From One Big PBIX to a Shared Dataset + Thin Reports


"When is it time to stop copying PBIX files and start reusing a shared model?"

Most BI developers begin their Power BI journey the same way: one PBIX file that does absolutely everything. It holds the data model, the queries, the measures, and all the report pages on top. It’s simple, self-contained, and easy to understand. That's normal, no? Yes, absolutely, but let's imagine a scenario where there are different teams, and they want the same dataset but tweaked in a different way for their purpose.

Since your semantic model is quite big in terms of size. Generally, you can just make the adjustments and create multiple PBIX files for different audiences. This solution will work, but let's understand what we are doing. After a few months, what started as one neat, well-understood PBIX turns into a small family of very similar files:

  • Multiple PBIX files containing almost the same model
  • KPIs that should mean the same thing but don’t quite match
  • A growing fear that changing a core measure might break someone else’s report
  • A file so heavy that opening it feels like loading a small video game

This is the point where many teams realise they aren’t just maintaining reports anymore, they’re maintaining variations of the same model. And that is usually when it’s time to step back and restructure things. The solution isn’t complicated: Split the big PBIX into a shared dataset and build thin reports on top of it.

When we make this shift, you will make your life a lot easier:

  • Your measures are defined in one place
  • Different teams can have tailored reports without duplicating the model
  • Fixes and enhancements flow automatically to all reports 
  • The overall setup becomes easier to maintain and scale. 
  • You won't be consuming the fabric capacity multiple times for the same model

In today's blog, we will walk through what a shared dataset actually is, how thin reports work, when it’s worth splitting your PBIX, and a practical way to migrate without turning your project upside down.

What do Shared Dataset and Thin Report mean?

Okay, let's define them clearly before we move further. These terms sound quite technical and are used interchangeably. But are they really the same thing? Not entirely.

In simple terms, the Shared Dataset generally contains the data model - tables, relationships, measures, and doesn't contain the visuals. This is the one source of truth that everyone should be using across teams because all the base KPIs are defined here.

When we talk about the Thin Reports, these are reports that are connecting to the Shared Dataset instead of importing their own data. 

When do I need to move to Shared Datasets?

Not every report needs a shared dataset. A simple dashboard or a quick analysis for a team can live happily inside a single PBIX. But as soon as a report becomes useful to more than one group, and you can see your dataset growing, this is the sign that you should be moving towards a Shared Dataset 

Here are the signs I look for that usually tell me it’s time to stop duplicating PBIX files and start centralising the model.

  • When you are maintaining multiple versions of the same PBIX-: Same underlying data used in multiple reports. Maybe a different set of measures depending on the audience. This is clearly creating multiple versions of the same model.
  • When different teams have the same KPIs but in their own report-: This is the most frustrating stuff for maintaining multiple truths. Such situations can lead to a mismatch between the KPIs, and then they ask you to debug the difference.
  • When your PBIX is getting heavier day by day-: Your PBIX takes ages to load, and every change takes too long because of the lag. It would make sense to reduce the size of the report by splitting the data model from the visuals.
  • And the most important part is governing-: When everyone has their own version of the same thing, it is very difficult to ensure the same KPI definitions. Eventually, if you are thinking of going towards Dev/Test/Prd with consistent KPI definitions, a shared dataset is the way to go.
Protip-: From experience, I can say shared datasets are not the ultimate cure to all the issues, but this will be a stepping stone in the direction of maintenance and governance.

Step-by-Step guide to move from PBIX to Shared Dataset

Step 1-: First, we need to create a duplicate of the PBIX. Then, rename the duplicate PBIX to something like Report_Name_Shared_Dataset.

Step 2-: Open the Shared Dataset version and delete all pages and only keep the data model, measures, and the tables. This way, we will have the base for the shared dataset

Step 3-: Now, it is time to publish the dataset to the workspace. You can now connect any of the reports with it.

Step 4-: Time to create a thin report. Open a blank PBIX and go to the Get data > Power BI Semantic Model, and you can now directly create the report without worrying about the data modelling. Or if you need the same visuals as they are in the original report, then just copy and paste them.

How simple it is? It is actually very simple to set it up, but this can also lead to some problems if best practices aren't followed

Here are the best practices that I have gathered over time, and these will help you to keep things clean

  • Clear distinction between the model and reports-: This is quite common, specifically if you are new to the shared dataset and the thin reports world. As a rule of thumb, your dataset is only where your model and logic lie. It will serve as a foundation where you can create thin reports.
  • Your dataset will be the one source of truth-: This will ensure all the reports that are based on this dataset will have the same KPI definitions, and if in case KPI definitions change in the future, you don't need to change it in all places
  • Don't overburden your thin reports-: Thin report is to use the shared dataset and build a report on top of it. This is not the place to create an alternate data model. As best practice, try not to create too much in thin reports and keep the calculations in the shared dataset itself. 
  • Make your Shared Dataset presentable-: Before you publish any of the shared dataset, it is always recommended to do a basic cleanup (deleting unused measures and columns). You don't want any of the users to get confused with those. Also, give business-friendly names to all your columns and measures 
    • My checks generally include running a Best Practice Analyzer, 
    • Using Measure Killer to remove unused columns and measures, 
    • Using Tabular Editor to organize all the measures in folders and subfolders
    • Double-check on the data type and column name in all tables
  • Clear ownership required-: The Whole team shouldn't be owning a shared dataset, as it will create a mess in its maintenance. You need a specific person to approve the changes that will ensure a clear structure of the dataset.
  • Improving shared dataset over time-: As the KPI grows, your shared dataset should be adapting to all those changes. New measures, columns, and tables can be added in the future, and it won't hurt as long as you are doing all these changes in one place.
  • Set up the security in the dataset-: Most important part is you can't set up security on the thin reports, it always considers the security from the shared dataset. Also, you can't test any roles in thin reports. All thin reports generally inherit the security from the shared dataset.

Once your shared dataset is in place and different teams start using it, it often becomes the default source for the numbers everyone cares about. And they begin asking, “Is this coming from the main model?” or “Is this the official measure?” When that starts happening, you’re moving beyond the shared dataset and into something with more weight and responsibility. That’s where the idea of a golden dataset really starts to make sense.

Where “Golden Datasets” Fit In?

If you’ve spent any time around Power BI conversations, you’ve probably heard people talk about “golden datasets.” The term sounds fancier than it really is. In most organisations, a golden dataset is simply a shared dataset that has grown into the official source of truth for a particular business area.

You don’t start by declaring something a golden dataset. It becomes one organically as people rely on it, and it proves itself reliable over time. This tag is normally earned by a dataset when it meets these criteria-

  • Multiple reports and teams depend on it
  • Someone is clearly responsible for maintaining it

  • Key business measures are defined there

  • The organisation agrees it’s the trusted source

Once a dataset reaches that point, the business typically wants to protect it a bit more. That can mean:

  • Putting it in a dedicated “data” workspace

  • Restricting who can edit it

  • Marking it as endorsed, promoted, or certified in the Power BI Service

  • Documenting measures and definitions

  • Adding proper approval for changes

None of this has to be heavy governance. Even simple recognition helps people understand that this dataset isn’t just another report, but it’s the official version of sales numbers, or finance metrics, or customer data.

The general path looks like this:

  1. One big PBIX

  2. Split into model + thin reports

  3. The model is used more widely

  4. Model becomes trusted

  5. And it’s treated as a golden dataset

So the idea of a golden dataset isn’t something different, I see it as a normal progression model and thin reports. There are a few things to keep in mind when you are in this transition.

  • Don't rush for the 'Golden' label-: This label will eventually come if your shared dataset is being used in multiple thin reports and users trust the logics that you have put in place.
  • Clear communication-: Open communication to be done, in case of the KPI definitions change. All the business users should be aware of such changes.
To wrap today's blog, here are the final words of advice. A single PBIX file is a perfectly fine starting point. But once you find yourself copying the file for different teams, adjusting KPIs in multiple places, or worrying about whether a change will break another version, that’s usually the sign it’s time to separate the model from the report layer.

A shared dataset gives you one place to define business logic and one place to maintain it. Thin reports let each audience have its own view without turning into another model you need to look after. The result is cleaner updates, consistent numbers, and less time spent searching for the “right” version of a measure.

You don’t have to adopt all the enterprise features on day one. Most teams start with a simple split, get used to maintaining the model in one PBIX, and gradually build better habits around naming, ownership, security, and change management. Over time, the shared dataset naturally becomes something the organisation trusts, and that’s how many golden datasets are born.

If you’re already juggling several PBIX files that look suspiciously similar, you’ve probably outgrown the one-file approach. A shared dataset with thin reports isn’t a big architectural leap; it’s just the next sensible step.



Love this blog? Get even more with BI Bits!

I've launched a Power BI newsletter called BI Bits — your go-to for quick, practical tips and tricks to level up your dashboards and DAX skills.

Each edition is:

  • Short and actionable

  • Beginner- and intermediate-friendly

Let’s make Power BI simpler, one bit at a time.

Comments

Popular posts from this blog

Copying Bookmarks from one Power BI report to another

Playing with Totals in Power BI

Introduction to Power Ops