Skip to main content

Reference v/s Duplicate in Power BI

Finally, we are back with new blogs. The idea of all the blogs is to share the problem of the week which I faced and provide a solution to it. So, as Business Intelligence Analyst one of my major responsibilities is to design an optimized data model and avoid many to many relationships. The primitive approach to such a problem is to create bridge tables out of a big flat table and create one to many relationships in that process.

Creating bridge tables can be achieved in the Power Query Editor. There are mainly two ways to achieve that one is to take reference tables and the other is to create a duplicate table out of the big flat table. So what is the difference between duplicate and reference? Let's dig deeper into it. If you see both of them create a copy of the main table but in duplicate, it will copy the changes applied to the main table whilst in the reference the bridge table will be isolated from all the changes applied to the main table.

The reference query always points to the main table and does not copy any of the applied steps to the main query. Let's see how does it work in Power BI. I am currently using the Sample Superstore Data. You just need to right-click on the main table and you will see both duplicate and reference.


At first, we are creating a reference table out of the orders table and to avoid many to many relationships we will remove duplicates from the segment column. I have removed all other columns because I will set up a segment table and create relationships with other tables.


If you do that you will get a column with only three rows. Now let's see the m-script and query dependencies behind this reference. To see that you need to go to the view tab and there you can find both the options.




You can see the process and steps working in the background and it is evident that the reference tables always point to the main table. Now let's do the same thing with duplicates. You will get the duplicate table just following the same steps you just need to select the duplicate instead of reference. Let's see the query dependencies and the m-script for it.



As you can see from the query dependencies when you create a duplicate table it doesn't point out to the main table but if you check the applied steps in duplicate you can see every step that has been applied to the main table is visible over there. 



Duplicate tables require more processing time as compared to the reference tables as it occupies the space in memory. The main question is when to use this duplicate option. You can use this when you want to create a mirror image of the big flat table with all the applied steps. The purpose of both duplicate and reference is quite different and depends on what you want to achieve in the end.



Thanks for Reading  Let's connect on  LinkedIn. For more such blogs do follow us.






Comments

Popular posts from this blog

Ultimate Beginners Guide to DAX Studio

There are zillions of external tools available with Power BI but DAX Studio is one of the most commonly used tools to work with DAX queries. It is a perfect tool to optimize the DAX and the data model. In this blog let's shed some light on the basic functionalities that can take your report to the next level. ARE YOU READY?  To start you will need the latest version of the DAX Studio. You can download it from their website . Don't worry you don't have to pay for the license. Fortunately, DAX Studio is a free tool As a BI Developer, I am using DAX Studio regularly. Based on my experience I use it for several purposes but in this blog, I will highlight the most common ones. Extracting a dump of all the measures used in your PBIX. Why do we need to do this? It can be used for documentation purposes also sometimes we try to reuse the DAX and such a dump comes in handy in this scenario. How to achieve it? Open the DAX Studio it is located under the external tools once you open t

Append v/s Merge in Power BI

Let's discuss another problem of the week. As a Power BI user, there are times when you want to combine queries. What are the ways to do so? In most cases, you can attain it by using either append or merge and both serve different purposes. Let's understand what do these terms mean in Power BI and how they are functionally different from each other.  It is quite common to get data from various sources and you need to combine those data depending on a particular column which is common in both tables so that you can add extra information or column to your big table. In such cases, we use merge queries. How to perform merge queries? For instance, I am considering Sample Superstore data and we will merge the returns table to the order table. You will find both merge and append in the home tab in extreme right in the power query editor. ProTip - You will find two options when you click on the drop-down in merge which are merge queries and merge queries as new. When you use merge que

Use Relationship in DAX

Data modeling is an essential part of creating perfect visuals. While creating complex data models there can be a case where you can find an inactive relationship represented by dotted lines and it occurs because you already have an active relationship between the two tables. But as a developer, you need to use both the relationship. How can it be done? You can use "Use Relationship" in such cases. Use relationship can be added to your DAX and act as a modifier or enhancer for calculation. It activates the inactive relation. But make sure you have an inactive relationship in place before using the use relationship function. Let's see how it works on Sample Superstore data. In my fact table I have two dates- Order date and Ship date. I am making the two relations between my date table and fact table. The relation between the sample superstore (date) to date table (date) is active while the relation between the sample superstore (ship date) to date table (date) is inactive