Skip to main content

Difference between Star schema and Snowflake schema

Are you aware of the schema in the database? Why do we need schema? Let's answer all these questions. Schema primitively means a structure or a framework that can make your data organized. We know that data alone doesn't make any sense until and unless it is organized and structured which represents the logic of the data. When we talk about data warehouse fact tables and dimension tables make a schema. Mainly there are three types of schema- Star Schema, Snowflake Schema, and Galaxy Schema. In this blog, we will point out the difference between Star Schema and Snowflake Schema. Basically, the Snowflake schema is an advanced (pro) version of the star schema.

What is a Star Schema? It's a basic structure where the fact tables are placed at centered and they are surrounded by dimension tables. It will help you to differentiate your quantitative data from qualitative data. But why it is named star? To answer that you need to see the formation of fact tables and dimension tables in this schema. It forms a star shape between fact and dimension table. The design of this schema and takes very little time to execute queries. Since the design is simpler then it will have a lesser number of foreign keys. This sort of schema has a high data redundancy which means the same data can exist in a different place which can cause data inconsistency.


Source

What is Snowflake Schema? Till now you must be aware that why this schema is named like that is obvious because of its shape. It is an advanced version of the star schema. It is a multidimensional model where the fact tables are present at the center and they are surrounded by dimension tables but in this schema, the dimension tables divide themselves into one or more tables. Dimension tables will divide them until the data is normalized. Data normalization? It is a fancy word for data structuring which can reduce data redundancy. Due to the complex nature of this schema, it takes a longer time to execute queries. The number of foreign keys is more in this schema. This schema has a low data redundancy due to data normalization. Dividing dimension tables into various lookup tables can prove out to be a great aid in saving a lot of storage.



But the main question is which schema is to be used and under what scenarios? If you need to report a simpler dataset and execute basic queries then you should go for star schema but it won't allow many to many relationships it is much suitable for one to many relationships. According to my experience star schema works very well with BI tools such as Tableau. I have worked more with the star schema. In case if the fact table has many relationships with the dimension table then the snowflake schema can come to your rescue. The snowflake schema is a much better choice for RDBMS but not for OLAP. For OLAP you need denormalized data which can be obtained from Star Schema. Both the schemas have both pros and cons associated with them but it depends on your business requirements.



Thanks for Reading  Let's connect on  LinkedIn




Comments

Popular posts from this blog

Ultimate Beginners Guide to DAX Studio

There are zillions of external tools available with Power BI but DAX Studio is one of the most commonly used tools to work with DAX queries. It is a perfect tool to optimize the DAX and the data model. In this blog let's shed some light on the basic functionalities that can take your report to the next level. ARE YOU READY?  To start you will need the latest version of the DAX Studio. You can download it from their website . Don't worry you don't have to pay for the license. Fortunately, DAX Studio is a free tool As a BI Developer, I am using DAX Studio regularly. Based on my experience I use it for several purposes but in this blog, I will highlight the most common ones. Extracting a dump of all the measures used in your PBIX. Why do we need to do this? It can be used for documentation purposes also sometimes we try to reuse the DAX and such a dump comes in handy in this scenario. How to achieve it? Open the DAX Studio it is located under the external tools once you open t

Identify and Delete Unused Columns & Measures

Heavy dashboards and a bad data model is a nightmare for every BI Developer. Heavy dashboards can be slow due to multiple reasons. It is always advised to stick with best practices. Are you still figuring out about those best practices then you should definitely have a quick read on Best Practice Analyser ( link ). One of the most common issues with slow dashboards is unused columns and unused measures.  It is very normal to load some extra columns and create some test measures in your dashboard but as a part of cleanup process those unused columns and unused measures should be removed. Why we are removing them? Because if you keep them then ultimately it will increase the size of your data model which is not a good practice.  How to identify the culprits (unused columns and unused measures)? In today's blog we will provide you with 2 most common external tools which will help you in identifying the culprits. More external tools😒. Who's going to pay for this? To your surprise

Best Practice Analyser (BPA) Guide

Do you want to save tons of efforts to check if your data model and PBIX file follows the standard best practices and norms? Then this blog is for you. If you are a follower of our channel we already deep dive into the importance of the DAX Studio as an external tool. If you are a beginner I would highly recommend to visit this blog . In today's blog we will check how Tabular Editor can help to optimize the data model.  Best Practice Analyser allows to define or import best practices. It will make sure that we do not violate the best practices while developing a dashboard. Isn't it exciting!! Before we start make sure you already have Tabular Editor version 2.24.1 installed on your system. To install it do visit this link and select the link for windows installer. Once Tabular Editor is installed it will reflect in your PBIX file under external tool. Also, we need to define the standard rules. To do so in your advanced scripting or C# script copy this and save it via Ctrl+S. An