Difference between Star schema and Snowflake schema

Are you aware of the schema in the database? Why do we need schema? Let's answer all these questions. Schema primitively means a structure or a framework that can make your data organized. We know that data alone doesn't make any sense until and unless it is organized and structured which represents the logic of the data. When we talk about data warehouse fact tables and dimension tables make a schema. Mainly there are three types of schema- Star Schema, Snowflake Schema, and Galaxy Schema. In this blog, we will point out the difference between Star Schema and Snowflake Schema. Basically, the Snowflake schema is an advanced (pro) version of the star schema.

What is a Star Schema? It's a basic structure where the fact tables are placed at centered and they are surrounded by dimension tables. It will help you to differentiate your quantitative data from qualitative data. But why it is named star? To answer that you need to see the formation of fact tables and dimension tables in this schema. It forms a star shape between fact and dimension table. The design of this schema and takes very little time to execute queries. Since the design is simpler then it will have a lesser number of foreign keys. This sort of schema has a high data redundancy which means the same data can exist in a different place which can cause data inconsistency.


Source

What is Snowflake Schema? Till now you must be aware that why this schema is named like that is obvious because of its shape. It is an advanced version of the star schema. It is a multidimensional model where the fact tables are present at the center and they are surrounded by dimension tables but in this schema, the dimension tables divide themselves into one or more tables. Dimension tables will divide them until the data is normalized. Data normalization? It is a fancy word for data structuring which can reduce data redundancy. Due to the complex nature of this schema, it takes a longer time to execute queries. The number of foreign keys is more in this schema. This schema has a low data redundancy due to data normalization. Dividing dimension tables into various lookup tables can prove out to be a great aid in saving a lot of storage.



But the main question is which schema is to be used and under what scenarios? If you need to report a simpler dataset and execute basic queries then you should go for star schema but it won't allow many to many relationships it is much suitable for one to many relationships. According to my experience star schema works very well with BI tools such as Tableau. I have worked more with the star schema. In case if the fact table has many relationships with the dimension table then the snowflake schema can come to your rescue. The snowflake schema is a much better choice for RDBMS but not for OLAP. For OLAP you need denormalized data which can be obtained from Star Schema. Both the schemas have both pros and cons associated with them but it depends on your business requirements.



Thanks for Reading  Let's connect on  LinkedIn




Comments

Popular posts from this blog

Copying Bookmarks from one Power BI report to another

Playing with Totals in Power BI

Introduction to Power Ops