I am really sorry for the late reply. Snowflake doesn't have redundancy because it could have many layer of the dimension, but in Star schema maintenance may appear simple at the beginning, but the larger data warehouse you need to maintain, the harder it becomes (due to data redundancy). For example, let's assume you have product dimension and product settings connected to this dimension. If you have any update (or load issue) in the settings it will not affect the main dimension or the connected facts with this dimension. However, if you have one dimension Product and you need to add many settings or update/delete the dimension you will need to update all the facts connected to this dimension. Assume the following case We have fact table sales -> connected to dimension name "Product" this dimension contains one level includes the "Product_Settings". After you loaded the dimension and add the product_id into the sales fact table. You releaised that the table had an issue while loading "Product_Settings". In this case you have assume to reload the whole dimension again "in some cases delete the current and load the new data" then update the previous ids (which had an error) with the new ids. In case of Snowflake, you will just reload the "Product_settings" without touching the fact table. Please let me know if this clear or not.
Please Eng. Alaa, it would be more than great if you can give a real life example first to illustrate the concepts before moving into a new stage, I suggest that to better link and digest the content, otherwise all info will be lost and the confusion arise. Or try to put some reference for a demo data to apply and practice with guidelines how to implement those concepts and what are the best practices.
Hi Moshera, Every lecture has its style of examples. For example, in this lecture, we need to clarify different designs for the schema style. However, the use-case will be the same example for any facts or dimensions explained before. The main concept here is to clarify the dimension design difference between the Star and Snowflake and show the differences and the use case that we need to use for each type. I also agree with you; we need to provide more examples and use cases. We currently plan to do these sessions soon. Please keep following the next videos, and we will give additional videos for the parts which require more examples, and the schema types will be one of these lectures. I also need to share with you that the course is somehow long and explain different topics and lots of topics. We are trying to set some boundaries (we mean here the idea between illustrate every point and the main points), we are trying to balance between the deep dive into details and explain the core concepts and let the audience continue from the references. Also, this course has another appendix lecture which will include some more and advanced topic, but this lecture is the initial core concepts for everyone needs to start in the DWH. The last point, for each lecture, we provide the lecture link, and it includes the lecture notes and further reading. Please let me know if I understand and answer your question correctly, or you need additional info.
شرح راائع ومحتوي منظم ومفيد جداا ياريت تستمر في السلسه دي بالفيدوهات للاخر ان شاء الله هتبقي مرجع عربي قوي جداا بالتوفيق لكل فريق العمل
شكراً ليكي وجزاك الله خيرا
شششكراً من القلب على الشرح الموجز الجميل و الامثلة الواضحة
لو سمحت يا باشمهندس هو فيه شرح لل fact types والgranularity
الشرح ممتاز ما شاء الله 🌹
اه في شرح لل Fact type’s & dimensions
زى ايه ال tools ال بتسخدم مع ال star schema والتولز ال بتسخدم مع snowflake schema
can you explain why the maintenance in the SnowFlake is easy ???
I am really sorry for the late reply.
Snowflake doesn't have redundancy because it could have many layer of the dimension, but in Star schema maintenance may appear simple at the beginning, but the larger data warehouse you need to maintain, the harder it becomes (due to data redundancy).
For example, let's assume you have product dimension and product settings connected to this dimension. If you have any update (or load issue) in the settings it will not affect the main dimension or the connected facts with this dimension. However, if you have one dimension Product and you need to add many settings or update/delete the dimension you will need to update all the facts connected to this dimension.
Assume the following case
We have fact table sales -> connected to dimension name "Product" this dimension contains one level includes the "Product_Settings". After you loaded the dimension and add the product_id into the sales fact table. You releaised that the table had an issue while loading "Product_Settings". In this case you have assume to reload the whole dimension again "in some cases delete the current and load the new data" then update the previous ids (which had an error) with the new ids.
In case of Snowflake, you will just reload the "Product_settings" without touching the fact table. Please let me know if this clear or not.
the title of slide 13 should be snowflake schema characteristics
thanks
Please Eng. Alaa, it would be more than great if you can give a real life example first to illustrate the concepts before moving into a new stage, I suggest that to better link and digest the content, otherwise all info will be lost and the confusion arise. Or try to put some reference for a demo data to apply and practice with guidelines how to implement those concepts and what are the best practices.
Hi Moshera,
Every lecture has its style of examples. For example, in this lecture, we need to clarify different designs for the schema style. However, the use-case will be the same example for any facts or dimensions explained before. The main concept here is to clarify the dimension design difference between the Star and Snowflake and show the differences and the use case that we need to use for each type.
I also agree with you; we need to provide more examples and use cases. We currently plan to do these sessions soon. Please keep following the next videos, and we will give additional videos for the parts which require more examples, and the schema types will be one of these lectures.
I also need to share with you that the course is somehow long and explain different topics and lots of topics. We are trying to set some boundaries (we mean here the idea between illustrate every point and the main points), we are trying to balance between the deep dive into details and explain the core concepts and let the audience continue from the references.
Also, this course has another appendix lecture which will include some more and advanced topic, but this lecture is the initial core concepts for everyone needs to start in the DWH.
The last point, for each lecture, we provide the lecture link, and it includes the lecture notes and further reading.
Please let me know if I understand and answer your question correctly, or you need additional info.