Option 4 works best for microservices where each service stores its own data. They are essentially domain boundaries. It also means that you can change one without changing all of them if you need to, because tight coupling is usually worse than repeating yourself. Just for the sake of argument, what if you wanted to put a video timestamp on a video comment, but not on a post comment, or you want to be able to tag a region of the photo with a photo comment.
+1, and if you want to run analytics just pipe these data through an ETL/ELT process and run aggregations on another data warehouse with designs suited for OLAP, so having more tables on OLTP database is not really a problem - just need some schemas and a good naming practice
I would drop the photo and video tables and add a mediaUrl column to the post table. The url can be a link to a photo, a video, or be null. I would let the application handle that. I'm not using the database to do anything with the media and I'm certainly not storing any media as blobs. The post can have a one-to-many relationship with the comment table where one post has many comments. Love your content and presentation. Subscribed.
One caveat is thinking about how you relate to the "parent" table. For instance, if you have a many-many self-relation on the parent table, then you need to choose Option 3 otherwise you can overcomplicate trying to model the various types of self-relations.
Similar challenge earlier this year. Used technique virtually the same as option 3. Would have saved time and several brain cells if I had seen this first!
Some of these approchaes are implemented in some ORMs, like e.g. SQLAlchemy which supports three forms of inheritance: - single table inheritance - several types of classes are represented by a single table; - concrete table inheritance - each type of class is represented by independent tables; - joined table inheritance - the class hierarchy is broken up among dependent tables. Each class represented by its own table that only includes those attributes local to that class. Additionally, Django supports what they call "Generic Foreign Keys" and corresponds to your last solution.
I would like choose option1 then add simple checking constraint without going write store procedure to meet the requirment becuase storing comment in one place it is helpfull for reply and mentioning feature to incorporate it at one place and for performance optimization we can use view table
Thanks for sharing what you would do, that's good to know. I'm not sure you can add a check constraint to ensure that only one of the columns is populated, but I could be wrong.
I saied option 4 even before you display it caue it will reduce search time if you fetch only one of two type even maybe all types and offers independency
Call me crazy but I had to use the 5th method in a project where I had attachments table related to multiple content types (more content types still undetermined while developing the app). In the attachment table I added a reference_type and a reference_id Thanks Ben!
Surely there is another option - each content type gets its own table, comments are a single table, and there is a link table which relates a comment to one of the content tables. Similar to option 1/2 except the link table only needs one column per object type. You still have to enforce some of the rules like ensuring only one FK gets set but that could be a row constraint.
That could work. It sounds like you would have the same problem as option 1, with the link table having "one column per object type", and you would have to ensure that only one of the columns are populated.
If any of the approach stores a blank column even if it is a number then I am avoiding that. So option 1 and 2 are not for me. Option 3 is good but another thing I dont want is too much joins. So the best one for me would be Option 4.
I would prefer to go option 3 as it meets requirement for referential integrity as well as provide options to consolidate data with less joins to multiple table ..in option 4 in some cases I may have to put union all clause to get everything ..but option 3 i may not have to
What if you have multiple possible parent tables? Table1 (parent) 1:n Table2 (optional) 1:n Table3 (optional) 1:n Table4 (optional) Could be 1>2>3>4 but also 1>3>4 or 1>4 etc. Entries are also added and deleted so 1>2>3>4 can become 1>2>4. Seems impossible for me to create a proper view for this. 😅
Could you try a single table for all of these records, and then have a parent ID column that points to another record in the same table? It’s called a self join if you wanted to know more about it. But it only works if the records are the same type of record.
@DatabaseStar unfortunately the four tables store vastly different types of information. I guess my only option is to write a query for each possible combination, manually declaring null for the rows of the tables that arent included in the case and then union them together. Thx anyways. Real quality channel 😊👍
I think it's more of an intermediate problem. If you're new to SQL, you probably won't need to know this or understand all of this video. But you'll likely come across this problem in your project or career at some point.
Option 4 works best for microservices where each service stores its own data. They are essentially domain boundaries.
It also means that you can change one without changing all of them if you need to, because tight coupling is usually worse than repeating yourself. Just for the sake of argument, what if you wanted to put a video timestamp on a video comment, but not on a post comment, or you want to be able to tag a region of the photo with a photo comment.
Thanks for sharing, that's a good point and a good example.
Really enjoying your series of videos. Thanks.
Glad you like the videos!
+1 on option 4; less headache in the long run
Thanks for sharing!
+1, and if you want to run analytics just pipe these data through an ETL/ELT process and run aggregations on another data warehouse with designs suited for OLAP, so having more tables on OLTP database is not really a problem - just need some schemas and a good naming practice
I would drop the photo and video tables and add a mediaUrl column to the post table. The url can be a link to a photo, a video, or be null. I would let the application handle that. I'm not using the database to do anything with the media and I'm certainly not storing any media as blobs. The post can have a one-to-many relationship with the comment table where one post has many comments. Love your content and presentation. Subscribed.
Good idea, that could work, essentially combining all of the content into a single table. Thanks for sharing! And thanks for subscribing.
One caveat is thinking about how you relate to the "parent" table. For instance, if you have a many-many self-relation on the parent table, then you need to choose Option 3 otherwise you can overcomplicate trying to model the various types of self-relations.
That's a good point! Thanks for sharing.
Similar challenge earlier this year. Used technique virtually the same as option 3. Would have saved time and several brain cells if I had seen this first!
Good to hear one of these solutions was something you had used!
Some of these approchaes are implemented in some ORMs, like e.g. SQLAlchemy which supports three forms of inheritance:
- single table inheritance - several types of classes are represented by a single table;
- concrete table inheritance - each type of class is represented by independent tables;
- joined table inheritance - the class hierarchy is broken up among dependent tables. Each class represented by its own table that only includes those attributes local to that class.
Additionally, Django supports what they call "Generic Foreign Keys" and corresponds to your last solution.
That's good to know how ORMs handle it. I'm not surprised - application code can handle this much easier than databases.
I would like choose option1 then add simple checking constraint without going write store procedure to meet the requirment becuase storing comment in one place it is helpfull for reply and mentioning feature to incorporate it at one place and for performance optimization we can use view table
Thanks for sharing what you would do, that's good to know. I'm not sure you can add a check constraint to ensure that only one of the columns is populated, but I could be wrong.
You absolute can (ensure it via a check constraints) and it’s the most sane way to do it.
I saied option 4 even before you display it caue it will reduce search time if you fetch only one of two type even maybe all types and offers independency
True!
Call me crazy but I had to use the 5th method in a project where I had attachments table related to multiple content types (more content types still undetermined while developing the app).
In the attachment table I added a reference_type and a reference_id
Thanks Ben!
Good to know it worked for you!
@@DatabaseStar Well, I wasn't fully satisfied but yes it fulfilled the task
In software engineering, it's always KISS - Keep It Simply Stupid designs that are mostly efficient and easier to maintain.
That's a good principle to keep in mind.
Surely there is another option - each content type gets its own table, comments are a single table, and there is a link table which relates a comment to one of the content tables. Similar to option 1/2 except the link table only needs one column per object type. You still have to enforce some of the rules like ensuring only one FK gets set but that could be a row constraint.
That could work. It sounds like you would have the same problem as option 1, with the link table having "one column per object type", and you would have to ensure that only one of the columns are populated.
If any of the approach stores a blank column even if it is a number then I am avoiding that. So option 1 and 2 are not for me. Option 3 is good but another thing I dont want is too much joins. So the best one for me would be Option 4.
That makes sense, thanks for sharing!
I would prefer to go option 3 as it meets requirement for referential integrity as well as provide options to consolidate data with less joins to multiple table ..in option 4 in some cases I may have to put union all clause to get everything ..but option 3 i may not have to
That’s true. Thanks for sharing.
Amazing video
But What about Morph way?
I think its the best in this case
What do you mean by "morph way"?
@@DatabaseStar i think they refer to laravel's morph types
Option 5 is the polymorphic method.
What if you have multiple possible parent tables?
Table1 (parent) 1:n Table2 (optional) 1:n Table3 (optional) 1:n Table4 (optional)
Could be 1>2>3>4 but also 1>3>4 or 1>4 etc. Entries are also added and deleted so 1>2>3>4 can become 1>2>4. Seems impossible for me to create a proper view for this. 😅
Could you try a single table for all of these records, and then have a parent ID column that points to another record in the same table? It’s called a self join if you wanted to know more about it. But it only works if the records are the same type of record.
@DatabaseStar unfortunately the four tables store vastly different types of information. I guess my only option is to write a query for each possible combination, manually declaring null for the rows of the tables that arent included in the case and then union them together. Thx anyways. Real quality channel 😊👍
I like option 3.
That's good to know
vote for option 5 with no referential integrity whatsoever and actually deleting promptly comments, as they're mostly polluting. Including this one :(
Haha good one! Thanks for sharing :)
Should an SQL beginner even look at this video?
From the commentator's profile photos, this is the stuff of legends, I reckon some of them wrote SQL 1.0 😅😂😂😂
I think it's more of an intermediate problem. If you're new to SQL, you probably won't need to know this or understand all of this video. But you'll likely come across this problem in your project or career at some point.
@KenyanNetHunter - haha thanks! I'm not that old but I'm glad you like the content.