Want to improve your database design skills? Get my Database Design project Guides here (diagrams, explanations, and SQL scripts): www.databasestar.com/dbdesign/?
In my project, I added a CustomFieldDefinitions table and a CustomFields table, for the Orders table. My idea was to get a clean structure. It was proven to be bad when I tried to implement filtering of Orders and thus wanted CustomFields to be included. I used Entity Framework and the idea was compile a structured query string arg into a query. But you cannot flatten all custom fields per Order to columns that the ORM can handle. My next attempt will be with a JSON field.
JSON support in most major DBMSes has gone a long way, but if your data is highly dynamic to begin with, you are probably bringing a knife to a gun fight and should consider a NoSQL database for your application
Data integrity can be maintained by creating a value table for each datatype, storing attribute and value in different tables. But this makes query really complex but can be solved by introducing a flat table, some sort of read model, for single select query
For #3 the video missed the most important con, which is in multi-tenancy, each customer, with a click in the product, would change the basic database schema. That shuts down the solution right there. (if your situation is not multi-tenancy, then you probably don't even need to worry about custom fields) edit: and it's thousands or tens of thousands of columns, not 50-200
@@DatabaseStar What if you need to create custom attributes for a certain class in your application? Normalized option obligates you modify your app to create a new class and a new table on the database in order to have a new type of attribute, so in that case using sort of an EAV solution is better right?
I think the Modified EAV could be modified even more to be a bit more useful. Instead of a single customer_attributes table, why not three tables for the different data types (customer_string_attribute, customer_date_attribute, customer_number_attribute). This allows for better data validation compared to EAV and removes the risk of multiple columns being populated. It does add more complexity when trying to query all attributes.
Magento cms indexing mechanism solves entity attribute value performance...that can also be applied to worpdress to solve performance issue...I think nosql can be used with sql to indexing the large public data, for eg: ecommerce product data & confidential records like order, stocks can be stored in sql for better secured storing...
@DatabaseStar This is a bit difficult to understand without any table examples of how it looks like. Is there any blogpost where you have explained it with an example. thanks in advance :)
Thanks for the feedback! Yeah I've heard that before about this video and some others, so I try to add examples into some of my recent videos. I don't have any posts that has examples unfortunately - but I can create one in the future.
Only a few of these options support custom fields and none of these support defining a custom field(column) (which would be different for each tenant) to which each row would have a value.
What do you mean by "only a few support custom fields"? And what do you mean by "none of these support defining a custom field"? Each of them allow users to define custom fields, where the users can determine what information can be captured for a record. If there's a need to have a new column where each row has a value, then this would be more like adding a new column to a table, and would be done with an Alter Table statement.
I've had much success with EAV using: * (1) MSSQL and the Variant type so that you only have 1 value table(or column). * (2) use strongly typed and optimized hand written SQL stored procedures(middle-tiered ORM generated SQL is not always a good idea for reporting queries). * (3) Create a dynamic-PIVOT operation stored procedure to generate a pivoted 'flat' table for reporting. Legally inject parameters(list of required attributes) from the reporting user interrace into the dynamic(sql) PIVOT stored procedure to generate the pivoted flat table. I used COUNT and /or MAX as the PIVOT aggregate function. * (4) Use optimized indices(indexes). You actually can index the tables easily. Also you don't really need to cast or convert values since the very nature of the query would limit the types (attributes) required and also that you are using the Variant data type that is directly sent straight up to the user interface reporting visualization elements. These alone solved my issues. I seeded the model(the value table) with millions of records and the performance is amazingly fast! As a bonus, you can further 'cube' the generated pivoted flat table to generate further summaries and statistics before presentation.
Want to improve your database design skills? Get my Database Design project Guides here (diagrams, explanations, and SQL scripts): www.databasestar.com/dbdesign/?
Thanks for this video, I really like your clear voice and concise explanations.
No problem! Glad it was helpful.
In my project, I added a CustomFieldDefinitions table and a CustomFields table, for the Orders table.
My idea was to get a clean structure. It was proven to be bad when I tried to implement filtering of Orders and thus wanted CustomFields to be included.
I used Entity Framework and the idea was compile a structured query string arg into a query. But you cannot flatten all custom fields per Order to columns that the ORM can handle.
My next attempt will be with a JSON field.
Thanks for sharing. Sometimes designs can work at one point but later on they can be hard to work with.
I prefer 3NF Normalized Tables (6th in the video) and the JSON (7th) for my large scale app’s database design.
Thanks for the video. :)
Awesome, good to hear!
JSON support in most major DBMSes has gone a long way, but if your data is highly dynamic to begin with, you are probably bringing a knife to a gun fight and should consider a NoSQL database for your application
Good point. Using JSON data for custom fields, in a NoSQL database, could be the way to go.
I need ACID, but also need custom fields. What do I do? Have an id column in SQL that points to a document in MongoDB. Two databases?
Data integrity can be maintained by creating a value table for each datatype, storing attribute and value in different tables. But this makes query really complex but can be solved by introducing a flat table, some sort of read model, for single select query
Yeah I can see how that can work, thanks for sharing.
Working with norlamized (#6) all time. Json sometimes too (#7) but now I have to switch my articles to EAV.
Thanks for sharing, good to know!
For #3 the video missed the most important con, which is in multi-tenancy, each customer, with a click in the product, would change the basic database schema. That shuts down the solution right there. (if your situation is not multi-tenancy, then you probably don't even need to worry about custom fields)
edit: and it's thousands or tens of thousands of columns, not 50-200
That's a good point, if there are many customers then the number of custom fields would increase a lot. This is something to consider.
I would prefer #1 EAV, and #7 JSON for alternative, thanks for sharing.
EAV bad, you chose the worse options. Best is Normalized
Thanks for sharing!
Yeah generally normalised is better
@@DatabaseStar What if you need to create custom attributes for a certain class in your application? Normalized option obligates you modify your app to create a new class and a new table on the database in order to have a new type of attribute, so in that case using sort of an EAV solution is better right?
I think the Modified EAV could be modified even more to be a bit more useful. Instead of a single customer_attributes table, why not three tables for the different data types (customer_string_attribute, customer_date_attribute, customer_number_attribute). This allows for better data validation compared to EAV and removes the risk of multiple columns being populated. It does add more complexity when trying to query all attributes.
Good point! I think that can work as well.
Magento cms indexing mechanism solves entity attribute value performance...that can also be applied to worpdress to solve performance issue...I think nosql can be used with sql to indexing the large public data, for eg: ecommerce product data & confidential records like order, stocks can be stored in sql for better secured storing...
That's a good point, good to see it can help for those systems.
@DatabaseStar This is a bit difficult to understand without any table examples of how it looks like. Is there any blogpost where you have explained it with an example. thanks in advance :)
Thanks for the feedback! Yeah I've heard that before about this video and some others, so I try to add examples into some of my recent videos. I don't have any posts that has examples unfortunately - but I can create one in the future.
Nice video! Anybody know what "PK" and "RK" refers to in these diagrams? This is the only thing I'm missing!
primary key
Ah, PK = Primary Key (the unique identifier for the row), and FK = Foreign Key (a reference to a Primary Key in another table)
I didn't think of the JSON (or potentially XML??? - is that a thing too?) version until you said it....
Good to know! Yeah XML could work but I don't see it used very often.
Json is the best solution for a dynamic fields even big companies use it on their api
That’s a good point!
Great video
Thanks!
Only a few of these options support custom fields and none of these support defining a custom field(column) (which would be different for each tenant) to which each row would have a value.
What do you mean by "only a few support custom fields"? And what do you mean by "none of these support defining a custom field"?
Each of them allow users to define custom fields, where the users can determine what information can be captured for a record.
If there's a need to have a new column where each row has a value, then this would be more like adding a new column to a table, and would be done with an Alter Table statement.
I know very few things about DBs (at least no the deep ones) but Dynamic Schema seems dangerous even to just hear about it in this video !
Yeah it is a bit risky!
I've had much success with EAV using: * (1) MSSQL and the Variant type so that you only have 1 value table(or column). * (2) use strongly typed and optimized hand written SQL stored procedures(middle-tiered ORM generated SQL is not always a good idea for reporting queries). * (3) Create a dynamic-PIVOT operation stored procedure to generate a pivoted 'flat' table for reporting. Legally inject parameters(list of required attributes) from the reporting user interrace into the dynamic(sql) PIVOT stored procedure to generate the pivoted flat table. I used COUNT and /or MAX as the PIVOT aggregate function. * (4) Use optimized indices(indexes). You actually can index the tables easily. Also you don't really need to cast or convert values since the very nature of the query would limit the types (attributes) required and also that you are using the Variant data type that is directly sent straight up to the user interface reporting visualization elements. These alone solved my issues. I seeded the model(the value table) with millions of records and the performance is amazingly fast! As a bonus, you can further 'cube' the generated pivoted flat table to generate further summaries and statistics before presentation.
Great tip! That approach sounds good and it's good to hear it works for you.
Very helpful for me
That’s great!
Dynamic Design looks 😬 due to risk of columns change on live server. need strong validation for this type.
I prefer JSON, EAV structure.
Thanks! Yeah there are some risks for this approach.
While I enjoyed the video I am none the wiser as to which is best or better maybe a list of don't do to best option might be good
Good idea, thanks for the feedback.
Or just use no-sql DB
Yeah that is an option!
👍👍
Thanks!
Normalisation
Yeah that is one approach
i think 1st solution was much better
Oh thanks, that’s good to know
@@DatabaseStar currently watching your "7 database design mistakes to avoid" video👍
I think you oversimplified the EAV by A LOT.
Yeah I did simplify it for the video, it can get out of hand pretty easily. What else would you add for EAV?
@@DatabaseStar Take some notes from this video: ua-cam.com/video/WneHTRZVbec/v-deo.html