Dear Database Star, thank you for this amazing well explained video. One suggestion I would make for the problem of a location is the concept of Country. Not always is clear the political definition of the place that is being addressed. With this in mind, one internationally well accepted solution is the one adopted by the World Health Organization (WHO) which adopts a self-related table of Administrative Political Divisions. Each of these will have a property (varchar or something mappable to an ENUM) that defines its type (e.g., country, state, city, street, etc.). For instance, one big mistake is to assume that an administrative political division as the United Arab Emirates is a country. Further more, as you well, pointed out in this video, not every country has state subdivision. Therefore, the WHO's approach sanates all these problems.
Another concern is when you want to app to do reverse geocoding. When you create a geocoding request, you have to populate each property with the correct data. It's much easier when you separate all the address data as you described.
Good question! That's something you can do to make the list more defined. I chose not to do it because the list could be quite long and I'm not sure if a full list could be collated (list every city/state/province in the world). But it is possible and a good option.
@@DatabaseStar Do you know of any data source, Json or csv that contain all these different cities, states, provinces and so on around the world or would be better to rely on an address API and then enter in the database based on the response from it?
What if this database needs to handle high load? All these joins will slow it down, especially many to many. So I guess denormolized(intro design) is better in this case?
Good question. Yes, having things in a single table (denormalised) will likely improve query performance if there are a lot of SELECT queries happening (aka high load). But the tradeoff is that it's harder to update. It's worth considering.
Thanks for the great video, quick question why have a many to many relation between the customer and the address instead of doing a one to many relation where if a user is deleted, or a user deletes an old address the user_id would be set to null.
Good question. The reason for a many-to-many is that an address exists even though a user does not live there. If a user changes their address, this could be handled by inserting a new record for address and removing the relationship to the old address. This would allow historic data to be determined (e.g. a history of their addresses, customer orders, etc). It also allows for multiple users to have a single address (e.g. two people living in the same house have individual accounts for an online shopping website). A one-to-many relationship can still work, with the points you mention though.
I would like to see videos for the other addressing designs you mentioned such as ip address, names, profanity words and emails including how to validate them.
Hi, thanks for the great video. I'm in a case where i have 2 address for one entity (typically a start_address and end_address). How should i handle it ?
In most e-commerce setups, the address is typically linked to other details like the receiver's name and phone number. How could we incorporate that into this design, or is it even possible?
You could connect it to the user table in the same way. Or have a kind of "user_contact" table that connects both addresses and users, and also connects phone numbers.
Hey Ben, you have been tremendous help. How would you model the data in a situation where you have 2 entities such as "employee" and "project". Where the project table would only need attributes such as "city", "region". Whereas the employee would need the attributes listed in the video. Would having separate address entities for employee and project be redundant or a good practice? Thanks again!
Good question. You could do it a few ways, but my initial thoughts are to have a full "address" table to link to the employee table, as that's what an employee needs. A project could have a related "city" table, which would just have information about city and region, like a cut-down version of the address table. You could relate the address table to this city table, so the cities are not stored in two places.
What happens when a user updates their address (like a typo) ? The solution that comes to my mind is: - If the address is only associated to the user who created it, the address field could be updated - If the address is associated to two or more users, then create a new address field But the problem with modifying existing address fields is if it's already associated to an order, so in that case it's better to just leave "incorrect" addresses rather than correct them?
What about normalization and redundant data ? I have some doubts about completely normalize the database. What do you think ? This example it is not normalized. But to normalize, a lot of tables should be created: one to country, another to the state, city, neighborhood, street . . . 6 tables at least. To request a complete address, a big join should be done with the person table, for example, and those 6 tables . . . In theory is the right thing to do. But in pratice ... I don't know.
Good question. This is a common concern with desigining databases. In short, there's a balance between creating extra tables to reduce redundancy (tables for country, state, city, etc as you mentioned) and having them all in one table. I think we would need to consider the user interface and the types of queries that would be run. E.g. if a country is selected from a drop-down by a user, a list of countries in a separate table would make sense. If the user manually types in the city or state, then perhaps this does not need a separate table. But if they select it from a list, a separate table may be needed. If the application runs queries to get the full address, then you would need to consider the impact of joining to many tables. Joins aren't inherently bad - with small tables and indexes, the query could still perform well.
@@DatabaseStar In case the join/view of the person/user table and those 6 tables at least is not performant, would it make sense to create seperate table that contains this data in a read-only state that could be queried without joins or perhaps caching the address information for a person in Redis?
Thanks! I've created a separate table for country because there is a defined list of countries (it's a list that does not change very often at all), and is something that can be selected from a drop-down on a screen. We could do the same thing for other fields, but the list would be longer and could change more often.
“A customer can have many addresses” but …what would be the statement for the reverse to sustain a many to many relationship? “An address can be had by many customers?” Could you please clarify? Awesome videos by the way.
Nice video as always, ensuring validation of past record never cross my mind. I have a question, currently I am developing a project for practice. It has 4 types of user, `patient`, `dentist`, `dental assistant`, and `admin`. Should I put all users in 1 table with role column, or separate them to different tables.
Good question. My initial response would be to put them all in one table, with a column for their role. This is because I assume they will all have the same kind of information stored about them (name, perhaps a created date, perhaps active or inactive, email address, password). You can use other tables to determine what their permissions are if you need that.
Hello Database Star First of all, Great Vid's so far! Very well explained. I wanted to know, why you have in the last section of the Video the Text "* not to scale". What do you mean by that or to what do you reference it? Thanks in Advance!
Thanks! Ah that was just a bit of humour - saying that the image of the person and the image of the house are not to scale, meaning they are not a realistic size, as the person is taller than the house.
I've used this data structure for over twenty years. As data types have become much more sophisticated it's been interesting to experiment with json for both user and form data MySQL and MariaDb. MySQL is very fast and has proven very efficient/fast in search/retrieve/display of large forms. Your thoughts would be appreciated
Glad to hear this structure has worked for you. I don't have a lot of experience working with JSON but I can understand it has some benefits such as easily-configurable structure. I don't know a lot about the performance of it. Many recent versions of databases are adding more support for JSON so I think it's only going to get better.
@@DatabaseStar I think for NoSQL JSON document DB databases that the main benefit is when all the data can embedded in the Json document, because relations need to be managed by the application and could become a mess as nothing is consistent, however since there is no defined schema it is much more flexible and does not need migrations, but yeah you would still need to plan some kind of schema 🤣
always brainstorming for hours about how should we store the address! thank you for the solutions benchmarks.
No problem, glad you found it helpful
Insanely helpful, can't thank you enough for these, so happy I stumbled upon this channel.
Thanks! Glad you liked it.
Yes , YOU ARE A DATABASE STAR 🌟.
Easy To Understand Your Teaching and video Presentation.
Keep Doing More Videos ♥️
Thanks!
Dear Database Star, thank you for this amazing well explained video. One suggestion I would make for the problem of a location is the concept of Country. Not always is clear the political definition of the place that is being addressed. With this in mind, one internationally well accepted solution is the one adopted by the World Health Organization (WHO) which adopts a self-related table of Administrative Political Divisions. Each of these will have a property (varchar or something mappable to an ENUM) that defines its type (e.g., country, state, city, street, etc.). For instance, one big mistake is to assume that an administrative political division as the United Arab Emirates is a country. Further more, as you well, pointed out in this video, not every country has state subdivision. Therefore, the WHO's approach sanates all these problems.
Good points about the concept of countries and how it can be improved!
Extremely good I was searching for such simple explanation
Thanks a ton
You're welcome!
Thank you Ben - always thought about these during design but never followed through this thought process.
Thanks, glad it helped!
Wow....everything just makes sense...very informative video. You just saved me from making a very huge mistake. Thank you🤝
Glad it was helpful!
Another concern is when you want to app to do reverse geocoding. When you create a geocoding request, you have to populate each property with the correct data. It's much easier when you separate all the address data as you described.
That's a good point, another reason to have separate address data. Thanks for sharing.
Hello!
Why don't we create separate tables for city and regions as we did with country?
Good question! That's something you can do to make the list more defined. I chose not to do it because the list could be quite long and I'm not sure if a full list could be collated (list every city/state/province in the world). But it is possible and a good option.
@@DatabaseStar Do you know of any data source, Json or csv that contain all these different cities, states, provinces and so on around the world or would be better to rely on an address API and then enter in the database based on the response from it?
What if this database needs to handle high load? All these joins will slow it down, especially many to many. So I guess denormolized(intro design) is better in this case?
Good question. Yes, having things in a single table (denormalised) will likely improve query performance if there are a lot of SELECT queries happening (aka high load). But the tradeoff is that it's harder to update. It's worth considering.
Thanks for the great video, quick question why have a many to many relation between the customer and the address instead of doing a one to many relation where if a user is deleted, or a user deletes an old address the user_id would be set to null.
Good question. The reason for a many-to-many is that an address exists even though a user does not live there. If a user changes their address, this could be handled by inserting a new record for address and removing the relationship to the old address. This would allow historic data to be determined (e.g. a history of their addresses, customer orders, etc). It also allows for multiple users to have a single address (e.g. two people living in the same house have individual accounts for an online shopping website).
A one-to-many relationship can still work, with the points you mention though.
I would like to see videos for the other addressing designs you mentioned such as ip address, names, profanity words and emails including how to validate them.
Good idea, I can create those videos.
Hi, thanks for the great video. I'm in a case where i have 2 address for one entity (typically a start_address and end_address). How should i handle it ?
You could have a joining table between entity and address, which can store the multiple addresses.
In most e-commerce setups, the address is typically linked to other details like the receiver's name and phone number. How could we incorporate that into this design, or is it even possible?
You could connect it to the user table in the same way. Or have a kind of "user_contact" table that connects both addresses and users, and also connects phone numbers.
Looking forward into more videos like this...🖤
Thanks! Yeah I have plenty more planned.
Hey Ben, you have been tremendous help. How would you model the data in a situation where you have 2 entities such as "employee" and "project". Where the project table would only need attributes such as "city", "region". Whereas the employee would need the attributes listed in the video. Would having separate address entities for employee and project be redundant or a good practice? Thanks again!
Good question. You could do it a few ways, but my initial thoughts are to have a full "address" table to link to the employee table, as that's what an employee needs. A project could have a related "city" table, which would just have information about city and region, like a cut-down version of the address table.
You could relate the address table to this city table, so the cities are not stored in two places.
@@DatabaseStar Ohhh! Haha it took me a minute to understand what you meant but I got it now. Thanks 1000 times!
What happens when a user updates their address (like a typo) ?
The solution that comes to my mind is:
- If the address is only associated to the user who created it, the address field could be updated
- If the address is associated to two or more users, then create a new address field
But the problem with modifying existing address fields is if it's already associated to an order, so in that case it's better to just leave "incorrect" addresses rather than correct them?
Yeah I think if the address is associated to anything like an order then don’t edit it, make a new address. Your solution is a good one.
Is it a good idea to create a separate table for states and cities as well?
Yeah that is a good idea. It depends on how you'll use the data and whether you want users to select from a list of enter a free-text value.
@@DatabaseStarDatabaseStar Entering a free text would likely become problematic as the user does not care how they input their address usually.
Informative video. Please make this types of video. Thank you...
Thanks!
how would you handle PO-BOX type addresses in this instance?
Good question. I would probably store the PO Box details in the address line 1 field.
I would add a PO Box column to the table. I would also add a constraint on the table requiring the postal not be null.
What about normalization and redundant data ? I have some doubts about completely normalize the database. What do you think ? This example it is not normalized. But to normalize, a lot of tables should be created: one to country, another to the state, city, neighborhood, street . . . 6 tables at least. To request a complete address, a big join should be done with the person table, for example, and those 6 tables . . . In theory is the right thing to do. But in pratice ... I don't know.
Good question. This is a common concern with desigining databases. In short, there's a balance between creating extra tables to reduce redundancy (tables for country, state, city, etc as you mentioned) and having them all in one table. I think we would need to consider the user interface and the types of queries that would be run. E.g. if a country is selected from a drop-down by a user, a list of countries in a separate table would make sense. If the user manually types in the city or state, then perhaps this does not need a separate table. But if they select it from a list, a separate table may be needed.
If the application runs queries to get the full address, then you would need to consider the impact of joining to many tables. Joins aren't inherently bad - with small tables and indexes, the query could still perform well.
@@DatabaseStar In case the join/view of the person/user table and those 6 tables at least is not performant, would it make sense to create seperate table that contains this data in a read-only state that could be queried without joins or perhaps caching the address information for a person in Redis?
great video database star. quick question, why do we create a separate table just for country?
Thanks! I've created a separate table for country because there is a defined list of countries (it's a list that does not change very often at all), and is something that can be selected from a drop-down on a screen.
We could do the same thing for other fields, but the list would be longer and could change more often.
@@DatabaseStar perfect.. that makes sense.. thank you again
Great Job .
Thanks!
Thanks a lot for this
You’re welcome!
“A customer can have many addresses” but …what would be the statement for the reverse to sustain a many to many relationship? “An address can be had by many customers?” Could you please clarify? Awesome videos by the way.
Thanks! Yes, it could be "an address can be had by many customers", or even "held by many customers" or "related to".
@@DatabaseStar Thanks!
Nice video as always, ensuring validation of past record never cross my mind. I have a question, currently I am developing a project for practice. It has 4 types of user, `patient`, `dentist`, `dental assistant`, and `admin`. Should I put all users in 1 table with role column, or separate them to different tables.
Good question. My initial response would be to put them all in one table, with a column for their role. This is because I assume they will all have the same kind of information stored about them (name, perhaps a created date, perhaps active or inactive, email address, password). You can use other tables to determine what their permissions are if you need that.
@@DatabaseStar Many thanks, I will try to implement that. 😁
Muito bom seu vídeo. Aprendi muito mesmo sem saber inglês, somente pelas imagens e animações..
Thanks!
thank you
You’re welcome!
Hello Database Star
First of all, Great Vid's so far! Very well explained.
I wanted to know, why you have in the last section of the Video the Text "* not to scale". What do you mean by that or to what do you reference it?
Thanks in Advance!
Thanks! Ah that was just a bit of humour - saying that the image of the person and the image of the house are not to scale, meaning they are not a realistic size, as the person is taller than the house.
Can you simplify superapp database?
contain initial features only
What's a superapp database?
Hey Ben, I just sent you an email about working with you through our website. I'd like to hire you for a consultation session. Please advice. Thanks!
Thanks Andre! I saw the email and will respond soon.
I've used this data structure for over twenty years. As data types have become much more sophisticated it's been interesting to experiment with json for both user and form data MySQL and MariaDb. MySQL is very fast and has proven very efficient/fast in search/retrieve/display of large forms.
Your thoughts would be appreciated
Glad to hear this structure has worked for you. I don't have a lot of experience working with JSON but I can understand it has some benefits such as easily-configurable structure. I don't know a lot about the performance of it. Many recent versions of databases are adding more support for JSON so I think it's only going to get better.
@@DatabaseStar I think for NoSQL JSON document DB databases that the main benefit is when all the data can embedded in the Json document, because relations need to be managed by the application and could become a mess as nothing is consistent, however since there is no defined schema it is much more flexible and does not need migrations, but yeah you would still need to plan some kind of schema 🤣
I thought this was Common practice?
It might be, but I haven't seen it that often.
cool
Thanks!