Secret To Optimizing SQL Queries - Understand The SQL Execution Order
Вставка
- Опубліковано 15 тра 2023
- Get a Free System Design PDF with 158 pages by subscribing to our weekly newsletter: bytebytego.ck.page/subscribe
Animation tools: Adobe Illustrator and After Effects.
Checkout our bestselling System Design Interview books:
Volume 1: amzn.to/3Ou7gkd
Volume 2: amzn.to/3HqGozy
The digital version of System Design Interview books: bit.ly/3mlDSk9
ABOUT US:
Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.
Great video! One addition: The "EXPLAIN" command is an invaluable tool for optimizing SQL queries. It provides a detailed execution plan, allowing the developers to understand how the database engine processes a query. By analyzing the execution plan, you can address the performance bottlenecks with proper optimizations, e.g. proper indexes.
Thanks for sharing this.
Thanks a lot for the addition, really good :)
@cmertayak - I second you. It's an awesome command I use many times at my work to optimise. My go to command to improve queries execution.
Thanks for sharing
Oh yes, if you run EXPLAIN in some desktop client like Mysql Workbench, shows you detailed chart diagram of your Query, quite useful
Opt for indexes with SELECT, WHERE, JOIN clauses.
Use full column comparison to get data instead of half or computed comparison (i.e startsWith)
Avoid ORDER_BY on large data retreval
Use limit of smaller number with pagination for more data.
Could you explain how? What if i need large data retrieved with order by. How would i use limit and pagination in this case? Thanks
The way you explained with the animations are Awesome. Great Job. Very Well Explained.
Very good intro. Would like a more detailed explanation on more complex queries.
they don't do detailed explanations. it's basically "use indexes". don't sort lots of data. well, thanks.
@@jonbaird9718agreed, UA-cam is made for juniors
bro this way of teaching is really really make sense. thanks a lot for these visuals.
Thank you for a fantastic visualization of the SQL queries execution order. That's exactly what I have been missing in the other materials. I really appreciate your style of teaching
Simple and to the point explanation. Love it. Thanks 👍
This is the best explanation I've ever seen. Big thumbs for you!
*Explanation level is so beautiful!*
One of the best SQL videos I have come across, just the way it is put together and the infographics. If you are learning SQL, you really should understand the mechanics behind optimizing queries, how databases work. Just adding more hardware or VM resources will not fix the issue if your queries are not optimized properly.
Very well presented, thanks for explaining SARGAble concept
Very profound, please share more on SQL like windows and CTE, your explanation is very approachable.
Thank you for your time and effort to explain any of the subjects. Really like it and more over able to register the concept in mind easily. Thanks again,.
Wow. To the point with knowledge I can use today. Thank you.
wow, what an awesome introduction to SQL optimization.
Love your channel. Your videos are great.
Excellent video explaining basic concepts in very short time..❤
Impressive graphic animation, could you please share how the execution plan animation was done
Awesome visualization, I've been loving all the short videos on this channel!
Clarifying Q. The execution order has SELECT happening after HAVING, so this should mean that the calculated column total_spent doesn't exist at the time the HAVING clause is evaluated?
Additionally, for the optimizer to "make up" a reasonably good plan (from the various alternatives), it needs to know a bit about the data (value) distribution. This is where STATISTICS / ANALYZE (depends on the DB vendor) come handy. It helps the optimizer do estimates for the various steps (rows, size of data, etc.) of each plan, and figure out which of the different plans is the best candidate to execute. Therefore it is important to collect this information on critical columns (usually join, where clause columns). It is also important to keep this information regularly refreshed so that the optimizer does not make bad decisions based on stale statistics. Very bad things can happen with stale statistics.
Understanding how the DB engine works with indexes is key. you may assume that a WHERE purchase_date >= 2022 AND purchase > 100 would be the same if you have indexes on purchase_date and purchase, but it might be required to have a composite index... Order in the WHERE clause may also be important as it helps reducing the dataset before applying the second condition.
WHERE order has no effect on most sql systems. The only way you can force SQL to filter data first is to use a derived query.
these videos are amazing!!!! thanks!!!
Awesome as usual! Thanks a lot!
You should select from the orders table then join the customers since your where clause is a column in orders table! Your SQL is joining on unnecessary rows from orders & customers!
Excellent explanation, thanks!
Great video, very informative and well explained bravo!
Great video!! Very helpful! Thanku sir!
Your presentation is so pleasant to watch, is it manually key-framed in the video editor or are there tools to do that naturally?
Nice and simple explanation.Thanks
As usual, excellent and to the point video!
Superb video! Simple explanation on query optimisation.
thanks, helped clear up some issues I had.
Thank you, this was really helpful.
Hi Sir thank you 🙏 for taking the time to explain the SQL. Sorry Iam new and very helpful.
I heard it called "predicate pushdown" when you move a condition earlier in the plan
Index usage tip: When using params in your query (e.g., select .... where year > ?), databases may not utilize an index if it is unbalanced. For instance, if you have approximately 1 million rows with year = 2022 and only 1000 rows with year = 2023, the database cannot predict whether the parameter will be useful for filtering. To resolve this issue, pass the value directly in the query itself, allowing the execution plan to determine if the index is suitable for the intended purpose.
As I wrote in my comment, good understanding on how you db engine works is key. And they are all different. So never assume that a good query on a MySQL will be a good query on Postgres, Oracle or any SQL engine.
this opens the gate for SQL injection, don't do this
@@maf_aka I think the idea was not to use prepared statements *where you don't need them.* E.g. if you already have validation in place that ensures your received value is enum (number, null, etc.) - you can be sure no SQL injection is possible there - so no need to use prepared statements *there.*
Ok, but then you get a different query plan for each (different parameter / set of parameters) query
@@lethern2 yep. but that's why you need to understand how your db engine works
Very simple and to the point, love the visualization too
good things to practice for the interview. Thanks
Fantastic explanation.
Very good video. It is really helpful.
cool, didn't think it's possible to include all these concepts in 6 min video. One thing, it's great to watch it when you want to summarise already existing knowledge
Amazing. Thank you!
oh my goodness, this is too good for non IT background jumping ship to see where AI will land. Thx. You are my 3blue1brown for IT
Great video!
Best explanation ever
Thanks. Good to know! Useful!
You guys are awesome!
Thanks for your sharing Bro's.
Thanks for this! Will there be a transcription soon?
This query actually does not need to join customers table since all the fields are present in the orders table already. (unless there are invalid / dirty customer_id data in the orders table and you want to filter them out)
This stuff is gold. Thank you for making this available for free. Really appreciate it!
1:26
Great. Thanks for sharing..
So good explanations
Thank you so much!
thanks a lot for your content
Well explained. However I do miss 1) the generation of more query-plans and selection amongs them (cost estimations) and (as an element herein) 2) different table access tactics (sequential scan, index access or index only).
Nice bird's-eye view introduction.
It is not clear how to 'use appropriate indexes' to optimize for sorting, and how to implement pagination. Especially in your example where the sort order is made on an aggregate.
I feel like this is a bit misleading because sometimes where and select influence the first stage. As you said, when there’s a covering index, the database won’t read the entire table. So the select and where influence what is read from the source.
Order and limit can also come it at the source as well if the index can be used with the order. You refer to this when you talk about “sorting the whole table”.
CTEs and sub queries are not mentioned but that’s okay i guess.
thanks so much!
muchas gracias!
good explaination
Hi The actual plan should be derived from the explain and explain analyze right instead from the query?
Lord Buddha. I'm looking for an active data flow visualization that can shorten data query response times! A great video, it saved me today. Leaving with 1 subscription as a fan! 🔍⚡
What tool do you use to generate your animations?
Would building a cte table and then running a non-sargable query on it, should also be avoided?
מדהים!
so in the above example, which place we should index ?
In this example the 'total_spent' alias is already in use in the HAVING clause without defining. How is that possible?
yes, I have the same question, it doesnt make sense...
Having uses total_spent from the SELECT, so how come HAVING is executed before the SELECT?
I'd say so too. This is error. First SELECT part is evaluated, then - HAVING part.
I have always thought that the Sql structure is poorly designed by not starting from FROM and placing the reference at the end of the statement, for example in a SELECT it should go just before ORDER BY, in an UPDATE the SET after WHERE, etc. Somehow they wanted to remedy the problem by introducing the WITH clause but I'm sure many regret that whoever designed the language should have worked a little harder at the time.
order_date is mentioned as indexed - is that implicit or explicitly defined?
Very good video
This is pretty cool.
Can you make a video explaining the difference between system design and software architecture?
Something doesn't add well here. If you notice HAVING clause refers to 'total_spent' which is defined in SELECT, so dependency wise HAVING should be after SELECT and not before it.
Question: at the end of the video you mentioned do not sort the whole data and use pagination for optimizing ORDER BY and LIMIT. Those are the things I use for pagination! What do you mean by that?
The other thing is from your video LIMIT happens after ORDER BY. How come it can help when ORDER BY has already happened?!
Btw great videos and content, thank you for these
Will it be even faster if we always order where first and join after?
00:45 Understanding SQL query execution and optimization techniques
01:30 Understanding SQL execution plans can optimize queries for better performance
02:15 Optimizing SQL queries through index usage
03:00 Writing soluble queries is essential for optimizing database performance.
03:45 Sargable queries improve query performance.
04:30 Understanding the SQL execution order is crucial for query optimization
05:15 Optimizing SQL Queries with Indexes
05:57 Understanding SQL execution order is key
Crafted by Merlin AI.
hi, can you enable captions/subtitle for this video? thank you!
What program is this used in the presentation?
I still don't understand the difference between first point noted on here 3:19 and second point noted on 3:23. Would you mind to re-explain it ? thank you!
thank you for your video,
i working on IT with 10 years experience, but I never know the order between JOIN and WHERE,
utill I watch this video
Can anyone help me when does the function count or sum will be executed will it be after limit ?
Can you/anyone please explain execution of case when and window function with group by
I always thought that the SELECT happened before HAVING, considering that we can use SELECT aliases in the HAVING filter.
my app didnt reached 40 queries per second yet but i will implement that just in case my app will be next amazon :D
can someone explain to me what's mutant query plans with a real life example?
Is that a typo in the first select clause, total spent should be total_spent?
yes, i think so, and I have another question, 'Having' uses total_spent from the SELECT, so how come HAVING is executed before the SELECT? Doesnt make sense...
0:12 _[JOIN comes before WHERE]_
is there any way to make the WHERE clause execute first to narrow the rows required to make the JOIN in the first place??
this is the only reason i still do this using a nested query rather than JOIN
a CTE can be benefitial in your use case.
I think Order by is evaluated before select as order by might change selected rows...is it correct?
Yes this vid is full of mistakes
Will this work with MySQL as well?
很不错
Why are we using HAVING total_spent >_ 1000, but not WHERE total_spent >_ 1000 ? Can you please explain?
Why are there no subtitles? I need subtitles. Thank you very much!
Ambiguous query
subtitles not available
is there a way to contact you? I have some specific questions on indexes?
Where is the translation of the CC?
you should have more subtitles
This is top-notch in every aspect. I read a book with similar content, and it was top-notch. "Better Sleep Better Life" by William Brook
What about mongodb ?
這集沒字幕..