21 - Snowflake Data Warehouse Internals (CMU Advanced Databases / Spring 2023)
Вставка
- Опубліковано 14 січ 2025
- Prof. Andy Pavlo (www.cs.cmu.edu...)
Slides: 15721.courses....
15-721 Advanced Database Systems (Spring 2023)
Carnegie Mellon University
15721.courses....
58:43 I'm super curious about Andy's opinion. too bad it's cut 😂😂
26:07 I can say exactly how they're doing this. It's called the Query Acceleration Service. It's enabled on a cluster-by-cluster basis, and the customer DOES pay for the added compute resources, per second, for the time that they've been burst onto the stack to help out the query processing. Snowflake will never EVER "borrow" compute cycles from one customer's allocated compute cluster to help out another customer's query. To imply as much is disingenuous.
Is that In Baltimore by K Mack in the intro? You got some good taste in hiphop.
Terrific Talk - Thanks. Same question as @taLin - "DBMS supports query plan hints" (47:09) ? I wasn't aware of that - is that true? That's completely against the grain if true.
"59:00" The gang fight video is so funny!!! Where can I find this video online?
While Snowflake automatically clusters micro-partition in background what do the actual clustering key do to the Snowflake stored data? I'm guessing cluster key is rearrangment within a micro-partition i.e., How does query optimizer utilizes clustering key?
Let's say there are 1M micropartition files for a table. You issue a query and the WHERE clause includes a predicate on one of the cluster keys. Snowflake says "I know the ranges of values in the files, so I only need to scan 10K of these files to get the answer". It enables "file pruning" during the read phase.
Hi Andy,
Thanks a lot for the lecture! Can you please clarify what you mean by "DBMS supports query plan hints" (47:09) ? As far as I know because of Snowflake's philosophy "as a service" there are no hints and no plans to implement them.
Thanks,
Tania.
No runtime hints still. Maybe talking about how you can visually see the query plan as executed in query profile view and select nodes to focus in on in the new UI and also can get a plan given some sql with explain_json function.
Spills of the off camera discussions?