Maybe is worth mentioning the difference between Transparent Huge Pages and Huge Pages itself, mostly of the database support Huge Pages, but not the transparent mechanism
Just don't use NULL, there can't be a data type for an absence of data (even more contradictory the absurd of a phrase like "null value") . The relational data model itself does not have NULL, it is based on 2 valued logic, not on 3 valued logic. I do think this is an awesome course but in general it does not properly observe the theoretical background for the relational data model.
I guess both mirror and delta-main is supported in SQL-Server. In VLDB’15 , it announce that user can use Columnar Store as an Cluster index or secondary index.
So compression for speed is not effective anymore? Is it true for all, uh, types of columns, workloads? Vague question, I know, but I heard several years ago it was better to compress almost everything with the zstd like algorithms, because disk IO was the bottleneck. It's not like that anymore?
Could you help clarify what is the frontend NSM component of delta lake? according to their paper, the client will directly write to the log and the objfile on object store/s3
Me too. Personnaly I guess that original Parquet and ORC are designed by Hadoop community users. The size of row-group is related to HDFS, and may not so much suitable for SSD or others
Maybe is worth mentioning the difference between Transparent Huge Pages and Huge Pages itself, mostly of the database support Huge Pages, but not the transparent mechanism
I'm actually binge watching these. Andy is hilarious.
Just don't use NULL, there can't be a data type for an absence of data (even more contradictory the absurd of a phrase like "null value") . The relational data model itself does not have NULL, it is based on 2 valued logic, not on 3 valued logic. I do think this is an awesome course but in general it does not properly observe the theoretical background for the relational data model.
I guess both mirror and delta-main is supported in SQL-Server. In VLDB’15 , it announce that user can use Columnar Store as an Cluster index or secondary index.
So compression for speed is not effective anymore? Is it true for all, uh, types of columns, workloads? Vague question, I know, but I heard several years ago it was better to compress almost everything with the zstd like algorithms, because disk IO was the bottleneck. It's not like that anymore?
Could you help clarify what is the frontend NSM component of delta lake? according to their paper, the client will directly write to the log and the objfile on object store/s3
I am not sure if it is just me who can't hear any audio.
think it got re-uploaded. was previously unable to watch it but works fine for me now.
@@dsds-rj9rg I didn't do anything. It must have been a UA-cam glitch.
amazing lectures❤
31:52 Super interesting part about the optimal size of the row groups in PAX files
Me too. Personnaly I guess that original Parquet and ORC are designed by Hadoop community users. The size of row-group is related to HDFS, and may not so much suitable for SSD or others
Why you can’t do it in Rust was the saddest moment of the lecture
what does it relate to Rust anyway? This is a DB course, it's all about theory. Andy is not in favor of any language choice here.
It was a joke, what's sad about it
people like you bring bad reputations to rust and rustaceans