Strategies for optimizing your BigQuery queries

Поділитися
Вставка
  • Опубліковано 2 гру 2024

КОМЕНТАРІ • 25

  • @AlesNajmann
    @AlesNajmann Рік тому +5

    This is the simply most important video to watch by the users of BigQuery 🔥

  • @premmohantyagi
    @premmohantyagi 3 роки тому +11

    Perfect and easy examples to describe the impact of optimization.

  • @majorcemp3612
    @majorcemp3612 3 роки тому +4

    How do you know the "where" clause is performed before the join (5:17) ? Don't you have to do a sub query on the left joined (t2) table ? (also does this sub query be less performant ?)

    • @leighajarett221
      @leighajarett221 3 роки тому +5

      You should be able to tell from the execution details :) In the first stage of the query, the workers will read from distributed storage and filter the data before shuffling and joining (which usually happens in the next stage). You can see the WHERE clause and the JOIN applied in the details of the stage to understand the order. You don't need the where clause in a subquery to filter the data before joining, but subqueries can be helpful if you want to pre-aggregate the data before joining to reduce the amount of data joined.

    • @majorcemp3612
      @majorcemp3612 3 роки тому +1

      @@leighajarett221 so I don't need a where clause in subquery because everytime the where clause is being done at the distributed level ? Like everytime ? Or should I put a subquery to be sure ? (and if I put this subquery to be sure will I get bader performances ?)

  • @sevanajemian
    @sevanajemian 3 роки тому +2

    amazing video for our new joiners :))

  • @divertechnology
    @divertechnology 2 роки тому +1

    great video i thought it gonna be light but is very deep in knowledge

  • @abhishekjena9581
    @abhishekjena9581 2 роки тому +1

    nice way of presentation .....good stuff

  • @SasawatChanate
    @SasawatChanate Рік тому

    5:50
    Does the common practice of putting WHERE 1=1 as the first expression has any affect on this?

  • @dheer211
    @dheer211 3 роки тому +2

    Great video thanks Leigha

  • @michelkant
    @michelkant 3 роки тому +1

    Nice one! At how much data processing do you recommend to start working on optimizing the queries? As in when will the effect really start to show an impact? 1gb of processing, 10, 1000 or 1tb?

    • @leighajarett221
      @leighajarett221 3 роки тому +1

      That's a great question! I think it depends on your pricing model. For example, if you're paying on-demand for each query you may be inclined to start optimizing queries that are scanning whatever #GB translates to a reasonable cost savings. This will also depend on how often the query is being run - for example if a query scanning 1TB of data is only run once that's $5, but if you run it everyday that's a lot!
      I would try to pinpoint queries that are run on a regular basis and are consuming the most resources, but set some threshold based on the total estimated ROI. For example, (the number of times the query will be run each year) x (the total cost of the query) > $1000 (or whatever is significant to your business)
      Check out the pricing calculator if you're estimating query prices: cloud.google.com/products/calculator/

  • @xiaoyangshawnhuang1251
    @xiaoyangshawnhuang1251 2 роки тому +1

    thanks for the great explanation. I like "stay curious"! :)

  • @Alex315101
    @Alex315101 Рік тому

    really curious why such simple examples can not be automatically handled by query optimizer (which is already done by other databases)

  • @jackylee6839
    @jackylee6839 3 роки тому

    At 5:25, why does clustering based on common join keys improve the efficiency more than partitioning?

    • @samhithnamburu
      @samhithnamburu 2 роки тому

      Clustering on join keys leads to least number of shuffles to perform the join

  • @jaysyanshar
    @jaysyanshar 3 роки тому +1

    is the concept can also be applied to other SQL databases?

    • @yellowsubmarine7059
      @yellowsubmarine7059 Рік тому

      Some these concepts could apply to other SQL database management systems (DBMS). Also some DBMS' are smart enough to apply some of these optimizations when generating the execution plan without having the user making changes to their SQL query.

  • @ahmedhelal174
    @ahmedhelal174 2 роки тому +2

    Thank you

  • @MyBestHibou
    @MyBestHibou 3 місяці тому

    A little surprise how little the default optimizer can do for us...

  • @Code-ie8ub
    @Code-ie8ub 3 роки тому +1

    thanks!

  • @gangsterrapmix8570
    @gangsterrapmix8570 3 роки тому +1

    Google Google many businesses are always liked by people all over the world Thank you Google

  • @viacomsspy5439
    @viacomsspy5439 3 роки тому

    Recommended is broken

  • @hritvikmohan4048
    @hritvikmohan4048 3 роки тому +3

    Maal ho yrr tum toh....