Update a Table with Millions of Rows in SQL (Fast)

Поділитися
Вставка
  • Опубліковано 25 лип 2024
  • 📝 Get my free SQL Cheat Sheets: www.databasestar.com/get-sql-...
    🎓 Learn how to improve the performance of your SQL: databasestar.mykajabi.com/get...
    Running an SQL UPDATE statement on a table with millions of rows can often take a long time.
    There are several different ways you can update this data so that it runs faster.
    In this video, I explain and demonstrate five different approaches to updating data in a table with millions of records.
    The demonstration is done in Postgres, but the concepts are valid for Oracle, SQL Server, MySQL, and other vendors.
    ⏱ TIMESTAMPS:
    00:00 - Our sample data
    00:47 - Method 1
    01:22 - Method 2
    02:09 - Method 3
    04:38 - Method 4
    05:25 - Method 5
    🔗 VIDEO LINKS:
    Scripts used in this video: github.com/bbrumm/databasesta...
  • Наука та технологія

КОМЕНТАРІ • 16

  • @prashlovessamosa
    @prashlovessamosa Рік тому +2

    Your channel is very helpful sir thank you providing knowledge for free.

  • @RAMII19780529
    @RAMII19780529 Рік тому +5

    It would seem that the first method would always be the fastest. The REPLACE function basically runs an INSTR on every row, changing the data when it finds a match. Every other example will do more work. For example, in Method 2, the INSTR runs on every row, then REPLACE has to search the subset of rows again to locate the text and replace it. The other methods all add additional overhead while still searching some fields twice. The fastest way would be to run it in parallel if your DB supports it.

    • @Firecul
      @Firecul Рік тому +1

      I'm a relative noobie with dbs but I'd have to agree with you here. I'm not sure how it could get faster than only checking everything once and updating as needed.
      Perhaps this might not have been the best example to get across what he meant.

    • @DatabaseStar
      @DatabaseStar  Рік тому +2

      Yeah that's true, that explains why the first method is fastest. And the database is pretty good at determining how to run the update statement efficiently.

    • @DatabaseStar
      @DatabaseStar  Рік тому +3

      That's a good point! Yes I think the example in this video showed that the simplest method was the fastest. If a different type of update was done, then perhaps it would be slower. So it probably wasn't the best example!

  • @slaybryn5504
    @slaybryn5504 4 місяці тому

    Can you show me that I want to update mysql database on Column Address to be masking with custom way like replacing * and & sign round. eg. 'address' to '*d&ras&' to all 30million rows. How to solve plz help me.

    • @DatabaseStar
      @DatabaseStar  4 місяці тому

      I think you can use a function that replaces string characters with others. In MySQL it’s called REPLACE.

  • @coolcha
    @coolcha Рік тому

    Would update in batches be beneficial for a live app, thereby making the app available for use in between the batches?

    • @DatabaseStar
      @DatabaseStar  Рік тому

      Yeah it might be good in that situation. It could be good if you have a scheduled maintenance period, and run updates during this period (for example, 1 hour each night), and then run more updates the next day,

    • @coolcha
      @coolcha Рік тому

      @@DatabaseStar I see. Thanks.

  • @karthiks4347
    @karthiks4347 25 днів тому

    👏👏👍

  • @kane_lives
    @kane_lives 10 місяців тому +3

    What is the point of this video?
    The most obvious and naive method is the fastest, while the rest are convoluted and do extra work for no good reason.
    99% of the videos on this channel are good to great, but I would just delete this one if I were in your shoes.

    • @DatabaseStar
      @DatabaseStar  10 місяців тому +2

      Thanks for the feedback! I guess the point of the video is that yes, the simplest method is the fastest. There are many other ways that it could be done and that may seem faster, but are actually slower in this example.

    • @slaybryn5504
      @slaybryn5504 4 місяці тому

      @@DatabaseStar I thought that method 3 batching is fastest for larger 100million rows records. Because the fastest naive method 1 can't run for larger rows. It will stop, is it right?