when indexes are useless | The Backend Engineering Show

Поділитися
Вставка

КОМЕНТАРІ • 45

  • @hnasr
    @hnasr  3 роки тому +5

    head to database.husseinnasser.com to get a discount coupon for my Introduction to Database Engineering course

  • @carlosdelgado5632
    @carlosdelgado5632 3 роки тому +27

    I happen to realize about this about 2 weeks ago when I created an index on a column with "status" and saw no increase of performance at all, as the matter fact it actually decreased so I guess I learned it the hard way 😂

  • @squirrel1620
    @squirrel1620 3 роки тому +17

    I was just troubleshooting a long running query and found that one of the tables being joined was missing an index. The SQL EXPLAIN showed exactly what index should be added. Added it and bam, 40s query down to 1s

  • @talhaabdurrahman9407
    @talhaabdurrahman9407 3 роки тому +11

    the way you said bahrain made me so happy

  • @emjaytripleo
    @emjaytripleo 3 роки тому +2

    Just bought your course Yesterday, looking forward to it

  • @marsilinouzaky2748
    @marsilinouzaky2748 3 роки тому +7

    Great video Hussein, hope u have a great vacation.
    I think two more ways indexes can be useless if it's on a column that changes value very very often where maintaining a rebalanced tree will become challenging and introduce a lot of overhead. Also if the rows are not that many then index might not provide a lot of benefits.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 2 роки тому

      exactly.
      one of the reasons we generate random test data, is to force the DB engine to use the index.

  • @RustinRoberts
    @RustinRoberts Рік тому

    Great video. This helped my understand indexing a bit deeper. Thank you

  • @sirajul-anik
    @sirajul-anik 2 роки тому

    on point 2, i had figured out this one a few months ago. I was totally shocked.

  • @noorulamin2652
    @noorulamin2652 2 роки тому +1

    explained very well thank you Hussein Nasir for such awesome content we really appreciate your content & hard work keep it up 👌

  • @bashardlaleh2110
    @bashardlaleh2110 3 роки тому +1

    The composite index thing is really interesting

  • @lakhveerchahal
    @lakhveerchahal 3 роки тому +3

    I'm not sure of this but this might be the case (it can be wrong) -
    When u only do
    select * from employees where condition;
    This * can cause the db to do heap scan instead of index scan because it anyways need to fetch every column from heap, so to skip two hops it goes to table scan in the first place.
    Please correct me if it's wrong.

    • @ili101
      @ili101 3 роки тому +2

      Doesn't sound logical to me. If you have million rows and you do "SELECT * WHERE id = 42069" I don't see way it will scan million rows instead of using the index then getting this 1 row from the heap. From what I know select * is only bad for network traffic and CPU/RAM/IO for fetching/formatting the output. Except probably if you select *only* from the indexes then you don't need the heap at all and skip this extra step?

    • @marsilinouzaky2748
      @marsilinouzaky2748 3 роки тому +1

      The index doesn't store all columns values so it doesn't really matter what columns you're looking for, instead what matters in this case is the where clause column, the DB will scan this index based on the where column, get the primary key, then it has access to all columns related to this primary key, so specifying one column or * won't make much of a difference if you have an index that will be used to filter your rows.
      Someone can correct me if I'm wrong :)

    • @lakhveerchahal
      @lakhveerchahal 3 роки тому

      Yes I guess, it'll still use the index scan 99% of the time, it's just bad for the I/O.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 2 роки тому

      selecting only happens after finding the rows.

  • @VivekYadav-ds8oz
    @VivekYadav-ds8oz 3 роки тому +3

    I had questions regarding the propagation of query from backend to database - why do we introduce so much overhead in this part?
    We use textual commands and pass it to the database for lexing *for every single query* (queries get cached but this only helps if query is exactly the same), then parsing, then actually doing its actual job. But since most of the query is programmatically generated, why don't we have a binary format for this, where we just send a blob C-style structure over the network. It would have all the information database needs without needing to filter through the human junk.
    Also, I am not an engineer (yet!) so no real world experience, but do databases run on different machines than the server? For all examples and usecases encountered by me, they've always been on the same machine, so using the network (especially TCP for localhost networking!) seems unnecessarily expensive. There should be an alternative method communication that doesn't require network but uses some Unix-y message passing of some sorts.

    • @rishabhanand4270
      @rishabhanand4270 3 роки тому +8

      yes, databases are best kept away from your server. This way it's decoupled and you can scale each other independently. A server should do exactly what it's name suggests: serve your clients. Whatever requests your client has, it should note it down and send it to the "kitchen" where "food" is prepped, cooked and sent to the clients. Sometimes "food" is also cached :).
      Coming to the network problem: it's not always a full TCP connection establishment, there is a connection pool from your server to the database which is almost always active and your server can just use these existing connections to quickly send over the raw query. Yes, there is still a little overhead but it's still better than keeping your database on the server itself when we talk about scale.
      Finally, we don't send "parsed" queries over the network because then you are parsing it on your Server. As explained by Hussein in this video, the parsing is dynamic, i.e, the database looks at a query and comes up with the most efficient plan based on statistics which it stores in the database itself. So you're loosing all the good database optimizations if you just send the parsed query directly to the database. Also, if tomorrow the database changes it's parser (maybe an upgrade), you will have to upgrade all your server instances to have the same version of the parser. It's a headache nobody wants. Let the database do it's job of caching, looking at your query to plan and executing your query.

    • @bepamungkas
      @bepamungkas 3 роки тому

      To beat the server query planner, you'll need actual knowledge of the data. AFAIK, only SQLite allows you to do that by embedding the entire logic into its driver; making it a classic serverless RDBMS.
      note: Of course there is cloud serverless db like cosmos or aurora, but those are a bit more obscure in implementation due to several abstraction layer. SQLite use simplified model where db interface is treated as part of the application itself.
      Regarding your last question: Yes. It is advisable to separate db with backend due to predictability factor. Roundtrip latency is generally an acceptable cost compared to resource exhaustion during peak load (since on single-machine scenario, every additional load will be multiplied by at least two: backend processing and db access). And latency is usually much more predictable compared to load.
      Having said that, for localhost you could use UNIX socket- or named pipes on Windows- to avoid TCP "overhead". However, on obscure edge cases where you need do bug hunting, TCP-based connection generally easier to debug. And using TCP from the get go means you're not locked to single scenario, or worse, made wrong assumptions due to difference in system behavior.

  • @Gabriel-sc3mg
    @Gabriel-sc3mg 9 місяців тому

    and how does indexes works when we make a join by a PK it will search the results by Index and then search in the other table? great video btw

  • @shiskeyoffles
    @shiskeyoffles 3 роки тому +1

    Similar situation!
    Saving in watch later

  • @siya.abc123
    @siya.abc123 3 роки тому +1

    Enjoy your vacation bro, we will all be here when you return.

  • @rishiagrawal4260
    @rishiagrawal4260 3 роки тому +1

    can you please make a video on HLS ? and more importantly how you explore/read untouched technology ?

    • @hnasr
      @hnasr  3 роки тому +1

      I think this will help ua-cam.com/video/1-KmLc0c2sk/v-deo.html

  • @marawanlotfy1477
    @marawanlotfy1477 3 роки тому +1

    hi hussein
    love from egypt man ❤️

  • @mihaidorobantu8574
    @mihaidorobantu8574 2 роки тому

    Regarding the first name case - can we not just create an index on a normalized version of the first name table?

  • @HSBTechYT
    @HSBTechYT 2 роки тому

    I had a question about composite indexes in nosql/mongo
    How does mongo know what's left and what's right ?
    And let's say I have a index for userid property but _id (inbuilt) is also indexed right ?
    So will this affect performance?

  • @wilsonovasea
    @wilsonovasea 3 роки тому

    good video.. even better to display with 1.5 speed

  • @henrydesousa8264
    @henrydesousa8264 3 роки тому

    Regards from Costa Rica.

  • @johnnychang3456
    @johnnychang3456 2 роки тому +1

    Another one is when you do a LIKE query with percent sign on the left of your where clause. E.g, select sth from xx where xxx LIKE %q%
    Since % means all possibilities, the database has no way to apply any index scan.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 2 роки тому

      also, iLike with/without the leading % will cause a full table scan.

  • @emptymeta5849
    @emptymeta5849 3 роки тому

    Yo, big fan of your outage content.
    Could we get some content on the recent roblox haloween outage?
    Have you heard anything about that?
    Edit: Didn't see the part about vacation, hopefully we get some good outage content when hes back

  • @dixztube
    @dixztube 3 роки тому

    Trying to learn kotlin by just reading the documentation. Ugggggh. I like JavaScript but need to leave my comfort zone.
    Hope your family is doing well. Thanks for a great channel !

  • @jasdeepsinghgrover2470
    @jasdeepsinghgrover2470 3 роки тому

    When the index is implied but not mentioned.. Like object type(having index) is car and we are searching for BMW but we never mentioned object type to be car in the query.

  • @ahmedtawil7705
    @ahmedtawil7705 3 роки тому

    How you make it ?!!
    really when i face a problem , i just find that you talking about in newly videos!! 😂😂
    Thank you Hussein

    • @hnasr
      @hnasr  3 роки тому

      I read minds 😍

  • @kooshasangari2555
    @kooshasangari2555 3 роки тому

    ⭐ Thanks Man, You are #1 ⭐

  • @hitmusicworldwide
    @hitmusicworldwide 3 роки тому

    Hey! Let us know how life is back home ! That sounds fantastic!

  • @erlangparasu6339
    @erlangparasu6339 3 роки тому

    please give sql code or benchmark result when it useless

  • @rishavagarwal6531
    @rishavagarwal6531 3 роки тому

    Thank you sir for the video :)

  • @Vijay-Yarramsetty
    @Vijay-Yarramsetty 2 роки тому

    commenting for the sake of UA-cam algorithm, to help him monetize from the video.

  • @mughees52
    @mughees52 3 роки тому

    Welcome back to bahrain in Advance

  • @rahulbera454
    @rahulbera454 2 роки тому

    500th like ❤️

  • @electronlibre4163
    @electronlibre4163 3 роки тому

    Don't we say indices instead of indexes? 😁