Lecture 10: Open Addressing, Cryptographic Hashing

Поділитися
Вставка
  • Опубліковано 1 гру 2024

КОМЕНТАРІ • 103

  • @ozzyfromspace
    @ozzyfromspace 3 роки тому +43

    "Back when I was a grad student, I got a PhD writing programs in C, never using any other structure than arrays, because I didn't like pointers."
    Ah! Finally, a man of culture ☺️🏆☮️😂

  • @sergeykholkhunov1888
    @sergeykholkhunov1888 3 роки тому +34

    03:18 open addressing
    05:36 probing
    11:58 insert with probing
    16:19 search with probing
    21:55 deletion problem
    30:05 probing strategies(get hash func applicable to open addr-ing)
    30:23 linear probing
    36:09 double hashing probing
    39:06 uniform hashing assumption
    46:16 cryptographic hashing

    • @anantggwr
      @anantggwr 3 роки тому

      russia?

    • @sergeykholkhunov1888
      @sergeykholkhunov1888 3 роки тому +1

      @@anantggwr yes

    • @anantggwr
      @anantggwr 3 роки тому

      @@sergeykholkhunov1888 Wow! I have some russian friends. Your last name was kindaa similar. I guessed it right!

    • @canada4590
      @canada4590 2 роки тому

      This is computer science why isn't it thought in front of computers

    • @_stakegambler
      @_stakegambler 2 роки тому

      @@canada4590 how

  • @melvin6228
    @melvin6228 5 років тому +80

    "Back when I was a grad student, I got a PhD writing programs in C, never using any other structure than arrays, because I didn't like pointers."
    I really wonder what would be behind this.

    • @slavkochepasov8134
      @slavkochepasov8134 4 роки тому +12

      Memory management and pointer unsafe typing is main sources of coding complexity in C. Want easy life => no pointers. That how Java was conceived! :)

    • @MGtvMusic
      @MGtvMusic 3 роки тому +2

      @@slavkochepasov8134 Java is what came to mind haha

  • @anmolsharma9539
    @anmolsharma9539 4 роки тому +22

    Kudos to the camera man for that drone shot of the flying pillow

  • @slavkochepasov8134
    @slavkochepasov8134 4 роки тому +9

    Awesome question on delete_me and Search that took professor unguarded. 43:16 Yes, Open Addressing implementation as presented in this lecture has to guaranty at least one empty slot in the table in order for search to work correctly. Alternatively Search can fail if exhausted all m trials. Practical implementation should watch for both delete_me and empty slot counts. Once table became over populated with delete_me and low on empty it has to do some rehashing of existing keys to come back into "balanced" state.

  • @fgfanta
    @fgfanta 2 роки тому +3

    Years later, preparing again for an interview, I am back here.

  • @satadhi
    @satadhi 7 років тому +28

    wait so people do not answer questions who are sitting at the back even in MIT 16:05 ?

  • @prydt
    @prydt 7 років тому +7

    This series is amazing!

  • @guyarbel2387
    @guyarbel2387 4 роки тому +6

    Thank you student who asked for proof (kind of) 44:50

  • @caseyli5580
    @caseyli5580 6 років тому +43

    am I the only one who really likes the sound of the really big fat chalk that they use for these lectures :)

  • @TheJohnny966
    @TheJohnny966 7 років тому +7

    Skip at 43:10 for cryptographic hashing

  • @dylancutler1978
    @dylancutler1978 6 років тому +10

    I feel like the disgruntled comments on this video were people who expected this lecture to be on cryptographic hashing when really it's on linear probing.

  • @omkarchavan2259
    @omkarchavan2259 7 років тому +11

    thank you MIT.

  • @angelc4794
    @angelc4794 5 років тому +7

    I knew he wanted to fling one to the very back. Bwahahahhaha~ Always my favorite part. Gives me good feels.

  • @digama0
    @digama0 6 років тому +11

    I wonder if Erik is in the audience while Srini is teaching... The analysis at 31:55 is incorrect - linear probing is not O(log n) time (expected), it is O(1) time for totally random hashes. This is because although the largest clusters are O(log n), the average cluster length is O(1) (long clusters are *very* low probability). Erik sets us straight in ua-cam.com/video/Mf9Nn9PbGsE/v-deo.htmlm24s (which, from the timing of the classes, may have even been to the same students in the following year)!

  • @dormistepa
    @dormistepa 7 років тому +13

    At 29:24 , what would happen if i insert 496 instead of 999? My understanding is that it will insert it at h(496,1) (since the flag is "deleteMe" so the insertion is made) and therefore 496 would be duplicated (at h(496,1) and h(496,3))!
    Is this right? Would appreciate your help

    • @Gukslaven
      @Gukslaven 7 років тому +8

      Great question, I think you're right it's a problem. I would say the insert function should work differently - if deleteMe is found, remember it, but keep searching until you find None. Then insert at the first deleteMe, unless you find a matching key in which case overwrite. But strange to think he would make this mistake.

    • @Gukslaven
      @Gukslaven 7 років тому +2

      Just want to rephrase this question, easier for me to understand at least: we insert 586 at index 1; then insert 496, let's say h(496, 0) was index 1, so we do h(496, 1) which is index 3. And it gets inserted. Then we delete 586, replace it with DeleteMe. Then we insert 496 again, but now it gets saved in index 1 (following what the lecturer is saying that DeleteMe and None should be treated the same for inserting). So there's two copies of 496.

    • @vishalsr8935
      @vishalsr8935 7 років тому +2

      Good point. The main intent of hash is to replace the old record with the new one if we try to insert the same key. Let's assume the duplicates are there and when you search, it will return the first hash entry, which is the recent one (the first slot is the recent update). Yet I do not see solving delete for this problem because it deletes only the first entry while duplicate is still there. Also, you do not want duplicates to be there especially if you need to maintain M>=N.

    • @vishalsr8935
      @vishalsr8935 7 років тому

      Thought about this, for a while I would say, rehashing's the solution. Mentioned in the later part of this lecture.

    • @pnachtwey
      @pnachtwey 6 років тому

      I don't like chaining, it is slow due to dynamically allocating and deallocating memory. I don't like the delete me flag. When inserting I would do a linear search until an empty space is found. Then I would put the difference between the original index and new index into a field of the original key's structure. Now when the original key is removed I would check if there and offset other than 0. 0 means there were no duplicates. If non-zero I would add this offset to the original key to get where the data for the duplicate key was stored.
      if you don't like incrementally searching for the next empty slot you can index by any number your want. When a key is deleted I would also consider moving the data from the last structure back to the key that was deleted and mark the last structure as empty. In practice I have had very few collisions because I make my hash tables BIG. Memory is cheap! Just don't let it get cached to the disk.

  • @innostellarvtp4484
    @innostellarvtp4484 4 роки тому +4

    For the Delete algorithm, why don't we just replace the last occurrence of the non-empty item after probing through the hash table with the slot that is being deleted? It would be more natural and consistent (although it might take more runtime).

    • @thomasisaacsini
      @thomasisaacsini Рік тому

      I think that key you're suggesting to move - let's call it K - at the end of some chain wouldn't necessarily have been placed in the slot where you're deleting key D. K got to it's slot via its own route

  • @LordMoopCow
    @LordMoopCow 3 роки тому

    This is the best professor the rest are a bunch of hippie stoners and high voice plebs

  • @mustafakarakas1116
    @mustafakarakas1116 3 роки тому

    29:22 -> insert yani ekleme yaparken DeleteMe yazılı olan dizini görünce döngüyü durdurup yeni elemanı o indise yazmak doğru değil çünkü bır hash de aynı elemandan birden fazla olma ihtimalini doğurur.
    Yapılması gereken deleteMe ye normal bir elemanmış gibi davranıp donguyu devam ettirmek. yani eger (999,2)'de deleteMe varsa (999,3)'ü kontrol ederek yer aramaya devam etmek.
    -----------------------------------------------
    29:22 -> overriding DeleteMe element is not a good practice because it may cause to same element being added multiple times.

  • @vishalsr8935
    @vishalsr8935 7 років тому +27

    got a Ph.D. in CS without using pointers :) only care about time not space

    • @ppantg1
      @ppantg1 6 років тому +7

      Well pointers take up more space...and both approaches can do well on time if you resize...you meant other way around?

  • @adamvs1
    @adamvs1 10 років тому +9

    Excellent lecture series, but I do wish the training course for the audio/visual operators would teach them that if they keep swinging the camera around, we can't read the blackboard.

    • @YashGupta-gz1me
      @YashGupta-gz1me 9 років тому +37

      Yes I agree that the lecture series are great, but I disagree with the second argument. I think the camera operators do a very good job in focussing the material that is being taught and in focus of the lecture on the video and not on the blackboard only.

    • @mitocw
      @mitocw  9 років тому +30

      Adam V-S Yash Gupta There also lecture notes available for the course (besides exams and assignments with solutions) that might help you, see the course on MIT OpenCourseWare for the full materials: ocw.mit.edu/6-006F11

    • @IT__PRANJAL_BAJPAI
      @IT__PRANJAL_BAJPAI 2 роки тому

      @@YashGupta-gz1me yes yash absolutely right

  • @pnachtwey
    @pnachtwey 6 років тому

    This was one of his better lectures.

  • @melvin6228
    @melvin6228 5 років тому +1

    I wonder if open addressing really uses less memory when your load factor needs to be 0.5 compared to a higher load factor when you use chaining. I think for small data sets this isn't the case, because you need to be much more aggressive with table doubling when you do open addressing.

  • @misliclc
    @misliclc 5 років тому +3

    Why would the problem with a search using open addressing only appear when one deletes an item from the table? After a couple of inserts we could have the following slot configuration as it's written at 13:45:
    0 - empty
    1 - 586
    2 - 133
    3 - empty
    4 - 204
    Wouldn't the given search algorithm fail if we search for 204 as the third element happens to be empty? It seems like it's not necessary to delete anything from the table to have a broken search. Do I understand how we traverse this structure wrong? And how do we start from the second element in the first place?

    • @stephanekamga1336
      @stephanekamga1336 5 років тому +3

      the insertions are not linear; they are using a deterministic hash function which gives you the key where to insert the item in the array. so if you use the same function both to insert and to search, you are good to go. Issues arrise when you have delete operations.

    • @matiascosarinsky
      @matiascosarinsky 2 роки тому

      Not really, when searching you are using the same deterministic function you used for insertion. That is what determines how you traverse the array.

  • @l0b01
    @l0b01 11 років тому +4

    At 9:00, shouldn't it be hash of k and 0 to hash of k and m - 1 rather than hash of k and 1 until hash of k and m - 1? (ASCII art detection algo won't let me write function calls)

  • @宋一小
    @宋一小 Рік тому

    What is the hashing function the professor is using around 14:00? How does key = 133 get inserted into slot 2?

    • @tinywhale3954
      @tinywhale3954 Рік тому

      its not a "real" hashing function, just sort of one he is making up one in his head for the sake of examples.

  • @meghnasingh9941
    @meghnasingh9941 6 років тому

    what is the time complexity for successful search in open address hash table with uniform hashing? given x is load factor with x= n/m where n is number of elements and m is total slots in the array.

  • @noguide
    @noguide 6 років тому +1

    Did he almost disclose his password at 47:12? May not be a problem, because there would still be |first daughter's name length|! possibilities (if all letters are unique), but still narrows the search space down a bit, and you even already have the password's length (if you know the name, that is, not that I care, I just gave a quick thought to this as an exercise, the name is just a variable for me). Would a. rainbow table help when you have a hint like this, if you somehow got hold of /etc/passwd? Maybe still not an issue with a salt? I insist, all these questions are purely academic, but I guess that it shows how careful you have to be these days. I know that he is an MIT professor, and must know what he is doing, but I would like to prove that he hasn't given anything away, as he claimed. Maybe I will come back to this after learning some cryptography, which was already in the pipeline.

    • @melvin6228
      @melvin6228 5 років тому

      I clicked on the report UA-cam user because you wrote this comment (edit: I now realized it might've been a better idea to report this comment directly but the reasons listed aren't related to your comment). I was hoping I could write an explanation, but I couldn't unfortunately and I do think an explanation is helpful (which is why I'm replying to your comment so that UA-cam mods have an easier time finding as to why I reported you). My intention is that this comment is analyzed by a mod and that a mod will give you feedback on whether such a comment is helpful for the community or not, that's it.
      I think your comment has an interesting insight, but IMO your comment does more harm than good. Unfortunately, these type of open security discussions on the internet always have a weird trade off. However, as the comment specifically is targetting one person, I think it's more harmful than helpful.
      If you want to discuss techniques on how to hack someone's password, IMO this is not the place. It's not for me to decide, but I do hope that UA-cam mods will check whether this type of security speculation targeted at one person is a good thing.
      Final thing: it is clear that you don't intend any harm by your comment, you're just looking for an interesting discussion. But I do think there should be a clear line about what's acceptable and what isn't and I think your comment might be crossing the line due to the heterogenous nature of the audience. If you were saying this at a hacker conference, it would've been fine, but we're not at a hacker conference.
      (I'll delete this comment within 2 weeks, if the mods then haven't seen the connection as to why I clicked on the report button, it's on them)

  • @whoareyou1694
    @whoareyou1694 3 роки тому +1

    Creative license or not I need access to this material any recommendations?

    • @mitocw
      @mitocw  3 роки тому +3

      All the materials that we have available are at: ocw.mit.edu/6-006F11. Best wishes on your studies!

  • @firefly_benotx
    @firefly_benotx 7 років тому

    How many time I will be searching in Open Addressing as it says search as long as encounter by k or find an empty slot . but in case if we have no empty slot where should we stop search , we know where we started

    • @ujan754
      @ujan754 6 років тому +1

      If you don't have an empty slot it obviously means that the hash table is full, provided that your probe sequence is a permutation of all the available slots. In that case once the sequence is exhausted you can conclude that the key isn't present.

  • @rstark
    @rstark 2 роки тому

    Awesome!

  • @Logan1selva
    @Logan1selva 4 роки тому

    13:42 How does it happen to be 4? I don't understand...is 4 is his desired slot to insert 496?

    • @Logan1selva
      @Logan1selva 4 роки тому

      14:13 is this random insertion? If it is then why do you look at slot 4 to insert 496 in the first attempt? However how will you look all the way up at the 1st slot in attempt 2? How does he specifies (496,3) = 3 even before inserting 496 at slot 3 ? Is slot 3 3rd on his list of desired slots?😵

    • @Logan1selva
      @Logan1selva 4 роки тому

      Can anyone clear this doubt? please..i apologise if they sound dumb🙃..

    • @davidjiang7929
      @davidjiang7929 4 роки тому

      Remember space is limited. Result of hashing is distributed uniformly across all of the array. So there is a chance that another key sits in the same space. Thus, u go to hash value #2.

    • @vigneshsubramanian8040
      @vigneshsubramanian8040 3 роки тому

      Its just an example to explain how probing works.
      The hash function can generate a probe sequence on given a key. So basically, saying h(k, 1) means return the first value in the probe sequence, if that fails then you go to h(k, 2), which returns the 2nd value in the probe sequence.
      So in this example, h(496, 1) returns 4

  • @ibgib
    @ibgib 2 роки тому

    47:13 - I thought this guy is cuckoo giving away hints to his Password. But giving him the benefit of the doubt, I'd say he's being clever (and I applaud him for it) in leading up to 1-way hash functions where you can have publicly available information (the fact it's related to his daughter's name ~ `h(x)`) without revealing the secret (his password ~ `x`).
    I thought his teaching was too good not to praise his finesse here.

  • @tanvishinde805
    @tanvishinde805 3 роки тому

    what does h2(k) is "relatively prime to m" mean?

    • @tanvishinde805
      @tanvishinde805 3 роки тому +2

      Got it.. "Two integers are relatively prime when there are no common factors other than 1. This means that no other integer could divide both numbers evenly. Two integers a,b are called relatively prime to each other if gcd(a,b)=1 .
      For example, 7 and 20 are relatively prime." Ref: math.libretexts.org/Courses/Mount_Royal_University/MATH_2150%3A_Higher_Arithmetic/4%3A_Greatest_Common_Divisor_least_common_multiple_and_Euclidean_Algorithm/4.4%3A_Relatively_Prime_numbers#:~:text=Definition,and%2020%20are%20relatively%20prime.

    • @dharmiknaik1772
      @dharmiknaik1772 2 роки тому

      they do not have any common factor

  • @yawofori-addae3888
    @yawofori-addae3888 6 років тому +1

    why does search (496) fail?

    • @hoanghungpham473
      @hoanghungpham473 5 років тому +1

      cause it found an empty slot -> resulting a fail

    • @akhilr94
      @akhilr94 2 роки тому

      Because it's mentioned earlier that h(496,2) = 1 where 586 already exists (now deleted). The empty clause in search(k) will hold true there and search fails.

  • @hakanahlstrom8310
    @hakanahlstrom8310 7 років тому

    I dont understand the professors answer to question at 44:24.
    (m-n)/m is easy to understand. But I dont understand the transformation from that to 1-alpha --> to 1 over p. :S:S:S

    • @ppantg1
      @ppantg1 6 років тому +5

      1/p = number of trials(or probes in this case) require to get a hit(insert or search in this case). For example, if p = 0.5 you need 1/0.5 or 2 trials on average(think flipping a coin). If p = 0.1 then you need 1/0.1 = 10 trials for a success, since you had 0.1 (or a tenth) of a chance to get a successful hit.

  • @furkanefezerenuz7445
    @furkanefezerenuz7445 6 місяців тому

    is it normal that I understood nothing?

  • @PamirTea
    @PamirTea 7 років тому

    Great lecture.

  • @pratik_shrestha
    @pratik_shrestha 4 роки тому

    21:24
    Is that Mark Zukerberg at second from bottom left corner?

    • @tungo7941
      @tungo7941 4 роки тому

      I don't think so. Mark went to Harvard

  • @neuron8186
    @neuron8186 3 роки тому

    views are inversely proportional to the index number of the video people are not tough huh!

  • @JoeR14
    @JoeR14 11 років тому +1

    Hahaha he should have thrown a flick... Not a bad toss though. Good lecture too.

  • @Logan1selva
    @Logan1selva 4 роки тому

    37:08 what is i?

  • @davidroonie1336
    @davidroonie1336 4 роки тому

    What does he give to students??

    • @mitocw
      @mitocw  4 роки тому +1

      Seat cushions XD ua-cam.com/video/HtSuA80QTyo/v-deo.html

    • @IT__PRANJAL_BAJPAI
      @IT__PRANJAL_BAJPAI 2 роки тому

      @@mitocw glad you still answer students !!

  • @monk_learn
    @monk_learn 3 роки тому

    Since 2013 this video has only 850 likes?

  • @emiliamorgan5052
    @emiliamorgan5052 Рік тому

    whats with the cushions though lol

    • @mitocw
      @mitocw  Рік тому

      Srini Devadas, "You know this class after while is going to get boring. Right? Every class gets boring. So we, you know, try and break the monotony here a bit. And so-- And then the other thing that we realized was that these seats you're sitting on-- this is a nice classroom-- but the seats you're sitting on are kind of hard. Right? So what Eric and I did was we decided we'll help you guys out, especially the ones who are-- who are interacting with us. And we have these cushions that are 6.006 cushions." (from the first lecture)

  • @kevsingh
    @kevsingh 2 роки тому

    While this professor is still great, I feel like he's all over the place compared to Erik...

  • @이효건-o4o
    @이효건-o4o 3 роки тому

    13:48 OOPS moment

  • @florianwicher
    @florianwicher 6 років тому +1

    Here is a brilliant video that explains how RSA encryption works - it uses the one way trapdoor functions mentioned in the video! ua-cam.com/video/wXB-V_Keiu8/v-deo.html

  • @angladephil
    @angladephil 6 років тому

    Thanks MIT for these free lectures. This one was not the best. Especially the 1st part since in many cases, you don't know N and thus can't compute M so that M greater than N...

  • @gyanig8501
    @gyanig8501 4 роки тому

    Am I the only one who watches the lecture series in 2x?

  • @rickyleung9312
    @rickyleung9312 7 років тому

    Lots of garbage words making me hard to understand your idea, e.g. For "I will give you a sense of why the statement I am going to make is true", it will be a lot better if you just say whats wrong.

  • @lockersrandom6161
    @lockersrandom6161 4 роки тому +2

    Thank you MIT.