Examples of Failed Attempts at Data Anonymization 1. NYC Taxi and Limo Commission - Chris Whong '14 (1:50) 2. Netflix Prize - Narayanan and Shmatikov '08 (17:40)
Thanks for the great lecture! One thing I want to add is : random ids mentioned in 15:03 do not work because it is, in essence, only a pseudonym. Under the same random id, as soon as the actual relationship between a person and the id is uncovered. All the protected traces will cease to be protected. That is why it does not work and it cannot solve the problem of data linkage attack.
Hello, I have a question about the Netflix example. Would leaving a review on IMDb necessarily mean that the person watched the movie on Netflix? I assume millions of people watch movies on Netflix on a particular, don't leave reviews on all of the movies they watched on Netflix in IMDb, so I am kinda confused how a sure match would be made with a high probability. Sorry if the question seems dumb.
It's not necessarily a 100% guarantee of a match. But if there's a person who has (say) 100 movies and scores which seem to match up, then this is a pretty strong indication. You don't need a perfect match, it works even with weaker correlation and noise as well. Check out their original paper for more details.
Thanks Gautam for making the lectures publicly available. Appreciate it!
Examples of Failed Attempts at Data Anonymization
1. NYC Taxi and Limo Commission - Chris Whong '14 (1:50)
2. Netflix Prize - Narayanan and Shmatikov '08 (17:40)
Thanks for the great lecture! One thing I want to add is : random ids mentioned in 15:03 do not work because it is, in essence, only a pseudonym. Under the same random id, as soon as the actual relationship between a person and the id is uncovered. All the protected traces will cease to be protected. That is why it does not work and it cannot solve the problem of data linkage attack.
Thanks for these lectures.
This is too good. Thank you sir!
Amazing content !
Thanks a lot for this nice work.
Great lecture! May I ask what note writing app you are using?
Thanks! I answered you on Twitter, but if anyone else is looking: Xournal++
@@GautamKamath I was literally about to ask the same thing.
Also fantastic lecture! I subscribed :)
Hello, I have a question about the Netflix example. Would leaving a review on IMDb necessarily mean that the person watched the movie on Netflix? I assume millions of people watch movies on Netflix on a particular, don't leave reviews on all of the movies they watched on Netflix in IMDb, so I am kinda confused how a sure match would be made with a high probability. Sorry if the question seems dumb.
It's not necessarily a 100% guarantee of a match. But if there's a person who has (say) 100 movies and scores which seem to match up, then this is a pretty strong indication. You don't need a perfect match, it works even with weaker correlation and noise as well. Check out their original paper for more details.
@@GautamKamath Will check it out. Thanks for getting back to me and making the course available for everyone :)