How Binary Search Makes Computers Much, Much Faster
Вставка
- Опубліковано 27 вер 2020
- Featuring binary versus linear search, and non-clustered indexes. Uh, indices. However you want to say it. • MORE BASICS: • The Basics
Written with Sean Elliott / seanmelliott • Camera by Tomek • Graphics by William Marler wmad.co.uk
🟥 MORE FROM TOM: www.tomscott.com/
(you can find contact details and social links there too)
📰 WEEKLY NEWSLETTER with good stuff from the rest of the internet: www.tomscott.com/newsletter/
❓ LATERAL, free weekly podcast: lateralcast.com/ / lateralcast
➕ TOM SCOTT PLUS: / tomscottplus
👥 THE TECHNICAL DIFFICULTIES: / techdif
I am, as ever, extremely thankful for animator William Marler for handling all the graphics here!
epic
Pinned one month ago hahaha
Hello
is says your comment is from a month ago yet this video is 13 seconds old....?
One month ago ?! Wow...
there is actually a third kind of search:
Windows search. It takes your query, scraps it, returns 6 random database entries and 2 ads. Then it recursively shuffles the deck to look busy until you close it in frustration.
Are you talking about the search bar in Windows 10?
This is the almost the only way I launch apps/programs and how I convert currencies and timezones and translates words and lookup words definitions and lookup general info about movies/shows/actors. And it works perfectly for these cases.
I've never actually used it for searching for files, because I always know where exactly my files are 😅
Pop os
@@HassanSelim0 It works fine for sure, but there are many times where it plain sucks - So my idea why its bad it because of its index/priority system - Like it should value .exe files or important app's the most, but often I haft to spell out the entire thing before it even suggest it, or other issues. Its nothing like googles auto fill for sure, but its also based on high-machine learning and spying on us...Something MS search bar also does, but not as good as google textbox I figure.
@@HassanSelim0 it refuses to find programs for me too btw. ive tried for a very long time to launch OBS from the search bar, until i gave up and just added another desktop icon
@@HassanSelim0 How do you do "convert currencies and timezones and translates words and lookup words definitions and lookup general info about movies/shows/actors" with windows search?
Tom did what I think most people should do about terrible people in history: acknowledge their work, while making sure the world remembers they were terrible.
Well said. All this tearing down of statutes because the subject's actions don't fit today's morality has to stop.
Cry me a river hippie. Go back far enough and everyone was “terrible” based on modern weak minded standards.
@@jangamaster8677 People like you wouldn't survive a day without "modern weak minded standard"
@@jangamaster8677 so sexual harassment is a modern weak minded standard?
Omar B. Kar not exactly his point... calm down guys
For those that are still struggling with understanding binary search: Imagine telephone books not having the names in sorted order, and you have to look to ALL the names until you find the right one. That would be so unpractical. Instead, the entries are ordered by name, and you just keep skipping over a couple of pages until you overshoot, and then go back and look closer. That is the kind of magnitude we are talking about.
I understood it, but this is a good analogy.
What the heck is a telephone book? You mean ebook.
ThePotaToh yellow pages
To realize I'm this old to remember scouring names in Yellow Pages and this action helped to understand binary research even after watching Tom's video is.....
unbelievable (say it like MrWhoseTheBoss).
Jk, good analogy. Thank you!
@@ThePotaToh your too young. Think of "contacts" list
When you really think about it, the existence of search engines is something absolutely incredible that we just take for granted
Fun to build as well.
That's why the internet has been called the world's largest library with the world's worst indexing system.
The speedy secret of the google algorithm is ignoring as many search terms as possible "unless" "you" "do" "this" "forever"
Seriously! Is it just me, or has Google search massively gone downhill over the last decade? It's just broken auto-answers, AMP pages, ads, and the same dozen or so top results that come up no matter how you tweak the search query.
Leo It's probably search engine optimization catching up to current algorithms. Although a search engine's job is to get you what you want, those who want you to browse nonsense you don't want (e.g. clickbait headlines) aren't into that.
especially annoying that google ignores keywords like "not". Its really hard to search for thing like "shirt that is not red" as google just gives you pictures of tom scott.
@@real_dddf While Google does not treat "not" as a keyword it does allow you to exclude words using the - operator. A search for Shirt -red for instance will only show results where red is not mentioned.
@@real_dddf there are options for this. if you put a '-' in front of a word, it won't show results with this word. try searching for "Tom scott the park bench" and "-Tom scott the park bench" and see how the results differ (without the " ) (or in your case "shirt -red". The non-advertisement results are all not red)
Binary search for Tom Scotts' age. Results: Somewhere between 15 and 50
lmaoo
35.
@Hunter The Based God Burn
The volume called "Red Shirts", one of Tom's favourites.
Also a real book, one of my favorites I've read!
this is my new favourite pixar movie
Wait..the a113.... Nice one.
I'm temped to reply, because I like your videos and wanna be noticed for a second, and you did the same thing
okay, but thanks for stealing my comment Seamus 🙄
Note the box is from the Lux Foundation Library
Rupert You realise the graphics guy did that so people, not just you, would find it? It’s not like it’s extremely hidden, a lot of people noticed.
A week after my CS class covered the concept, Tom Scott comes around an explains it in an easier to understand format
My teacher used his video on sorting system
Haha same here, Gr 12?
so what's missing?
@@BloodSprite-tan nothing is missing per se, more that the Cs class also covered how to write the binary search and indexing and sorting to make binary search efficient.
im literally going over this rn in my cs class lmao. UA-cam knows
Dictionaries in some languages actually have a second indexing system. In Chinese for example, since the written character does not necessarily relate to its pronunciation, dictionaries often have a by-stroke index at the end. (The main index, i.e. the order characters appear in, is ordered by the latinization of the characters' pronunciations. There are multiple latinization schemes, but dictionaries will usually pick one depending on the region.) This way, if you know how a character is written but don't know how it's pronounced or latinized, there's still a way for you to find it.
Cool
Today, Hanyu Pinyin is the standard in all Sinophone countries. Older books might still use one system or another, such as Wade-Giles, Yale, etc. but today most won't stray from Pinyin
That is, unless it's Taiwan. The Zhuyin alphabet, developed specifically for Chinese under the Republic of China, is still widely used for how intuitive it is
Fun fact: Isaac Asimov had at least one book in every main section of the Dewey Decimal System.
I can’t wait to read the one on history of Australia /s
@@AlexTheGamer97 spoiler alert: a lot of get killed
Sneaking around slipping copies on the shelves when the librarians' backs were turned.
@@AlexTheGamer97 You will never find it. It was stolen ;-)
@@666Tomato666 as compared to any other colonial country?
I already know this from the education system, but I still feel like I’m learning when I watch this
I bet you dont study in a public school in the us. Just a feeling
It's a type of storytelling and science communication.
@@justaguy1182 I learned this in public school
Tomer Zaitsev No, a public school in the UK actually
@@seastilton7912 Were you perhaps from the era when they had BBC Micros in the schools?
Makes a book about how to order a library
Librarians:
where does this go then?
025 Library operations
Rick Astley is president check on my channel
Yo Dewey, we heard you wanted to sort books about how to sort books, so we put a section for books about sorting books in your method for sorting books so you can search for books about searching for books.
@@HeavyMetalMouse wasn't it yo dawg? Nevertheless, you just posted meme props to you gent sir.
Bertrand Russell has entered the chat
2:09 there's also exponential search (start at the lowest index, double it until you've passed the thing you're looking for, then perform a search (usually binary search) between where you ended and the index you looked at right before). It's used when you don't know how many items there are.
"I don't remember the title, but I know it was red"
_pulls out book called redshirts_
Hah, it's cause Tom wears red shirts.
_By John Scalzi_
Wait, I know that author, I've read his books.
OH. I READ _THAT_ BOOK.
omg I didn't even notice that! Fantastic book that. And I don't even watch Star Trek!
I dont get it
@@Vocaloid_Cevirmeni i don’t know what the exact reference is, but there’s probably a real book called redshirts by John Scalzi
@@kwibloupthesomething There is
And how binary makes Tom's videos get put in everyone's recommended much much faster
What???????
Yay!
Nothing against it...
Box A113! Can’t skip that past me!
From the Lux Foundation Library, no less
I rewound the video because I didn't notice, then when I saw I giggled
saw that too haha
0:40
Can anyone say why it's such a significant number?
Tom: “It’s a long, long way...”
My Brain: “TO BA SING SE”
Fantastic video as always Tom!
There's no war in Ba Sing Se.
they look so prettayyyy
to mukumbura!
We could always take the Secret Tunnel.
@@starlightsall Through the mountain? *(But that would get us to Omashu)
Fun fact: the ‘A113’ @ 0:40 is a reference to Disney Pixar movies. In most of the Disney movies created, this code has appeared on random objects or doors, computers, etc. ‘A113’ is the room name in which many famous Disney directors studied; many of them work on movies together today.
Or it could just be a random code that Tom wrote down
This guy is a computer scientist, philosopher, artist, musician, and incidentally a UA-camr.
+ linguist
+ game show semi-finalist
@@KoyasuNoBara The most important one
+ grumpy emoji expert
+ ex-politician
Ar-r-r!
"if it's unordered, there are no shortcuts - linear search is the best you can do"
quantum search: allow us to introduce ourselves
Well thats cheating
I guess, you could turn a search algorithm into a Quantum Annealing problem, which would make it work on a D-Wave QC. So that would mean you can handle an search of *drum roll* 2048 items! A general purpose QC would do a lot better on this problem, as it can actually work with qubits representing an index, rather than a bit mask, but the largest practical search algorithm I've seen was with 10 qubits, making it work as a search in an index of just 1024 items, which is even worse. We might have pushed it a bit further, I'm a bit out of date, but the point is, theoretical work on quantum computing greatly outpaced what we can do in practice. For a QC to replace indexed binary search and use unindexed data directly, you'd need 30-40 qubit general purpose QCs. And there are some really hard problems, both engineering and theoretical, that prevent us from pushing a qubit count that high on general purpose QC.
@@konstantinkh bro just solved quantum computing in a UA-cam comment
@@michaelba7791 Not sure I'm following. I'm explicitly pointing out why practical quantum computing we have isn't adequate. And my background is quantum theory, albeit, in particle physics rather than computing. Though, I have been out of academia for nearly a decade now.
@@konstantinkh they were being sarcastic because they hardly understand what you're saying, most likely
The day Tom changes t-shirt will be the day the end is near
Lmao
"How bad do you have to be to be kicked out in 1905?" Is literally so funny I had to pause the video.
Great to see Tom filming in the real computer history centre, after so many months.
The animation and talk are both enlightening. Thank you for this great video Tom!
0:19 The attention to detail here is incredible. Besides the obvious joke about Tom’s red shirts, John Scalzi is an actual author that wrote a book called _Redshirts_
The secret sauce to Google indexing is the same as Dewey's system : inherent built-in biases liberally sprinkled with revolutionary thinking.
Using GaGa for a search is like doing "research" using the yellow pages. The best plumber must be "AAAAAAAA111 PLUMBing'
JR
To be fair, Google is amazing how it does things. However, a lot of the searches will be cached and returned for similar searches, and they have thousands of servers, so people access different ones. Still amazing tho given the sheer scale of it all!
I can't imagine they just make it snappier by some engineer saying "ah stick an index on that column it'll be better" :D
wrong
Now that Arthur Library Card song makes sense to me finally I know who Dewey is!
DW: WHO IS DEWEY
Tom: Terrible person. Absolutely awful.
I like how this animator pays homage to A113
Sjoerd Quaak And Doctor Who series 4 on the same box
My teacher: Talks about anything
Me: Bored
Tom: Talks about anything
Me: interested
Even though I'm a software developer and watch these videos more for entertainment rather than for learning, I must say it really helps to be reminded why we do such things and what can be achieved if we think about implementing stuff like a search and how it can be optimized rather than to say "ahh there's a library for it, bubble sort? meh, will do"
“Box A113”, isn’t that some Pixar meme?
Close, it's an inside joke for students of CalArts, of which many went on to work for Pixar
Also, the Lux Foundation Library where the box is from is a prominent location in Doctor Who, series 4
Timestamp please
@@joeybaseball7352 6:52
﴾ ﴿ I’m guessing that’s a joke, since the vid is only 6:51 long 🙂
I love the ping-ponging that happens with this search. The search time actually has a half-life
I worked part-time at a library before and during when everything was digitized so talking about making/sorting/editing index card stacks brought back a lot of amusing memories of finding forgotten cards for books that came off the shelves decades before, or a card for each printing of a particular book that's been replaced with new editions a few times.
Very occasionally we'd find a 'ghost book', a book on the shelf that had no index cards whatever, despite being checked out multiple times by readers.
00:05 The dark blue book in the middle says "How to Hide Yourself in a Tom Scott Video - William Marler" !
And the big orange one is a Hitchhiker's Guide to the Galaxy reference.
He's the animator on this video. But there's more fun to be seen. Books by various UA-camrs with links to Tom are mentioned: Hannah Fry, Matt Parker, Hannah Witton, Steve Mould.
I love details like this hidden in backgrounds.
1:41, a fear that many people have is to be forgotten after they die. Its fair to say that within 100 years most people are forgotten and the ones that aren't are often those that we remember as a stark reminder not to be them.
My personal favorite index for book orders is how a NY library organizes by height just so they can fit all their books on shelf's.
That can go hand in hand with Dewey order, though: arrange bookcases in Dewey order, and arrange the shelves in each case according to how tall the books are in that case's subrange of Dewey numbers.
This sounds like a dumb system until you're actually faced with trying to put a book on a shelf that is too short. Then, suddenly, sorting by size is the only system that makes sense.
Most libraries I've seen have a section for oversized books. The rest of the books are then ordered by some sort of numbered classification system.
@@chaosordeal294 My preservation & conservation teacher in library school once mentioned that ordering books by height is a good way to keep them from getting uneven exposure, which helps with their long term conservation.
*shelves
Bubble sorts and binary chops, brings back memories!
College is such a waste of time. Literally nobody ever has written a bubble sort, deliberately or accidentally. It's a conspiracy of the education system to prevent people from learning to program with actually interesting tasks.
Happy to hear Tom saying indices / indexes! I got a lot of comments on the cleanroom video saying I should have said vortices instead of vortexes. Same thing. Thank you Tom
I think it was Orson Bean who made the same joke I did, 40 years earlier: Shouldn't the plural of Kleenex be Kleenices?
Another optimization is starting the searching process in the background while the user types before the search button is clicked. That way irrelevant results are eliminated and total search time is abated.
2105: Tom Scott, what a guy, absolutely no whiff of scandal about him.
Me: *already knows about binary search*
Also me: I’d love to see how Tom explains it
Hey Tom, I just want to say thanks for making this series. I started the GCSE Comp Sci course this September and I feel like I'm already ahead since a lot of the things you cover are in one way or another on the course we're studying (e.g. databases or Huffman coding from what I remember), and all the other things you cover in The Basics that are not on the course still give me an insight on the how or why of computers, rather than just the what, and a lot of very interesting niches. I really appreciate the work that goes into these videos, thanks!
It was so close to being called the Truman decimal system too if that newspaper hadn't stuffed it up.
Good one!
Never heard indices explained so well. Cleared a lot up for me about the basic concept. Thanks for making this!
Hi Tom,
Please make more of "The Basics". I love to watch these and the way you explain things is phenomenal.
PS. I'd also love to see more lengthy videos and "The Advanced" series!
Thanks.
I always enjoy how smooth and to the point your animation is.
Not only is Dewey decimal an example of an index, it's an example of binary search, and also binary insertion. When books are added, they get a number in between two existing numbers according to some rule. If the new book can't be inserted because the two existing numbers are adjacent, an additional digit is tacked on instead. In this way an infinite number of books can be added without reindexing the existing collection.
Now we literally have containers of all human knowledge in gradient: from rare science data to meme and porn.
Fingerprints of humankind. Dope.
Actually the range just goes from memes to porn. Science data and everything else is somewhere in between.
@@ilyaholt8607 dont forget piles of ads everywhere
I've been programming for 10 years and 3 years professionally, but your basics videos are still super interesting and full of great stuff! Wonderful video. :)
4:02 genuinely expecting Tom to say “a cluster ****” 😂
3:02 Thought you might want to know that the captions are messed up! It says "forget about the side w nd so on," instead of "forget about the side and so on"!
Was the error you made intentional?
@@barneystinson3358 you?
@@barneystinson3358 Are you okay, Barney?
It's forget about the side we don't want and so on
You should be able to suggest an English translation by clicking on the three dots directly under the right side of the video.
As a former librarian, I actually laughed out loud at 1:30
I love of he said bad even for that time
Thanks, this explained index searching better than a year of a minor in big data management.
As always, thanks for making such a great video, Tom, keep them coming!
The computer indexes I'm used to have the value that it's indexed on and the location of where the record is (a "pointer"). This allows for immediate access to the record without having to search again on the main file.
A lot of data structures use that technique. Make a "node" of two variables, the data and a pointer to another node or an empty address. Then you can link a bunch together in different ways for sorting and searching.
Bravo! *How to compress a day of college into 7 minutes.* Wish I had this as a student. Knew the theory, but not about Dewy's biography. That was a nice touch which never came up in Computer Studies nor Information Systems, and would have been a good reason why my University used another method.
Here in the US, college and university libraries tend to use Library of Congress cataloging, which has more detailed categories and tends to work better for larger collections. Some large public library systems use it too, like the two largest systems in my area (Hennepin County Library and St Paul Public Library).
The combo of a good teacher and a skilled animator makes tutorials excellent ! Thanks to the team.
2:36 "You have to spend some time to sort them" with bubble sort or insertion sort!
I have my Computer Science teacher to thank fo those two terms.
Great video, Tom! I work with huge databases on a daily basis and was never formally trained so this helped me understand indices better. (Also, I like that you say "indices," since I'm generally alone among my coworkers, who say "indexes.")
OK, so if you code (a great tool for RDBA), build your own dataset. Then build your own index. Then a deeper one. Then run tests for lookups on all three datasets.
Dewey is a fantastic example of an uncomfortable truth. Sometimes terrible people can do great things.
Dewey is not the only example. Heisenberg is literally Nazi
I legitimately believe your short videos should be shown in a variety of subject lessons in secondary schools nationally.
They are informative, accurate, researched and extremely well explained by taking the form of a narrative.
People can really learn about the way the world around them works from your efforts.
Tom, I do not know how you find the time and effort to produce these videos but they are brilliant! Top one. Nice one. Get sorted.
0:11 Wrong! I worked in a bookstore for 9 years, whenever customers showed up not remembering any pertinent information about a book they were looking for, the color they nearly-always gave as the only bit of "useful" information they could remember was BLUE, not red!
Also, they were always dead certain of this "super important" "fact." Incidentally, the predominant color of the actual book, if found, could be just about anything... though rarely blue.
And when the mechanic asks what kind of car you have, tell them you have a blue one, with a gasoline engine.
Hi Tom! I just thought I'd mention that there is an error in the subtitles at around 3 minutes into the video, the sentence is missing a couple of words and the odd letter.
Love the sidenote on Dewey, good and important of you to say.
Thanks Tom I love these videos!
I was looking for data structures and here's tom scott with his video, its like a god sent gift!
I sort stuff like this:
If it's on the floor then leave it be
I'm literally learning about this in my data structures class right now! Thanks for the review.
When I took an advanced data structures class in college, I thought I would be learning about how to store data.
The algorithms I learned there for faster data parsing and sorting really opened my eyes.
Dewey was a bad, google is a bad.
Conclusion: indexing makes bad people.
“Was a bad”? Why are you talking like a goddamn toddler?
@@Jayfive276 sounds like you are also a bad.
@@Dunkster74 Wee aw all dhe baad
So Dewey was interested in sorting people as well. Way ahead of his time!
Tom, this is the exact subject I just had a lecture on earlier today and barely understood. Thanks for helping me with my Computer Science homework!
Bruh this guy makes some of the weirdest yet most informative videos ever dudes
getting fired due to complaints about you in 1905 is insane
Definitely.
0:33 This isn’t a Pixar movie Scott!
Tom can make any boring topic interesting
Just started doing this for my uni degree, thanks for uploading!! It’s make a ton of difference understanding Binary data search
Dewey was kicked out of the American Library Association in 1905 for wearing a t-shirt that said, "tell your ankles to stop staring at my eyes".
2 ways to search: No! There's way more, but even as the usual common search methods, there's linear, binary, and *HASH*.
just the basics
Like as someone who does coding, I wonder why I’m even watching this, but the way you explain it makes me just enjoy the ride
Sorting and search algorithms are always nice to hear about. :) When you started talking about searchability on multiple critera I couldn't help but think about the k-d tree which lets one, in "one go", perform a binary search for an entry on however many critera or dimensions desired. Quite neat. :)
So you are saying that I can turn my GT 740 into RTX 3090 using binary search. Noice!
"How to hide yourself in a Tom scott Video" well that didn't work. I found you, Wiliam marler, I found you!
Wait till Tom discovers hashmaps
Another search is to place the last used item on top, and use linear search. Frequently used items are near the top and more readily found, while rarely used items are farther down.
nice a113 reference
"Don't you know the Dewey Decimal system!?!?" - Conan the Librarian
I started 5 weeks ago with my computer science studies and got this topic (the basics of it) 3 weeks ago, but Tom can explain it way better than my teacher did
I was so engrossed in the library sorting segment at the beginning that I forgot the title and length of the video and upon seeing the intro, thinking it was the outro, went "Oh that was nice." And went to scroll down when Tom continued to speak and spooked me.
Wow, Dewey seems like a cool gu-... oh god..
Actually the last time I've been at the library the ordered the books by the color of their binding. That was so hard to find what you wanted
Scientific journals, in the pre-computer days, always had one color of their covers, for decades. So the book shelves were always stacks and stacks of red ("Journal of the ABC") followed by stacks and stacks of green ("Journal of the DEF") followed by stacks and stacks of yellow ("Journal of the GHI") followed by stacks and stacks of blue ("Journal of the JKL")...
I never fully understood indexes in a DB until now. Thank you!
Absolutely brilliant. Never heard a better explanation of the topic. You are extremely talented in explaining things!
The fastest sorting method is the one that randomly shuffles your list until it’s in order
big brain time
tbf yea its the slowest and fastest at the same time
That dewey dude sounds like a legend!
Edit: NO HES NOT NO HES NOT
Well that aged quickly.
Bruh that was fast
huh
People love unsuspecting twists, ur vids, are like that. I luv ur vids.
I would've liked this video anyways, but it gets EXTRA love for the shout-out to John Scalzi at the very beginning.