Getting Sorted & Big O Notation - Computerphile
Вставка
- Опубліковано 20 лип 2024
- How well sorted is your algorithm? Choosing the right method to sort numbers has a huge effect on how quickly a computer can process a task. Alex Pinkney talks about two popular sorting algorithms and how they 'scale up.'
Follow up film "Quick Sort": • Quick Sort - Computerp...
Alex's code that generated the data for the tests:
github.com/apinkney97/Sorts
Alex's graph of all the results:
eprg.org/allplots.pdf
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. See the full list of Brady's video projects at:periodicvideos.blogspot.co.uk/...
"you'd have to be quite a good programmer to work out how to do something that runs that badly..."
Fantastic quote!
An interesting consideration is the mechanism of storage. At one defense contractor I worked at, we had machines with no real RAM to speak of, and even transient data would be stored on tape media. Because of that, a ping-pong bubble sort was always the fastest method in practice because it dealt with adjacent elements (didn't have to deal with the seek times). Because of the linear seek times, it saved time by actually doing work as the tape wound one way or the other.
To this day,he is still random sorting...
Really glad you are liking it - there are PLENTY more to come!
PS: Tell your mates about us!
>Brady
Nice that you included bogo sort!
Good video, but I'd like to challenge Alex's theory at 4:50 that smaller lists are "nearly sorted and therefore BubbleSort is faster" when he's using uniform random. The data rather suggests a tipping point between bubble and merge somewhere 20~40 elements, that's around when you'd run out of Level 1 CPU cache (64 bytes). That most likely explains why bubble sort is initially much faster, because doing 1000 swaps on 32 elements in L1 cache is less CPU work than a handful recursive method calls in the JVM each allocating 2 arrays and copying data.
It should remind everyone that Big-O is just theory ignoring how computers really work. These days cache misses dominate performance, unlike in the 90's when multiplications were slow and memory was fast.
I'd also like to point out that his approach to micro-benchmarking Java code is prone to incorrect data. For small input sizes he is running in interpreted mode (JIT requires 10k iterations) also you don't want to call `nanoTime()` in a tight loop, it's an expensive call and thus will dominate cost for smaller input sizes. Look into JMH next time you want to benchmark in Java.
It's astounding to see how common it is for these professors or engineers to always have a Rubik's cube on their desk.
Indeed it was not me... Most of the computerphile videos are made by Sean Riley (we always say who made the film in the full video description if you want to double check)
I'll still be making a few myself and come along for the odd interview (just because I like being involved!) - but Sean is the main man and you can usually assume he is the an behind the camera/edit!
>Brady
This video would have helped me a lot back when I was in simple algorithms. You should do more videos on algorithms. Maybe Quicksort, I always thought it was beautiful how well it worked.
Brady! This channel is really turning into something great. Being an IT guy, its really apparent how little people understand fundamental computer concepts that can really be a benefit in any vocation. I am really excited to see the networking video :D.
Please don't stop putting videos up on Computerphile!
This is my favorite channel Brady!
I can't wait for more on sorting. I know it very well but it is very nice to hear someone explain it so well in just 10 minutes.
Thanks... Sean, unlike me, knows how to use After Effects...
>Brady
Hi, we will put the links in the description when they are available - at the time of writing this, the two videos 'teased' at the end are not yet available and the annotation simply says 'coming soon' - hope that helps >Sean
So far I've liked all the Computerphile videos, but I think this is one of the better ones. I'm a 16 year old self taught programmer and I think this channel is a great way of introducing new people to computing. Great job Brady!
Thank you for making this video. This shows that the audience is really important to you guys.
Algorithm complexity is something I have wanted to learn about for a while, and this video has given me a basic look. Keep the good work up!
"He invented a tree sort that uses fewer logs."
~ cartoon in ancient Dr. Dobb's Journal
Amazing work with the animations! Makes merge sort so much easier to understand!
:D so so glad you posted the code. this got me back into some programming after a long hiatus for work! i managed to incorporate a simple pre-load system so you don't have to reprogram the app each time you want to change the range of your test XP simple enough but was still super fun :D
I really liked how long and in-depth this episode was. I noticed that most of the Computerphile episodes were pretty short.
wow, that last algorithm is so amazing :)
It will be covered soon! >Sean
Nicely explained video. Computing science concepts like Big O notation and sorting are really interesting.
That paper they write on takes me back to my primary school days. Nice touch. Well done
That is the most mindblowing computer-related thing I've heard in a long time.
Thank you so much for this video. It really helped me understand the concepts of these algorithms.
Incredible! I feel like taking CS Algorithms again but with more fun since it is familiar :). Definitely support other people's suggestions on algorithms videos. Among those: more sorting (quick sort, selection sort, bucket sort and comparison etc.); O notations deserve a separate topic; P=NP (probably for later videos since it is more advanced); data structures: queues, trees, stacks; graphs and graph algorithms. Overall, great channel!
Excellent video! Please continue on this route here.
Best video on this channel so far! Great job!
I appreciate the effort you are putting to explain this.
algorithms and data structures exam next friday. those sorting videos are pretty good for understanding. thanks, brady :)
This channel is so precious
I wish the title included Big O notation! I was recently looking up more information on the subject and this was a much better explanation than the rest!! =)
Done ;) >Sean
Wow! Great demonstration illustrating Big O notation
I would like to see more on programming languages, their history, pros and cons, basic abilities of each, thank you!
I'm loving these videos guys! Keep up the great work :D
I haven't been a huge fan of this channel Brady, even though I was really looking forward to it being launched. That being said, I really enjoyed this video! This is excatly the kind of stuff I would like to see!
Thumbs up from me!
The best sorting algorithm is using recursivity. The video was excellent. Your channel is getting great
nice video and explanation for those who are new to Algorithm Analysis
Hey Computerphile, Really loving the videos on this new channel, especially this one! Would you be able to do a video on recursion and its applications in algorithms, and further, how to write algorithms using recursion. I would really like to know the thought process behind how to write good recursive programs. Thanks :)
Bubble sort is the simplest one to implement, and a good introduction to algorithms in general. Also, it serves as an excellent example of how different algorithms which at a first glance might both seem more or less efficient to untrained eyes actually perform very differently.
Quicksort will be covered soon! >Sean
I don't understand why merge sort is always given an initial recursive decomposition step. You can form the initial base lists by simply collecting the elements 2 at a time with the last pair as a single item should the number of elements be odd.
Quick sort does this:
1) Choose a random pivot, the pivot is used to compare the numbers of the list
2)Create 2 lists, "greater" and "less"
3)Go through each number (except the pivot) in the list. If its greater than the pivot, add it to "greater". If it's less than or equal, add it to "lesser"
4)Recursion! Basically, repeat 1-3 for each and every "greater", append the pivot, then append the "less" list after you apply steps 1-3. It's hard to explain, but relatively easy to code
Amazing the n! example with the cards!! Never though of it like this. I'm gonna go and look if you have a quick sort one now since you explain this so well =D
Rumour has it he is still there to this day.
Love this video and happy to see more stuff in this area :)
This is the best one so far! More videos like this!
This is the best video yet!
Just a few months ago we were doing sorting algorithms in my CS course and I was bored enough to implement 20 different sorting algorithms and benchmark them. Here's my top 8 algorithms for 1 mil elements: 1. Bit-adaptive radix sort (can do 200 mil numbers in 2 seconds on my PC) 2. Flashsort 3. Introsort 4. Mergesort 5. Quicksort 6. Shellsort (very simple to implement) 7. Heapsort 8. Smoothsort (look it up, it's really cool).
I never understood merge sort till today. Thanks!
we could compute the time to compute for each method, take the best one before sorting!! what an awesome sorting algorithm!
Thank you so much you're a life saver I was looking at the mark scheme for a question on this and I was so confused
Oh man, you did sorting without going through Quicksort! That's the most famous algorithm in computer science!!! Great video regardless, too bad this wasn't here last semester when I was taking Data Structures and Algorithms.
In fact, "Timsort", which is the built-in library sort in a bunch of languages, is a combination of merge sort and insertion sort (which is kind of a better-organized bubble sort, which does the same number of comparisons, but only O(n) swaps). It's particular worthwhile to have great performance for lists that are either already in order or nearly in order.
there's a russian ballet that actually demonstrates sorting algorithms
For those wanting a lovely visual way of seeing sorting algorithms the appropriately named sorting-algorithms site has them all for comparison. Its problem sizes cant get too large but its one of the best references for various cases I know of.
If he were to explain it with pictures like he did these, it would be incredibly easy to understand. Sure, the concept of a pivot and recursion might be slightly more difficult than these concepts, but that's why we're here!
Computerphile, I subscribed before this video finished loading.
I want to see radix sort, and a discussion of how you can beat the theoretical limits if you're willing to break the rules. (If you have a limited number of values, you can sort really fast by putting the cards in piles by number and never comparing them with each other.) A lot of the biggest improvements in CS come from solving the problem you actually have to solve, rather than the general case.
Ohhh a networking video! Now I'm really excited.
i'M new to programing so looking at the end code with knowledge of purpose and process is very helpful
Wow, this is exactly what I thought you should do next.
How does a computer "know" that 3 is a lower number than 4?
Can't wait for more sorting!
What about a 2^b^n sort? It checks what the range of values for the data type is, generates a list of random numbers equal to the size of the list, and checks if it both has the same values as the original list, and makes sure that it's sorted. It would probably be called the Coincidence Sort.
Great job man. This video was great!
Really really really really really interesting material!
Thank you for posting this, I needed to see that picture. :)
Love the animations. Very cool.
This was nice. Would've been nicer to have had this before my workshop about collections and sorting, but whatever.
Shufle sorts are the best kind of sorts, because at least you had fun!
Thanks! >Sean
Sorting algorithms are tough to make O(n!), but there are algorithms for other tasks that are worse than O(n!) by a long shot. Things like O(n^n) have come up in my own mathematical research (case generation and resolution for a problem).
This was really interesting, I liked this over the stuff such as the "hair algorithm", although I think that you were laying down the base
Bubble sort is a zombie. It will never die, no matter how many times you try to kill it. It always comes back.
At the end of the universe, when the heat death is almost complete, and there's almost nothing left in the entire universe... someone will be there teaching bubble sort. It will never, ever, ever die. Even though it's the most horrible sorting algorithm that anybody has ever devised.
Gnome sort, as far as I can recall, is about making one step back after the switch, and it seems to make more sense than starting from the beginning.
What a great channel! Brings me back to my university days.
Btw: thumbs up for including BogoSort. There's probably only one sorting algorithm that's even crazier: Intelligent Design Sort. It says something like "Look at your data. Some higher being has decided that this is the order you need. Therefore consider it sorted" Intelligent Design Sort uses O(1) time (constant time) in all cases.
Sorting Algorithms are a thing of beauty!
It would be awesome if someone come up with English subtitles for these videos (just like numberphile!). Because English is not my mother tongue and you all got funny accents. Thanks for the channel!!
prevoius video were very good but this is best one yet
Yes! Many databases use techniques very similar to what's used in heapsort. If you keep your heap structure you can search, add and remove log(n) (worst case) time. Well it's not exactly the same way, but it is extremely similar and uses the same idea behind it. As a sorting algorithm it loses out against mergesort in the real world, but the idea of using a tree structure get used all over the place.
Look up red-black trees, they're quite ingenious ;)
Makes a huge huge difference in real programming. The software i am coding has an "industry standard" to meet which is to process 1 million records of 200 fields inside 15 minutes.
15mins = 900 seconds to do 1 million, which is less than 1ms per record.
Shaving fractions of ms off of processing times is incredibly important at times.
Anyone else having problems loading this video? It only seems to be this one. Not sure what is going on. Hopefully the issue is resolved soon since I'd really like to se it.
How come merge sort at n=4 takes 53 times longer than when n=5?
They were talking about this way of sorting in the end of the video (the throwing the cards in the air part). It's called bogosort
I call Radix sort above the sorting algorithms brought up in this video. It works for pretty much all numbers that a computer can use, and also would work for characters (since those can be cast as integers).
Of course, it won't work when you can only use a method that compares two objects at once since it doesn't work in that way, but still.
It's good to see some actual computer science in computerphile. Also, quicksort rules merge sort drools!
Buublesort doesn't rely on a list's last element being the largest. Every time you parse the list (pass over every element), the largest element within the unsorted portion (the left) is moved into the sorted portion (the right). Thus, the series is sorted by ordering values in descending order of magnitude (backwards).
"n" is the number of items in the list. The overall efficiency is the function of "n" that tells you the total number of steps in the algorithm. For Bubble Sort, worst-case it has to swap every pair of items on every pass. Whenever it does, it needs to do another pass. So that's n passes and n swaps each--n^2. Merge sort splits the list in half each time, so the total number of splits is log-base-2 of n. Then it compares each element together for each pair, so that's n comparisons. So n x log n.
This was awesome! Do one on quick sort too! Also do one where you answer the question "How does Google return search results from billions and billions of websites almost instantly?"
Bubble sort will always be relevant, because it's easy to remember, easy to write and relatively quick on extremely small datasets, along with having extremely low overhead.
Thanks for the link, mate.
I was just about to comment about it being called bogo sort, but he got it in! I'm happy now.
Nice video, very educative. Could you post more algorithm explanation videos like this? Thank you
A bit later on after other videos, a video on the for loop and it's OO sibling foreach, and how it leads to spatial and temporal locality in memory, and how it can be unwound to be executed in parallel depending on branching with speculative execution could be interesting. There's easily 5 minutes there, and if you take your time to cover it in some depth with simple explanation you could get 15 minutes that would be interesting ;)
We looked at bubble sorting in the decision maths unit in A level Maths. The different sorting methods are quite interesting :-)
You are correct, it is generally implemented with recursion.
Exactly - it has to be a pretty specific reason. The systems I work on use them to generate formula based pricing, where once price can be based on another price, based on another price etc - the recursion moves down through the price hierarchy - it would have been very hard to achieve the same result without it. Sure can be confusing trying to work out pricing errors though.