Thank you for proposing the challenge Matt! It's been fun! The solution by Landon and myself referenced at the end truly was an effort that involved everyone else already mentioned in some way. That 500us mentioned (currently at 300us now by the way) has its "DNA" from about 10 different people in it. For me, this was as much of a social exercise as it was a mathematical and programming one. Can't wait for the next challenge!
@@Exachad if nobody is using memory mapping, then it can still be speed up. Not all languages/OSs support mapping, but it is the fastest way to get data into memory.
More like "spends an hour writing code to complete a task in a millisecond". The optimizations listed here are extremely natural and are the kinds of things that show up in basic coding interview questions.
So logically, if I spend years, it'll either finish in almost 0 time, or be so fast that I get the answer randomly in the past before I even start coding.
As a programmer, there is nothing more emotionally conflicting than when someone appreciates your code wanting to make it better and then humiliating you by optimizing it to ridiculous proportions
@@camaradearthur3531 It's not about optimizing it, but if like op says, it is optimized to "Ridiculous proportions" you would feel a little bit humiliated, specially if you tried your best
@@lucassXDDD it basically like deadlifting 400Ib and getting congratulated by a dude that then proceeds to deadlift 1000Ib. It hits different then if the dude just lifted 500.
@@lucassXDDD Although the feeling of frustration does increase neuroplasticity. So reading the code of the other person while feeling bad about it makes you learn the other method efficiently.
Now that I think about it... it would be super cool to have a programming "Speed Run" site with sets of problems and then categories for different languages and things with leaderboards... You would have to have rules to make it fair. E.g. Reading from a standard file everyone uses (from RAM?), no pre-processeing allowed, take average of X runs, running on a particular free tier AWS instance hardware type, outputting to screen not disk, etc. You could have folks just upload their code and have the server run it multiple times in a standardized / calibrated / regularly tested sandbox and then post back the time the programmer got onto the leaderboard. Reminds me of Zachtronic games leaderboards.
"You can save days and days of hard work by just spending a few hours reviewing the pre-existing literature" is Matt's way of saying "just read the docs"
I think it's funny because I've heard the opposite saying in other engineering fields. My thermodynamics professor frequently joked that weeks of literature review and model building can save you hours of lab time lol
@@tahunuva4254 That's a lovely way to learn the TIME variable wraps around every 24 hours (e.g. in C64 basic). Similarly surprising, unix gettimeofday doesn't.
Btw, this should be a series. Come up with a problem, solve it inefficently, post it to the channel, and run a speed run competition to see who can do it better. That seems like a fun way to learn programming.
This is a proven method to get an answer on technical forums. 1. Ask your question. 2. Switch to an alt account and give a horribly wrong answer. 3. Profit. Nerds may not care enough to answer your question but they will NEVER miss a chance to prove someone wrong.
Benjamin Passen made it in 50 min on first day, posibly before Matt might even compiled hes code to run it for 32 days, he literaly got results mounth before matt got his own, bruh
@@Mikasey I think you misunderstood. Unless he coded it to run in negative time, it would arrive at the answer after Matt's. Matt's finished running well before June 18th, while Benjamin's started running on June 19th.
@@HelgeHolm ah, now i get it, he made calculations before podcast, he just talked about them in it, i for some reason assumed he did stuff on inspiration after the podcast, my bad
Half my job is "researching the preexisting literature", only to find there was one paper you didn't notice that did what you were doing faster, better, and earlier.
My dad claimed that whenever he had a clever idea and went to do the research two physicists in particular had already published it and he named me in their honor. In my opinion it's more on-brand that when picking a name those two felt "comfortable" and it was only later he realized why they were familiar so created the story after the fact.
@@ericmarcelino4381 then ive probably made it a couple of times... mostly when i have to deal with something so outdated or obscure that basically no one else has to deal with it XD (or my google-fu was insufficient)
I remember someone once telling me that "if you problem is hard enough, you eventually stop trying to compute a solution and you compute how long it takes to obtain a solution." The last time I did that I concluded that my code would run for 12 days. Mentally, I thought, "That gives me 12 days to think of a better approach." By the next morning I thought of a different algorithm that was successful in about 0.1s!
XKCD - number 1205 I used that to show why I wrote a program that converted a boring & repetitive manual process that normally took two hours, into a program that took 5 minutes total (and this was including manually entering the data into the program and grabbing the results). This process had to be done 2-3 times a month, so me taking 8 hours to intermittently write and debug was worth it. I also made it use an external Parm file so it would be easy to update over time.
@@toddkes5890 That XKCD strip comes up at least once a month at work. I maintain a CI/CD pipeline and we're constantly spitballing automation improvements that might cut down on the manual work.
@@ford9501 I feel like a lot of my coworkers violate this XKCD comic constantly. Spending weeks or months trying to optimize something that takes 5 mins of labor and only needs to be done about once a month or every other month.
@@coopergates9680 Actually that manager was decent and just smiled at the improvement. Of course the lookup file had to be updated nearly every time meant it wasn't perfect, so that might have helped out.
At my lab we have a few computers running very inefficient Matlab code to translate .tif images into a awful million line .csv files that our machine learning collaborator needs for his AI stuff. Right now, a 1600x1600x512 volume is processed overnight and we have roughly a 100 of those. It should take roughly a month for the processing to be done. Honestly, I could sit down for a month and learn C++ good enough to make it run in minutes instead, but like, I have other stuff to do and I took the executive decision that we could afford to sacrifice a workstation for that amount of time. It also helps that it gives me something to report on to my PI at the weekly group meeting while my own experiments are failing lmao
@@garak55 Everything about this both horrifies me, and causes me physical pain. My first instinct was to ask why this person would want an image in csv format (and why they outsourced this process to what I'm assuming is the math department), but I fear the answer.
I guess Matt has so many projects and videos and books in various stages of completion that he doesn't care if any particular one is done a month later than it should be. I like to do things one at a time, so having to wait a month until it is done would frustrate me enough to go back and get the program to finish in minutes instead of days or months.
Yes, but you will also find 100 wrong answers to go along with it, because for each genius gracing us with their presence there are legions of dunning Kruger cretins
In order to become that person that's better than everyone else on the internet, it takes seeing every other single person doing that thing, and still thinking that you can do better.
@@ko-Daegu And once you get that job, you'll lose it if you waste your time pre-emptively optimizing without any business benefit. Keep that in mind too :)
I've heard of UA-camrs leaving minor mistakes in their video (e.g. misspellings) to provoke people commenting corrections as an engagement hack. Matt is taking this to another level by releasing optimisable code that provokes so much engagement he even gets a whole new video out of it! Genius move 😅
@@y_fam_goeglyd the title is correct. The number he gave in the intro was supposed to be a multiple, not a percent. He said the same number again a moment later as "bunchofnumbers times faster"
@@japaneserequired6314 That was the implication lol. I would imagine writing this in machine code would not be a challenge with a reward worth the effort, but it would be interesting to witness.
@@japaneserequired6314 Because compilers (no offense) are about 600x better at writing Assembly than you'll ever be at it - even if you started to do nothing else than Assembly programming today until the end of your life. The amount of optimization a compiler bakes into Assembly code has reached such ridiculous levels of complexity (such as division by invariant multiplication and refactoring virtually any math problem as a higher order polynomial) that it's a waste of time to even try competing. Really, you could count on one hand the number of people who even play in the same ballpark as modern compilers and those people usually design said compilers to begin with. You're better off using C or C++ and letting that translate your code into Assembly, even unoptimized C code will be orders of magnitude more efficient than any Assembly code you could ever even conceive of - all thanks to the ridiculoulsy insane amount of optimizations a compiler implements.
@@Finkelfunk Nah. I turned compiler optimisation off and it was only about 10% faster. Admittedly, that was an older version of my code which I have sped a lot by fixing memory handling, but I got a similar result from changing my check detection function (Chess engine) from scanning acrossways the entire board in one pass, to doing a pass only to find the king and then scanning radially from him.
I would argue it's the complete opposite; they aren't thinking outside the box, they're trying to think as far inside the box (aka the computer) as possible. Programmers are translators, the best answer is the one that is simplest to understand, the clearest explanation with no unnecessary fluff, and requires the least prior knowledge. That means less time trying to figure out what you were trying to say, and more time working on the problem.
However you want to call it, I just want to add that translating the problem into a graph and then running one of the many highly optimized algorithms from that field is a very well known technique. Other examples would be isomorphism and matchings. Definetly very impressive though.
@@williamrutherford553 Yes, agree. I call it "machine empathy" when you know where the bottlenecks are, what the CPU and I/O routines are having to do. Often it doesn't matter but when it does it can make a huge difference to run time.
@@andrewharrison8436 When you know what exactly is the least amount of connected processes necessary to make the most effective solution for the most complex problems. We're truly living the most exciting of times.
@@williamrutherford553 I would argue that you are arguing semantics. Having a profound understanding of the environment you work in allows creative solutions. These are certainly creative solutions and are outside the realm of what a programmer would normally do to run something as silly as this. Thus, outside the box.
One day Matt will find a halfway solution to the Riemann Hypothesis just to "give it a go", and then 3 months later a viewer will have solved completely to spite him
Python is slow, but not THAT slow. There was a pure-Python solution with about 60 lines of code that took 628 ms, which I was able to reduce down to 121 ms with Numba on an old Core 2 Duo E7500 without parallelization. Usually the bottleneck is the algorithm, not the language used.
@@silverfire222 There's a link to a spreadsheet in the video description with a bunch of submissions. You can filter by language and sort by execution time. It has links to the source codes too.
Matt is a stand up guy. His effort was thoroughly and effortlessly destroyed by a hundred hobbyist and full time coders, and he turned it into a math problem to teach people. Truly a legend
competitive programmers will litterally spend months improving an obsqure and functionally useless program that nobody will ever need to use irl by 0.0000001 milli-seconds
Just know that these solutions are actually pretty elementary for programmers and are the type of thing that would show up on an interview for someone fresh out of college.
@@namef Actual competitive programming (or at least a super common modern take on it) is more like spending anywhere between a minute and an hour on a problem (that you have anywhere from no clue to an entirely memorized a solution for) until you either give up because there's an algorithm you haven't read about yet or get code that works and runs under the time requirement. Then forgetting about it and moving on to the next problem - basically going for quantity over quality. But this sort of incremental optimization of a single problem against other peoples solutions sort of like speedrunning I think is way cooler tbh. If you were referring to a specific competition where this sort of quality approach occurs please fill me in.
Im pretty sure desmos could do so with ease. No idea about getting a ti80 series to do it. Most of the time would be spent reading words from another computer and compressing them into an abstract set of bits.
You would be surprised how much your recreational math helps out coders. Sometimes we just need to look at code from different angles and every little bit of novel mathematics we discover improves our flexibility. Thank you Matt.
For me, the part starting around 9:25 describing the differences in algorithms was the most interesting part. As a general rule, if you can simplify any problem to bitmasks and bitwise operations (especially and, or and xor), it will run really really fast on practically any CPU. And it turns it applies to here, too.
In 2007 I was developing a service for a major multinational mobile operator. My initial algorithm took 22 minutes to run for places in downtown Athens (very high cell coverage) After a tiny improvement in the code and changing the way the data was stored in Oracle spatial for the same coordinates it took 800 ms. Needless to say I was very happy that day.
@@jasonb.9790 The competition tried to create something similar when the service was launched. The main competitor CosmOTE paid 700000 Euros to a Finnish company to make something similar. My problem was I couldn't load cell coverage rasters into Oracle spatial (10gR2) and I had to use vectors (shapefiles). If you do point in polygon to shapes that have 60000-100000 edges (cell coverage) the built in Oracle spatial point in polygon algorithm takes forever.
@@barongerhardt yeaaaah ive seen that as well, though the improvement are usually on the scale of 3-4 orders of magnitude (i think the biggest ive seen before was 6)... this was improvement of 9 orders of magnitude... thats kinda crazy... though to be fair, our code and datastructures generally dont start that bad
I remember years back of hearing a story about someone who got a job to implement a saving system for a long running process (like a week or so) because it sometimes broke during execution - sort of like saving Matt's progress every day or so until the program had finished. Instead the programmer spent a bit of timedays optimising the application to make it run in under an hour. Making the thing run 100x faster meant the whole saving was unnecessary, and was instead transformative for the business as they could get results so much faster. Now it wasn't running billions of times faster like in this situation, but it's still a great story that had actual impact.
@@MSheepdog 36 hrs down to 2hrs:10min... got the customer to sign off on releasing a held payment of $14M in 1985 :) We were happy to acknowledge receipt of that check - the CEO sent me a nice memo for my part interfacing between customer and a college intern (I pointed a lot of potential improvements to him) ...
I remember in my first year of applied computer science, we got the task to find all prime numbers between 0 and some number. At first you write code and are so happy seeing it finding the right answers. We got that same assignment several times throughout the year. It was so fascinating seeing the improvement in coding and algorithms to get that same task done in just a fraction of the time it took you the last time.
_It doesn't matter how fast you get the wrong answer!_ /s Sadly too many programmers are ignorant that there are 3 types of optimizations (from slowest to fastest): * Bit-twiddling (focus on bit hacks and other clever optimizations) * Algorithmic (focus on O(n) complexity) * Data-Oriented Design (focus on minimizing cache misses) Computer Science typically focus on O(n) complexity. It is an OK place to start but a bad place to end. i.e. Two algorithms can have the same exact O(n) but in practice one can be 16x times slower! Andrei Alexandrescu gave an interesting talk _Sorting Algorithms: Speed Is Found In The Minds of People - Andrei Alexandrescu - CppCon 2019_ where invented a new sorting algorithm (!) along with showing that the standard way we measure sorting performance is incomplete (!!). We should be measuring sorting with (C(n) + M(n) + kD(n))/n where Everyone should start with a (slow) reference version. The act of writing it helps you understand the problem. Then you can start applying optimization techniques. For example my first version of this solver took 20 minutes. The second version took 2 minutes. The third version took 2.5 seconds.
Step 1 in my process for solving algorithmic challenges is "Does Knuth have a solution for that?". A lot of the time the answer is YES. Absolute legend.
Back in the 80's I wrote a program in BBC basic to allocate pupils to classes. At 'A level' there were 5 'bands'. p,q,r,s,t. Band p for example could have subjects maths, physics, chem, etc. There were about 7 subjects per band. A pupil would choose 3 or 4 subjects, and the job was then to see if there was a band that had all their choices. Ideally all 220 pupils would be accommodated. If there were too many failures then the subjects in each band could be rejigged. Also, the numbers of pupils allocated to each band should be roughly similar. I wrote the program for 8 bit BBC B micros. It worked but took a few days for each run. (BBC basic is an interpreted language and very slow.) Then the 32 bit Acorn Archimedes came out. I tried the program on it and it was not much faster. I then had one of the few genuine flashes of inspiration in my career, and realised that instead of subjects being represented as strings, each one could be just a single bit in a 32 bit word. I rewrote the program and tried it. It ran in 1 minute 30 seconds. Bitwise ANDing Matt called it. I then used that technique in many other programming problems. It works!
I think we need to switch units here. In the time the program ran light traveled less than 150 kilometers (in a vacuum), less than the length of the M25 around London
This is my favourite video you have ever released. Please do more of this type. It is super interesting, even if people like me have no coding experience, your explanation is so clear we can understand it.
This is why the open-source community is so valuable to humanity. Seriously. Great job guys! There is always a smarter geek! I seriously love this community. It isn't about being the best, it is about learning from the best -- and there is plenty of best to go around in this community. In my profession (software engineering) there is a common expression "standing on the shoulders of giants." It basically means that whenever I write an amazing piece of code, a lot of the inspiration for that code rests in the accomplishments from those who came before me. And if I go my entire life and only find one or two new novel ways of doing something, the only thing I would want is for my accomplishments to be used in future code and made even better.
IMO its rarely about a 'smarter geek' - as there are heaps of people of similar intellectual capacity involved and the best method concepts may well come from the 'stupidest' among them. What is magic is the collaboration and speed of evolution that creates. It means that concept from our 'stupidest' -- the artists/linguist types with no mathematical skills at all can have good ideas nobody else in the 'room' has come up with and possibly never would - different perspectives can be powerful thing. And that person's idea can be picked up and polished and evolve rapidly through the iterations of the more skilled number theory and computer science types.
Open-source works best when there is a clear specific problem with a "known" optimal goal outcome, like in this video. But open source development kind of breaks down for things like "what is the best user-interface" or "best workflow" which are more vague and up to interpretation. I mean look at something like GIMP :\
I am greatly impressed with how unprecious Matt is about his Python code (which was completely correct, just computationally expensive) and how delighted he is by the community response
"computationally expensive" - I love this phrase, sometimes maths and coding terms can be just so satisfyingly descriptive and tickle my brain 😁 Also, agree with your comment!
As a C++ guy, this makes me want to find a way to do this entirely at compile time, leading to a (runtime) speed of however long it takes to print 500 words.
I was going to say you couldn't [portably] read the input file at comptime so it wouldn't be a valid comparison buttt depending on the input format maybe you could #include it directly... if it's comma separated that would totally work
Feel free to change the metric so that you measure both compile time and runtime. Then you can compete, and you will find that your solution will perform (relatively) poorly -- the preprocessor is not known for being fast.
My favorite part is how Benjamin says in the spreadsheet "Just for reference! I did not actually run this implementation." Basically couldn't be bothered, only an insane person would run this lol
I'm actually a bit curious to see how long it would have taken, if he'd run it. Matt's run was done, iirc, on an old laptop; it's quite likely that Benjamin's testing hardware is significantly faster. (It would probably still be so slow that it's not worth the bother, but an interesting thought none the less.)
It's fair enough for Matt to say his code was "good enough" when it turns out it wasn't buggy and gave the right answer; if it took 30 days to crash or give an obviously wrong result (very possible) it would be a different story. Personally when it didn't finish after 5 or 10 minutes I would have intervened haha, computers are stupendously fast these days!
Indeed. The optimizations are impressive, but the most impressive thing to me was the fact that Matt had the patience to wait a month for the computer to give him an answer😁
Thank you for the video of calculated leadership on creating improvement(PCQI). Aside from achieving great results, you are plotting the moment to a details channel that opens the path cross the next step in problem solving/improvement. Hope for a good 2023 for all.
@@iantaakalla8180 Here's a little challenge where Matt should know a : // A 5 GHz CPU will take over 48 years to produce a result uint64_t a = 0, b = 1, n = 7678603279843596748 >> 1; while(n--) b += a += b; // {no overflow checks} uint64_t s[] = {a, 0}; std::cout
I love how reminiscent of Alan Turing's work on the Enigma this is. A bunch of people working together to try and optimize the solution to a linguistics problem through a mix of clever computer manipulation and even more clever maths. Absolutely brilliant!
@@namef But not just a competitive environment - an open and friendly competitive environment. I work in a closed competitive environment (working for a corporation competing against other corporations for market share) and much of the state of the art is stale because everyone has to first learn each other's previous mistakes and current best approaches. We even know that some of our markets are relatively safe because the cost of somebody entering it (learning all that we learned the hard way) is simply too much to do against an existing competitor. In an open, collaborative environment like this where people share their methods and learn from each other's mistakes, that's where progress is made. That's what science, academia, and even patents were originally about - bringing knowledge into the public sphere where others could work with it and learn from it.
My favorite part about this is that one of the people who made one of these improved versions is Stefan Pochmann, the inventor of one of the most prolific methods for solving a rubik's cube while blindfolded. You sure have quite the audience!
Ah, of course, he's one of the biggest names in the field of solving a Rubik's cube while blindfolded, something that I totally knew existed before just now.
18:50 "Very rarely will Python go into production". As a Python programmer, having worked in 3 of the most major animation studios in Europe, I can attest that Python code does go into production, and it can be just as reliable as other languages, if not even more because of its readability, making it easier to understand for the people manipulating the code (the developers). We use it to build plugins, to automate repetitive tasks, inside the software used by the artists, such as for instance opening and saving the scene in the right place in the complex folder structure. But I do understand where you're coming from. If you're working on a piece of code that needs performance, one example in our area would be the rendering engine, then you need a lower level language such as C/C++, which are indeed optimized for performance.
Very well put. For high-performance computing, the cycle is as Matt described, but the vast majority of software development is not high performance computing. Your machine learning framework might be written in Rust, C or C++, but you interact with it using Python because it's easier.
Just a thing Matt, if you ever read this, great video man! I really appreciate your passion for these things! And it's a great exercise you've sparked, and it's nice that you took the time to compile the results for us in this video. I'm saying this because my comment was a bit easy to make, I feel a bit guilty about that. Hope you don't mind too much :)
I love how I can be in one corner of UA-cam where people are taking about computer programming, improving code efficiency just for fun, 30min dives into computing and mathematics… And then I can click away and watch a guy whose entire channel is dedicated to taking the longest bong rips humanly possible. It’s nice, I like this place
As a programmer by trade, I want to push back a bit on the "python isnt for production and lower level languages are more robust" Python is a very safe language in the sense that you won't get memory leaks or segfaults, unlike C or C++. Where you do run into problems (with large applications) is type mismatches and silent failures that always plague dynamically typed languages. This is a big reason why most applications these days that need to be somewhat performant but more importantly need to be very robust, are written in C# or Java, since those languages are both memory safe _and_ type safe. Side note: Rust also is memory safe and type safe, with the speed of C, which is why its caught on so widely and quickly in the last decade. The main sacrifice that Rust makes is that its compiler is very strict, so its harder to learn and harder to program. But that's worth it for many people in many applications! Python is also used in production all the time for what it's supposed to be: a scripting language. Python scripts are great for setting environments up or automating file manipulation. And perhaps the biggest thing of all is that python is a great tool as an orchestrator of highly performant C libraries. Most famously this takes place with machine learning frameworks, where you can write python code that leverages CUDA and C libraries to do incredibly efficient and incredibly parallel computation. Loved the video!
Python is also great as a "glue" language between APIs and databases. Think integrating multiple services between different companies, and providing new, combined products.
@@andrew_ray If you haven't tried it yet, you should check out Haskell! The type sytem is so great that it's often said that "if it compiles, it works".
I once inherited code that was partly a SQL stored procedure (previous programmers enjoyed using cursors where-ever possible). It was fine for a few categories and detail sections on a database with only 10,000 records and looking at a regional subset of said data. Over the years (10 years later), it ballooned to something like a dozen categories and detail sections on a database of over 100,000 records in the main table (with literally millions of records elsewhere), searching for ALL data instead of regional. This code took well over 24 hours to run (Me: "Whhhyyyy????? Why would you use the report in ways for which it was never intended? lol") - and would time out during peak hours. First iteration, I got rid of the cursors, and did a clean main query/subquery temporary tables (the cursors used the same (15-20+ table) queries 13 times with 1 or 2 criteria changes), and got it down to 30 minutes (2 hours in peak hours) before removing the cursors, and maybe up to 5 seconds after removing them. The next bottleneck is that the detail code was being run 3,000 to 5,000 times separately (the entire render of the report page took 1-2 seconds per each 10 millisecond long query). "Ah ha!" (The previous reporting tool we used rendered reports a bit differently) I then thought, "Fine! I'll just load the two queries into my compiled .NET DLL code and feed the datasource as needed!". Report run time was now 2 or 3 seconds (that's the shortest time it takes any report to render, regardless of querying time). To this day, that refactoring is still my favorite.
Congratulations! That's a great result. Also, congratulations on having managers/executives willing to let you spend time on that (even if it IS your actual job). Most companies would be "eh, it's annoying but it works, and it's not worth spending time/money on improving it."
Then you spend another week or two deciding how to roll out these improvements, so your 24 hrs > 2 seconds improvement happens over a 10-20 year career. E.g. before each performance review, knock another hour off the run time.
@@falconerd343 "If the piece of QWERTY does what it's supposed to, no matter how cumbersome it is to use, and how much system resources it gobbles up, it ain't broken, don't fix it."
I believe Rust/c/c++ can all reach the same runtime performance for any given problem. High performance code in either language is often written by consulting the generated assembly and tweaking the code until the desired assembly is reached. The difference is going to be in development time, maintainability, safety, compatibility, etc.
The task of comparing the presence of letters in different words is a perfect fit for evil bitwise hack languages like C, though. Rust's whole thing is that it's the fastest memory-safe language. Meaning that its performance is not as important a feature as its reliability even in very large codebases that took a long time to develop.
I was perusing the spreadsheet and realized that the main difference between the best two solutions (Rust vs C)was that the C one was parallelized. Which is kind of hilarious given that Rust makes parallelizing generally easier lol. Then I took a look at miniBill's Rust solution and he seems to be preparing to parallelize his code, so maybe we'll catch up!
I definitely think “gunpowdery blacksmith” is more satisfying than “showjumping veldcraft”. I also would guess that “veld” and “fjeld” are etymologically related, which makes me wonder if you can have a “vord”. EDIT: I do concede that "showjumping veldcraft" describes The Lion King.
_veld_ and _fjeld_ aren't etymologically related, with the former coming from Proto-Germanic ∗felþą, and the latter coming from Proto-Germanic ∗falisaz.
Actually that bitwise part is the exact equivalent (in abstract sense) of your Python code - it just uses bits as a representation of a set - it uses ones to flag which elements of the domain belong to a given set (which works really well in that case cause domain has so few elements).
Yeah, I supposed converting a word into an alphabetized binary representation could be a hashing algorithm. It's just not secure, which isn't a problem for this application.
And it's inherently parallelized, since the CPU can perform both the bitwise operations and the zero check for 32 or even 64 bits in parallel, which is only possible when using a bitwise representation and native CPU instructions.
The fact that your viewers found a new solution to an old problem makes me wonder: Why not pose an old problem once a year, which hasn't been solved yet? A problem which you find interesting and would enjoy creating a video about. Who knows, with the kind of talent I saw today, your viewers just might answer an age old question. Perhaps ask the channel which sorts of things they'd like to learn more about that has a component which hasn't been solved yet. Great job as always, Matt; I look forward to the next one.
2 роки тому+38
Just drop them as homework. "... In statistics, Dantzig solved two open problems in statistical theory, which he had mistaken for homework after arriving late to a lecture by Jerzy Neyman." en.wikipedia.org/wiki/George_Dantzig
@ that reminds me of the guitarist Buckethead hearing a tape guitar virtuoso Shawn Lane had made by splicing together snippets of audio, inspired by Conlon Nancarrow's piano roll compositions, and then very nearly learning to play it, thinking that Shawn Lane had actually played it straight.
@@nokenwari If we can crowd source a solution to racism via 'putering then you'll make a lot of people very happy - for the five seconds necessary for humanity to find something else to be bigoted about. In the meantime what about the four colour map problem?
19:09 "That's just the production life cycle of code" I can attest to this. Years ago we tried converting a decade worth of customer data to a new system using a 4GL programming language (kind of like Python). It was estimated to run at least 30 days, provided nothing went wrong. Then it was re-written in SQL to do it all on the database, which ran in about 4 hours. Still long, but obviously much better.
4:30 Don't feel bad that you can't beat experts in their own field. Pat yourself on the back because you have the ability to valiantly participate in *_every_* field.
Crowsourcing optimization on slow code sounds like a fun competition. This is a really fascinating video. It's crazy just how fast people can make code run.
crowdsourcing and competitions to find more secure, faster, error free code, etc. world could be a better place. Guess this is what they call GNU on steroids. If only it wasn't so fractured. Or at least a cycle of consolidation should be added, so 1 distro ---->2000 distros ----> 5 distros. Hmm. sounds like the general function of neural networks...
i love how this went from a little joke included in the video to an informal contest within the community to now a lesson on benchmarking, statistical significance, and oddly specific performance optimizations this is a fantastic channel and i'm glad to be along for the ride now the real question is can we golf this? aim for smallest source size that solves it in under an hour or something?
Check out Benjamin's spreadsheet. There's a python solution there by Stefan Pochmann which is 18 lines of code and runs in a little over 2s on my system. Truly amazing stuff.
I feel the pain. I had an assignment in university to fill in each side of a cake with decorations(5rectangles filled in as perfectly as possible with other rectangles). The codes initial runtime was in something like 10million "universe heat deaths" and I got that down to sub 1 second.Now that's an improvement.
Yup. I like to keep my notes well organized, so I type up everything I can in Notepad using 12 pt Lucida Console (Courier is acceptable too, but not my first choice).
It would be very interesting to be able to graph not only the time each program took to find the answer, but also the time it took them to write it, and see how they correlate.
Getting down to fluctuations in caching is where it gets good, Matt! Time to come full circle with an algorithm which is possibly faster, but has to run repeatedly for 1 month before thermal RNG cooperates.
This reminds me a bit of the "impossible" performance that came out of the Commodore 64 in the 80s. Due to the exeptional popularity, people started using it in was that tweaked some crazy performance out of it. And when some started, it became a challenge for others and suddenly it did magical things.
Yes, I don't think any hardware has ever been (and continues to be to this day) pushed so far beyond its originally intended use as the C64. The demo scene is probably responsible for most of the techniques.
Matt does does this so well. He plays with math(s) oddities in a way that invites and welcomes collaboration, and then brags about the accomplishments of the collaborators. This channel is a superb example of motivating people in an entirely positive way, and should be required review material for all teaching and management curricula. Obviously much of learning is by rote and through tedious repetition, but examples like these "useless" problems presented by an enthusiastic evangelist who presents it as a cooperative opportunity give a refreshing reprieve from the more normal "Here is a problem, now work out the preexisting correct answer and show the expected work" Thanks Matt.
I am very anti-Python but you sold it short. You will definitely find Python code in production. And there are ways to make Python safer like using mypy for static typing.
@@DadundddaD They mean it should be #1 on UA-cam trending in the gaming tab because this challenge was about speed running a problem solving task. It was a joke.
I think it's interesting to see the rapid development of improved code, and how long each program took to run after the amount of time passed from the original piece of code. It's kind of similar to Moores law. If you show the world inefficient code, theyll make it 40,000,000,000 times more efficient in 2 days. I think we'll call it "Parker's law"
is parkers law, when you are giving it a go but your solution is so slow that someone else can learn about the problem, improve the code and run it to get a solution before your code even produces a result?
5:30 This part about having to start calculations while still importing data from the text file made me think of the wacky ways I've tried to optimize Zachtronics puzzles
It's very true. The software I write is all in Python (with some C extensions where needed for performance). A lot of it is taking large data sets (some over 2TB) from a network file server and parsing them, modelling them (ML), and producing more like them (interpolation). In many cases, it takes longer to read and write that data to the server than it does to actually parse, model, and generate new data. Pipelining and parallelizing are very important in this kind of data processing.
Matt! I see you wrote your own while-loop to find combinations. You should know that the standard library already has a function for that in the itertools module. I didn't know that for a long time, so i ended up writing my own general implementation to reuse whenever i needed. Turns out most of the things we do have already been done better, but i still had a lot of fun :)
I love your channel, I think something that could improve your visuals would be to use a fixed width font so that digits line up, it was especially noticeable in the segment where you were lining up binary numbers for the bitwise and segment. Keep making great videos! Thanks!
I saw this unfold in my recommended list and have been looking forward to your reply, and you more than delivered! Would've been interesting to see some Big O comparisons perhaps. Lovely video!
I can entirely agree with you. I always enjoyed maths in school but ended up studying social sciences/business in uni. This channel (as well as all the numberphile/all the other phile's) helped me get back into maths and start getting into programming. They really do help make it fun and remind me of the beauty and challenge of math 😁😁
Great video, but it's not true that Python rarely makes it into production (mentioned at 18:50) Python is incredibly popular in production for web backends. Very rarely is a web backend doing something that is compute limited, but they do have very complex always changing business requirements. For this reason, Python is a great choice as it's just faster to write than C or C++. The primary cost of a web application is developer time, not compute resources, so most projects optimise for the former. Since python is an easy to use general purpose language, lots of people learn it so it's also quite easy to hire developers for (compared to GO or Rust for example). Since web backends generally do highly parallel processing of very simple tasks (eg you want to handle 10,000 people posting a new comment), it is very easy to scale that horizontally by just having having multiple servers processing requests. Generally the bottleneck in web is the database & network speeds, not compute time. So switching from Python to C++ gives minimal speed benefits except for in extreme cases. EG. Imagine it takes 3 seconds to post a comment to UA-cam. A fairly accurate breakdown would be: 2 seconds of that is network latency. 0.8 seconds of that is the database, 200ms is the Python. Sure, if you highly optimised and switched to C, you could maybe get it down to 5ms, but even though you sped up your code by 40x, you only changed the overall speed from 3s -> 2.805s, less than a 10% increase of overall performance. One final thing to mention is a lot of python is actually C under the hood using extremely optimised algorithms, so it is usually pretty easy to get very good performance if you are doing things that have already been solved. It's only when you need to do something completely unique mathematically that you run into performance issues that can't be resolved in Python. Anyway you make amazing content, I just wrote this up so that people don't get discouraged from learning Python, it's absolutely used in production all over the world and is a very useful and desired skill to have for Web, Machine learning, Mathematics and almost any field of programming.
before even clicking on the video my response was "Well that's not terribly difficult. All you have to do is make the code absolutely terrible and anyone who knows anything should be able to spruce it up quite a lot" but what actually happened is actually quite impressive
And before this, making the fox and dog obsolete, was the alcoholic "Pack my box with five dozen liquor jugs." Which, all things considered, is pretty good - 10x6 bottle diameters is a pretty reasonable box shape (especially when material strength and manual handling are considered).
It's very cool to me that people like you can take papers that scientists published and explain them in this format and have them become easily available to the general public. And like this videos example, it really becomes an extension of the experiments/findings of the original paper/idea
A lot of programmers (especially/specifically web devs) tend to throw even the smallest performance improvement out the window in order to maintain """readability""" It sucks cause performance optimizations are really interesting. plus the fact that code can be both efficient and readable
@@namef I have not made any web stuff. Only calculations and engineering stuff. But what I was trying to say was that i love writing the code, checking that it works, and the optimizing the code to run faster. But at some point the optimization becomes tidies (when writing the code) and then it ceases to be enjoyable to me to write the code. Sometime you still have to do it, if you need to run the code a lot of times or if the code take a lot of time. But if you need to run the code once and it takes 30 s, then there is no need to do something that I do not Enjoy. But if the optimization is enjoyable to me, I will still do it on a code that only needs to run once and take les than a second, just because I find et enjoyable. (cause I love optimizing code 🙂 )
@@namef Surely the smallest performance improvements are the first ones that should be thrown out? Oh, you mean the smallest changes to the code, not the changes that produce the least improvement! Someone really should come up with a less ambiguous language and figure out how to get people to use it (and a way of maintaining that unambiguity through natural development of the language...)
@@namef as a web dev, i do spend my time thinking about optimization when i can, because I enjoy it. But "Easy To Change" is really more valuable to the business than microsecond soeedups for the end user. As much as it pains me to say it, it is only really a good use of dev time to optimize well for known bottlenecks. (Arguably arguably clean, simple, readable code will more or less coincide with algorithmically efficient code anyway. So far the user facing bottlenecks I've seen have been embarassingly sloppy to begin with)
The second I saw your code running in 32 days I knew it was gonna be bogged down to seconds, didn't expect the parallel bit adding leading to microseconds
It's incredible how dramatically extreme (and even simple) optimizations can have on efficiency of code and algorithms. I used to be heavily into this kind of stuff during my computer science degree (Not so much since I got a job actually programming)
The quick brown fox.. was actually used in teletype communications a lot. It tested both the baudot code papertape reader at the sending end and the receiving end papertape puncher for proper operation. The human operator would see at a glance whether the transmission was received ungarbled. Back in the day this type of communication took place over HF analog radio
Always fun to do stuff like this, in undergrad I would pretty often write a crappy "it works" version of code and then try to write a more optimized version before the first one finished. I had a pretty good track record of getting code that would have taken ~10+ hours down to more like 10 minutes. In fact I generally don't have a lot of patience for any code that's going to take more than an hour or so.
Generally that's how it is supposed to be. Micromanaging and doing performance analysis with unfinished code is impossible, so first write the code, test it, find the areas which need improvements and then improve them. This is a guilty pleasure of all beginner programmers. A lot of people will just think about the performance of an operation way too much when the function or functionality doesn't require such things, nor is it complete. Having an algorithm is fine, but changing it multiple times while working on it to further "improve" the performance, is often, a waste of time. This is not to say to go at it without a plan, however, like you said, first make a working version, then improve it. Now you can test it and actually see the difference your changes are making, which will then give you certainty of wether or not you need to improve performance.
Its quite telling of the general competency of society that there are so many skilled mathematicians and programmers that this puzzle was crushed for no reward. It's kind of incredible what happens when there is a public leaderboard.
Thanks - that was great fun. I'm a fan of code optimization and often we found software optimization of 1 or 2 orders of magnitude was pretty easy to achieve. This story ticked me over the edge to subscribe.
Mathematicians wrote NumPy and SageMath so they apparently like Python as well. Personally I love a functional language with minimal syntax. Like I love LISP. It's a personal problem.
I would like to correct the mentioned "python is not for production" sentence, as I am an Engineer who puts python into production :), and that is perfectly fine. The key insight is: what is the goal? If you want to have speed in your production code, you definitely want to use C/C++/Rust, Go, perhaps Java nowadays, something that compiles and is optimized for speed. But if you want to have a piece of code that is not required to be the fastest, and the key point is that it must be just good enough, but requires that guys like Matt could even touch it and put things inside, python is perfect. It is perfect to write tools, helper scripts in python that can go into production, such as we do it for 15 years. I'm sure this is also why python is so popular, not because so many people prototype things :). Matt wrote a piece of code, ran it, waited a MONTH for the result and then said python is not for production, while I am sure the same winning sub-second code can be rewritten in python and would perform MUCH better than the original naive implementation. I'm sure Matt would be able to write C++ code for this that would run for a month :D. It was not about chosen languages mostly, but the poor choices of algorithm and data structures. I know, the same code in C++ will be much faster than the same code in python. But not by this margin. I was just wondering that if Maths is also about finding a solution efficiently (O-notation, P/NP problems, etc...), how was this MONTH-long execution acceptable for Matt? :D. After all, he is a mathematician. And imagine, after a MONTH, it turns out your python script contains a syntax error at the printout_result() function :). Or imagine that after 25 days of execution you spot an error in the implementation. I'm sure he tested it with shorter words though. Good video apart from this as always, enjoying the channel :).
I agree. Additionally, production codes often combine several languages. For example Python is popular in machine learning, but the low-end routines in these Python libraries are often implemented in a lower-level language like C++
"but requires that guys like Matt could even touch it and put things inside, python is perfect." 🤣 "I'm sure Matt would be able to write C++ code for this that would run for a month :D" -- Ouch! rofl
"I was just wondering that if Maths is also about finding a solution efficiently..." Is it though? I was of the mind that Math generally is more concerned about clever/insightful symbol manipulation to arrive at a solution much more so than it is about said solution's computational efficiency. Your examples are in fields of math that are the closer end of the spectrum to Computer Science and are specifically about applying math _to_ the problems of determining efficiency (O-notation) and completeness (P/NP), so I don't think that they are a good indicator of the concerns of Math and mathematicians as a whole. Computer Science, on the other hand, is very much about efficiency as a core concern, since the whole point of it is to apply the algorithms to actual machines actually processing the computations. The major example here is the "store the words as an unsigned 32-bit integer and use bitwise-and to compare them" strategy that is at the core of all of the best solutions. That is pretty much entirely a Computer Science/Electrical Engineering insight that relies on one's knowledge of how computers work. That said, I agree that maybe a mathematician is not the best person to discuss the readiness of a programming language for production or prototype usage, so I agree with you on that point.
Thank you for proposing the challenge Matt! It's been fun! The solution by Landon and myself referenced at the end truly was an effort that involved everyone else already mentioned in some way.
That 500us mentioned (currently at 300us now by the way) has its "DNA" from about 10 different people in it.
For me, this was as much of a social exercise as it was a mathematical and programming one.
Can't wait for the next challenge!
What I want to know is how the file read time was reduced by more than a factor of ten...
A great example of open source success!
@@rmsgrey Multithreading
Mathematics and programming *are* social exercises! :)
@@Exachad if nobody is using memory mapping, then it can still be speed up. Not all languages/OSs support mapping, but it is the fastest way to get data into memory.
Coder: spends a month writing code to complete a task in a millisecond.
Parker: just runs code for a month.
Result: a tie!
Meanwhile, Web Devs spend too much time coding AND executing…
More like "spends an hour writing code to complete a task in a millisecond". The optimizations listed here are extremely natural and are the kinds of things that show up in basic coding interview questions.
the guy who solved it in a 6.7 miliseconds probably solved it in a few minutes the first time, this was just to flex on how quick he could make it.
So logically, if I spend years, it'll either finish in almost 0 time, or be so fast that I get the answer randomly in the past before I even start coding.
Considering someone got to 900 seconds within one day, I belive the win is clearly on the programmers side.
As a programmer, there is nothing more emotionally conflicting than when someone appreciates your code wanting to make it better and then humiliating you by optimizing it to ridiculous proportions
If optimising your code make you feel humiliated you have a bad placed ego dude :( It should be amazing
@@camaradearthur3531 It's not about optimizing it, but if like op says, it is optimized to "Ridiculous proportions" you would feel a little bit humiliated, specially if you tried your best
@@lucassXDDD it basically like deadlifting 400Ib and getting congratulated by a dude that then proceeds to deadlift 1000Ib. It hits different then if the dude just lifted 500.
@@writerman2934 Great anology!
@@lucassXDDD Although the feeling of frustration does increase neuroplasticity. So reading the code of the other person while feeling bad about it makes you learn the other method efficiently.
I like how this is basically a speedrun category now
"I just broke the 100% glitchless set-seed parker-word-search wr speedrun!" - competative programmers
I'm looking forward to the Summoning Salt video on this
Parker%, I love it
Now that I think about it... it would be super cool to have a programming "Speed Run" site with sets of problems and then categories for different languages and things with leaderboards... You would have to have rules to make it fair. E.g. Reading from a standard file everyone uses (from RAM?), no pre-processeing allowed, take average of X runs, running on a particular free tier AWS instance hardware type, outputting to screen not disk, etc. You could have folks just upload their code and have the server run it multiple times in a standardized / calibrated / regularly tested sandbox and then post back the time the programmer got onto the leaderboard. Reminds me of Zachtronic games leaderboards.
And that's where the record stands.
The way to get answers on the internet is not to ask for answers, it's to provide the wrong answer and wait for people to correct you.
Reverse Engineering and Reverse Psychology. Going backwards is always the answer.
Trite
This is Karatsuba's law.
@@asheep7797 No, this is actually Murphy’s law
LOL this comment thread is so funny because it proves this law correct.
It's actually called Cunningham's Law
The smallest length of time that it is possible to run that code in, shall in the future be called Parker Time
I second this motion.
@@WarrenGarabrandt I microsecond this motion.
I picosecond this motion
I femtosecond this motion.
I attosecond this motion
Seeing the 4,000,000% I immediately thought, "Typical exaggeration. Oh, Stand-Up Maths. It'll be an understatement."
Understatement by quite a bit, it’s 4,000,000 *times* faster, not % faster.
@@weirdboi3375 no, it’s 400 MILLION times faster, not 4.
40 BILLION percent faster, just insane
It’s possible, if your code runs on very low level and his codes takes hundreds of milliseconds
@@zainahmed4172 bro his code took like a month lmfao
"You can save days and days of hard work by just spending a few hours reviewing the pre-existing literature" is Matt's way of saying "just read the docs"
Impressive how he litterally just called out every first-year computer sci student with only one sentence
RTFM
RTFM
@@namef calling out? Finding solutions on your own is a very important skill to have, even if they aren't optimal.
I think it's funny because I've heard the opposite saying in other engineering fields. My thermodynamics professor frequently joked that weeks of literature review and model building can save you hours of lab time lol
Let's be honest, the record here is that he wrote a program that took 20 days to run. That wasn't easy, not even in Python!
No, writing a program that takes ages to run is easy. Having confidence it will finish successfully is the hard bit.
32 days if to be acurate
@@0LoneTech Properly implementing Bogosort is actually really difficult.
@@0LoneTechwhile (true) do
if(time > age) then break
end
And set age to however long you want it to run :P
@@tahunuva4254 That's a lovely way to learn the TIME variable wraps around every 24 hours (e.g. in C64 basic). Similarly surprising, unix gettimeofday doesn't.
Btw, this should be a series. Come up with a problem, solve it inefficently, post it to the channel, and run a speed run competition to see who can do it better. That seems like a fun way to learn programming.
I would love that
I would watch this show
upvote!!!
Cunningham's law applied.
This is a proven method to get an answer on technical forums.
1. Ask your question.
2. Switch to an alt account and give a horribly wrong answer.
3. Profit. Nerds may not care enough to answer your question but they will NEVER miss a chance to prove someone wrong.
Hands down best part of all of this is finding that Matt's definition of "efficient enough" is just < lifetime of the universe
... when it should be, at most, less than the (remaining) lifetime of Matt. Otherwise, no podcast. (-:
Hahaha good enough for a mathematician, he just needs the answer and then the code can be deleted after all, not really worried about performance
I guess he's not concerned about being alive when the computer finally finds the answer.
True mathematician spirit... between now and the end of the universe...
"I have proven a unique solution exists."
"what is it?
"irrelevant details!"
This is the craziest Tool Assisted Speedrun progression I've ever seen.
Fr
The moral of the story is: No matter how hard you optimize, you can't get to the answer faster than Matt because he already finished. :)
Benjamin Passen made it in 50 min on first day, posibly before Matt might even compiled hes code to run it for 32 days, he literaly got results mounth before matt got his own, bruh
@@Mikasey I think you misunderstood. Unless he coded it to run in negative time, it would arrive at the answer after Matt's.
Matt's finished running well before June 18th, while Benjamin's started running on June 19th.
@Helge Holm he run it next day after podcast, june 2th
2:10
@@HelgeHolm ah, now i get it, he made calculations before podcast, he just talked about them in it, i for some reason assumed he did stuff on inspiration after the podcast, my bad
@@Mikasey No hire :)
Half my job is "researching the preexisting literature", only to find there was one paper you didn't notice that did what you were doing faster, better, and earlier.
My dad claimed that whenever he had a clever idea and went to do the research two physicists in particular had already published it and he named me in their honor. In my opinion it's more on-brand that when picking a name those two felt "comfortable" and it was only later he realized why they were familiar so created the story after the fact.
Well, that's StackOverflow in a nutshell.
@@privacyvalued4134 true - half of software development is checking stack overflow for someone else who had to do it before XD
@@SharienGaming you know you made it in programming when your problem doesn't exist yet in Stackoverflow
@@ericmarcelino4381 then ive probably made it a couple of times... mostly when i have to deal with something so outdated or obscure that basically no one else has to deal with it XD
(or my google-fu was insufficient)
I remember someone once telling me that "if you problem is hard enough, you eventually stop trying to compute a solution and you compute how long it takes to obtain a solution." The last time I did that I concluded that my code would run for 12 days. Mentally, I thought, "That gives me 12 days to think of a better approach." By the next morning I thought of a different algorithm that was successful in about 0.1s!
XKCD - number 1205
I used that to show why I wrote a program that converted a boring & repetitive manual process that normally took two hours, into a program that took 5 minutes total (and this was including manually entering the data into the program and grabbing the results). This process had to be done 2-3 times a month, so me taking 8 hours to intermittently write and debug was worth it. I also made it use an external Parm file so it would be easy to update over time.
@@toddkes5890 That XKCD strip comes up at least once a month at work. I maintain a CI/CD pipeline and we're constantly spitballing automation improvements that might cut down on the manual work.
@@ford9501 I feel like a lot of my coworkers violate this XKCD comic constantly. Spending weeks or months trying to optimize something that takes 5 mins of labor and only needs to be done about once a month or every other month.
@@toddkes5890 Then the greedy upper management just gave you heaps more work to do instead of a raise or promotion or other credit?
@@coopergates9680 Actually that manager was decent and just smiled at the improvement. Of course the lookup file had to be updated nearly every time meant it wasn't perfect, so that might have helped out.
It still baffles me that you were alright with it running for a month. I'd let it run for a few hours tops before I gave up or tried to improve it.
he had to do it for the content
At my lab we have a few computers running very inefficient Matlab code to translate .tif images into a awful million line .csv files that our machine learning collaborator needs for his AI stuff. Right now, a 1600x1600x512 volume is processed overnight and we have roughly a 100 of those. It should take roughly a month for the processing to be done.
Honestly, I could sit down for a month and learn C++ good enough to make it run in minutes instead, but like, I have other stuff to do and I took the executive decision that we could afford to sacrifice a workstation for that amount of time. It also helps that it gives me something to report on to my PI at the weekly group meeting while my own experiments are failing lmao
@@garak55 hehe fair enough
@@garak55 Everything about this both horrifies me, and causes me physical pain. My first instinct was to ask why this person would want an image in csv format (and why they outsourced this process to what I'm assuming is the math department), but I fear the answer.
I guess Matt has so many projects and videos and books in various stages of completion that he doesn't care if any particular one is done a month later than it should be. I like to do things one at a time, so having to wait a month until it is done would frustrate me enough to go back and get the program to finish in minutes instead of days or months.
Remeber kids:
No matter who you are,
no matter what you do,
there's someone on the internet
better than you.
Best way to find the right answer is to be wrong on the internet
Yes, but you will also find 100 wrong answers to go along with it, because for each genius gracing us with their presence there are legions of dunning Kruger cretins
In order to become that person that's better than everyone else on the internet, it takes seeing every other single person doing that thing, and still thinking that you can do better.
I'm that person
@@DemoniteBL no you aren’t bro 💀 self proclaimed genius is never accurate
As a software engineer, I appreciate your humility. Code, given enough time, can be optimized quite a lot.
You won’t pass a freshgrad job interview in my country with such a method
@@ko-Daegu And once you get that job, you'll lose it if you waste your time pre-emptively optimizing without any business benefit. Keep that in mind too :)
@@ko-Daegu same.
I've heard of UA-camrs leaving minor mistakes in their video (e.g. misspellings) to provoke people commenting corrections as an engagement hack.
Matt is taking this to another level by releasing optimisable code that provokes so much engagement he even gets a whole new video out of it!
Genius move 😅
He added a mistake in the title (compared to the number he gave in the beginning of the video) to boot!
@@y_fam_goeglyd the title is correct. The number he gave in the intro was supposed to be a multiple, not a percent. He said the same number again a moment later as "bunchofnumbers times faster"
Is that why he claimed 400 million times faster was 4 million % faster at the start of the video? Cuz I still havent figured out that one.
He did that here too. "Paralyzed computing."
Let's add in thebinary representation of "bread" not starting with a "1".
Watching the cascade from python to java to c++ to c was so satisfying.
the closer you get to machine code the faster it will be. I'm surprised nobody attacked in straight machine code.
@@japaneserequired6314 That was the implication lol. I would imagine writing this in machine code would not be a challenge with a reward worth the effort, but it would be interesting to witness.
@@japaneserequired6314 Because compilers (no offense) are about 600x better at writing Assembly than you'll ever be at it - even if you started to do nothing else than Assembly programming today until the end of your life.
The amount of optimization a compiler bakes into Assembly code has reached such ridiculous levels of complexity (such as division by invariant multiplication and refactoring virtually any math problem as a higher order polynomial) that it's a waste of time to even try competing. Really, you could count on one hand the number of people who even play in the same ballpark as modern compilers and those people usually design said compilers to begin with. You're better off using C or C++ and letting that translate your code into Assembly, even unoptimized C code will be orders of magnitude more efficient than any Assembly code you could ever even conceive of - all thanks to the ridiculoulsy insane amount of optimizations a compiler implements.
@@Finkelfunk that is wishful thinking. Gcc is so full of bugs in code generation, it is crazy.
@@Finkelfunk Nah. I turned compiler optimisation off and it was only about 10% faster.
Admittedly, that was an older version of my code which I have sped a lot by fixing memory handling, but I got a similar result from changing my check detection function (Chess engine) from scanning acrossways the entire board in one pass, to doing a pass only to find the king and then scanning radially from him.
Coders who get to that point where they really think outside the box and just come up with these cool solutions are so cool to me.
I would argue it's the complete opposite; they aren't thinking outside the box, they're trying to think as far inside the box (aka the computer) as possible. Programmers are translators, the best answer is the one that is simplest to understand, the clearest explanation with no unnecessary fluff, and requires the least prior knowledge. That means less time trying to figure out what you were trying to say, and more time working on the problem.
However you want to call it, I just want to add that translating the problem into a graph and then running one of the many highly optimized algorithms from that field is a very well known technique. Other examples would be isomorphism and matchings. Definetly very impressive though.
@@williamrutherford553 Yes, agree. I call it "machine empathy" when you know where the bottlenecks are, what the CPU and I/O routines are having to do. Often it doesn't matter but when it does it can make a huge difference to run time.
@@andrewharrison8436 When you know what exactly is the least amount of connected processes necessary to make the most effective solution for the most complex problems. We're truly living the most exciting of times.
@@williamrutherford553 I would argue that you are arguing semantics. Having a profound understanding of the environment you work in allows creative solutions. These are certainly creative solutions and are outside the realm of what a programmer would normally do to run something as silly as this. Thus, outside the box.
And yet none of that hyper-efficient code would likely exist today... if not for Matt. The catalyst. The inspiration. The legend.
The Parker Programming!
One day Matt will find a halfway solution to the Riemann Hypothesis just to "give it a go", and then 3 months later a viewer will have solved completely to spite him
The Parker Effect!
@@EngineerLume what does it mean can you explain?
@@johnwickgaming3118 Watch the Numberphile Parker Square video
Python is slow, but not THAT slow. There was a pure-Python solution with about 60 lines of code that took 628 ms, which I was able to reduce down to 121 ms with Numba on an old Core 2 Duo E7500 without parallelization. Usually the bottleneck is the algorithm, not the language used.
This was in no way a problem with the language. This was absolutely a problem with his code.
Is the repo public? I'd like to have a look at it
@@silverfire222 There's a link to a spreadsheet in the video description with a bunch of submissions. You can filter by language and sort by execution time. It has links to the source codes too.
@@mad_vegan Thank you!
it'd be interesting to have a 'by compiler' subcategorization too
Matt is a stand up guy. His effort was thoroughly and effortlessly destroyed by a hundred hobbyist and full time coders, and he turned it into a math problem to teach people. Truly a legend
"Stand up" - Also, "Stand up comedian" :)
Never underestimate the obsessive compulsiveness of programmers.
competitive programmers will litterally spend months improving an obsqure and functionally useless program that nobody will ever need to use irl by 0.0000001 milli-seconds
Never underestimate the motivation that can come from wanting to embarrass Matt Parker
Just know that these solutions are actually pretty elementary for programmers and are the type of thing that would show up on an interview for someone fresh out of college.
@@namef Actual competitive programming (or at least a super common modern take on it) is more like spending anywhere between a minute and an hour on a problem (that you have anywhere from no clue to an entirely memorized a solution for) until you either give up because there's an algorithm you haven't read about yet or get code that works and runs under the time requirement. Then forgetting about it and moving on to the next problem - basically going for quantity over quality.
But this sort of incremental optimization of a single problem against other peoples solutions sort of like speedrunning I think is way cooler tbh. If you were referring to a specific competition where this sort of quality approach occurs please fill me in.
Nerds* is more accurate, as nerds will try to do loads of things more efficiently.
Now I want someone to port the fastest code to graphical calculator and see if it would still run faster than Matt's time
My thought was a BBC micro. If one of those could beat his time, that would be amazing.
Im pretty sure desmos could do so with ease. No idea about getting a ti80 series to do it. Most of the time would be spent reading words from another computer and compressing them into an abstract set of bits.
@@simonwillover4175 I mean, a TI84+ can run Doom. It's probably possible without needing another device.
@@Mikowmer Probably not enough memory in a ti84+ but I recon the newer ez80 based calculators have a chance of working.
@@mattsadventureswithart5764 or a commodore 64
You would be surprised how much your recreational math helps out coders. Sometimes we just need to look at code from different angles and every little bit of novel mathematics we discover improves our flexibility. Thank you Matt.
For me, the part starting around 9:25 describing the differences in algorithms was the most interesting part. As a general rule, if you can simplify any problem to bitmasks and bitwise operations (especially and, or and xor), it will run really really fast on practically any CPU. And it turns it applies to here, too.
In 2007 I was developing a service for a major multinational mobile operator. My initial algorithm took 22 minutes to run for places in downtown Athens (very high cell coverage) After a tiny improvement in the code and changing the way the data was stored in Oracle spatial for the same coordinates it took 800 ms. Needless to say I was very happy that day.
It's a great feeling
Good job. 22 minutes is 1320 seconds is 1320000ms, knocking that down to 800ms is a factor of 1650.
@@cxpKSip I wanted to write a paper on it because my point in polygon implementation was insanely faster but the company asked me not to do it
@@asicdathens Did they not let you because other companies could then begin to compete in that service? Or some other reason?
@@jasonb.9790 The competition tried to create something similar when the service was launched. The main competitor CosmOTE paid 700000 Euros to a Finnish company to make something similar. My problem was I couldn't load cell coverage rasters into Oracle spatial (10gR2) and I had to use vectors (shapefiles). If you do point in polygon to shapes that have 60000-100000 edges (cell coverage) the built in Oracle spatial point in polygon algorithm takes forever.
That has to be the best example of optimization I've ever seen.
So it would seem
Meh, seen better.
@@barongerhardt yeaaaah ive seen that as well, though the improvement are usually on the scale of 3-4 orders of magnitude (i think the biggest ive seen before was 6)... this was improvement of 9 orders of magnitude... thats kinda crazy... though to be fair, our code and datastructures generally dont start that bad
I remember years back of hearing a story about someone who got a job to implement a saving system for a long running process (like a week or so) because it sometimes broke during execution - sort of like saving Matt's progress every day or so until the program had finished.
Instead the programmer spent a bit of timedays optimising the application to make it run in under an hour. Making the thing run 100x faster meant the whole saving was unnecessary, and was instead transformative for the business as they could get results so much faster.
Now it wasn't running billions of times faster like in this situation, but it's still a great story that had actual impact.
@@MSheepdog 36 hrs down to 2hrs:10min... got the customer to sign off on releasing a held payment of $14M in 1985 :) We were happy to acknowledge receipt of that check - the CEO sent me a nice memo for my part interfacing between customer and a college intern (I pointed a lot of potential improvements to him) ...
I remember in my first year of applied computer science, we got the task to find all prime numbers between 0 and some number. At first you write code and are so happy seeing it finding the right answers. We got that same assignment several times throughout the year. It was so fascinating seeing the improvement in coding and algorithms to get that same task done in just a fraction of the time it took you the last time.
Yes, there's a set of problems that should be included in all beginner level classes about algorithms and that is one of them.
@@chrisdawson1776 🤡 "🤓"
This is such a neat resource! I'll try to apply it some time in the future
_It doesn't matter how fast you get the wrong answer!_ /s
Sadly too many programmers are ignorant that there are 3 types of optimizations (from slowest to fastest):
* Bit-twiddling (focus on bit hacks and other clever optimizations)
* Algorithmic (focus on O(n) complexity)
* Data-Oriented Design (focus on minimizing cache misses)
Computer Science typically focus on O(n) complexity. It is an OK place to start but a bad place to end. i.e. Two algorithms can have the same exact O(n) but in practice one can be 16x times slower! Andrei Alexandrescu gave an interesting talk _Sorting Algorithms: Speed Is Found In The Minds of People - Andrei Alexandrescu - CppCon 2019_ where invented a new sorting algorithm (!) along with showing that the standard way we measure sorting performance is incomplete (!!). We should be measuring sorting with (C(n) + M(n) + kD(n))/n where
Everyone should start with a (slow) reference version. The act of writing it helps you understand the problem. Then you can start applying optimization techniques. For example my first version of this solver took 20 minutes. The second version took 2 minutes. The third version took 2.5 seconds.
Nice that's a good touch from the teacher.
It's like something my granddad used to say: "It's not about being the smartest, but learning from the smartest to become wiser".
Step 1 in my process for solving algorithmic challenges is "Does Knuth have a solution for that?". A lot of the time the answer is YES. Absolute legend.
Back in the 80's I wrote a program in BBC basic to allocate pupils to classes. At 'A level' there were 5 'bands'. p,q,r,s,t. Band p for example could have subjects maths, physics, chem, etc. There were about 7 subjects per band. A pupil would choose 3 or 4 subjects, and the job was then to see if there was a band that had all their choices. Ideally all 220 pupils would be accommodated. If there were too many failures then the subjects in each band could be rejigged. Also, the numbers of pupils allocated to each band should be roughly similar. I wrote the program for 8 bit BBC B micros. It worked but took a few days for each run. (BBC basic is an interpreted language and very slow.)
Then the 32 bit Acorn Archimedes came out. I tried the program on it and it was not much faster. I then had one of the few genuine flashes of inspiration in my career, and realised that instead of subjects being represented as strings, each one could be just a single bit in a 32 bit word. I rewrote the program and tried it. It ran in 1 minute 30 seconds.
Bitwise ANDing Matt called it. I then used that technique in many other programming problems. It works!
500 microseconds is 660 times faster than an average blink of the eye or just about the speed that a thought can pass the brain. Incredible.
You can't even let go the enter key that fast.
I think we need to switch units here.
In the time the program ran light traveled less than 150 kilometers (in a vacuum), less than the length of the M25 around London
it cant be faster than my thoughts because i dont have any
This is my favourite video you have ever released. Please do more of this type. It is super interesting, even if people like me have no coding experience, your explanation is so clear we can understand it.
This is why the open-source community is so valuable to humanity. Seriously. Great job guys! There is always a smarter geek! I seriously love this community. It isn't about being the best, it is about learning from the best -- and there is plenty of best to go around in this community. In my profession (software engineering) there is a common expression "standing on the shoulders of giants." It basically means that whenever I write an amazing piece of code, a lot of the inspiration for that code rests in the accomplishments from those who came before me. And if I go my entire life and only find one or two new novel ways of doing something, the only thing I would want is for my accomplishments to be used in future code and made even better.
IMO its rarely about a 'smarter geek' - as there are heaps of people of similar intellectual capacity involved and the best method concepts may well come from the 'stupidest' among them. What is magic is the collaboration and speed of evolution that creates. It means that concept from our 'stupidest' -- the artists/linguist types with no mathematical skills at all can have good ideas nobody else in the 'room' has come up with and possibly never would - different perspectives can be powerful thing. And that person's idea can be picked up and polished and evolve rapidly through the iterations of the more skilled number theory and computer science types.
7:36 he said it was behind a patreon pay wall, wyta open source
en.wikipedia.org/wiki/Standing_on_the_shoulders_of_giants is 800 years old. And it remains as true, indeed.
YES ,besides papyrus which is smoke
Open-source works best when there is a clear specific problem with a "known" optimal goal outcome, like in this video. But open source development kind of breaks down for things like "what is the best user-interface" or "best workflow" which are more vague and up to interpretation. I mean look at something like GIMP :\
I am greatly impressed with how unprecious Matt is about his Python code (which was completely correct, just computationally expensive) and how delighted he is by the community response
"computationally expensive" - I love this phrase, sometimes maths and coding terms can be just so satisfyingly descriptive and tickle my brain 😁
Also, agree with your comment!
As a C++ guy, this makes me want to find a way to do this entirely at compile time, leading to a (runtime) speed of however long it takes to print 500 words.
I was going to say you couldn't [portably] read the input file at comptime so it wouldn't be a valid comparison buttt depending on the input format maybe you could #include it directly... if it's comma separated that would totally work
and from there you just have to write thousands of lines of template hell and wait a few weeks for it to build i'll leave that bit to you
Yeah, if you start codegenning, there won't be much of the challenge left, eh?
Easy, the words could be easily encoded into the program file itself using a static storage buffer. Near-instant execution.
Feel free to change the metric so that you measure both compile time and runtime. Then you can compete, and you will find that your solution will perform (relatively) poorly -- the preprocessor is not known for being fast.
Matt's the type of guy to search for perfect-numbers by checking every integer between 1 and N for divisibility and summing them
My favorite part is how Benjamin says in the spreadsheet "Just for reference! I did not actually run this implementation."
Basically couldn't be bothered, only an insane person would run this lol
I'm actually a bit curious to see how long it would have taken, if he'd run it. Matt's run was done, iirc, on an old laptop; it's quite likely that Benjamin's testing hardware is significantly faster. (It would probably still be so slow that it's not worth the bother, but an interesting thought none the less.)
@@mnxs That's a good point.. now I am tempted to run it..
I love it when videos are ever so slightly shorter than 30 minutes, it's the perfect thing to watch in my break
It's fair enough for Matt to say his code was "good enough" when it turns out it wasn't buggy and gave the right answer; if it took 30 days to crash or give an obviously wrong result (very possible) it would be a different story.
Personally when it didn't finish after 5 or 10 minutes I would have intervened haha, computers are stupendously fast these days!
Indeed. The optimizations are impressive, but the most impressive thing to me was the fact that Matt had the patience to wait a month for the computer to give him an answer😁
Thank you for the video of calculated leadership on creating improvement(PCQI). Aside from achieving great results, you are plotting the moment to a details channel that opens the path cross the next step in problem solving/improvement. Hope for a good 2023 for all.
This video is a testament to the potential power of open source. Your code got improved millions of times thanks to the community effect. Amazing!!!
It goes to prove Matt, as a programmer you are a great mathematician...
“Matt was an average programmer but he was a BRILLIANT MATHEMATICIAN!”
@@iantaakalla8180 Here's a little challenge where Matt should know a :
// A 5 GHz CPU will take over 48 years to produce a result
uint64_t a = 0, b = 1, n = 7678603279843596748 >> 1;
while(n--)
b += a += b; // {no overflow checks}
uint64_t s[] = {a, 0};
std::cout
I couldn't code my way out of a box, but I can still very much appreciate this competitive coding. Good job, Matt.
while True:
if boxExists == False:
remove_box()
break
else:
continue
while true:
print "Help Im stuck in this box"
@@benjaminaharon460 while True:
print ("Help, I'm stuck in this box!")
I second this comment 👍 and I also appreciate the coding replies to it very much 😂
Ask for help from step bro
What's fun is that there's probably a lot of optimization left, as sub 7 ms was achieved without Assembly optimization.
I love how reminiscent of Alan Turing's work on the Enigma this is. A bunch of people working together to try and optimize the solution to a linguistics problem through a mix of clever computer manipulation and even more clever maths. Absolutely brilliant!
I'm glad you pointed that out, it really shows how much smart people working together in a competative environment can accamplish
@@namef But not just a competitive environment - an open and friendly competitive environment.
I work in a closed competitive environment (working for a corporation competing against other corporations for market share) and much of the state of the art is stale because everyone has to first learn each other's previous mistakes and current best approaches. We even know that some of our markets are relatively safe because the cost of somebody entering it (learning all that we learned the hard way) is simply too much to do against an existing competitor.
In an open, collaborative environment like this where people share their methods and learn from each other's mistakes, that's where progress is made. That's what science, academia, and even patents were originally about - bringing knowledge into the public sphere where others could work with it and learn from it.
My favorite part about this is that one of the people who made one of these improved versions is Stefan Pochmann, the inventor of one of the most prolific methods for solving a rubik's cube while blindfolded. You sure have quite the audience!
Ah, of course, he's one of the biggest names in the field of solving a Rubik's cube while blindfolded, something that I totally knew existed before just now.
And his version is 18 lines of python which run in a handful of seconds. He wins, as far as I am concerned.
18:50 "Very rarely will Python go into production".
As a Python programmer, having worked in 3 of the most major animation studios in Europe, I can attest that Python code does go into production, and it can be just as reliable as other languages, if not even more because of its readability, making it easier to understand for the people manipulating the code (the developers). We use it to build plugins, to automate repetitive tasks, inside the software used by the artists, such as for instance opening and saving the scene in the right place in the complex folder structure.
But I do understand where you're coming from. If you're working on a piece of code that needs performance, one example in our area would be the rendering engine, then you need a lower level language such as C/C++, which are indeed optimized for performance.
Very well put. For high-performance computing, the cycle is as Matt described, but the vast majority of software development is not high performance computing. Your machine learning framework might be written in Rust, C or C++, but you interact with it using Python because it's easier.
@@imai_kinami Yeah, great example :)
Just a thing Matt, if you ever read this, great video man! I really appreciate your passion for these things! And it's a great exercise you've sparked, and it's nice that you took the time to compile the results for us in this video.
I'm saying this because my comment was a bit easy to make, I feel a bit guilty about that. Hope you don't mind too much :)
The second he said Python doesn't go into production, I could feel the existence of this comment.
@@alxjones That's what makes it both a good (generates likes) and a bad (too obvious) comment.
I love how I can be in one corner of UA-cam where people are taking about computer programming, improving code efficiency just for fun, 30min dives into computing and mathematics…
And then I can click away and watch a guy whose entire channel is dedicated to taking the longest bong rips humanly possible.
It’s nice, I like this place
As a programmer by trade, I want to push back a bit on the "python isnt for production and lower level languages are more robust"
Python is a very safe language in the sense that you won't get memory leaks or segfaults, unlike C or C++. Where you do run into problems (with large applications) is type mismatches and silent failures that always plague dynamically typed languages. This is a big reason why most applications these days that need to be somewhat performant but more importantly need to be very robust, are written in C# or Java, since those languages are both memory safe _and_ type safe. Side note: Rust also is memory safe and type safe, with the speed of C, which is why its caught on so widely and quickly in the last decade. The main sacrifice that Rust makes is that its compiler is very strict, so its harder to learn and harder to program. But that's worth it for many people in many applications!
Python is also used in production all the time for what it's supposed to be: a scripting language. Python scripts are great for setting environments up or automating file manipulation. And perhaps the biggest thing of all is that python is a great tool as an orchestrator of highly performant C libraries. Most famously this takes place with machine learning frameworks, where you can write python code that leverages CUDA and C libraries to do incredibly efficient and incredibly parallel computation.
Loved the video!
Python is also great as a "glue" language between APIs and databases. Think integrating multiple services between different companies, and providing new, combined products.
I am also a programmer, and I have worked on a production Python codebase.
I still agree with Matt's assessment.
I don't know about anyone else, but I like it when compilers stop me from making mistakes.
@@andrew_ray If you haven't tried it yet, you should check out Haskell! The type sytem is so great that it's often said that "if it compiles, it works".
@@andrew_ray typescript FTW!
I once inherited code that was partly a SQL stored procedure (previous programmers enjoyed using cursors where-ever possible). It was fine for a few categories and detail sections on a database with only 10,000 records and looking at a regional subset of said data. Over the years (10 years later), it ballooned to something like a dozen categories and detail sections on a database of over 100,000 records in the main table (with literally millions of records elsewhere), searching for ALL data instead of regional. This code took well over 24 hours to run (Me: "Whhhyyyy????? Why would you use the report in ways for which it was never intended? lol") - and would time out during peak hours.
First iteration, I got rid of the cursors, and did a clean main query/subquery temporary tables (the cursors used the same (15-20+ table) queries 13 times with 1 or 2 criteria changes), and got it down to 30 minutes (2 hours in peak hours) before removing the cursors, and maybe up to 5 seconds after removing them. The next bottleneck is that the detail code was being run 3,000 to 5,000 times separately (the entire render of the report page took 1-2 seconds per each 10 millisecond long query).
"Ah ha!" (The previous reporting tool we used rendered reports a bit differently) I then thought, "Fine! I'll just load the two queries into my compiled .NET DLL code and feed the datasource as needed!". Report run time was now 2 or 3 seconds (that's the shortest time it takes any report to render, regardless of querying time). To this day, that refactoring is still my favorite.
Congratulations! That's a great result. Also, congratulations on having managers/executives willing to let you spend time on that (even if it IS your actual job). Most companies would be "eh, it's annoying but it works, and it's not worth spending time/money on improving it."
@@falconerd343 It's state government. Great work athmosphere where I work, actually.
Then you spend another week or two deciding how to roll out these improvements, so your 24 hrs > 2 seconds improvement happens over a 10-20 year career. E.g. before each performance review, knock another hour off the run time.
@@falconerd343 "If the piece of QWERTY does what it's supposed to, no matter how cumbersome it is to use, and how much system resources it gobbles up, it ain't broken, don't fix it."
Rust enjoyers (like me) were so close from being able to brag about Rust being BLAZINGLY FAST
Blazingly fast doesn't have to be fastest. Rust is definitely blazingly fast. :)
I believe Rust/c/c++ can all reach the same runtime performance for any given problem. High performance code in either language is often written by consulting the generated assembly and tweaking the code until the desired assembly is reached. The difference is going to be in development time, maintainability, safety, compatibility, etc.
It makes sense not to find rust the fastest
Simply cuz rust devs are not that many
+
Many of the crazy optimization guru use C++
The task of comparing the presence of letters in different words is a perfect fit for evil bitwise hack languages like C, though.
Rust's whole thing is that it's the fastest memory-safe language. Meaning that its performance is not as important a feature as its reliability even in very large codebases that took a long time to develop.
I was perusing the spreadsheet and realized that the main difference between the best two solutions (Rust vs C)was that the C one was parallelized. Which is kind of hilarious given that Rust makes parallelizing generally easier lol.
Then I took a look at miniBill's Rust solution and he seems to be preparing to parallelize his code, so maybe we'll catch up!
That bitwise anding technique is brilliant! :) Bit-ops usually are, it's almost like a whole other field of maths.
I definitely think “gunpowdery blacksmith” is more satisfying than “showjumping veldcraft”. I also would guess that “veld” and “fjeld” are etymologically related, which makes me wonder if you can have a “vord”.
EDIT: I do concede that "showjumping veldcraft" describes The Lion King.
"Veldcraft" sounds Dutch somehow.
_veld_ and _fjeld_ aren't etymologically related, with the former coming from Proto-Germanic ∗felþą, and the latter coming from Proto-Germanic ∗falisaz.
@@zapfanzapfan In all likelihood. The Dutch did have African colonies, and both Dutch and Norwegian are Germanic languages.
@@miners_haven how about veld and field? I can't be wrong here
@@SpencerTwiddy yes, they are indeed etymologically related
Actually that bitwise part is the exact equivalent (in abstract sense) of your Python code - it just uses bits as a representation of a set - it uses ones to flag which elements of the domain belong to a given set (which works really well in that case cause domain has so few elements).
Yeah, I supposed converting a word into an alphabetized binary representation could be a hashing algorithm. It's just not secure, which isn't a problem for this application.
@@tylerpeterson4726 Using a bitmap instead of a set is a common method of optimization.
@@sophiophile In Java there is even an EnumSet implementation for this exact use case
And it's inherently parallelized, since the CPU can perform both the bitwise operations and the zero check for 32 or even 64 bits in parallel, which is only possible when using a bitwise representation and native CPU instructions.
The fact that your viewers found a new solution to an old problem makes me wonder: Why not pose an old problem once a year, which hasn't been solved yet? A problem which you find interesting and would enjoy creating a video about. Who knows, with the kind of talent I saw today, your viewers just might answer an age old question. Perhaps ask the channel which sorts of things they'd like to learn more about that has a component which hasn't been solved yet. Great job as always, Matt; I look forward to the next one.
Just drop them as homework.
"... In statistics, Dantzig solved two open problems in statistical theory, which he had mistaken for homework after arriving late to a lecture by Jerzy Neyman."
en.wikipedia.org/wiki/George_Dantzig
@ that reminds me of the guitarist Buckethead hearing a tape guitar virtuoso Shawn Lane had made by splicing together snippets of audio, inspired by Conlon Nancarrow's piano roll compositions, and then very nearly learning to play it, thinking that Shawn Lane had actually played it straight.
"Why not pose an old problem once a year, which hasn't been solved yet?"
ooh ooh do racism
@@nokenwari If we can crowd source a solution to racism via 'putering then you'll make a lot of people very happy - for the five seconds necessary for humanity to find something else to be bigoted about.
In the meantime what about the four colour map problem?
19:09 "That's just the production life cycle of code" I can attest to this. Years ago we tried converting a decade worth of customer data to a new system using a 4GL programming language (kind of like Python). It was estimated to run at least 30 days, provided nothing went wrong. Then it was re-written in SQL to do it all on the database, which ran in about 4 hours.
Still long, but obviously much better.
4:30 Don't feel bad that you can't beat experts in their own field. Pat yourself on the back because you have the ability to valiantly participate in *_every_* field.
The mantra of the polynerd.
Crowsourcing optimization on slow code sounds like a fun competition. This is a really fascinating video. It's crazy just how fast people can make code run.
crowdsourcing and competitions to find more secure, faster, error free code, etc. world could be a better place. Guess this is what they call GNU on steroids. If only it wasn't so fractured. Or at least a cycle of consolidation should be added, so 1 distro ---->2000 distros ----> 5 distros. Hmm. sounds like the general function of neural networks...
i love how this went from a little joke included in the video to an informal contest within the community to now a lesson on benchmarking, statistical significance, and oddly specific performance optimizations
this is a fantastic channel and i'm glad to be along for the ride
now the real question is can we golf this? aim for smallest source size that solves it in under an hour or something?
Check out Benjamin's spreadsheet. There's a python solution there by Stefan Pochmann which is 18 lines of code and runs in a little over 2s on my system.
Truly amazing stuff.
We should find a bigger alphabet so we can test it on that.
This is great. I think if Matt's code hadn't taken so long, fewer people would have been encouraged to improve it. Brilliant stratagem. 🤭
I feel the pain. I had an assignment in university to fill in each side of a cake with decorations(5rectangles filled in as perfectly as possible with other rectangles).
The codes initial runtime was in something like 10million "universe heat deaths" and I got that down to sub 1 second.Now that's an improvement.
I cannot overstate how much it frustrates me that when you're discussing bitwise AND that you don't have the columns lined up and in monospace...
Yup. I like to keep my notes well organized, so I type up everything I can in Notepad using 12 pt Lucida Console (Courier is acceptable too, but not my first choice).
Also, bread has an a.
@@Kwauhn. I mean, not all the text needs to be monospace, but bitwise operations make the most sense with consistent columns.
@@MudakTheMultiplier I know haha, just giving my anecdote about monospace fonts. They do make formatting plain text easier.
Also saying "the 1's bit is A, the 2's bit is B, [...] and so on" then labelling the most significant bit on-screen as A....
- binary digit for the "A" in "BREAD" being a 0. Obviously a cosmic ray bit flip!
I look forward to you implementing ECC video encoding Matt 🙂
Parker Bread
It would be very interesting to be able to graph not only the time each program took to find the answer, but also the time it took them to write it, and see how they correlate.
You'd also need to consider how long they've been learning and working in the language
Getting down to fluctuations in caching is where it gets good, Matt! Time to come full circle with an algorithm which is possibly faster, but has to run repeatedly for 1 month before thermal RNG cooperates.
This reminds me a bit of the "impossible" performance that came out of the Commodore 64 in the 80s. Due to the exeptional popularity, people started using it in was that tweaked some crazy performance out of it. And when some started, it became a challenge for others and suddenly it did magical things.
Yes, I don't think any hardware has ever been (and continues to be to this day) pushed so far beyond its originally intended use as the C64. The demo scene is probably responsible for most of the techniques.
Matt does does this so well. He plays with math(s) oddities in a way that invites and welcomes collaboration, and then brags about the accomplishments of the collaborators. This channel is a superb example of motivating people in an entirely positive way, and should be required review material for all teaching and management curricula. Obviously much of learning is by rote and through tedious repetition, but examples like these "useless" problems presented by an enthusiastic evangelist who presents it as a cooperative opportunity give a refreshing reprieve from the more normal "Here is a problem, now work out the preexisting correct answer and show the expected work" Thanks Matt.
I am very anti-Python but you sold it short. You will definitely find Python code in production. And there are ways to make Python safer like using mypy for static typing.
I'm also anti-python.
Python kinda mid ngl
This project is a true testament to the power of open source work. Fantastic!
This better go to #1 Trending for gaming for these speedruns.
What do you mean? You mean it should be in Speedrun trands?
@@DadundddaD what?
@@DadundddaD They mean it should be #1 on UA-cam trending in the gaming tab because this challenge was about speed running a problem solving task. It was a joke.
I think it's interesting to see the rapid development of improved code, and how long each program took to run after the amount of time passed from the original piece of code. It's kind of similar to Moores law. If you show the world inefficient code, theyll make it 40,000,000,000 times more efficient in 2 days. I think we'll call it "Parker's law"
Correction: You mean Cunningham's Law, where you post the wrong solution online so someone else can correct you and-
oh my god
is parkers law, when you are giving it a go but your solution is so slow that someone else can learn about the problem, improve the code and run it to get a solution before your code even produces a result?
@@raffimolero64 was that a reference to Parker's Squares?
@@raffimolero64 lol great comment
when the video should be over but you're only 1/3 of the way through, love it.
5:30 This part about having to start calculations while still importing data from the text file made me think of the wacky ways I've tried to optimize Zachtronics puzzles
It's very true. The software I write is all in Python (with some C extensions where needed for performance). A lot of it is taking large data sets (some over 2TB) from a network file server and parsing them, modelling them (ML), and producing more like them (interpolation). In many cases, it takes longer to read and write that data to the server than it does to actually parse, model, and generate new data. Pipelining and parallelizing are very important in this kind of data processing.
at this point, the initial statement "I wrote some terrible code" is an understatement.
"I wrote the most horrifying code known to humanity"
I think this fits
This feels more like, Cunningham's Law at this point.
Matt! I see you wrote your own while-loop to find combinations. You should know that the standard library already has a function for that in the itertools module. I didn't know that for a long time, so i ended up writing my own general implementation to reuse whenever i needed. Turns out most of the things we do have already been done better, but i still had a lot of fun :)
I love your channel, I think something that could improve your visuals would be to use a fixed width font so that digits line up, it was especially noticeable in the segment where you were lining up binary numbers for the bitwise and segment.
Keep making great videos! Thanks!
I saw this unfold in my recommended list and have been looking forward to your reply, and you more than delivered! Would've been interesting to see some Big O comparisons perhaps. Lovely video!
thabks for doing these videos, you've impacted my mathematical studies and motivation. you've made maths fun.
I can entirely agree with you. I always enjoyed maths in school but ended up studying social sciences/business in uni. This channel (as well as all the numberphile/all the other phile's) helped me get back into maths and start getting into programming. They really do help make it fun and remind me of the beauty and challenge of math 😁😁
Great video, but it's not true that Python rarely makes it into production (mentioned at 18:50)
Python is incredibly popular in production for web backends. Very rarely is a web backend doing something that is compute limited, but they do have very complex always changing business requirements. For this reason, Python is a great choice as it's just faster to write than C or C++. The primary cost of a web application is developer time, not compute resources, so most projects optimise for the former. Since python is an easy to use general purpose language, lots of people learn it so it's also quite easy to hire developers for (compared to GO or Rust for example).
Since web backends generally do highly parallel processing of very simple tasks (eg you want to handle 10,000 people posting a new comment), it is very easy to scale that horizontally by just having having multiple servers processing requests. Generally the bottleneck in web is the database & network speeds, not compute time. So switching from Python to C++ gives minimal speed benefits except for in extreme cases.
EG. Imagine it takes 3 seconds to post a comment to UA-cam. A fairly accurate breakdown would be: 2 seconds of that is network latency. 0.8 seconds of that is the database, 200ms is the Python. Sure, if you highly optimised and switched to C, you could maybe get it down to 5ms, but even though you sped up your code by 40x, you only changed the overall speed from 3s -> 2.805s, less than a 10% increase of overall performance.
One final thing to mention is a lot of python is actually C under the hood using extremely optimised algorithms, so it is usually pretty easy to get very good performance if you are doing things that have already been solved. It's only when you need to do something completely unique mathematically that you run into performance issues that can't be resolved in Python.
Anyway you make amazing content, I just wrote this up so that people don't get discouraged from learning Python, it's absolutely used in production all over the world and is a very useful and desired skill to have for Web, Machine learning, Mathematics and almost any field of programming.
Can't wait to see the version of this video for his new double jigsaw code video!
before even clicking on the video my response was "Well that's not terribly difficult. All you have to do is make the code absolutely terrible and anyone who knows anything should be able to spruce it up quite a lot" but what actually happened is actually quite impressive
'Sphinx of black quartz, judge my vow' beats out the brown fox and lazy dog by several letters (at a mere 29), and is still perfectly coherent
And it is infinitely cooler.
And before this, making the fox and dog obsolete, was the alcoholic "Pack my box with five dozen liquor jugs."
Which, all things considered, is pretty good - 10x6 bottle diameters is a pretty reasonable box shape (especially when material strength and manual handling are considered).
@@jayayerson8819 hey it’s a plot point from ella minnow pea!
This sounds like a line from a screamo song. I could totally see someone screaming this
It's very cool to me that people like you can take papers that scientists published and explain them in this format and have them become easily available to the general public.
And like this videos example, it really becomes an extension of the experiments/findings of the original paper/idea
TwoMinutePapers is an excellent channel in this spirit -- all the latest breakthrough papers in Artificial Intelligence and computer graphics
Representing the alphabet in bits is actually quite clever! Would have never thought of that
I love optimizing code... At least to a point. I still like all the steps to be relatively logical.
A lot of programmers (especially/specifically web devs) tend to throw even the smallest performance improvement out the window in order to maintain """readability"""
It sucks cause performance optimizations are really interesting. plus the fact that code can be both efficient and readable
@@namef I have not made any web stuff. Only calculations and engineering stuff. But what I was trying to say was that i love writing the code, checking that it works, and the optimizing the code to run faster. But at some point the optimization becomes tidies (when writing the code) and then it ceases to be enjoyable to me to write the code. Sometime you still have to do it, if you need to run the code a lot of times or if the code take a lot of time. But if you need to run the code once and it takes 30 s, then there is no need to do something that I do not Enjoy. But if the optimization is enjoyable to me, I will still do it on a code that only needs to run once and take les than a second, just because I find et enjoyable. (cause I love optimizing code 🙂 )
@@namef Surely the smallest performance improvements are the first ones that should be thrown out?
Oh, you mean the smallest changes to the code, not the changes that produce the least improvement!
Someone really should come up with a less ambiguous language and figure out how to get people to use it (and a way of maintaining that unambiguity through natural development of the language...)
@@rmsgrey Isn't assembly/assembler very fast?
@@namef as a web dev, i do spend my time thinking about optimization when i can, because I enjoy it. But "Easy To Change" is really more valuable to the business than microsecond soeedups for the end user. As much as it pains me to say it, it is only really a good use of dev time to optimize well for known bottlenecks.
(Arguably arguably clean, simple, readable code will more or less coincide with algorithmically efficient code anyway. So far the user facing bottlenecks I've seen have been embarassingly sloppy to begin with)
The second I saw your code running in 32 days I knew it was gonna be bogged down to seconds, didn't expect the parallel bit adding leading to microseconds
It's incredible how dramatically extreme (and even simple) optimizations can have on efficiency of code and algorithms. I used to be heavily into this kind of stuff during my computer science degree (Not so much since I got a job actually programming)
I think coders coding code to speed run a problem is a awesome idea
The quick brown fox.. was actually used in teletype communications a lot. It tested both the baudot code papertape reader at the sending end and the receiving end papertape puncher for proper operation. The human operator would see at a glance whether the transmission was received ungarbled. Back in the day this type of communication took place over HF analog radio
and the dog kept being lazy.. never got that out of him
@@RoelandJansen I always thought it should've been a cat, but the alphabet is badly designed
@@trueriver1950 yeah you are right . four lazy cats here and one dog, supposedly on meth or something
Always fun to do stuff like this, in undergrad I would pretty often write a crappy "it works" version of code and then try to write a more optimized version before the first one finished. I had a pretty good track record of getting code that would have taken ~10+ hours down to more like 10 minutes. In fact I generally don't have a lot of patience for any code that's going to take more than an hour or so.
Generally that's how it is supposed to be. Micromanaging and doing performance analysis with unfinished code is impossible, so first write the code, test it, find the areas which need improvements and then improve them.
This is a guilty pleasure of all beginner programmers. A lot of people will just think about the performance of an operation way too much when the function or functionality doesn't require such things, nor is it complete. Having an algorithm is fine, but changing it multiple times while working on it to further "improve" the performance, is often, a waste of time.
This is not to say to go at it without a plan, however, like you said, first make a working version, then improve it. Now you can test it and actually see the difference your changes are making, which will then give you certainty of wether or not you need to improve performance.
Its quite telling of the general competency of society that there are so many skilled mathematicians and programmers that this puzzle was crushed for no reward. It's kind of incredible what happens when there is a public leaderboard.
Thanks - that was great fun. I'm a fan of code optimization and often we found software optimization of 1 or 2 orders of magnitude was pretty easy to achieve. This story ticked me over the edge to subscribe.
And that is how progress can be made: Work together, share ideas, improve things to their best. Great example.
"Very rarely Python code goes into production" - Matt Parker, 2022
A mathematician's perspective , I guess.
Mathematicians wrote NumPy and SageMath so they apparently like Python as well. Personally I love a functional language with minimal syntax. Like I love LISP. It's a personal problem.
I would like to correct the mentioned "python is not for production" sentence, as I am an Engineer who puts python into production :), and that is perfectly fine. The key insight is: what is the goal? If you want to have speed in your production code, you definitely want to use C/C++/Rust, Go, perhaps Java nowadays, something that compiles and is optimized for speed. But if you want to have a piece of code that is not required to be the fastest, and the key point is that it must be just good enough, but requires that guys like Matt could even touch it and put things inside, python is perfect. It is perfect to write tools, helper scripts in python that can go into production, such as we do it for 15 years. I'm sure this is also why python is so popular, not because so many people prototype things :).
Matt wrote a piece of code, ran it, waited a MONTH for the result and then said python is not for production, while I am sure the same winning sub-second code can be rewritten in python and would perform MUCH better than the original naive implementation. I'm sure Matt would be able to write C++ code for this that would run for a month :D. It was not about chosen languages mostly, but the poor choices of algorithm and data structures. I know, the same code in C++ will be much faster than the same code in python. But not by this margin.
I was just wondering that if Maths is also about finding a solution efficiently (O-notation, P/NP problems, etc...), how was this MONTH-long execution acceptable for Matt? :D. After all, he is a mathematician. And imagine, after a MONTH, it turns out your python script contains a syntax error at the printout_result() function :). Or imagine that after 25 days of execution you spot an error in the implementation. I'm sure he tested it with shorter words though.
Good video apart from this as always, enjoying the channel :).
I agree. Additionally, production codes often combine several languages. For example Python is popular in machine learning, but the low-end routines in these Python libraries are often implemented in a lower-level language like C++
"but requires that guys like Matt could even touch it and put things inside, python is perfect." 🤣 "I'm sure Matt would be able to write C++ code for this that would run for a month :D" -- Ouch! rofl
"I was just wondering that if Maths is also about finding a solution efficiently..."
Is it though? I was of the mind that Math generally is more concerned about clever/insightful symbol manipulation to arrive at a solution much more so than it is about said solution's computational efficiency.
Your examples are in fields of math that are the closer end of the spectrum to Computer Science and are specifically about applying math _to_ the problems of determining efficiency (O-notation) and completeness (P/NP), so I don't think that they are a good indicator of the concerns of Math and mathematicians as a whole.
Computer Science, on the other hand, is very much about efficiency as a core concern, since the whole point of it is to apply the algorithms to actual machines actually processing the computations. The major example here is the "store the words as an unsigned 32-bit integer and use bitwise-and to compare them" strategy that is at the core of all of the best solutions. That is pretty much entirely a Computer Science/Electrical Engineering insight that relies on one's knowledge of how computers work.
That said, I agree that maybe a mathematician is not the best person to discuss the readiness of a programming language for production or prototype usage, so I agree with you on that point.
i feel as though this should be dubbed "the Parker algorithm" as a nod towards the beloved "Parker square"