going fast is about doing less

Поділитися
Вставка
  • Опубліковано 29 січ 2025

КОМЕНТАРІ • 282

  • @NStripleseven
    @NStripleseven Рік тому +1496

    Funny how the video started with “alright so let’s do the first thing you’d probably think of, just caching some states in a hashmap” and ended with “so now we get rid of the hashmap.”

    • @leddoo
      @leddoo  Рік тому +253

      spoilers!!! 😆
      but yeah, when i discovered that, i knew i had to make this video :D

    • @entcraft44
      @entcraft44 Рік тому +44

      That's actually a good example on why you should always test your optimizations. They might not always actually work, and worse, they can stop working if you tweak something else.

    • @spfab3429
      @spfab3429 Рік тому +48

      ​@@entcraft44 I'd argue it's not so much an example of "test your optimizations" and more an example of "ensure your earlier assumptions/optimizations are still valid". He did test the hash map when he implemented it and it did improve the performance, but he changed so much of the code and the algorithm afterwards, that by the end, the code that he originally implemented the hash map for simply doesn't exist anymore. And the new code, that exists now, has therefor a basically untested hash map.

    • @linebreaker8751
      @linebreaker8751 Рік тому

      Actually it may be needed to make one more clay robot than needed for geode robot becouse of the obsidian robot. And at 16:30 he just missed that fact! Actulally he should check if there is enough of obsidian robots first before than checking geode robots! The except of that this optimazition was amazing!

    • @NStripleseven
      @NStripleseven Рік тому +3

      @@linebreaker8751 I don’t remember the exact code that was displayed but I think it accounts for the maximum possible amount of clay you could need in one minute, which just happens to usually be that of the geode bot.

  • @matt-dx1jo
    @matt-dx1jo Рік тому +479

    I can verify this rule works. Whenever I do less myself (i.e., copy other people's code more) the code ends up being much faster.

    • @notnullnotvoid
      @notnullnotvoid Рік тому +24

      Sounds like it's high time to get good.

    • @batlin
      @batlin Рік тому +35

      @@notnullnotvoid that's a pretty unpleasant comment.

    • @korn6657
      @korn6657 Рік тому +16

      ​@@notnullnotvoid harsh but fair

    • @hhjkfvbu5846
      @hhjkfvbu5846 Рік тому

      Hhg

    • @gianni50725
      @gianni50725 Рік тому +3

      @@batlinits a joke just like the original comment

  • @Avighna
    @Avighna Рік тому +338

    The fact that you ended with removing the hashmap ... simply amazing!
    You started with the slowest possible approach and then worked your way up to the best where you were exploring so less that the hashmap became the overhead! That's some optimization!
    Don't be too scared to start with the slowest approach or use hashmaps, you might end up discarding them at the end like this guy. Amazing video!

  • @Inirit
    @Inirit Рік тому +162

    Incredible video. I've been a professional software engineer for nearly 10 years but I don't often have to deal with optimization problems and as such my low-level optimization skills are relatively poor. Walking through all your steps so clearly and concisely really helps me understand some of the different techniques that can be used and is a great addition to my general skills as a programmer. The final optimization in this particular scenario being to merely remove the naive first pass at optimization (i.e. the hash map) was especially revealing.

  • @kanuos
    @kanuos Рік тому +242

    The last optimization was just the perfect icing on the cake. I was thinking along the lines of "Oh, now is the time for multi-threading!" but never in a million years I would have guessed THAT. Subscribed!

  • @gingeas
    @gingeas Рік тому +16

    incredible explanation of branch-and-bound. I notice every year there are quite a few questions that don't have a fancy solution, but just require optimizing an okay one; for example the 2022 day 16 problem ("Proboscidea Volcanium") which is another branch-and-bound, or the 2020 day 15 ("Rambunctious Recitation") which just requires a slightly-optimized brute force solution (it's the Van Eck sequence and has no optimal solution)
    these sorts of questions i really like for aoc as it helps break the stigma to those i teach programming to in that you don't have to memorize 75 algorithms to be good at competitive programming: just need a propensity to sit down and solve a good puzzle!
    looking forward to your later videos

  • @OrtinFargo
    @OrtinFargo Рік тому +16

    I find myself continuously returning to this video because it is such a well-done video. it doesn't cover naive but commonly thought optimizations like multi-threading, which could lead to poor performance on older devices, nor does it delve into the process of optimizing the hashmap to the point of oblivion! What sets this video apart is that it demonstrates how to approach a problem and generalize it to enhance the algorithm. Many other videos that merely provide the memoization solution and call it a day, however this video goes the extra mile by presenting clear and concise steps for general optimization that could be used in any application. In my opinion, this video is an invaluable resource and explain in an entertaining manner that you have gained yourself a subscriber.

  • @pseudo_goose
    @pseudo_goose Рік тому +69

    This was definitely one of the most fun problems to optimize this season.
    I used a similar approach, but it looks pretty different (in particular, non-recursive) because I've had a lot of experience implementing search algorithms for these kinds of problems. They are basically pathfinding problems, where you try to find the path of moves (or sequence of actions) with the least cost (or the largest score).
    In my case, I just used BFS, where my score at each state is the "minimum achievable score": geodes_collected + (time_remaining * num_geode_robots) . I also added some simple best-case pruning - if it can't beat the best "minimum achievable score" so far, even if it built geode robots in every future turn, then it's not worth searching that branch.
    I also noticed that you can remove "waiting" from the list of actions, which significantly reduces the number of states. My logic goes: Waiting is a redundant state, because you always have a specific robot you want to build (unless you are running out of time), and you always want to build that robot as soon as possible. So my action becomes the question "what is the next robot I want to build?" and the waiting is built into the state transition, it fast-forwards until either time runs out or it can build that robot.
    12:40: Another way to make this safe is to use the bytemuck crate, which encapsulates the unsafe transmute using a bunch of checks on the type layout - requiring #[repr(C)], ensuring there isn't any padding that would be uninitialized, etc. I use bytemuck a lot for shader bindings in graphics programming; since they (mostly) use the C memory layout, I can just specify all the shader variables in a struct and then easily convert it to a byte slice to write to a GPU buffer.

    • @leddoo
      @leddoo  Рік тому +18

      thanks for mentioning bytemuck, i'll definitely recommend that next time!

  • @AndreasHontzia
    @AndreasHontzia Рік тому +69

    The number of times where I could just slam dunk a programming problem with math is quite high. Knowing that you are in the right complexity class is key. I like your theoretical analysis of the game.

  • @Aidiakapi
    @Aidiakapi Рік тому +13

    Props for this video, this was a really interesting problem, and you explained your steps and reasoning behind them very well. For fun, I went back and reviewed my own code, and what I did for this day, and there's some more interesting optimizations:
    - I implemented "no idling" differently, and instead calculate the time until one robot could be produced with the current resources, and skip ahead to that frame. This cuts out all the "waiting states".
    - I set up a much tighter upper bound, by having two assumptions, and then calculating the time until we can start producing geodes, and how many it'd produce. The assumptions are:
    --- After the first clay robot has been produced, ore is infinite.
    --- After the first obsidian robot has been produced, clay is infinite.
    - I also compute: lower bound = owned geodes + geode robots * remaining time.
    - Instead of recursion, it uses a binary heap, with states ordered by highest upper bound first, and then highest lower bound. Once the upper bound is >= any previously seen lower bound, we know we are done, and the previously seen lower bound is the actual maximum.
    So we visit a state, to test paths where we produce any of the new robots, and then compute new bounds on the geode count for those states. For the example in the video, on part one (24 minutes), it visits 343, and computes bounds for 1246 states, which runs in 29µs. For part two (32 minutes), it visits 5373, and computes bounds for 21259 states, which runs in 865µs.

    • @leddoo
      @leddoo  Рік тому +6

      those numbers are sick :D
      gonna have to give your approach a try at some point, sounds super fun!

    • @duncathan_salt
      @duncathan_salt Рік тому +4

      are those assumptions about infinite resources sufficiently robust? would it not be possible to construct a blueprint which breaks those assumptions?
      my initial thought would be to treat resources as infinite as soon as you have as many robots for that resource as the highest cost of that resource. is that too conservative?

    • @Aidiakapi
      @Aidiakapi Рік тому +2

      ​@@duncathan_salt They are an overestimation, but the upper bound is never used as an answer, it is only used to cull the search space, because if using these overly optimistic assumptions, you couldn't even beat a previously seen result, it could never be a solution worth visiting.
      The real answer could *never* produce more than ore/clay than infinite, so the real amount of geodes produced could *never* be higher than this upper bound.
      The key is the interaction with the lower bound, which is just "how much would I have if I idled the rest of the time". Once a state we've visited, where it just idles the remainder of the time, can produce more or the same amount of geodes, than our optimistic upper bound with infinite resources, we know that the solution with this upper bound could never be better than any previously seen.
      Because we pick off the item with the highest upper bound, any other states we might have to visit will also never be able to beat a previously seen state.
      Hope that explains it a bit better.

    • @duncathan_salt
      @duncathan_salt Рік тому +1

      @@Aidiakapi ah! I understand now. that's very clever! tyvm for the detailed explanation, it's much appreciated.

  • @sinom
    @sinom Рік тому +4

    The camera didn't die. It was just doing less

  • @matroqueta6825
    @matroqueta6825 Рік тому +24

    I had a very similar experience with AoC 2019 day 18, Many Worlds Interpretation (except that problem took me about 4 or 5 days to complete). Likewise went from a recursive solution that would DNF, to using a cache to getting it to finish in a couple of mins, and then improving it gradually by finding ways to identify clearly terrible strategies (like walking over a key but not picking it up in the 2019 case) and removing them from evaluation.
    Many of the advanced AoC problems are really great at teaching the "do less = go fast" principle in a very natural way

    • @thekwoka4707
      @thekwoka4707 Рік тому +6

      Yeah, in 2022, there is one related to movement between caves to turn valves.
      You can actually step through caves, but you actually only need to be concerned with the order of valves to turn, since any steps in caves not getting you directly to the next valve is wasted.
      My fastest solution actually just bruteforced every possible valve order end value since it was so quick compared to any kind of simulation.

  • @Rin-qj7zt
    @Rin-qj7zt Рік тому +8

    That cache removal took me off guard but it makes so much sense and is a perfect example of unintuitive optimization processes. There will be solutions implemented that were only solutions at the time they were implemented.

  • @devtadeo
    @devtadeo Рік тому +120

    5 months ago UA-cam recommended me your video of why python is slower than your language, it's impressive to see someone that really puts effort in videos with those animations and good explanation of the topics, one of the best channels I've ever subscribed ❤

    • @leddoo
      @leddoo  Рік тому +16

      aww, thanks!
      if you haven't seen them already, i think you'd also love the videos by @strager_ (who i've stolen the tip/exercise thing from 😁)

    • @kissinger2867
      @kissinger2867 Рік тому

      @@leddoo UA-cam just recommended me this video and I'm already sold. It's so rare to find a computer science channel with true computer scientist who is passionate and knowledgeable about the subject of computation and not a tech influencer.

    • @devtadeo
      @devtadeo Рік тому

      @@leddoo yeah i know @strager_ i love his videos, i pretty much follow every programmer that explains with animations/slides like tantan altough i dont understand rust but the videos are enjoyable anyways

  • @jukkajylanki4769
    @jukkajylanki4769 Рік тому +2

    Great video! Here is one further state space optimization: remove 'idling' as an option. That is, instead of each child node being the parent node + 1 turn passes, make each child node be the result of a "what do I choose to build next?" choice. For each choice, calculate immediately how many turns it will take to build the thing (how many turns are needed to wait for resources + do the build) before recursing to the child. That is a generalization that should obviate the awkward can_build booleans and removes all "I idled" child nodes from the graph.

    • @JoQeZzZ
      @JoQeZzZ Рік тому +1

      Yep, this + caching is how I solved the problem when I did it.
      It's not a matter of idling, it's a matter of deciding what to build next and waiting the required number of turns to do so.
      There are definitely further easy optimisations, like not building a producing robot if you're already producing enough as was stated in the video, but simply removing all idling steps was enough for me to get it into acceptable (making a cup of coffee abd coming back) running range for me.

  • @SciDiFuoco13
    @SciDiFuoco13 Рік тому +32

    Very cool video! I remember bruteforcing this in 5 minutes and getting on the global leaderboard just to then spend 5 hours trying to optimize it down to milliseconds. I used most of the techniques used in the video, although it took me a while to realize especially the second to last one (although I implemented it in a slightly different way, i.e. for every kind of robot wait until you can build it and then do it; this means that exploring a node doesn't coindice with a single turn though). In the end I spent a lot of type trying to find even tigher upper bounds (it's surprising how many more nodes you can skip with that!)

    • @leddoo
      @leddoo  Рік тому +4

      oh, that sounds really interesting! do you have your solution on GH?

  • @lewismassie
    @lewismassie Рік тому +6

    Realising what you had to do for that last step is the true wizardry of programming. I certainly wouldn't have figured that out

  • @CWunderA
    @CWunderA Рік тому +1

    This was fantastic! What a great example of how to think about and tackle a complex problem systematically. How you walked through each step, the intuition behind each change, and built the solution up from parts was perfect. I would love to see more videos like this.

  • @CottidaeSEA
    @CottidaeSEA Рік тому +8

    I wish I got to do more stuff like this at work. This is the kind of stuff that really excites me. There's something in me that just goes kind of crazy when I make something blazingly fast.
    I made a function go from taking 3 minutes to just below 2 seconds before, I could've done even more, but I couldn't really justify it. It was also fast enough by a long shot even if it had taken 30 seconds.

    • @jimmyhirr5773
      @jimmyhirr5773 Рік тому

      Not sure what you do at work, but I've heard that these industries put a high priority on performance:
      Video games (particularly ones with high graphical quality)
      High frequency trading firms

    • @CottidaeSEA
      @CottidaeSEA Рік тому +1

      @@jimmyhirr5773 Yeah, they do. I primarily work with e-shops. Performance is a big part of what we do to stay competitive, but it's very much just until it's "good enough" and nothing more.

  • @iliedemian8461
    @iliedemian8461 Рік тому +2

    PLEASE UPLOAD MORE OF THESE VIDS
    this is the best someone has explained a DP problem ever i love this video❤❤

  • @Noitcereon
    @Noitcereon Рік тому +4

    Tips in the video:
    #1 (1:40): Try doing less
    #2 (4:13): Start with an algorithm that is definitely correct
    #3 (6:34): Use caches to avoid expensive work multiple times
    #4 (8:23): dynamic programming is just recursion + caching
    #5 (13:30): Understand how abstractions work, so you can use them effectively

    • @CWunderA
      @CWunderA Рік тому

      #6 Benchmark/test each optimization and remove earlier optimizations if they are no longer beneficial

  • @TheRealBrenny
    @TheRealBrenny Рік тому +1

    I don't usually comment, but that ending was the most satisfying ending I've seen in a long time.

  • @teofilciobanu
    @teofilciobanu Рік тому +3

    I just discovered your channel and this is such a great video. Very fun to watch and very inspiring. Great work, I'm looking forward for your future videos.

  • @protheu5
    @protheu5 Рік тому +1

    It's like a magician's performance, that twist at the end. Brilliant!

  • @skyslycer
    @skyslycer Рік тому +11

    This is so cool. I didn't understand anything after a certain point but still interesting.

  • @sanjsahayam5271
    @sanjsahayam5271 Рік тому +1

    Love the step-by-step approach to optimisations used here. Great job! Love to see more of these types of videos. 🙌🏾

  •  Рік тому +2

    Best software video I've seen lately with a great ending, subscribed.

  • @Jankoekepannekoek
    @Jankoekepannekoek Рік тому +2

    Never expected that last optimisation! Subbed!

    • @leddoo
      @leddoo  Рік тому +1

      i didn't see it coming either, that's why i made this vid :D

  • @yash1152
    @yash1152 Рік тому +1

    6:40 yes, i was wondering that too
    and it's awesome. the terminology used - hits/misses matches the terms used in paging and page replacement algorithms pretty well :)

  • @cyanoswag
    @cyanoswag Рік тому +4

    Your video style and clarity in the explanation is awesome! Thank you.

  • @zbynekjurica
    @zbynekjurica Рік тому +1

    This is a really great video! So much information, learnt a lot. Thank you! Would love to see more aoc optimization videos in the future!

  • @taukakao
    @taukakao Рік тому +25

    I think what's important to see here is that every optimization has it's cost, either code readability, complexity, or overhead (as with the HashMap). Balancing them is the most important task in my opinion.
    I think in many cases having easier to understand code can be more important than the speed increase. (And sometimes a seemingly better solution will result in a worse execution time, especially hard to notice in garbage collected languages)

  • @kjaamor2057
    @kjaamor2057 Рік тому +13

    I'm a novice programmer who has never used Rust but I watched this from start to finish. Superb piece of presenting, aside from anything else.
    I'm probably several years off implementing something like this, but as a novice its a nice way of being introduced to concepts.

  • @Roy-K
    @Roy-K Рік тому +1

    Absolutely incredible, your explanation as you went really helped to guide me through the thought process you used - hope to take advantage of this soon!

  • @thegreatdissector
    @thegreatdissector Рік тому +1

    I watched this video a few days ago and decided to hold off on watching the rest so I could have a go at solving the problem on my own time. I spent too much time trying to do so and eventually caved to see what "the solution" was.
    Turns out, there is no "the solution"! That was exactly my problem. I was trying to find an elegant algo that would get the job done. But this video revealed this is nothing more than simply beginning with brute force, followed by a series of clever prunes to the space. Moral of the story - just start with the brute force!

  • @raylopez99
    @raylopez99 Рік тому +11

    Nice, it's like checking the min-max score on a chess tree and then truncating the branches that will not give a higher score. How chess engines work. My solution was to simply randomly try different combinations and see what works, then store the learned preferred combination in the final production code. I wonder if this is "almost as good" as the binary tree approach here; of course it will by definition be suboptimal but it's easier to code for sure.

    • @thekwoka4707
      @thekwoka4707 Рік тому +1

      The chances you would get the right thing that way are extremely low with the context of the problems.
      You have to check a lot of different blue prints that have different robot resource costs to find the best.

  • @spicybaguette7706
    @spicybaguette7706 Рік тому +61

    One low-hanging fruit optimization is to use a faster hash function, like fnv. SipHash is only really necessary in cases where you have untrusted/user input as your key material, because it's HashDOS resistant

    • @thekwoka4707
      @thekwoka4707 Рік тому +5

      Or even a really naive stringification of the state.
      Just the number values of each section (time, ore, bots? With a hyphen in between lol
      But removing the cache entirely is faster

    • @professortrog7742
      @professortrog7742 Рік тому +13

      @@thekwoka4707 seeing as all those numbers are 8-bit ints i would say that it is quite obvious how to glue them together without falling back on strings.

    • @leddoo
      @leddoo  Рік тому +22

      good point!
      using fnv made the initial version about 30% faster

    • @kintrix007
      @kintrix007 Рік тому +7

      ​@@thekwoka4707 I am failing to see how that would bring any benefit? You have eight 8-bit integers in a struct. That means you have 8 bytes. You have it in a struct, which is to say that memory is grouped together. Thus, if you read 8 bytes, you just got all of the data in a non-ambiguous way as a u64, an 8 byte integer.
      What you are proposing is to instead allocate up to 8 times 3 bytes for the numbers and 7 bytes for the dashes. You made 8 bytes of data grouped together in one place into 31 bytes of data grouped together in another place. I have severe doubts that would lead to speedup. You just made the hashing function slower by having to hash more data.

    • @gwentarinokripperinolkjdsf683
      @gwentarinokripperinolkjdsf683 Рік тому +1

      @@leddoo was it enough to justify using the hash in the final version or no?

  • @Teflora
    @Teflora Рік тому +1

    That was a satisfying ending! Tbh did not expect that, but it makes a lot of sense!

    • @codesymphony
      @codesymphony Рік тому

      why didn't he show code for last step?

  • @colox97
    @colox97 Рік тому

    the last 2 software project i started are paused cuz i couldn't figure out a way to solve the problems i was having.. it's so nice to see someone reasoning trough a problem and solving it efficiently

  • @DB-Barrelmaker
    @DB-Barrelmaker Рік тому +3

    I think I now understand the point of memorization. Man what a round about way of realizing something

  • @SWinxyTheCat
    @SWinxyTheCat Рік тому +4

    The best way is to just return the correct answer immediately

  • @HippieInHeart
    @HippieInHeart Рік тому +1

    I'm not even gonna pretend that I understood most of what you said, since I'm pretty much still at the very beginning of learning programming, but it was pretty interesting regardless. Thanks for making this video. Maybe it will help me one day in the future.

  • @glevco
    @glevco Рік тому +2

    Great video, keep it up! Would love to see more like this, specially in Rust

  • @jeffreysun7983
    @jeffreysun7983 Рік тому

    A good lesson to take from this is to solve as much of a problem as possible before letting the machine take over. I can imagine someone working on this one long enough that they can practically solve it with pen and paper, then the final algorithm can be in the μs range

  • @ataraxianAscendant
    @ataraxianAscendant Рік тому +5

    babe wake up new leddoo video

  • @iRemmi
    @iRemmi 4 місяці тому +1

    this is an excellent, well explained video. thank you

  • @lucasa8710
    @lucasa8710 Рік тому +1

    this is a masterpiece video. incredible

  • @Turalcar
    @Turalcar Рік тому +1

    I would usually put "memory layout" under "doing less" as in doing less memory fetching.

  • @jeremysollars5922
    @jeremysollars5922 Рік тому

    Continue this style of video and you got yourself a long term viewer

  • @1vader
    @1vader Рік тому +5

    I'm pretty sure the unsafe transmute is undefined behavior if you don't specify repr(C). As you said, the field layout is undefined, which also means there can be padding in between and reading padding is undefined behavior.

    • @leddoo
      @leddoo  Рік тому +3

      yeah, i should have probably mentioned padding.
      but in this case, there's no UB, cause the struct consists of 8x u8 values, and those are transmuted to a u64. a u64 is of course 8 bytes large, and neither u64s nor u8s have any invalid bit patterns.
      if the compiler did insert padding (which would be sub-optimal here), the size of the struct wouldn't be 8 bytes, so the transmute would raise an error.

    • @KaneYork
      @KaneYork Рік тому

      The fact that it's "either a compile error or not UB" is actually mind blowing to me, someone who is used to bad tools that let you do bad things

  • @RoryDavidWatts
    @RoryDavidWatts Рік тому +2

    This is an excellent video, thank you for putting the effort into making it.

    • @leddoo
      @leddoo  Рік тому

      glad you enjoyed it! :)

  • @jonarmani8654
    @jonarmani8654 Рік тому +5

    so much was deleted by the end, even your camera
    doing more with less

  • @dariogifc0
    @dariogifc0 Рік тому +2

    You forgot to mention that 12.7 million is approximately the population of Bavaria. Unforgettable omission.

  • @Razvvannn
    @Razvvannn Рік тому

    Just stumbled upon your channel. Great video! Explained gorgeous the ideas! Instant subscribe

  • @meowoasdgjoiagjoi
    @meowoasdgjoiagjoi Рік тому

    wow, I wish I could like a video twice, that ending was amazing

  • @chezlonian
    @chezlonian Рік тому +2

    This man talks like the engineer I want to be. He has an understanding of low-level processes that make me embarrassed to claim to be an engineer. How do I learn this kind of stuff? Is there a community?

    • @leddoo
      @leddoo  Рік тому

      there is, in fact!
      i have a discord, link in the description.
      but the real deal is the handmade network! (handmade.network/ discord.gg/hmn )
      other than that, i always followed my curiosity, when i wondered "how does this thing actually work underneath?"

    • @chezlonian
      @chezlonian Рік тому

      @@leddoo Looks like the discord link in the description isn't working :(

  • @girishpoojari2910
    @girishpoojari2910 Рік тому +1

    Great video, I didn't understand everything but still some of it made sense. Thank you,

  • @tunafllsh
    @tunafllsh Рік тому +1

    This is amazing! Less is more until more is less.

  • @sebaztian18
    @sebaztian18 Рік тому +1

    Incredible explanations! Keep it up!

  • @bigjukebox3370
    @bigjukebox3370 Рік тому +1

    this is brilliant. Well done and very interesting example

  • @romanstingler435
    @romanstingler435 Рік тому +1

    Definitively worth a follow

  • @senzmaki
    @senzmaki Рік тому +1

    there's no reason for a programming video to go this hard 🐐🐐🐐

  • @gawwad4073
    @gawwad4073 Рік тому

    That ending was just beautiful

  • @BarsDemirdelen
    @BarsDemirdelen Рік тому +1

    One way I optimized this problem was building a better state tree. Instead of having actions like if can build robot, build robot, or wait, I changed my actions to be: Either build a x robot at t2 or build a y robot at t4. There is no wait action at all. I believe your no idling pruning is in the end equivalent to this, but I am mentioning this as just a different way of thinking.

    • @leddoo
      @leddoo  Рік тому +1

      > a different way of thinking.
      yup, a few others have done it that way too, and i love it :D

  • @prestonharrison2140
    @prestonharrison2140 Рік тому +5

    This is an insanely cool video.

  • @orbital1337
    @orbital1337 Рік тому

    As someone who has worked quite a bit on optimization problems like this, I also enjoyed this particular problem of AOC and your video about it! Branch and bound is a classic technique and this is a nice and simple example of how powerful it can be. Just for fun I benchmarked my own solution that I wrote a couple months ago against yours. Averaging over 10 runs I get:
    Part 1 | Part 2
    Mine | 1.3 ms | 47 ms
    Yours | 6.3 ms | 107 ms
    Our approaches are a bit different. You use DFS whereas I use BFS with a lower bound heuristic (build the most expensive robot possible). These are both okay (usually best-first search ala A* is better but I was too lazy for that). You use some domain knowledge to prune like only building geode robots when possible but I didn't bother with any pruning outside of comparing the lower and upper bounds. On the other hand, I put in more work into my upper bound which often gives the biggest bang per buck for branch and bound algorithms.
    Usually a good strategy for an upper bound is to relax the problem to something simpler that you can solve exactly. The relaxation I came up with is the following. Given a starting state of resources, create a copy of the resources for each type of robot - lets call this their personal inventory. Building a robot takes resources from their personal inventory. But when a robot *produces* a resource, that resource is added to *each* robot type's personal inventory. Each minute, you're allowed to build up to one robot of *each* type at once.
    This is clearly a relaxation because each robot type's personal inventory is an upper bound on the resources that you would have if they all shared resources. And since you can always decide to just build one robot, a solution to the original problem is a feasible solution to the relaxation. On the other hand, it is easy to construct an optimal solution to this relaxation: simply build robots whenever possible. So that is the upper bound that I used for branch and bound.
    Less is more, but sometimes more is less! In this case, doing a lot more work per state reduces the number of states to ~2300 in part 1 and ~81600 in part 2.
    Code: gist.github.com/t-troebst/466cb7017a295ab4d4a571238876f0a8

  • @thoperSought
    @thoperSought Рік тому

    that was a lot of fun!
    I don't use Rust, but I've been curious about it for a while, and I really enjoyed this

  • @reprC
    @reprC Рік тому +2

    At 12:06, technically it is safe to transmute this into a u64, but Rust doesn’t guarantee preserving field order for structs, only drop order. You’d ideally want to annotate Pack with #[repr(C)] to prevent rustc from reordering fields. Also, it may be a good idea to explicitly align the struct too ( with #[repr(C, align(8))] ). Struct alignment is calculated as the max alignment of each of its fields. Since Pack is a struct of u8’s, it has an alignment of 1, but u64 has an alignment of 8. mem::transmute()’s docs says the compiler will take care of ensuring compatible alignment, but I am not 100% sure if adjusting the unaligned value adds any overhead so explicitly aligning couldn’t hurt.
    EDIT: unpaused and saw you addressed the field ordering and repr. Hopefully this comment is still helpful for anyone interested

    • @leddoo
      @leddoo  Рік тому +1

      hmm yeah, repr(align) could be a good idea 🤔

    • @gfasterOS
      @gfasterOS Рік тому +1

      @@leddoo I think you need more than just alignment, it's conceivable that a profile-guided optimization would introduce padding within the struct. Without repr(C), it is reasonable to assume that the transmute is undefined behavior.
      Irrational aversion to unsafe isn't good, but there are a ton of potential footguns and it really should be treated as a "going faster" optimization reserved for when you know the safe variant is actually *much* slower.

    • @jole0
      @jole0 Рік тому

      Why does it matter if the fields are reordered?

    • @yjlom
      @yjlom Рік тому

      @@jole0 it doesn't, since they're never communicated between different versions (or even instances) of the program

  • @TaiwoMegbope
    @TaiwoMegbope Рік тому

    Thank you very much! This is profoundly useful.

  • @ItsLaxe
    @ItsLaxe Рік тому +1

    incredible video, thanks a lot

  • @chrissalgaj4111
    @chrissalgaj4111 Рік тому +1

    Absolutely awesome vid!

  • @curtmack
    @curtmack Рік тому

    One of the big reasons I love Lisp languages is that I can write a macro that memoizes functions for me.

  • @dudusami89
    @dudusami89 Рік тому +1

    amazing content, thank you for your time!

    • @leddoo
      @leddoo  Рік тому

      thanks for taking the time to leave such a nice comment! 😁

  • @syncrossus
    @syncrossus Рік тому

    My first thought was to explore the tree using A*. I'd have to actually work through it more thoroughly to be sure, but think it would be equivalent to all of your optimizations except "u8" and "enough bots".

  • @awfultrash888
    @awfultrash888 10 місяців тому

    Pretty good video man!

  • @keokawasaki7833
    @keokawasaki7833 Рік тому

    it is so satisfying to see those numbers drop

  • @atlasxatlas
    @atlasxatlas Рік тому +1

    i was struggling with aoc22 d19 for months now.
    ofc i thought of brute forcing it but instantly disregarded this method cause i was determined to fine some more logic way to solve this.
    i tunnlevisioned this ideal solution and at the end couldn't find it.
    finally i watched this video, which i had saved for a month after i watched it a little and realized it's about the same problem which i was stuck on.
    this is incredible stuff. im so glad i stopped being stubborn. i let go of my self assigned goal of 'if i do this it will be by my own force'.
    the _smartest brute force_ is an incredible method and i am so glad to have learned it.
    thank you

  • @denversupermarket7484
    @denversupermarket7484 Рік тому +1

    That was beautiful

  • @filipenicoli_
    @filipenicoli_ Рік тому

    that was just beautiful

  • @jm-alan
    @jm-alan Рік тому +1

    This is exactly the kind of progressive optimization I love.
    You have me reconsidering my Rust solutions for Project Euler, wondering if there are more unseen optimizations to be had...

  • @alssoldat7185
    @alssoldat7185 Рік тому +1

    Nice work, please consider making more like this, I think it is a category of coding information that is under represented on UA-cam.
    If anyone has suggestions for other sources of such information, please do share :D

    • @leddoo
      @leddoo  Рік тому +1

      i put some more links into the description. you may also enjoy strager_'s work!

    • @alssoldat7185
      @alssoldat7185 Рік тому +1

      @@leddoo Hehe yeah, I already discovered Casey and tantan, they are both great! I have also stumbled across strager, but only a video at a time, I'll give the channel a look :D Thanks!

  • @isaiahhonor991
    @isaiahhonor991 Рік тому

    Wonderful video!

  • @majorjohnson8001
    @majorjohnson8001 Рік тому

    Came back to this just now to implement things myself to have this code to look back on in the future and what's odd is that I'm not getting the same cache states-visited and cache-hit-rate that you did and removing the cache actually slowed down my code (110ms -> 160ms). I don't quite have the same low level optimizations in C# that you have in Rust, but I did use the best equivalent that I could--using bytes and HashCode.Combine( {8 parameters} )--and I had 34 seconds to your 44 seconds on the u8 step (it's where I started) and did the next three optimizations all at once, so I didn't time those. Removing the no-idling gets me... 210ms. Then removing enough_bots gets me... 185ms (yes, it went down when I did that).
    Even tried in-lining a lot of the CanBuild() and GetMaxCost() methods and that still only brought my average time down by about 10ms (smaller than the variance between runs).
    I'm getting the right result, just confused as to the timing discrepancy.

    • @leddoo
      @leddoo  Рік тому

      hmm yeah, that's pretty odd.
      all of the domain specific opts should be lang independent.
      do you have the code on GH?

  • @oysteinmb1
    @oysteinmb1 Рік тому

    This is a great video! Enjoy your other content too, but would love to get some more "tutorial"/"think like a programmer" videos!

  • @lesterdelacruz5088
    @lesterdelacruz5088 Рік тому +3

    My approach to optimization is always dfs and memoize and hope for the best haha. Been programming for 10+ years haha and clearly I haven’t progressed since year one.

  • @theghost9362
    @theghost9362 Рік тому

    Simply amazing

  • @Sammi84
    @Sammi84 Рік тому +1

    That ending was wild.

  • @ac130kz
    @ac130kz Рік тому +1

    maybe it's also useful to replace 4 fields with slices and use templates?

  • @silviogames
    @silviogames Рік тому +1

    How cool is that please,
    Wie cool ist das bitte

  • @RiverHurt
    @RiverHurt Рік тому +1

    How did you generate the first game tree? Was it generated with code or did you assemble it manually?

    • @leddoo
      @leddoo  Рік тому +1

      i wrote a brute force version that printed the state on every call (with indentation) & then built the tree manually based on that in google slides.

  • @ExCyberino
    @ExCyberino Рік тому +1

    Pretty good, you'll probably get big soon

    • @leddoo
      @leddoo  Рік тому +2

      thanks! i've been working out for ~2 months now 🌚

    • @ExCyberino
      @ExCyberino Рік тому

      ​@@leddoo I actually meant your channel will get big.
      But I'll believe your body is gonna get big as well. Keep at it

    • @Gunth0r
      @Gunth0r Рік тому

      @@leddoo oh, so you do gardening too?

    • @mastershooter64
      @mastershooter64 Рік тому

      @@leddoo Keep going for 2 more years you'll definitely be big, building muscle is no joke man you gotta get good sleep every night, eat well and lots of protein and hit the gym regularly and of course don't overtrain. It's hard work but if you keep at it for 2 or 3 years it's really worth it. sleep is very important.

    • @leddoo
      @leddoo  Рік тому

      @@mastershooter64 hehe, thanks for the tips!
      eating enough is definitely the hardest part for me. i'm the kind to get lost in a coding problem & forget to eat the entire day. (though dw that doesn't happen too often :D)
      and while i like to track how i spend my time, tracking calories has never really worked for me. though i recently got a new, more accurate scale, so now i at least have some objective feedback loop going on.

  • @thekwoka4707
    @thekwoka4707 Рік тому

    I ran my solution (typescript) to see how long it took and it was 67s for part one, 72 for 2 and used 5gb ram
    And that was with aggressive branch pruning for cases where that branch will never be good (buying a robot you could have bought last turn but you didn't do anything, being able to afford everything and not not buying anything, don't buy past whatever the theoretical maximum limit is for a kind of bot to be able to produce all other bots each turn)
    No state caching since the branch pruning should make states that would match be virtually impossible.

  • @hrmny_
    @hrmny_ Рік тому

    I'm curious about replacing the hashmap key, are rust release mode optimizations not able to basically do the same thing?
    Also, why not do a custom `Hash` implementation for it?

  • @EmersonPeters
    @EmersonPeters Рік тому +1

    Could someone explain what the final step actually was? Was it replaced with a 2D array?

    • @leddoo
      @leddoo  Рік тому

      the final step was deleting all the code related to the hash map.
      so the hashmap param itself & all the lookup/insertion logic.

  • @minuskelvin3619
    @minuskelvin3619 Рік тому

    Branch-and-bound and symmetry breaking are overpowered

  • @0xlogn
    @0xlogn Рік тому

    12:00 - why not manually impl Hash for Pack with the transmute?

    • @leddoo
      @leddoo  Рік тому

      you may also need to implement PartialEq manually then (with the transmute) (not clear that llvm would optimize that).
      both those impls together are like 10 lines of code.
      doing it with a `u64` is just simpler.

  • @laubannenberg5446
    @laubannenberg5446 Рік тому

    This is really cool, it's a very nice demonstration of how using a couple of key techniques in a spiral of improvements can whittle down a problem. But after seeing this, I wonder if you can't draw out this approach to optimizing even further?
    The rules of this game say you can build only one robot per turn. So once you hit time X the point where you can make one geode robot every turn, everything else after that should be expressible as a simple function of Y time -> Z geodes.
    And it only takes a finite number of steps to get to that point in the fastest possible way. Essentially, each moment in time until the takeoff point is a special case with a knowable geode value.
    So you could also just hard-code all those special cases for time < X, and the general function for time >= X.
    That's a bit banal solution, but then, the problem itself has no real randomness or surprises that require dynamic adaptation; the only variable is how many time steps you get. And it's in keeping with the general way of optimization here: aggressively use domain knowledge to do less. It's just kinda the big brother to optimizing a fixed-length loop by unrolling it.

  • @eboatwright_
    @eboatwright_ Рік тому +1

    whoa. subscribed 👍

  • @weiSane
    @weiSane Рік тому +1

    Yo leddoo video idea. Can you do a video on how NASA writes space proof code and give us examples baded on their coding style and thought processes

  • @litaoz
    @litaoz Рік тому

    How does the states skipped get calculated at 8:08 ? I’m still confused by that.