Premature Optimization

Поділитися
Вставка
  • Опубліковано 21 лис 2024

КОМЕНТАРІ • 1,7 тис.

  • @rijaja
    @rijaja Рік тому +9546

    A sentence I stole online and sometimes tell my classmates is "If your code doesn't work, we don't care how fast it doesn't work". I think it's a great piece of advice.

    • @Starwort
      @Starwort Рік тому +604

      My data structures and algorithms lecturer said something similar, an anecdote about him replacing a broken algorithm. The exchange was along the lines of 'But your algorithm takes 3 times as long as my algorithm!' 'Yes, but _my_ algorithm gets the right answer'

    • @FightLegacy
      @FightLegacy Рік тому +168

      Make it work, make it clean, make it fast.

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 Рік тому +217

      Correctness comes before efficiency.
      Sometimes also people also say “I can make it run a bit faster if it doesn’t have to handle all these edge cases”. Or, in other words, if it doesn’t have to be correct.

    • @Dash323MJ
      @Dash323MJ Рік тому +102

      A classmate of mine wrote a tree search accidentally with 2 extra unnecessary recursion calls and the professor determined that the code would eventually work, but it would take much longer than the age of the universe, so....... sometimes a bad algorithm is no better than code that doesn't work.

    • @codingbloke
      @codingbloke Рік тому +100

      That's a clever adaptation of the shorter one they use often when training devs: "It's easier to make working code go faster than it is to make fast code work"

  • @landon.packrat3281
    @landon.packrat3281 Рік тому +1748

    Here is a case I got early in my software development career. Our older software package was taking almost three hours to read the configuration files. My boss gave me the task to convert the configuration files to binary rather than human readable text, because it would be three times faster. I pushed back and asked for research time, because while there were a lot of large configuration files, but they were not THAT large! After doing my research, I found the real problem.
    There were two files that contained data that had to be correlated together. Each entry in the first file had a key to a data entry in the second file, and they needed to be connected in memory via pointers before the program could run. They were read via functions that opened up the file, searched for an entry, closed the file, and returned the entry. It started by manually reading the first file to get a list of keys for every entry in the first file. Then for each key, it would call the function to open the file, search for the key, and close the file. It would then repeat the process for the second file.
    My solution:
    Open both files and read all data entries into memory. Once everything is in memory in two lists, iterate through the first list and match it to the correct data entry in the second list.
    Loading times went from three hours to thirty seconds.

    • @owlon225
      @owlon225 Рік тому +239

      so basically you move the opening of the file in front of the iteration and that got you from 3h to 30seconds ? that's awesome

    • @landon.packrat3281
      @landon.packrat3281 Рік тому +258

      @@owlon225 Yep. Because then I only needed to reach each file once instead of n times.

    • @skeletonboxers7336
      @skeletonboxers7336 Рік тому +23

      points at ur icon
      Nice :)

    • @shauas4224
      @shauas4224 Рік тому +41

      Had similar problem but on my small program to automatically search for counterexamples for math problem. Having to constantly open and close files is SO slow

    • @nilsp9426
      @nilsp9426 Рік тому +81

      I work with statistical software and I am pretty sure I can say: 100% of my unnecessary performance issues were based on badly written loops that repeated something unnecessarily. I think this is another such case: the file opening and closing was repeated for no reason, and probably even in an explicitly programmed loop. In my use cases, everything that is called only once can be utter garbage in terms of performance without anyone noticing. In your case, this would mean that a function call that opens the file in 100 ms would be equally as good as one that opens it in 1 ms. Only when you force your computer to repeat something that is unnecessary a million times you are able to reach time frames that can be perceived by a human. So my 80% move is to try and move as much as I can out of loops. In your case it would be the file opening and closing.

  • @ZarBluestar
    @ZarBluestar Рік тому +3472

    One lesson that always stuck with me since college was that software engineering doesn’t just have one time-cost. It has three: runtime cost, development time cost, and repair/maintenance cost. Which happen to be a perfect allegory to the performance, velocity, adaptability triangle

    •  Рік тому +54

      That's a great point

    • @thunderstruck2727
      @thunderstruck2727 Рік тому +7

      Thanks for sharing! Great lesson

    • @Microphunktv-jb3kj
      @Microphunktv-jb3kj Рік тому +33

      This video makes me feel im over optimizing and making too much effort on adaptability.. on a real estate full stack app im making...
      wich probably wont have more than 500-1000 objects in the database.. and maybe 10-20 images per object...
      just made me think i should be more optimizing for the images, rather than the app itself. (wich will not have insane traffic anyways)

    • @YamianGodlike
      @YamianGodlike Рік тому +37

      @ Actually, it's 3 points... Because... Triangle?
      I will get my coat.

    • @KennedyApproach
      @KennedyApproach Рік тому +13

      If it takes (me) more time to optimize it than any and all users of the app is ever going to get back, then (I) don't optimize.

  • @TonyHammitt
    @TonyHammitt Рік тому +97

    This brings back a memory from grad school. A co-worker of mine insisted that the time to load data was too long and so he cooked up examples that showed the worst case, plotted out on a linear graph and everything. None of our problems used data sets big enough to get into the "bad region". In fact, all of them were small enough to make the normal method a bit faster, which of course was hidden down in the corner of his non-logscale graph. It wasn't even worth the time to have the meeting, not only because he proved himself wrong, but also because our programs would spend a couple milliseconds loading the data, then process it for minutes to hours. Optimizing a non-problem with something that would have made it worse. Great job!

    • @demonslime
      @demonslime Рік тому +3

      Was he not aware that the data loaded in a few milliseconds?

    • @TonyHammitt
      @TonyHammitt Рік тому +6

      @@demonslime I guess he was thinking that we could optimize the neural network training down to a few milliseconds on large data sets. In the mid-90s. A very optimistic expectation! 😆

  • @JochemKuijpers
    @JochemKuijpers Рік тому +1969

    Besides "premature optimization", there's another thing to avoid, which is "premature pessimization", or absence of performance-critical thinking and writing poorly performing code without thought. If performance is never on your mind during initial development, you're going to have to do a lot of work to get any amount of performance out of your application later. So some degree of performance-critical thinking *is* a good habit to form. Knowing what patterns to avoid from the get-go will speed up development, rather than having to double back to refactor these poor patterns regardless. I feel like "premature optimization is the root of all evil" is often used to argue that performance doens't matter at all, which is not an absolute truth and it's often more complicated than that. It's better to judge on a case-by-case basis. But don't get stuck in conversations about pre-increment or post-increment, because all experienced developers know that it really doesn't matter in the vast majority of the cases, and those are typically not the optimizations that come up in code reviews.

    • @Chex_Mex
      @Chex_Mex Рік тому +126

      Yeah I'm glad he talked about macro level decisions and micro level performance considerations. The "premature optimization is the root of all evil" is due to people making code unreadable and hard to maintain for seemingly vapid performance gains, sometimes even things that the compiler would have optimized for you already, like function inlining. It's the macro level decisions such as choosing the right data structures and database designs that are really important. For example, incorporating parallelization into a design to increase performance is extremely hard to do after the fact, since it relies heavily on the structure of your data to be easily parallelization and minimize shared data.

    • @NihongoWakannai
      @NihongoWakannai Рік тому +30

      In a way that's part of adaptability. You want to design the software so that performance can easily be improved later if needed.

    • @PowerStudios1000
      @PowerStudios1000 Рік тому +25

      Yeah totally agree with this. Kinda extreme example but I once took some code from taking many hours to run to less than a minute. It was saving function outputs to disk and then loading them elsewhere to get the output, this was being done inside of a function being numerically integrated to simulate forwards in time and so it was being evaluated a LOT. In this case just knowing the order of magnitude performance difference between keeping results in memory versus the time to read and write to disk would be useful for the original author to have realised for preventing this kind of extreme performance pessimisation.

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 Рік тому +32

      It’s not that performance doesn’t matter, it’s that it’s often unintuitive where the performance bottlenecks will lie.

    • @BillLambert
      @BillLambert Рік тому +20

      This * 1000. Using the right algorithms and data structures will get you most of the way there. Instruction-level optimization is something I have super-rarely done outside of game engines and embedded devices. Hand-written assembly code on an Arduino can yield measurable gains, but on a modern PC (or Mac), the compiler does a better job than the human 99.9% of the time, and when it doesn't, it's often because your high-level code is a mess.

  • @absurdworlds
    @absurdworlds Рік тому +429

    To be fair, people take "premature optimization is evil" too close to the heart and stop tinking about performance altogether. By the time people start noticing, it becomes "a death by a thousand cuts". I worked on a codebase like this once. Higher-ups kept complaining about things taking too long, but we couldn't pinpoint that to any particular part of the code.

    • @soppaism
      @soppaism Рік тому +67

      Yes, it can be used as an excuse for complacency. Often times we have multiple implementation choices with similar complexity/maintainability/implementation effort - being aware of the best option there is NOT premature optimization.

    • @Weaver_Games
      @Weaver_Games Рік тому +25

      I have literally refunded games because they were so unoptimized so, especially in gaming, your resulting frame rate is exceptionally important to the end user.

    • @ChaotikmindSrc
      @ChaotikmindSrc Рік тому +34

      The real sentence in its context is more like "'premature micro optimization is evil"
      which makes a lot more sense

    • @Ehal256
      @Ehal256 Рік тому +15

      ​@@ChaotikmindSrcthe real sentence is basically "premature optimization in ~97% of cases is bad, but in the remaining 3% it's very important"

    • @ChaotikmindSrc
      @ChaotikmindSrc Рік тому +18

      @@Ehal256 I disagree, premature optimization is good, premature MICRO optimization is evil

  • @somefive
    @somefive Рік тому +1002

    i love CodeAesthetics videos, its visually pleasing, the content is well made, i can see him getting 1 milion subs in no time

    • @Gamer-kn7fi
      @Gamer-kn7fi Рік тому +2

      This!

    • @fuery.
      @fuery. Рік тому +17

      Let's be honest he only has 6 videos (including this one) and he's got 151K subs so I wouldn't be surprised

    • @_Amilio_
      @_Amilio_ Рік тому +2

      Same 👏

    • @Redditard
      @Redditard Рік тому +3

      and i forgot to sub to him shit

    • @matthias916
      @matthias916 Рік тому +2

      kinda sad he doesn't credit the ones who came up with these concepts but I agree with the visual part

  • @flam1ngicecream
    @flam1ngicecream Рік тому +833

    One thing you didn't mention here is that not all performance problems are because your code is taking a long time to do a thing. Sometimes your code is waiting on an external resource (like a database or API) to do a thing, and the optimization is to make fewer calls to it and rely on it for as little as possible

    • @videoguy640
      @videoguy640 Рік тому +35

      or do things async

    • @kenfrostyypaws9113
      @kenfrostyypaws9113 Рік тому +83

      Definitely "do less work" is a huge part of performance

    • @krux02
      @krux02 Рік тому +20

      effectively most performance problems are something is waiting. Chache misses for example let the CPU idle for something around 200 cycles (or more, depends on the CPU). Having too many of them certainly makes things slow.

    • @SolutionBuffet
      @SolutionBuffet Рік тому +1

      I suspect this would be discovered when running the profiler, or?

    • @haakoflo
      @haakoflo Рік тому +9

      In my opinion, part of the problem is when we think of things like database connections and API's as "external", and that ensuring speed there is someone else's problem.
      While this _can_ be true for API's provided by other companies, internally in a company, the comunication between components is often the main cause of many kinds of problems, including performance.
      Developers need to understand the costs of accessing components outside their own code base, whether that is a database, API, local storage or some shared filesystem/cloud storage.
      While making fewer calls generally is a good idea, and should be a primary goal when setting up an enterprise architecture, trying to less on "external" resources that can make your code both bloated/unmaintainable and slow in some cases.
      Often, the main source of slowdowns is when developers insist on accessing databases or API's to retrive data a single element at a time, as if they're hash maps. Let's say some customer is logged in to your system, and wants to see their purchase history. Let's say your database has an order table and an order-line table. If you first transfer the order table as a collection, and then loop over each order-line, it could take 1-2 seconds to retrieve the order history for the customer, somtimes more. If, on the other hand, you join those tables and return everything using one query, this can be cut to less than 50ms.
      Not only does this speed up your service, it also reduces the load on the DB, speeding up everyone else's work, too.
      The same goes for API's, except this requires even more cross-service design work. An API that returns only atomic information elements can be extremely slow, and can be sped up by orders of magnitude if it exposes endpoints that allow retrieval of larger collections of data at once.

  • @VulpeculaJoy
    @VulpeculaJoy Рік тому +809

    When working with robotics applications in python and c++, the workflow almost always involves
    1. write something quick and dirty in python
    2. grow the code to a point python's poor performance becomes a problem
    3. rewrite everything in c++
    Most of the time this is all it takes to achieve performance requirements. The reason I still find it useful to do something in python first, is because it helps shaping the features that a program needs. If you're fluent in both programming languages, it's fairly easy to translate between them.

    • @jesustyronechrist2330
      @jesustyronechrist2330 Рік тому +118

      Honestly, I've found out that you basically have to rewrite your code from scratch at some point or another. It basically impossible to write code without gathering digital debt, 1 dubious design choice makes 80% of the code this dilly-dally bubblegum workaround due to you not formatting the input enough or something. In rewriting, I typically notice how I can turn my 1000 lines to 100, partly because I won't use most of the functions and partly that I for some reason do practically the same thing in 7 different ways over and over again.

    • @NithinJune
      @NithinJune Рік тому +7

      from what i hear rust is good for that

    • @HannesMrg
      @HannesMrg Рік тому +20

      Where I worked we had a mix of both. Python for accessibility and fast development. All interfaces and wrappers were in python. The resource intensive code was then written in c and the library used within python.

    • @ChaotikmindSrc
      @ChaotikmindSrc Рік тому +10

      I tend to do the same with C#, client ask me for a c# program, performance become a problem because of the growing features/requirements, end up rewriting it in c++

    • @SVGc1993
      @SVGc1993 Рік тому +4

      If I have a piece of code that runs slow in Python because of large data sets, I convert it to Rust using GPT.

  • @argore9709
    @argore9709 Рік тому +83

    I always struggle with thinking "what's the best way to implement this" rather than "how can I implement this", ending up making overly complicated stuff instead of just getting something to work and improving it afterwards
    This video has been really informative!

    • @tristantheoofer2
      @tristantheoofer2 9 місяців тому

      im the opposite actually. generally i try getting it to work and if it does at all i dont try making it faster because im too fuckin lazy to do so

    • @rasmusturkka480
      @rasmusturkka480 4 місяці тому +1

      @@tristantheoofer2 I'm the opposite. I don't try to make it work and if it works by accident I make sure it stops working

    • @Ttt-n1w
      @Ttt-n1w Місяць тому +1

      I like to think of it as how can I write this so that everything can be flipped on it's head in 2 lines gracefully? Maintainability more often than not is the biggest pain point in real world situations. A slow app is better than no app, but spaghetti will turn your head into confetti. Not sure how much sense that made. Ah well.

  • @arisweedler4703
    @arisweedler4703 Рік тому +411

    At work, a customer filed a bug because our distributed system didn’t converge it’s state fast enough. They would make a change in the controller and wait like 2 minutes before converging. This was a pain for them because it was a container solution, so a container not coming up in 2 minutes ==> k8s would shut it down. We would force their containers to go into an infinite loop or trying to come up, being too slow, and getting killed.
    So I spent 6 months developing profiling tools and optimizing, eventually getting convergence down to 20 seconds from 120. But then another (smarter) guy at my work added a “readiness gate” that basically just told k8s “hey this pod isn’t ready yet so just wait”, and that immediately solved the customer problem. It took him a few weeks to do it, sure, it wasn’t instant, but. It was the RIGHT solution. Optimization was not.
    Even when optimization seems like the right solution, even when customers and PMs were literally asking me for it, it was not the right solution.
    That’s my premature optimization story :)

    • @VioletGiraffe
      @VioletGiraffe Рік тому +16

      Amazing story, thanks for sharing! Sounds like you're working on some exciting stuff!

    • @MTAG
      @MTAG Рік тому +57

      The take away of course is - identify the problem and only then fix it. Listen to the customer carefully. The problem was not in the speed department although speed can be the solution to it, it's just not the optimal one.

    • @arisweedler4703
      @arisweedler4703 Рік тому +17

      @@MTAG if only that was as easily done as said!

    • @w花b
      @w花b Рік тому +9

      @@MTAG there's a reason the customer is just that. A customer. Not a developer.

    • @cyrileo
      @cyrileo Рік тому +1

      🤔 That sounds like a great lesson in understanding when optimization is needed and when it's not! 💡
      ~ with ❤️ from repliesgpt

  • @christianbouwense4702
    @christianbouwense4702 Рік тому +260

    Cannot stress the usefulness of profilers. I have been intimidated by them my entire career until I took the plunge to diagnose a problem with our React unit tests being very slow. Turns out the arranging and acting were not the bottlenecks, but the special assertions we were making. Totally took me off guard and was such a joy to find out what was really going on.

    • @erikl9146
      @erikl9146 Рік тому +3

      Could you elaborate on what profilers entail in this scenario?

    • @Chris-cx6wl
      @Chris-cx6wl Рік тому +5

      What are examples of react profilers ?

    • @coachflow
      @coachflow Рік тому

      Yes please

    • @T0m1s
      @T0m1s Рік тому +8

      Profilers are useful if you have clear bottlenecks, but very often this isn't the case. Imagine your program is bubble sort, and you decided it will be like that because you "don't want premature optimisation". Well, now it's slow for large data sets, and no profiler will help you, and you'll have to rewrite your program from scratch for no benefit. It pays to be very careful at the beginning, don't listen to people who talk about "premature optimisation" without understanding what it means.

    • @bermast4240
      @bermast4240 Рік тому +1

      To be fair, profilers tend to have really poor, unergonomic interfaces imo. I was spooked by them for a long time too, and I still haven't found one that I really like. But good grief have they saved me a lot of time, now that I know how to use them.

  • @3lH4ck3rC0mf0r7
    @3lH4ck3rC0mf0r7 Рік тому +387

    This is the kind of content young programmers need. The dogmas you're taught in programming courses and college aren't always the best fit for your project, and people need to be pragmatic. Focus less on ideals and more on doing things that work. There's a balance to everything.

    • @wadecodez
      @wadecodez Рік тому +8

      Completely agree. From my experience as you approach 2 comp sci experts the number of unsolvable problems reaches infinity. For those who do not have a background in programming, this problem can be expressed as a system of equations which simplify to approximately 1/(1-x) where x is the number of needed comp sci experts per organization. Sorry if the math is off, I'm just a PHP developer.

    • @AntiAtheismIsUnstoppable
      @AntiAtheismIsUnstoppable Рік тому +6

      One example of readability over performance is the For... Each loop.
      It's very readable, because you can explain with the variable itself what is happening.
      But it's also slower than a regular for... next, which is then in return, way less readable.
      I think if the For... Each... needs to be optimized because it is the main thing slowing things down, maybe the language itself used is wrong. Maybe programming directly in assembly is better. You can control each cycle completely.

    • @3lH4ck3rC0mf0r7
      @3lH4ck3rC0mf0r7 Рік тому

      @@AntiAtheismIsUnstoppable Until you realize you suck at assembly and the compiler just did a better job than you... lol.

    • @AntiAtheismIsUnstoppable
      @AntiAtheismIsUnstoppable Рік тому +2

      @@3lH4ck3rC0mf0r7 It was also half way of a ironic comment, but I think there is a logical limitation to performance to many languages, such that if you want them to be faster, you have to need hacks, which is where you might want to think aout switching language instead.
      I remember back in the old days on the C64, when we wanted to make really fast for loops, you replaced the 0 with a dot
      for i=. to 255:poke 1024+i,160:next
      is VERY fast in C64 basic, because it's literally run like machine code, because of some bug. But it's also not that readable, and I could just as easy have made a dedicated MC routine for it and calling it using SYS.

    • @3lH4ck3rC0mf0r7
      @3lH4ck3rC0mf0r7 Рік тому +1

      @@terraformer316 And without destroying the sourcecode too.

  • @30robtaylor
    @30robtaylor Рік тому +46

    Great video! Wanted to emphasize that "premature optimization" does not equate to "do no optimization", and I think your video hit the big one with focusing on data structures. We can be critical thinkers and have reasonable predictions to know when we should do those big improvements early to save a lot of pain later. It's always a gamble when doing things early so if you're not pretty sure, then setup your design to learn as fast as possible. Knowing when it's a "problem" is best discovered before it becomes a really big one.

  • @drvortex
    @drvortex Рік тому +899

    I feel this is one of the best channels about programming that I know of. Instead of asking to give us more content, I want to advocate for the opposite: please keep up the quality even if it means spacing out videos. Nevertheless, thank you for doing what you're doing.

    • @notveryobservant1056
      @notveryobservant1056 Рік тому +11

      I agree completely, I like the quality and don't really look for speed in the people I watch.

    • @kiyu3229
      @kiyu3229 Рік тому +1

      @@cyrileo tf?

    • @cyrileo
      @cyrileo Рік тому +1

      @@kiyu3229 what?

    • @virgodem
      @virgodem Рік тому +13

      Ah, so you want his videos to have better performance/quality at the cost of velocity, I see.

    • @yoshikagekira7600
      @yoshikagekira7600 Рік тому +1

      true that.

  • @CapteinObvious
    @CapteinObvious 3 місяці тому +3

    Great video for javascript coders who don't want to learn how computers actually work. For anyone working or interested in working on software that requires good performance: No, optimization is not an afterthought that can be "easily patched" at the end of the development cycle. Architecture of data and the way it connects together in a complex project plays a massive role in performance, and this is not something that optimizing compilers are allowed to modify. Most stuff in this video is false or only applies to non-realtime software. Keep looking for actually valuable information, it's out there.

  • @Fork501
    @Fork501 Рік тому +220

    I haven't heard it expressed this way, but I like how you laid this out.
    There are two stages of optimization:
    * Best practices
    * Best implementations
    We typically start with the best practices and hope that they're the way to go. These optimizations are things like array type, design patterns, etc. For most code, it's fine and the way to go for readability. Best implementation comes into play when your "likely-optimized" code is part of a loop that has to run at lightning speeds. We never know about the best implementation until we have a problem.
    Which tools do you use for optimization?
    Keep up the great content!

    • @ayaderg
      @ayaderg Рік тому +14

      this is exactly it! much better to spend your time considering why you're looping through something so much in the first place than spend it optimizing the speed of the loop, up until you're completely out of other things to resolve

  • @skyking469
    @skyking469 Рік тому +8

    I'm so glad this channel isn't rushing to produce videos but each video is carefully researched is the best thing. Probably the best channel to understand programing as a concept.

  • @jswlprtk
    @jswlprtk Рік тому +239

    You are one of the most valuable channels I have subscribed to. Please keep up the good work, appreciate it.

    • @bernard-ng
      @bernard-ng Рік тому +3

      Totally agree

    • @cyrileo
      @cyrileo Рік тому

      Thanks for the kind words 😊 I hope CodeAesthetic can continue to help people improve their programming skills 💻
      ~ with ❤️ from repliesgpt

  • @billy65bob
    @billy65bob Рік тому +58

    Fun fact about the post-fix vs pre-fix syntax.
    There are legit some languages where ++i will ALWAYS be faster than i++. It is indeed because they make a copy, but the copy isn't used, and the compiler is too primitive to realise this and remove the temporary store.
    Outside of such cases, the compiler (or JIT if you swing that way) will pretty much always be smart enough to eliminate any difference between the two.

    • @Yay295
      @Yay295 Рік тому +12

      The thing is though, you can just always use ++i and not have to rely on the compiler doing the right thing.

    • @kurtclark8560
      @kurtclark8560 Рік тому +4

      computers and compilers were not always so fast, been doing ++i for so many years that i++ looks goofy

    • @daveatkinson644
      @daveatkinson644 10 місяців тому +2

      My issue with this has always been, if you don't know the difference between what pre- and post-increment do, you should! And if you know, why wouldn't you use the correct one? I despair of my fellow coders who think they "know enough now, and don't need to learn anymore"... _never_ stop learning...

    • @joestevenson5568
      @joestevenson5568 8 місяців тому +1

      Yes. ++i will ALWAYS give you the good case. i++ is not guaranteed.
      It's totally appropriate to call this out in code review.

  • @HamzaQayyum
    @HamzaQayyum Рік тому +94

    Idea for video: how hard part of coding up a large project from scratch is often not implementing features, but rather the software *design* ; how all your code is laid out. Getting a good design in place at the beginning is often instrumental in implementing features in your project later down the line without losing your sanity.

    • @matthias916
      @matthias916 Рік тому +4

      he's copying these from a book, he's not making these things up

    • @mv2e19
      @mv2e19 Рік тому +3

      @@matthias916what book?

    • @matthias916
      @matthias916 Рік тому +9

      @@mv2e19 Most of his videos seem to come from books like "Clean Code", just to make sure I'm not misunderstood, I like his video's, just a little disappointed that he doesn't credit the source and makes it look like he came up with these concepts himself

    • @Dobby-gy5uz
      @Dobby-gy5uz Рік тому +26

      @@matthias916 These concepts are well known in the CS field though, it's not like those books were the first to discuss it.

    • @matthias916
      @matthias916 Рік тому +1

      @@Dobby-gy5uz Well, most of these concepts were formed by the authors of these books, people like Martin Fowler and Kent Beck

  • @fearfulfred
    @fearfulfred Рік тому +70

    For a class in uni I was writing a basic synthesizer in C for a custom embedded system developed by some engineering students from prior years. This system didn't even have an FPU so everything needed to be in fixed point.
    At the beginning I was separating everything into functions in their own C files, but when we started to introduce DSP into the mix, I started getting pops in the audio output.
    I had just read Game Engine Architecture by Jason Gregory and learned about the instruction cache and stack resetting from function calls, so I moved every function to be either inline in a header or as a really long function style macro. And it worked fine, I was able to play multiple notes with DSP without any pops.
    Similarly, at one point in uni a friend was writing an HRTF audio plugin and found that by making the entire audio processing block in one function with no calls to std vector subscript, it went from 1 voice to 99 voices playing concurrently.
    In the world of real-time audio operating at 48 kHz, unfortunately code aesthetic is not allowed as function calls will actually slow things down enough to be noticeable to the end user. But for other real time applications the frame rate is almost never high enough for "the overhead" of a function call to be noticeable.

    • @99temporal
      @99temporal Рік тому +6

      But the thing is: You can always separate things and create a macro to join them later, if need be.
      This way your code is readable AND it's fast

    • @noergelstein
      @noergelstein Рік тому +5

      If you write a function as static and call it only once, it is basically guaranteed to be inlined, so separating out code into a static function in the same file has no performance cost. (static free function, not static member function of a class)

  • @flamewave000
    @flamewave000 Рік тому +65

    I've always used the theory of critical sections. Some parts of your code will be within a "Critical Section" while the rest is in a "Non-Critical Section". Critical sections are areas where performance matters because it is code that will be executed either over vast amounts of data, or within a continuous loop (eg. a game update/render loop). Non-critical sections is code that is executed either infrequently, or due to an explicit event. Critical sections will typically only exist in about 5% of your code. Everything else is non-critical and requires little to no optimization outside of what you get for free from the compiler. The critical section code does require optimization, but it should also be carefully designed to try and maintain readability and maintainability. If either of those are broken, documentation can be provided so that future developers can study and understand why and how the code is doing what it is (such as referencing the mathematical formulas being used to calculate a complex solution). Again though, 95% or more of your time is spent working on non-critical code that requires readability and maintainability over performance and does not need to be optimized at all.
    For example, I built a simple menu system for a micro controller's LCD screen. This code is not efficient at all, but I don't care because the UI only ever updates when a user pushes a button, so who cares if it takes a whole 250ms to refresh the screen. What I care about is readability because UI can be complex, even at a small scale, and I want to be able to fix bugs as easily as possible. What does matter for performance however, is the sensor monitor which needs to be highly reactive and efficient but only accounts for ~10% of my actual code base. The UI is ~40% and the rest is File/Network IO and business logic that all runs slowly because it doesn't have to be fast and is limited to the speed of IO.

    • @tomysshadow
      @tomysshadow Рік тому +18

      The term Critical Section is already used in programming to refer to code that should only be run in a single thread at a time, but yes. The concept you are describing is generally referred to as a "hot path" or "performance path," as in, it's the codepath that will be run the most often so needs to be the most performant.

    • @UWHabs1
      @UWHabs1 Рік тому +2

      Yeah, if you have a section that is super slow due to a file io, for example, it doesn't matter if something else around that takes 10ns or 20ns to run. Sure, you can speed up that other piece by 2X, but it's not going to make a difference overall.
      But if that function is run a million times in a performance critical section, yeah, every little bit counts.

    • @NihongoWakannai
      @NihongoWakannai Рік тому +4

      Tbh even in the cases where I need optimization, I very rarely find myself needing absolutely percect optimization.
      More often I'm like "oops, that is a bit slow, lemme just cache some of that data and now it runs 10x faster"
      Unless you're working on something that truly pushes hardware to its limits, basic optimizations are likely all you need 99% of the time.

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 Рік тому

      There is the term “critical path”, from project-management theory.
      But yes, a typical program’s use of CPU time does tend to follow a pareto distribution, and concentrating effort on speeding up the “hot” bits will help more than elsewhere.

  • @npip99
    @npip99 8 місяців тому +2

    I think an important aspect to making these decisions is being well-versed in computer science in general.
    E.g., even if you haven't written a compiler, the concept that i++ and ++i have the same SSA form after collapsing unused variables. So, they will obviously compile to the same assembly after SSA optimization.

  • @JoeTaber
    @JoeTaber Рік тому +21

    Great video! I like Casey Muratori's "philosophy of non pessimization" as a framework for macro optimization.

  • @Dudeplay
    @Dudeplay Рік тому +7

    That videos hits direct in my heart. I am was nearly targeting fully on performance and adaptability, which killed me often adding new features. I ended refactoring for the new feature, but did the same again. I am currently in the process of realizing it. And stopping to over-optimize. I now often have small functions where I know I did a simple solution, but if it should be a performance issue which it never will be, I could optimize it in a couple of minutes. I already feel how my productivity rises, which also rises my happiness and motivation to do more and nice features. Thank you for the video, it confirms my thoughts and motivates me to go further!

  • @amythistfire5042
    @amythistfire5042 Рік тому +49

    At a previous position, we painstakingly went through each line and found that a function was being called for each element of a list and each element of the previous version every time the list was changed. So appending 3 to [1,2] was run 8 times for [1,2,3], [1], [2], [3], [1,2], [1], [2].

  • @isovideo7497
    @isovideo7497 9 місяців тому +2

    In many situations, algorithm choice is of enormous importance, and particularly in real-time applications. Optimization requirements may need to be be part of the initial design, so "it depends" on whether or not it can be considered "premature".

  • @akshay-kumar-007
    @akshay-kumar-007 Рік тому +305

    Profiling your code is something that literally only few people do. People give talks in conferences about how to make things BLAZINGLY FAST, but they forget to tell that the approach suggested may not fit for every usescase.
    code maintainability > performance (Most of the time 😅)

    • @patrickspiegel5524
      @patrickspiegel5524 Рік тому +53

      I profile my code and usually it just shows that I write shitty SQL statements.

    • @akshay-kumar-007
      @akshay-kumar-007 Рік тому +32

      @@patrickspiegel5524 no one knows how to write efficient SQL queries, even chatGPT

    • @farfa2937
      @farfa2937 Рік тому +2

      @@patrickspiegel5524 Well that's an easy problem to solve at least lol. Unless the problem is how much data you need back, then you're kinda f'd.

    • @TheDragShot
      @TheDragShot Рік тому +14

      @@patrickspiegel5524 writing SQL queries is tricky, specially when you're handling large data sets. There is a point where using regular 1:1 indexes stops being beneficial and instead slow you down due to IO bottlenecks. There are other types of indexes that can help, and sometimes it's somehow more effective to do plain joins (using disk and/or RAM) when there's a lot of data to cross over. After working with Oracle databases for a few years, I've grown used to verify the database plans for my queries once or twice before settling down with them, sometimes I end up having to use hints in order to instruct the database engine on how to poll for the matching rows at each table and in what order.

    • @FinlayDaG33k
      @FinlayDaG33k Рік тому +10

      I profile my code but that's because it allows me to see if I only "mildly screwed up" or "screwed up really badly".
      If a piece of code should run basically instantly but takes a few seconds, I know something may not be right.
      But if it takes a few ms longer than I expected, I generally just slap a TODO there for when I don't have anything better to do.

  • @dagoberttrump9290
    @dagoberttrump9290 Рік тому +9

    I think all of those topics you talked about (function call overhead, i++ vs ++i...) are things that a sincere developer will look at at some point in his carreer and is something which brings you ahead and sheds light on corners of computer science that help you understand things better. That is the stuff that matters a lot as it is the grounds where new concepts are built upon and new ideas sprout. Investigating the ++i vs i++ case has thought me a lot about how cpus work. It is a very interesting case as actually i++ can be faster on some processors (CISC family) due to instruction level parallelism, whereas on RISC processors you're probably better off using ++i. I never quite understood the fear of writing optimized software... If you make it a habbit it will come naturally and not hinder the development process. It will require some thinking at first, but if you want to stop thinking you're probably better off chosing a different job lets be honest.

  • @BryceDixonDev
    @BryceDixonDev Рік тому +33

    I agree with basically everything *except* the reasoning stated for why post/pre-increment doesn't really matter. My reasoning for having a preference in every situation is that there is a definitive difference. Not that that difference has a huge performance or functional impact, but that it is defined at all. When someone 10 years from now is reading through my code (fairly common for AAA game studios) and sees a post-increment, I want them to be sure they can say "it's there for a reason" and that they *don't* say "well, post-increment is slow so I'll just switch it to pre-increment" and inadvertently create bugs.
    My issue with people defaulting to post-increment because "the compiler will just optimize it to pre-increment anyway" is because:
    1. It won't *always* do that if they're not truly synonymous, which can lead to unexpected behavior; and
    2. When we write code, it's not just for the compiler, but for future readers (including our future selves) to understand our thought process when we were writing it.
    The way I think about it is that if increment operators weren't so short, would we make justifications for it not really mattering which was used? If instead of code being
    `for (int i{ 0 }; i < 100; i++) { /* ... */ }`
    it was
    `for (int i{ 0 }; i < 100; math::increment_integer_and_return_previous_value(i)) { /* ... */ }`
    I think it'd be obvious people's immediate question is "why do we care about the previous value here?" and that the answer of "we don't, but it doesn't matter because the compiler will optimize it" wouldn't be satisfactory.
    Aside: there's also a greater issue of user created `operator++` implementations which incur side-effects (which may differ between pre and post increment). There's a whole discussion to be had about if that's even good practice (it's not, please don't make operators with side-effects, I beg you), but since we live in a world where millions of people think OOP is a good idea I can't make any assumptions about it.

    • @Drillgon
      @Drillgon Рік тому +7

      That's a really good point. The difference between the two that matters isn't performance usually, it's intention.

    • @williamdrum9899
      @williamdrum9899 Рік тому +1

      I was never a fan of the ++ thing since from what I understand it's a holdover from assembly, where many CPUs had a specific instruction just to add 1 to a register (typically it was called INC for increment.) Nowadays, not every CPU has a dedicated INC operator so I really don't see how
      "int x = ++i + 7" would compile differently than "int x = (i + 1) + 7"

    • @williamdrum9899
      @williamdrum9899 Рік тому

      Technically every operator has side affects because it advances the instruction pointer 🤪

    • @Diamonddrake
      @Diamonddrake Рік тому +1

      ​@@williamdrum9899 "int x = ++i + 7" and "int x = (i + 1) + 7" have different answers. They would always compile differently. the later is 1 larger than the first, because ++i pushes i before adding 1 onto the stack while the (i + 1) pushes its result on the stack. Which is incredibly valuable in programing.

    • @williamdrum9899
      @williamdrum9899 Рік тому

      @@Diamonddrake I dont get it

  • @captaincrispy2184
    @captaincrispy2184 Рік тому +120

    I don't think THIS is what my girlfriend urged me to research on UA-cam

  • @Emil_96
    @Emil_96 Рік тому +8

    Man you already helped me and especially my colleagues so much when they interact with my code
    I regularly rewatch all your videos to keep writing better code and I recommend you to every coder
    The biggest help you gave were the nesting video and the readability video (I don't remember if the latter one was a whole video but it really stuck with me)
    Keep making such great content 🔥

  • @grproteus
    @grproteus Рік тому +70

    in a big C++ project we focused on velocity and adaptability. We had to redo pretty much everything when optimizing for performance, because what we built initially was that bad.
    NEVER leave performance for later. Try to make something that is 'good enough' performance-wise for starters. Otherwise you'll end up remaking everything in the long run.

    • @techpiller2558
      @techpiller2558 Рік тому +9

      I'd advocate that you did that project correctly. I don't think code should be held holy, but just trashed for the new, better version. It's difficult to optimize beforehand, before you know what are the bottlenecks. Also you will easily go into the optimization mindset, which is to delve into the endless rabbit hole of optimization, and done before you know what is important, can be a detrimental over engineering nightmare. Especially if you combine performance and adaptability just to notice later that what it is doing is not exactly what was needed, and even adaptability was wrong. Then you did a lot of work and used a lot of time, and still have to trash and redo. I think it is important to start with velocity, trash it, and iteratively add adaptability and performance when you have more information on the directions these should take, as these take resources.

    • @grproteus
      @grproteus Рік тому +26

      ​@@techpiller2558 not really. On the contrary, experience has proven that it was way too difficult to optimize. Once a foundation has been laid and 10-20 projects are based on that foundation, changes are exponentially hard to make.
      So yes, in principle you should optimize last, I agree, but in most cases a certain effort needs to be made to make something that can easily be optimized later to achieve some level of performance.
      This is not a given. You have to work for it.
      And I refuse to accept the mindset that "if it works, even if it is ugly, difficult to work with and slow as heck, it's ok".

    • @1495978707
      @1495978707 Рік тому

      While you shouldn’t be completely unaware of performance issues, you need to fully determine what your program is and then actually measure what your performance limitations are, and prioritize them

    • @1495978707
      @1495978707 Рік тому +1

      Also I should note, this is why it is so important to separate interface from backend. It’s certainly worth thinking a lot about what your interface should be, because it’s changing that which causes everything to break

    • @techpiller2558
      @techpiller2558 Рік тому +1

      @@grproteus I'd argue then that somewhere down in that line, too little trashing and redoing were done where it was supposed to.

  • @Swonke
    @Swonke Рік тому +58

    As a very early stage programmer, the caution against over-optimization is very timely for me. I'll start working on a function to accomplish the same task over and over, and make that function adaptable enough to be called from anywhere and fit a variety of circumstances....before even confirming if the task itself will work for my needs...
    I guess I've had to redo code enough times that I want to do it right the first time. However this video helps me see that going back to optimize is much easier and less frustrating than having to scrap hours of work because it turns out none of it was necessary to begin with...

    • @bluesillybeard
      @bluesillybeard Рік тому +4

      I've been programming for years (well, only really 2 years, but still), and I write dog-crap garbage code to begin with, for prototyping. I try to make it as understandable as possible, but I don't go out of my way to make it readable and intuitive, and I don't do much to make it extendable or flexible.
      Once I am done prototyping, I refactor and rework the code to make it good enough, and once it's good enough I stop there.

    • @peterbonnema8913
      @peterbonnema8913 Рік тому +1

      Excellent insight. I think the keyword here is working iteratively.

    • @cyrileo
      @cyrileo Рік тому +1

      I hear ya! 😉 Premature optimization can be the devil for sure- we have to remember to step back and look at the bigger picture first. 🤔
      ~ with ❤️ from repliesgpt

    • @aurias42
      @aurias42 Рік тому +1

      Gosh I feel that one. But I've been programming for almost a decade and it's still my preferred style. Given, very little of that time has been doing it for corporate, so we'll see if that changes.
      But the joy of working with a pleasing, adaptable system that resembles an engine within itself is just too satisfying to forego, so I maintain the view of sacrificing dev speed for architecture.

  • @Ian.McGregor
    @Ian.McGregor Рік тому +3

    New videos please? These are excellent - I’m a full time engineer watching your videos on my day off. Nice work!

  • @peezieforestem5078
    @peezieforestem5078 Рік тому +13

    If you're solving any kind of problem that isn't already solved, the likelihood that you will run into a an issue that will force a complete redesign of large parts of your code base is high. Therefore, the main argument against premature optimizations is that most of the code they apply to will not make it to the final product, i.e. most of the effort you put in is wasted.
    It's a different story if you're designing basically the same API for the Nth time and already know how it will go, more or less.

  • @itaysas105
    @itaysas105 Рік тому +5

    I work a lot on math heavy code, there are a few steps I always try to implement to optimize it. Vectorization, swapping matrix axis and memoization. vectorization is by far the most effective and make the code more readable, the other two are very quick to implement and check. If the code is still slow profiling is an amazing tool, find the problem function and try to work on that!

  • @ThePrimeTimeagen
    @ThePrimeTimeagen Рік тому +29

    great video

  • @lollertoaster
    @lollertoaster Рік тому +4

    I really like this triangle analogy. I'd expand it like this:
    - at the beginning you have nothing, so the velocity is the only important aspect of development
    - as you go back on some features, you find things you often modify and repeating code and optimize those parts to modify them easier, so adaptability becomes on par with velocity
    - as you finally approach the final design, you go back on the features you have and might want to tweak them to desired values. If this is a game, then at this stage you probably have a feature-complete vertical slice of the game and want to make adding new, non-coding elements like maps. models etc. easier. So adaptability becomes the most important aspect of the development and new mechanics happen sporadically as need or inspiration comes
    - finally, at the end, you have the program you wanted to make. It runs horribly and hopefully you have a powerful machine to run it. Now you can find parts that are inefficient with profiling tools and make them faster (probably at the cost of extensibility). Most of your code will probably run once, few times during the runtime, once every few seconds. Optimization of anything other than most executed code is usually a waste of time. The only concern now is performance.
    At the end you have an extensible program with few, rigid, ultra-efficient nodes that are never to be touched.

  • @KatieZeldaKat
    @KatieZeldaKat Рік тому +9

    I remember a particular problem in my code where I was trying to remove recursion, as it is generally avoided at all costs in terms of performance. I solved the problem with loops and tracking data top down, only to realize the data structures were actually making it take twice as long.
    The real problem with the data structures is that they were allocating memory regardless of whether the loop would actually go down far enough for it to matter. Heck, often times the recursion wouldn't happen, and the end case was done in one function call! It truly was a waste of time after a generally quick enough solution which did what it needed to.

    • @CD4017BE
      @CD4017BE Рік тому +1

      When replacing recursion with a loop + list data structure, I usually don't do it for performance but rather to avoid a potential stack overflow if the recursion has the possibility to go more than ~100000 layers deep. Because, depending on the programming language and whether you run on a non-main thread, the call stack might only be able to allocate a few megabytes (unlike Heap which can take the entire available ram).

    • @JanVerny
      @JanVerny Рік тому

      On the other hand, you should always remove recursion not for performance reasons, but for the sake of keeping sanity of anyone who has to at some other point try to read and understand what the function is actually doing.

  • @rutabega306
    @rutabega306 Рік тому +4

    Personally I don't like the increment example because postfix and prefix are virtually identical from a readability standpoint. It's a good idea to just pick one and stick with it for consistency, so why not pick the sometimes-faster one?
    At any rate, premature optimization is a very important topic, especially when performance hurts readability. Or when there are actually no gains at all (I recently saw someone turn a bunch of JS string concatenations into array concatenations for "performance" - they didn't measure so we'll never know how the JIT optimized). Glad you covered it!

  • @Alsadius
    @Alsadius Рік тому +10

    I will say that even if performance isn't worth optimizing in a lot of cases, there's still some basic things that can be worthwhile. For example, I'm an Excel/VBA guy, so you basically take any other language, and knock three or four orders of magnitude off the speed. For big jobs, it can easily take minutes to run a complex macro. And there's a couple basic things you can do to speed it up with almost zero effort - most commonly, turn off screen updating, and turn off auto-calculation (then turn them back on when you're done). That can easily be an 80% improvement in a lot of real-world situations, for a whopping four lines of code. (It occasionally causes headaches, especially if your code crashes mid-execution, but there's huge gains to be had).
    So there's certain kinds of optimization that actually do justify premature use. But it's rare even in my neck of the woods, and...well, VBA is an adorable pile of 40 year old jank, so perhaps this isn't the most useful example for people who code in other languages.

    • @cyrileo
      @cyrileo Рік тому

      I hear ya! 👂😎 Esp when it comes to big jobs, any improvement helps! 📈
      ~ with ❤️ from repliesgpt

  • @444haluk
    @444haluk Рік тому +2

    7:14 "Will my startup fail?" damn, Code. You hit the place that hurts the most.

  • @_xord
    @_xord Рік тому +13

    A funny and informative video. It's good to find someone who has a good balance of productivity and humour. Most other code channels lean more towards one or the other.

  • @jmwild1
    @jmwild1 Рік тому +3

    I had a real problem for years early in my career wanting to "get it perfect" the first time. So I would craft the whole thing in my head and then write the code. As I eased into an agile, TDD (or BDD) mindset, it was a huge relief from stress and anxiety.

  • @HotAkPL
    @HotAkPL Рік тому +37

    While it is not worth worrying too much about optimization working on the project, I still think it is a good thing to explore when available. Knowing how to make optimized code will generally make you a better coder in the long run.

    • @NihongoWakannai
      @NihongoWakannai Рік тому +9

      The point is that people focus on it too much and think it's the only thing that makes you a good programmer.
      If you're "coding" then sure, but a programmer needs to know how to write good architecture more than highly optimized code.

    • @YASxYT
      @YASxYT Рік тому +8

      @@NihongoWakannai Software today is much slower than it should be. Hardware keeps getting faster but a lot software doesn't, because there are many programmers who don't care about performance. Or worse, some programmers don't even know how to write performant code.

    • @NihongoWakannai
      @NihongoWakannai Рік тому

      @@YASxYT of course they don't care about performance, because it doesn't matter. What matters is finishing the project quickly and within deadlines, not spending 3 extra months getting extra performance that the end user doesn't even notice.

    • @Diamonddrake
      @Diamonddrake Рік тому +3

      @@NihongoWakannai The user always notices. You need a new cell phone every 2 years because code keeps getting slower and it annoys you, not because your cellphone CPU slows down with age.

    • @NihongoWakannai
      @NihongoWakannai Рік тому

      @@Diamonddrake No, the user does not always notice. No one notices the difference between a 1ms latency and 0.5ms latency. Users notice significant performance issues, not tiny ones.

  • @DemonixTB
    @DemonixTB Рік тому +3

    I hate when people missuse this quote. The full quote reads "We should forget about small efficiencies, say about 97% of the time. Premature optimization is the root of all evil, yet we should not pass up our opportunities in that critical 3%." This is rarely done, what he ment with optimization was things like cycle counting, cpu specific hot path definitions, cacheline fitting of memory structures. He did Not mean oh using layers upon layers of language abstractions to accomplish something 5 minutes sooner is a nobrainer. Don't **pessimise** your code, write reasonably performant code from the start, you'll find it's much easier to read and much easier to work in an environment that only does what it's supposed to rather then a bunch of busywork that you have to bring out the call stacks for just to find where the smalle amount of useful work gets done at all. That triangle is a false uh trichotomy you can easily have performant code that has full velocity and is reasonably adaptable, simply write code that does exactly what it has to accomplish it's task and if you need a more general solution you write the slow path next to it. you can always add new features this way also.

  • @oof-software
    @oof-software Рік тому +6

    Adding a memory pool for when you constantly allocate and deallocate the same type of object but never need more than N at the same time was a nice performance increase for me.
    You can even add a log entry for when you need to fall back to the default allocator in case you unexpectedly needed more than N objects.

  • @bananalife
    @bananalife Рік тому +4

    ORMs are always a good place to look when you start having performance issues. DB queries often come out on top of the performance issues list, whether you are making too many of them or they are simply unoptimized. When used without care, ORMs often procude suboptimal queries and object loading problems.

  • @johncarlogamboa7331
    @johncarlogamboa7331 Рік тому +7

    In unity, an interesting optimization i discovered is you shouldnt deeply nest rapid moving objects all in one parent. Because when a child object moves, it forces to recalculate bounds to the parent...imagine this recalculation happening frequently. Solution was to not parent moving objects. It looked messy in the scene hierarchy but hey at least it runs faster than having to parent them for "organizational" reasons.

  • @hydropage2855
    @hydropage2855 10 місяців тому +1

    I was once making a chat bot that used an API, and when you said it’s often 1 silly thing slowing everything down that really resonated, because I’m not sure why, but when I’d request something from the API, the response would be massive, and stripping away only the things I needed to store in memory improved speed and memory usage dramatically. But this was something I only cared for once I analyzed my memory usage later down the line

  • @IGassmann
    @IGassmann Рік тому +4

    Man you're becoming the best programming language channel on YT

  • @ZeEduardo95
    @ZeEduardo95 3 місяці тому +1

    This is a GOATED video. In the end is about using your own judgement instead of applying rules set in stone. Who would have figured! Your channel is amazing. Keep it going.

  • @nova7114
    @nova7114 Рік тому +6

    I was working on implementing visual transitions in java on an individual pixel basis. I saw a bunch of videos and did research on branchless programming and that it was recommended for code involving graphics to be branchless.
    I spent several hours converting all of my functions (while shoehorning in being able to multiply by the result of a comparison with bit operators) and only at the end I decided to test the difference between the new and old code. The old code was several times faster. I thought about it for a while before realizing the problem with branchless functions is that they compute all of the possible values and then throw away the values that we don't need. Some of my functions were making rather expensive calculations but only for a limited range of inputs and all other cases were either copying a value or returning zero. Making them branchless cause the simple cases to have a performance penalty that far exceeded any branch mispredictions that would have occurred.
    What I learned was when trying to optimize a lot of similar bits of code, do one piece, compare it to the old, and then implement it for the rest if it is faster.

    • @dominiccasts
      @dominiccasts Рік тому +9

      Graphics programming (at least with GPU acceleration) is really quite complicated, particularly in the branching case. Generally branch mispredictions and overcalculation only tend to happen if every shader unit in a block ends up taking different paths, but if they are branching on a variable in a common way that tends to work out fine. Also, GPU programming gets complicated by drivers, especially for nVidia cards, since those will often have optimizations built in for ways that a big game has done something, which means that what might look worse on paper may be faster because the driver has been written to recognize what you are doing and optimize it specifically.

    • @treelibrarian7618
      @treelibrarian7618 Рік тому +1

      There's a fairly simple way around the issue you described: do the simple quick operations in one loop and when you find data points that need the complex, slow processing put them on a stack (I usually implement the stack as an array with write always, conditional increment) then have a second loop that goes back over the stack doing the complex work on everything it finds. Both loops can be branchless without throwing away work (except the single memory write to the same memory location for every non-complex data point).
      But of course there are plenty of cases when branchless code is not the answer - branches are only bad when they are not predictable, and any repeating sequence (up to about length 16 I think) can be predicted eg yynynnyn.yynynnyn.yynynnyn..... only when the sequence is truly unstructured does the prediction fail, like in sorts and searches. (at least in scalar CPU code ... AVX and GPU code is a different matter)

  • @octaviaz409
    @octaviaz409 Рік тому +1

    Honestly I'm really glad I found this channel. These videos are all so high quality and enjoyable to watch.

  • @phazechange3345
    @phazechange3345 Рік тому +3

    The problem with this video is that it completely misrepresents the positions of literally everyone:
    For the more performance oriented side, you're misrepresenting them by saying they care about menial micro things like i++ vs ++i.
    For the "root of all evil" side, you are misrepresenting them by suggesting that they do literally anything at all.
    In reality:
    The performance side of the argument argues simply to not pessimize your code. Don't write poorly performing code just cause that "seems" to be the easiest. Don't blame your 5,000ms web request on "lol, that's just IO". We literally just argue in favour not picking extremely slow tech/algos from the get-go, as well as to actually measure from time to time. That's it.
    The "root of all evil" side loudly advocates *to never ever even bother taking metrics of your performance, or even to think about performance characteristics at all*. This side argues that ALL PERFORMANCE MEASUREMENTS AND OPTIMIZATION OF ANY KIND ARE PREMATURE OPTIMIZATION. JUST THROW HARDWARE AT IT.

  • @carlwummel9050
    @carlwummel9050 Рік тому

    This one was really well made. Not too fast not overloaded, slow talk... easy to follow. Thank you!

  • @Feonen
    @Feonen Рік тому +11

    Just a quick chime-in on the ++i/i++ debate:
    The reason I believe ++i should be "best practice" is that i++ can lead to subtle bugs, as it *post*-increments, meaning the result of "i++" is actually "i".
    Which can easily be overlooked if it's used as part of a larger expression!
    Therefore I recommend either:
    - Use ++i everywhere.
    - Never use i++ as part of a larger expression.
    Or use Rust, which has neither (for this exact reason) and just do: i += 1;
    :D

    • @NathanHedglin
      @NathanHedglin Рік тому +2

      But I love doing "return i++" jk jk

    • @blinking_dodo
      @blinking_dodo Рік тому +1

      i++; // make it a separate statement instead.

    • @marklarization
      @marklarization Рік тому

      I was using i++ for years without realizing what it actually did. When I learned the truth, I was so shocked that I now exclusively use ++i.

  • @pratikchakravorty983
    @pratikchakravorty983 Рік тому

    This guy is like the kaptain kristian of coding topics. Videos are sleek high quality and just a joy to watch

  • @gunnarthorburn1219
    @gunnarthorburn1219 Рік тому +22

    I think the problem with discussions about optimization or premature optimization often is so simplified (like the difference between i++ and ++i , we agree about focusing on readability here).
    Quite often two nested loops can be replaced by a sort followed by one loop. That brings down cost from O(n2) to O(n log n). O(n2) is often too expensive, and O(n log n) is most often acceptable. I think this is good practice. In most languages sort is easy to use, and the code is often easier to reason about when data is sorted.
    I think, when you write code for running on a server or writing code for a reusable library it makes sense to make informed optimizations (like the one above) immediately. When writing code that will run in a web browser the CPU cycles are cheaper and if the application is fast enough then it is.

    • @jasonpepas4290
      @jasonpepas4290 Рік тому +3

      Considering the hellscape that is modern day web performance, I'm not sure I understand how CPU cycles are cheaper on the client.

    • @gunnarthorburn1219
      @gunnarthorburn1219 Рік тому +4

      @@jasonpepas4290 If I build the system, and the web application, and host the servers, I pay for the CPU-cycles that run on my servers.
      But I do not pay for the CPU cycles that are wasted in your browser, when you use my application.

    • @serhatarslan4138
      @serhatarslan4138 Рік тому +1

      I wonder what is the difference in optimization regarding i++ and ++i?

    • @gunnarthorburn1219
      @gunnarthorburn1219 Рік тому +1

      @@serhatarslan4138, it used to be said that one was faster than the other. It was purely about the generated ASM-instructions, and I doubt it matters at all on modern CPUs.

    • @serhatarslan4138
      @serhatarslan4138 Рік тому +1

      I agree with you. Given the difference between ++i and i++, modern compilers generate the same code using optimization techniques. These are cookies for compilers :D @gunnarthorburn1219

  • @koekeritisVideos
    @koekeritisVideos 4 місяці тому

    I just discoverwd your channel recently and I love it!
    I feel like there is a gap in content for intermeddiate programmers like this, and this channel fills that gap beautifully with really useful topics covered in a streamlined and neatly explained way. Thank you for providing us with that :)

  • @vitulus_
    @vitulus_ Рік тому +4

    Love the triangle, put into words what I have been thinking. My rule is just once a component/idea is used for the third time, abstract/generalise it instead of hacking the differences. I'm not into generalising everything straight away, as long as you have experience with how things should be structured, it's typically not an issue. Performance should always be on the back of one's mind, otherwise you'll eventually ruin velocity, although full optimisation should be done when weak points are discovered later during profiling (what you call micro vs macro optimisation).

  • @mennovanlavieren3885
    @mennovanlavieren3885 Рік тому

    Much appreciated video. What I like the most:
    1. The distinction between Macro and Micro optimizations. Don't convolute the discussion. Macro aligns with good architecture, Micro opposes it.
    2. Go for data structures (or database structures) first. Data locality is where the battle is won. And it forces you to think about your architecture.

  • @jaypii3222
    @jaypii3222 Рік тому +18

    My advice for performance gains:
    1. Database design (missing indexes for large tables, too complex queries)
    2. Database queries (making even small sets of data large and not utilizing indexes )
    3. Everything file related (caching with files instead ram, too many files in 1 folder, to many file operations)
    4. External apis (too many calls)

    • @peterbonnema8913
      @peterbonnema8913 Рік тому +2

      For those writing console apps, anything related to IO is slow, including writing to console. So make sure to take out those debug lines before running!

    • @cyrileo
      @cyrileo Рік тому

      Good advice 👌 That might be a better approach than blindly optimizing code!
      ~ with ❤️ from repliesgpt

    • @BrunodeSouzaLino
      @BrunodeSouzaLino Рік тому

      5. Don't use XML for anything.

  • @peterszilvasi752
    @peterszilvasi752 Рік тому +1

    Once upon the time, we develop an auditing framework in .NET core 2.1 that used Socket technology to send data between processes/machine. We placed a lot of synchronization object (semaphore, mutex, etc.) to ensure the data integrity. The code was very smelly... it felt wrong to me. Anyway, the tests were passed the output looked good so we did not clean the code. Later when .NET core 2.1 was upgraded to .NET5, the tests started to fail and sporadically deadlock happened. It turned out that the code failed because the .NET5 much more faster than .NET core. So we had to clean up the code, remove a lot of unnecessary synchronization, and use a modern asynchronization programming model. Sometimes clean code and performance go hand in hand.
    P.S. I loved the little stories (Mark Zuckerberg, GTA5, etc.) during the videos. Great job!

  • @TheRationalPi
    @TheRationalPi Рік тому +5

    I do real-time signal processing algorithms for embedded systems, and sometimes you need to consider performance in the early stages of development. For example, I may be given the choice between doing something in the time domain or the frequency domain, and depending on if I'm doing more convolutions or array multiplications, one of those choices could be orders of magnitude faster. And because the choice of domain cascades through the whole system, changing later might mean scrapping the whole project.
    That said, this applies only to the "core" algorithmic portion of my code. Even in signal processing, there is a lot of peripheral code that isn't nearly so performance constrained, and prioritizing flexibility/maintainability is the way to go.

    • @dominiccasts
      @dominiccasts Рік тому +2

      I mean "premature" is the operative word, and it sounds like while it was early in the project, it wasn't premature to optimize then.

    • @TheRationalPi
      @TheRationalPi Рік тому

      @@dominiccasts Totally agree.

  • @jonopens
    @jonopens Рік тому +1

    I look forward to your videos quite a bit. You have a reasonable, rational approach to the thesis of each video, so I hope you will continue for a long time to come. Cheers!

  • @Temulgeh
    @Temulgeh Рік тому +8

    to me pre increment vs post increment isn't about performance but about code clarity and intention
    if i just want to increment i'll use pre increment, if i want to iterate on something while actually using the post increment's functionality, i'll use post increment

    • @davidg5898
      @davidg5898 Рік тому +2

      Same.
      I consider what is the most sensible for end condition or return or output and pick the increment type based on that.

  • @richardblain4783
    @richardblain4783 Рік тому

    6:44 it seems to me you’ve shown that the answer is, “No, pre- and post-increment have the same performance, at least in optimized code, and provided the result of the operation is ignored.” Thank you for investigating this question and making the answer available to everyone. You’ve saved us, collectively, a great deal of work.

  • @sergioquijanorey7426
    @sergioquijanorey7426 Рік тому +7

    Totally agree. Had performance issues in my machine learning project, done in Python. I thought that list comprehension were the problem, but I decided to profile first. Turns out it was some useless copying from RAM to GPU mem :D

    • @VioletGiraffe
      @VioletGiraffe Рік тому +4

      As my boss says: you never truly know what your code does until you've looked at it with a debugger.

    • @cyrileo
      @cyrileo Рік тому

      Nice catch! 🔎 Always profile before making optimizations 🙌
      ~ with ❤️ from repliesgpt

  • @LuigiDiPaolo-vw4xs
    @LuigiDiPaolo-vw4xs Рік тому

    This is the first video I see from your channel. I rarely subscribe to channels, and even more rarely subscribe after watching only one video.
    You've just got a new subscriber. Great content dude

  • @MichaelDarrow-tr1mn
    @MichaelDarrow-tr1mn Рік тому +57

    how is your logo a C

  • @JamesWjRose
    @JamesWjRose Рік тому +2

    I found your channel a few days ago, and thank you very much for all the content. I have been a software dev since the early 90s, and most of my career optimization was NEVER an issue... until I started making VR applications. VR is the exception to the rule about optimization. Basically EVERTTHING about VR needs to be optimized to hit the NECESSARY frame rates

  • @YamianGodlike
    @YamianGodlike Рік тому +10

    Your array vs hashset example reminds me of something I ran into during my limited time coding:
    As a default I used array and += to add elements to it. This worked fine in test but was slow in production. Switching from array to generic list and using add() made it way faster. Further testing showed that array was faster for low number of elements (which was my test data), but when it met real data (up to 600k elements) it was inadequate.

  • @MarkSrg
    @MarkSrg 11 місяців тому

    I just recently discovered this channel and I have to say it's amazing. Thank you for the useful and beautiful content !
    As a French engineering student I had classes about algorithmic complexity (which is a bit painful but really useful), and I think that it belongs to the 80% moves along with data structures. A classic case study is sorting arrays/lists of numbers, and the sorting algorithm can make a huge difference when working with large ones.

  • @jgilbert42
    @jgilbert42 Рік тому +3

    The best reason for Zuckerberg to use PHP was that he already new how to make a website with PHP.
    There are plenty of times when way faster code doesn't take more time to write and sometimes even faster to write. Depends on the knowledge of the programmer.

  • @ibrahimsoukak437
    @ibrahimsoukak437 22 дні тому

    I love your videos man. The pace of the video as well as the animations are really good and easy to follow

  • @WillBlocher
    @WillBlocher Рік тому +10

    Another hit. This channel is so good. Yes, I think that people come out of formal education with this over-emphasis on performance. They've had weeks to do their homework, so forget velocity. They're moving onto the next chapter next week, so forget extensibility. Big O notation is bouncing around in their head like the windows screensaver and their CS idols got to the moon with less memory than a wrist calculator. They think "anyone can get it done, but only the best can do it right and that's what gets them noticed". Then they graduate and land in their first production cycle and realize "oh, it is not the case that anyone can get it done".

  • @daffyf6829
    @daffyf6829 Рік тому +1

    I was surprised to discover how young this channel is. The content and quality of your videos are top notch. I appreciate the humor too.
    I'm an amateur, so I've never had to deal with performance. But I'm often paralyzed by trying to simplify my code. I'm not sure how to describe it in few words. Basically, getting from point a to point b, I'll end up with 20 nested if -else statements and be only half way. Then I spend hours reading through the documentation looking for a built in function, implementing and debugging my ham fisted attempts. I think of them as learning opportunities, but because I'm an amateur these lessons are few, far between and often forgotten. It's super discouraging.

  • @Metagross31
    @Metagross31 Рік тому +8

    I recently learned, that thinking about memory layout is much more important than I thought!
    I was working on some numerical calculations and tried to do some optimization.
    The first thing I did was flatten some nested vectors to just normal vectors to improve parallelization. This sadly did not result in as much of a speedup as I hoped, in some parts it even slowed the code down, despite it making better use of all available threads.
    Then I did some more digging into how the memory is layout of and how the vectors are iterated over and noticed that it was very impractical for caching. Thus, I just shifted the memory layout around a little bit without changing anything about the algorithms used. This alone brought me a speedup of over 50%. I never thought that it would make THAT much of a difference, but I am glad I did it :)

    • @theodorealenas3171
      @theodorealenas3171 Рік тому

      Wait how? You told the program to align a piece of data somewhere? How do you do that?

    • @john.darksoul
      @john.darksoul Рік тому +5

      @@theodorealenas3171 (can't say for the original commenter, but) when working with continuous data, the most cache-friendly way to iterate over elements is sequentially. As you access an element, the next few will be loaded in cache (cache controllers detect memroy access patterns + data is cached in chunks), and access time for them will be near instant. This can be illustrated with a 2D array (say, a matrix of floats) stored in memory line by line. If you process it line by line, it'll be fast, because the next element in a line will also be the next value in memory (and thus will benefit from caching), but if you iterate over columns, almost every memory access will be a cache miss and will take very long time in comparison, despite the algorithm being essentially the same.

    • @theodorealenas3171
      @theodorealenas3171 Рік тому

      @@john.darksoul I think what they did is they aligned the beginning of the data with a memory block. I think that's what they're referring to. So one load operation brings a good number of the data you want, and I wonder, how do you typically express that in non assembly code

    • @puppergump4117
      @puppergump4117 Рік тому +1

      ​@@theodorealenas3171 It could have to do with a condition being optimized with branch prediction. The cpu will typically execute the first instruction(s) of the loop while it checks the condition. In cases such as (i < vector.size()), the condition is always true until the end, meaning every loop but the last one is sped up a bit. But if it's sometimes true, sometimes false, it will be much slower than it could be, such as in a case where you have different control paths inside a loop.

    • @john.darksoul
      @john.darksoul Рік тому +1

      @@theodorealenas3171 yeah, well, it would make sense if the amount of data is really small, i.e. fits in a single cache line, but for large amounts of data it wouldn't make much difference (if at all) I believe
      In C11, there's aligned_alloc, which allows you to specify alignment in bytes, and some platforms have their own functions that do similar thing (_mm_alloc, posix_memalign etc)

  • @onlyvvv
    @onlyvvv Рік тому

    I work as a runtime performance engineer in the games industry, and I can say without a doubt that everything in this video hits home so true.
    The virtuous cycle of my job is:
    - Have a KPI of what we are trying to hit; the performance GOAL
    - Measure;
    - Make sure your measurement is correct....
    - Form a hypothesis of what's 'slow'; does it actually miss the performance GOAL?
    - If it does miss the goal, take action
    - execute loop until you hit the goal (but always remember you can ask 'is the goal valid')
    Rinse and repeat, and always bias for not spending infinite cycles in this loop.

  • @david_sanchez
    @david_sanchez Рік тому +11

    The passive aggressiveness is epic! I love it.
    That whole "performance" bit is such a rampant issue in development. Just look at any Stackoverflow question and you're guaranteed to see some speed freak in the comments making suggestions about "don't do that, do this, it's faster" when they have absolutely no idea what the use case(s) is/are. Not only that, the OP didn't even ask "how do I make this faster?" They just want to make it work. Definitely some popcorn worthy reading if you ever need to brighten your day. But then again, I'm not a developer so what do I know? 🤷‍♂

  • @tobb10001
    @tobb10001 Рік тому

    For the previous videos I was like "Yeah, that's also what I try to do and tell people all the time...", but this one caught me.
    Thanks for that perspective.

  • @handomize
    @handomize Рік тому +5

    Hey, really enjoy the videos that you put out. Really liked the dog walking part because i have that constantly... Trying to make a system highly adaptable and adding complexity to where it doesn't need it. Making the development 5 times more. I'm now at the end of a 2 month project that should have taken 3 weeks... I don't know how to balance speed and adaptability

  • @403gtfo
    @403gtfo Рік тому +2

    That was an really good and chill explanation on something I struggle with all the time.
    I've had projects fail due to me (the solo developer in the company) trying to focus too much on adaptability and performance where it wasn't needed. Feels bad...
    10/10 would zen watch again, have a sub :)

  • @ChristianBrink
    @ChristianBrink Рік тому +13

    Your stuff is fantastic. These are all the things I have been advocating for 15+ years. You present it in a much more accessible way than I ever did (corporal punishment and yelling :) )
    I am now having my team watch all of your videos - Thank you and keep up the great work!
    I also liked how you touched on YAGNI - Young devs can get caught up in that as they start writing better code

  • @negritoojosclaros
    @negritoojosclaros 10 місяців тому +1

    This vid is so true!!! Thinking about performance when you even have a problem is nasty!!

  • @enotirab
    @enotirab Рік тому +5

    Side note: PHP is a MUCH better language today than it was when facebook was first created. It still has problems, but its constantly improving.

  • @jonashautier9169
    @jonashautier9169 Рік тому +1

    Your videos are great ! A college shared it today at work and it's funny that both of us found your channel :)
    Really good content, well edited and with a lot of tips to develop better.
    I hope that your channel will grow, you earned it !

  • @snk-js
    @snk-js Рік тому +10

    im developing a game and that concept of adaptability is very worthy for me during the experience of crafting a complex system, may the velocity is gained over time while you focusing full on adaptability

  • @blablabla7796
    @blablabla7796 10 місяців тому +2

    I’m personally in the ++i camp. Not because it’s faster, but because it conveys a meaning to the reader. If I read i++, I always subconsciously have to consider the side effect of saving the value, doing some operations with the copy, possibly returning it from a function, then incrementing the value. With ++i, I don’t have to consider that possibility. Just increment the number, and use that, maybe. Having to think less about something when reading someone else’s code is worth it in my opinion.

  • @JakubSK
    @JakubSK 10 місяців тому +3

    Always optimize as you write code.

  • @ThijmenCodes
    @ThijmenCodes Рік тому

    Haha, when I saw the video title I figured you would add a gimmick to show us how much time is wasted worrying about premature optimization. Loved the counter, great video.

  • @TheRealBigYang
    @TheRealBigYang Рік тому +20

    There are some optimization tricks in c and c++ which don't impact readability and velocity. One of them is organising your structs. A struct saves memory in chunks based on the biggest variable type inside of it(up to the size of the biggest basic type). Sometimes, just shifting a few lines of code can make a huge performance difference. Remember, code is compiled and executed top down. The basics of the basics

    • @sameer_c
      @sameer_c Рік тому +1

      Its a union rather than a struct

    • @williamdrum9899
      @williamdrum9899 Рік тому

      Ah yes, the wonderful world of padding. Or why using "char" instead of "int" for a person's age doesn't actually save memory

    • @pigpuke
      @pigpuke Рік тому

      It's not based on the biggest field in the struct but the target machine architecture. If you target a 32-bit platform then you get 32-bit field packing, if you target an 8-bit platform you get 8-bit field packing. You can also force the field packing directly regardless for a struct.

    • @williamdrum9899
      @williamdrum9899 Рік тому

      @@pigpuke Didn't know people used C for 8 bit platforms

    • @pigpuke
      @pigpuke Рік тому

      @@williamdrum9899 I think you missed the point

  • @cefcephatus
    @cefcephatus 9 місяців тому

    Thanks for helping me coping with existential crisis. This video is actually the tool I can use to hit my inner voice in it's spiritual face to stop me from trying to actually optimize stuffs while the deadline is on. I used to just ignore it and write clean code unless it's an issue. But in design phase, I couldn't ignore it. This vid might help with that.

  • @maninalift
    @maninalift Рік тому +46

    I still think that you should default to ++i. You rarely want the behavior of i++, its a more conceptually complicated operation. You don't need to test to know that ++i isn't going to be slower although you might guess that most of the time it won't be any faster. On rare occasions you actually want the behaviour of i++, in which case it might help if you default to ++i everywhere else because the difference might act as a sign "hey, you know the wierd way that i++ works, i actually want that here"

    • @rauldragu9447
      @rauldragu9447 Рік тому +20

      Yeah, i was thinking the same. ++i isnt better because its faster, no one would expect it to be. Its better because its more correct, because you are actually stating your intent.

    • @etodemerzel2627
      @etodemerzel2627 Рік тому +15

      There's an even better alternative: don't use the ++ operator at all. I've stopped using it after experiencing its absence in modern languages. My code only got better.

    • @maninalift
      @maninalift Рік тому +5

      @@etodemerzel2627 yeah, I think that's probably a good idea. When we were writing low-level C manually iterating over pointers and counters all of the time, it made sense.

    • @redking36
      @redking36 Рік тому

      Which cases would use ++i? It seems to me like it would always skip the first elements of something or start the count too high.
      Well, I guess there’s no difference as long as it’s on it’s own line of code.

    • @rauldragu9447
      @rauldragu9447 Рік тому

      @@redking36 Everywhere you mindlessly write post increment, you should probably use the pre increment. Use it anywhere you want a value to be incremented NOW. The only place I find myself needing the post increment is when I need to append to an array and update the size i.e. ```array[size++];```. An example that I recently encountered where the post increment would have introduced a sneaky bug was needing to find a '#' in a file and then copy the color code past that. I did ```const auto begin = ++std::find(std::istreambuf_iterators..., '#'); std::copy(begin, ...);``` because these are forward iterators so you can't just add + 1 at the end, and if I used the post increment begin would still point to the '#'.
      Also when would it skip the first element? The only scenario I can think of is something like ```while (i++ < n) { ... }```, but i consider this bad form anyway and i would just use a for like this ```for (; i < n; ++i) { ... }```, it was created for this after all.

  • @punims
    @punims Рік тому

    That was genius! Loved the existential increment counter.

  • @johnbarbero757
    @johnbarbero757 Рік тому +6

    With a background in games programming I agree with the overall theme of your video (though it becomes more complex as features will be cut if they can't be made performant enough down the line). With regards to the ++i vs i++ I'll say that in most cases it will not matter as the compiler will see the context and optimize away the difference. I'd argue however that ++i is more correct if you don't actually intend to use the value and that it's is good practice to opt for that, and only swap for i++ when you need the current value of i and then increment.

  • @ainoakeisari
    @ainoakeisari Рік тому

    I wholeheartedly agree with your "is this conversation worth it" assessment.