How to make C++ run FASTER (with std::async)

Поділитися
Вставка
  • Опубліковано 7 вер 2024
  • Go to www.hostinger.c... and use code "cherno" to get up to 91% OFF yearly web hosting plans. Succeed faster!
    Patreon ► / thecherno
    Instagram ► / thecherno
    Twitter ► / thecherno
    Discord ► thecherno.com/...
    Series Playlist ► thecherno.com/cpp
    This video is sponsored by Hostinger

КОМЕНТАРІ • 444

  • @TheCherno
    @TheCherno  4 роки тому +86

    Thanks for watching, hope you enjoyed the video! Don't forget to check out hostinger.com/cherno and use code "cherno" to get up to 91% OFF yearly web hosting plans! ❤

    • @CatDaWinchi
      @CatDaWinchi 4 роки тому

      Hello, What do I need to create a std::mutex object? I`m compiling project with mingw-gcc (using cmake, under windows) and I cannot include a mutex lib (#include and then use it). Only solution I`v found is to use WinAPI wrappers around those calls, like HANDLE mtx = CreateMutex(...) and others that declared in header. Is there a way to use std::mutex?

    • @FergusGriggs
      @FergusGriggs 4 роки тому

      91%?

    • @artyquantum7283
      @artyquantum7283 4 роки тому +2

      They are already giving 90 % sale of Black Friday, I dont think they do any regard of the link you provided.

    • @267praveen
      @267praveen 4 роки тому

      Hey Cherno
      Can you please elaborate a bit on using the std::future ...

    • @bulentgercek
      @bulentgercek 4 роки тому

      ​@@artyquantum7283 But they will give a share to Cherno if you click that link. Because it passes that cherno keyword in the url. That's how sponsorship works with the links.

  • @jovanrakocevic9360
    @jovanrakocevic9360 4 роки тому +317

    Someone possibly mentioned it already, but you can pass by reference to threads, you just need to use std::ref()

    • @1495978707
      @1495978707 3 роки тому +9

      What is the difference between this and a reference?

    • @jovanrakocevic8263
      @jovanrakocevic8263 3 роки тому +51

      @@1495978707 Pointers and references have slightly different semantical meaning, namely a reference can never be refrencing nothing like nullptr does, therefore it's guaranteed to be "valid". Also you can't change what it points to once it's made. It's nothing you can't do with a pointer but it enforces it.
      If you mean why can't you just push as a reference instead of std::ref... I'm not sure. I think it has to do with references being uncopyable ( I guess this can be also worked around with std::move() if you are okay with the reference being possibly invalid). But this is fairly low level c++ arcane stuff and I'm not claiming I know what it is, it's just a guess.

    • @w.mcnamara
      @w.mcnamara 3 роки тому +153

      @@1495978707 std::thread requiring you to use std::ref instead of & is a deliberate design decision from the C++ committee to prevent you from shooting yourself in the foot when creating a thread.
      If you (for example) pass a local variable to a thread by reference, the variable may go out of scope before the thread uses it (because the thread executes in parallel), and then when that memory is accessed by reference in the newly created thread, it will result in undefined behaviour.
      Requiring people to use std::ref forces them to think about why they're passing by reference, (and if they're making an error, possibly realize it) and displays deliberate intent within the code.

    • @user-qy2le2fo7r
      @user-qy2le2fo7r 3 роки тому +16

      ​@@w.mcnamara Perfect explanation !

    • @mostafatajic6457
      @mostafatajic6457 3 роки тому +5

      ​@@w.mcnamara Thank you for such a clear explanation!

  • @oleksiimoisieienko1430
    @oleksiimoisieienko1430 4 роки тому +66

    if you preallocate a vector with number of "meshes", you can pass index to the async function, and you don't need to have locks.
    Mutex and memory allocation are one of the top CPU eater, if you kill them performance will be increased a lot. Without preallocation of vector it will allocate more space few times during push_back. And with lock your algo will wait sometime until other thread will finish pushing back to vector. if you provide index to vector you can just run load_mesh without lock, and it willl be safe, because every thread will access unique index. Result should be much better.

    • @maxmustermann3938
      @maxmustermann3938 10 місяців тому +4

      In this specific example, this will barely gain you anything (even though I agree with you). What's taking time here is definitely not some small reallocation of a vector that ultimate just contains 19 References to Meshes, and those Mutexes are not gonna be noticeable when the actual tasks running in parallel are this "big" comparatively. If you have a lot of small tasks, then the points you mentioned become very important though.

  • @igorswies5913
    @igorswies5913 4 роки тому +275

    Nobody:
    The Cherno:
    I need 19 Soviet trucks NOW

  • @TheOnlyJura
    @TheOnlyJura 4 роки тому +112

    You could increase the performance even more by increasing the vector's size and passing just one reference to a cell as an argument to the function. That way you don't need to reallocate all the time + you get rid of the mutex.

    • @minneelyyyy
      @minneelyyyy Рік тому +1

      that could work at first however if you end up extending this code and suddenly something somewhere needs to push to the vector you've just created ub. it's best to just never have a pointer to an element inside of a dynamic array.
      you could get rid of the reallocations by just setting the capacity and get rid of the mutex by returning the loaded mesh and then sort of collecting all of the results at the end

    • @TheRealZitroX
      @TheRealZitroX 2 місяці тому

      Isn’t there a way of atomic pushback?

    • @TheRealZitroX
      @TheRealZitroX 2 місяці тому

      Edit: no it does not, but you could hold a atomic size and then prevent it by that when someone else pushes.
      You can do it save this way. But when spending this time you just could lock the variable and no one noticed the micro delay.

  • @malgailany
    @malgailany 4 роки тому +106

    Parallel stacks view blew my mind. Didn't know such a feature exists!
    Thanks.

    • @Ariccio123
      @Ariccio123 4 роки тому +1

      Trust me, it's made a huge difference for me in the past. Always have it open in a tab!

    • @schoebey
      @schoebey 4 роки тому +4

      Right? I've been using VS for a long time (15+ years), but still discover new things about it. Thanks for that!

  • @Light-wz6tq
    @Light-wz6tq 4 роки тому +159

    Best C++ channel ever!

    • @TheMR-777
      @TheMR-777 4 роки тому

      No doubt

    • @Quancept
      @Quancept 3 роки тому

      100% right.

    • @kingsnowy3037
      @kingsnowy3037 3 роки тому +1

      *ChiliTomatoNoodle* wants to know your location

  • @attcs
    @attcs 4 роки тому +35

    This is a good presentation about "How NOT to do". There are at least two major issues.
    1, Usage itself. C++17: execution policies. This code can be written much simpler and with better performance without any mutex and the overhead of the countless thread:
    m_Meshes.resize(meshFilepaths.size());
    transform(execution::par, meshFilepaths.begin(), meshFilepaths.end(), m_Meshes.begin(), [](auto const& file) { return Mesh::Load(file); }); // or probably par_unseq
    std::async can eat a function with return value, you should just call the future::get in a second for loop. The video solution bypasses the obvious way, hardly readable/maintainable, etc.
    There are a lot of non-standard solutions to parallelize the calculation, OpenMP 2.0 and Parallel Patterns Library (PPL) are built-in in Visual Studio.
    2, Parallel File I/O could have major performance problems, especially if the files are not fit in the same hardrive cache (large files, fragmented data). In your example you read the same file from the same cache, obviously did not encounter this issue.

    • @Kabodanki
      @Kabodanki 4 роки тому +3

      Would love to have a response to that. Having things popping up isn't great, to paraphrase people from DF Foundry, it's distracting. Mutex should be avoided when not necessary. Having lots of thread is not great (MHW on PC is a great exemple of that). I/O is a great problem as well (Just look at what Mark Cerny had to say about game sizes and issues they got with it, like copying multiple time the same asset (increasing in the process game sizes) so they can be loaded quickly -> "there are some pieces of data duplicated 400 times on the hard drive." )).

    • @mohammedalyousef5072
      @mohammedalyousef5072 4 роки тому

      The compiler I’m using doesn’t support execution policies, it does support std::async. As far as I can tell, gcc has only recently added full support for it.

  • @commanderguyonyt0123
    @commanderguyonyt0123 4 роки тому +83

    The reason the why the reference doesn't work is, that the std::async function takes the arguments for the function by copying them. When calling the function, std::async has has an own copy of the meshes, so LoadMesh function takes the variable of the std::async function. TLDR: the std::async passes the arguments by copying them (in lambda [=])

    • @connorhorman
      @connorhorman 4 роки тому +4

      However, it will unwrap std::reference_wrapper into T&.

    • @NomoregoodnamesD8
      @NomoregoodnamesD8 4 роки тому +4

      call std::ref(x) and std::cref(x) to pass by reference in thread/async invocations

    • @AvivCMusic
      @AvivCMusic 4 роки тому

      I'm confused. std::async takes a "forwarding reference", aka Args&&. We pass into it an lvalue (the meshes), which turns the forwarding reference to an lvalue reference &. std::async will later take that reference, and pass it into the function passed in, which also takes a reference. Where does the copy happen?

    • @NomoregoodnamesD8
      @NomoregoodnamesD8 4 роки тому +3

      @@AvivCMusic since the function given to std::async may be run on a different thread, arguments are copied so that once you've called async, you may dispose of those arguments (assuming you pass by value rather than calling std::ref or std::cref, which instead basically passes by pointer)

    • @AvivCMusic
      @AvivCMusic 4 роки тому +1

      @@NomoregoodnamesD8 Oh, do you mean std::async explicitly makes a copy of the arguments, inside its body?

  • @guywithknife
    @guywithknife 4 роки тому +46

    I just want to take a moment to contemplate the way he pronounced “cached”

    • @StianF
      @StianF 3 роки тому +7

      Ohh, I thought he said "caged"

  • @leixun
    @leixun 4 роки тому +45

    *My takeaways:*
    1. One of the difficulties of making things run in parallel is to figure out the dependencies 2:38
    2. Check parallel stack graph 19:54

  • @Ariccio123
    @Ariccio123 4 роки тому +105

    You missed out on one huge problem: std::async launches a new thread on *every* invocation. This burned me back in 2015 when msvc switched from the (better) threadpooling implementation to the standards compliant implementation. My program (altWinDirStat) can invoke upto millions of std::async tasks, so it hangs almost immediately at about 700 threads.

    • @michmich6645
      @michmich6645 3 роки тому +8

      Bruh

    • @IExSet
      @IExSet 2 роки тому

      TBB looks better, and now we can use latches and barriers from C++20 ???

    • @Energy0124HK
      @Energy0124HK 2 роки тому

      Where are the alternatives of std::async then? Or how can we limit the amount of threads it creates?

    • @JamesGriffinT
      @JamesGriffinT 2 роки тому +1

      @@Energy0124HK The alternative is to build your own thread pool tailored to your individual needs.

    • @jhon3991
      @jhon3991 2 роки тому

      @Alexander Riccio `std::async launches a new thread on every invocation`, how about g++?

  • @RC-1290
    @RC-1290 2 роки тому +1

    For a second I thought this was going to be a video about avoiding the c standard library. And then I read the second half of the title ;)

  • @furball_vixie
    @furball_vixie 4 роки тому +117

    1st, wow i just woke up to this. perfect to watch while eating breakfast i guess

  • @gonzalosanchez9191
    @gonzalosanchez9191 2 роки тому +2

    By far this is one of the best channel on youtube, thanks a lot for all this series.

  • @Albert-lr7ky
    @Albert-lr7ky Рік тому +1

    Great video! You have always been such a nice teacher. One thing to add, possibly someone added it already: if we want to wait to get the returned value of all async(), we need to call get() of the future objects, for example, in your code ->
    for (const auto& file : meshFilepaths) {
    m_Futures.emplace_back(std::async(std::launch::async, LoadMesh, &m_Meshes, file));
    }
    /*
    * if we wish to get the result value and keep processing
    * we need to use get() of every future object
    */
    for (auto& futureObject : m_Futures) {
    futureObject.get();
    }

  • @lukassoo
    @lukassoo 4 роки тому +106

    2 videos in less than 24 hours?
    Impossible...
    Cherno found a way to make videos faster too!

    • @brakenthemole2377
      @brakenthemole2377 4 роки тому +45

      multithreaded video editing

    • @guestimator121
      @guestimator121 4 роки тому +8

      Even his hair looks like he was riding a bike 200miles per hour

    • @sky_beast5129
      @sky_beast5129 4 роки тому +6

      *Cherno.brain started using multithreading*

    • @Mulla_killer
      @Mulla_killer 4 роки тому +7

      Sponsors make people to achieve the impossible

    • @suntzu1409
      @suntzu1409 3 роки тому +1

      Cherno overclocked himself

  • @abdosoliman
    @abdosoliman 4 роки тому +2

    dude, you literally saved my life I was about to multithreading c++98 style before finding this by coincidence. *Thank you*
    could make something about multiprocessor so far I didn't find anything to it in clean way in the standard library

    • @bumbarabun
      @bumbarabun 4 роки тому +1

      Only problem is - this is naive and most probably not effective way to do multithreading.

  • @op3003
    @op3003 2 роки тому

    This How to finally showed me how to use async inside a loop. Did not find an answer on any of the 10 sites I've been on trying to find the answer.
    You rock Cherno!

  • @oozecandy
    @oozecandy 4 роки тому +6

    Wow- I just googled the term "std::async" and got an answer by one of my favorite cpp explainers which is only a few days old :)

  • @vladuchoo
    @vladuchoo 4 роки тому +2

    Dude, that's a great stuff, please keep it up!
    Another suggestion would be multithreading + immutability, lock-free constructs.

  • @XxxGuitarMadnessxxX
    @XxxGuitarMadnessxxX 2 роки тому +1

    I realize this is a two year old vid by now, but at ~18:30, you could also make the async function take an rvalue reference with '&&'. I'm not familiar with your 'Ref' there but I think it would then make that vector take a reference to those rvalue references passed in (maybe something similar like using std::forward(...)? I honestly have no idea, just guessing at this point lmao

  • @MelroyvandenBerg
    @MelroyvandenBerg 3 роки тому +11

    7:40 why do you first load it into a string vector, and later do a for-loop again :\? You can just put the Mesh::Load() in the while loop directly.

  • @andreylapshov9418
    @andreylapshov9418 14 днів тому

    Main poit which we should remember, that std::future is returned from async blocks in destructor unlike custom created std::promise.

  • @LoSte26
    @LoSte26 3 роки тому +1

    Starting from C++17, the standard library actually has a built-in parallel for loop, it's just simple as this:
    #include // needed for std::execution::par which enables parallel execution
    #include // needed to use the std::for_each algorithm
    ...
    std::for_each(std::execution::par, std::begin(meshFilepaths), std::end(meshFilepaths), [&m_Meshes](auto const& file)
    {
    std::lock_guard lock{s_MeshesMutex};
    m_Meshes.push_back(Mesh::Load(file));
    });
    that's it; no need to sore manually the std::futures anymore and the algorithm itself tries to correctly balance the number of spawned threads... and it's almost identical to a sequential for_each loop, except the use of std::execution::par (that makes run the algo in parallel) and the use of the lock_guard/mutex. The lock_guard template parameter does not need to be specified (it's deduced in C++17) thanks to CTAD.

    • @ycombinator765
      @ycombinator765 2 роки тому

      I am a noob but I don't know why I think Bocarra(fluentC++) author is right that C++ should not have deceptive anstractions. this one is exactly an example. What Cherno did is perfect verbosity mixed with elaborate simplicity. I am not arguing just meaning that some complexity should be in there. anyways thanks for this

  • @ArlenKeshabyan
    @ArlenKeshabyan 4 роки тому

    You don't need any mutex at all to do that. Just add a local std::vector "futures" variable. Iterate through file paths and add futures.push_back(std::async(std::launch::async, [](auto &&filepath){ return Mesh::Load(filepath); }), current_filepath);
    Right after that, just iterate through all futures you've just added and do m_Mesh.push_back(current_future.get());
    That's it.

  • @willinton06
    @willinton06 3 роки тому

    Well this just makes me appreciate the simplicity of C#’s concurrency model, not even looking at Go’s model which is somehow even simpler, all this could be done with a simple foreach, a Task list and and Task.AwaitAll

  • @chrisparker9672
    @chrisparker9672 4 роки тому +13

    Note that MSVC has a longstanding bug where `.valid()` on a future may erroneously return true even after calling `.get()`

  • @andreylapshov9418
    @andreylapshov9418 14 днів тому

    1. Use std::ref instead raw pointer.
    2. Soppose your Mesh clas has move ctor, so its better to use move semantic for push_back into the mesh vector, we avoid unnesessary copy.

  • @jeanehu-pludain6640
    @jeanehu-pludain6640 Рік тому +13

    pls don't put background music in same time you talking.

  • @Sebanisu
    @Sebanisu 2 роки тому +1

    I use async for loading a lot of textures. For saving some files I use a detached thread. I'm creating a map editor and I had some batch ops that I didn't want to wait or lock waiting on the writing to the hard drive. This video was very helpful.

  • @ryanj.s9340
    @ryanj.s9340 4 роки тому +2

    Thank you for making this video I have wanted you to dive into multithreading a little deeper hope we can see some of this implemented in your game engine series rock on Cherno

  • @chiefolk
    @chiefolk 3 роки тому

    nice to see how the cherno grew during this series now vs in first video : )

  • @alimehrvarz8391
    @alimehrvarz8391 3 роки тому +2

    "Real-world" examples are great, thanks.

  • @thewelder3538
    @thewelder3538 Рік тому +1

    A game is nearly always mostly single threaded. Very little in a game requires actually multithreading. Loading of data, decompressing etc, none of these require or would benefit from multithreading. They are synchronous operations. The same as the models may all be separate and their textures too, but you need all the models and textures to be loaded in order to render a scene. So although threading this stuff may give you some small advantage, you need everything ready and the threading gives you a much higher probability of failure. Thus you end up having to deal with shared data, synchronisation and deadlocks.
    Okay so loading and decompressing could be done on separate threads, but the reality is that your I/O is inevitably going to be slower than your decompression, so it's always going to end up waiting for the load thread to complete. As said in this video, threading needs to be well thought out and far too many people thread stuff that really should be single threaded. Games are NOT good example of something that really benefits from multithreading because their workloads aren't distinct enough. Image/video rendering, sound/music processing, certain hashing/crypto algorithms are much better examples of where you get a real win with multithreading as many operations are all distinct and work well when done in parallel.

    • @Debaser36
      @Debaser36 Рік тому

      I tend to disagree. I always use mt for my (small) games and when profiling this, it's always WAY faster than when not doing it. Of course it needs to be thought out properly, but when it is, it can be quite powerful in games.

    • @thewelder3538
      @thewelder3538 Рік тому

      @@Debaser36 I'm not saying that there aren't places that you can use multithreading and get an easy win, but the core game code is nearly always single threaded. Now I've been writing games at well known software houses for more years than I care to remember, and stuff like the main game loop and most other things are single threaded. I've seen stuff like physics and even AI pushed out to threads, but you still end up waiting on those threads and the context switching isn't cost-free. Threading adds a serious level of complexity and there's always a risk of deadlocks and other things that can be very difficult to find and debug. I try to never use multithreading unless I'm sure that what I'm doing is completely independent, or something that is totally compartmentalised. Games do benefit from multithreading, just not in the ways that most people think.

    • @Debaser36
      @Debaser36 Рік тому

      @@thewelder3538 well, yes, I am with you there. In a gameloop I would rarely use multithreading. I sometimes use it, when using a completely different system that hasn't much to do with the loop. But, I mean USUALLY, you do a lot of work BEFORE entering the game loop, from which you USUALLY can parralize a lot imho.

    • @thewelder3538
      @thewelder3538 Рік тому

      @@Debaser36 I agree, but that's generally preparation stuff for a gameplay scene or arena and not something you do on a per frame basis. During preparation there's probably loads you can push out to a thread: background loading, tuning, data organising etc. I do this all the time, but once the loop is running, it's nearly all running on a single thread, unless there's something I know I can do safely and QUICKLY in the background without causing hitching or a stutter if it doesn't complete in time. The real parallelizing is done on the GPU where virtually everything is on a thread.

    • @Debaser36
      @Debaser36 Рік тому +1

      @@thewelder3538 yes of course. Seems we are on the same page :)

  • @Energy0124HK
    @Energy0124HK 2 роки тому

    What is that Ref class that he used in the video? At 11:57, std::vector. Did he explain it elsewhere?

  • @xxXSashaBardXxx
    @xxXSashaBardXxx 4 роки тому +3

    perfect, as always! Thank you, Cherno.

  • @ShivamJha00
    @ShivamJha00 4 роки тому

    A 23min C++ video from Cherno. Now that's what I call a hell of a day

  • @AntoniGawlikowski
    @AntoniGawlikowski 4 роки тому

    Best episode I've watched so far - real world example and not some artificial sandbox makes much more sense for me :)

  • @dirtInfestor
    @dirtInfestor 4 роки тому +26

    If you want to use a reference instead of a pointer at 18:20 you have to use std::ref.
    Also is there any reason to use async instead of std::thread here? Besides showing us std::async of course

    • @lengors1674
      @lengors1674 4 роки тому

      I think it's because with async you can easily convert between multithreaded and singlethreaded. With async you can pass a function with a return value where with thread that is not possible, which means that if you have a function with a return value that you want to execute in parallel, with thread you'll need to convert it to a function with no return value and use some method to share data between that thread and the calling thread.

    • @dirtInfestor
      @dirtInfestor 4 роки тому +1

      @@lengors1674 Im aware of that, but since the futures here are not used, there seems to be no reason to use async over thread

    • @lengors1674
      @lengors1674 4 роки тому +1

      @@dirtInfestor yeah, you're right I misread your question

    • @foxvulpes8245
      @foxvulpes8245 4 роки тому

      @@dirtInfestor I believe the way he is showing it makes it easier if you want to have return values. The real question is, whats the performance hit of thread vs async?
      For a journeyman coder like myself, asyc method was more useful to me.

    • @shadowassasinsniper
      @shadowassasinsniper 4 роки тому +2

      std::async is typically pooled and will be faster to kick off than std::thread (which forces a new thread to be created)

  • @BenDol90
    @BenDol90 Рік тому

    Legend has it that he's still working on the mutex video

  • @coolumar335
    @coolumar335 4 роки тому

    Mr. Chernikov, you are one of the best software/ game developers as well as teachers in existence right now. Thank you and kudos!

  • @shushens
    @shushens 3 роки тому

    C++ threads by design only accept objects by value. Trying to pass things by reference will also pass them by value. So, either pointer or std::ref is the way to go :)

  • @AlienFreak69
    @AlienFreak69 4 роки тому +5

    Will you do a video on how to use the GPU for processing stuff?
    I'm talking about using the GPU for moving thousands of objects on the screen with each one having its own collision data. That kinda stuff is usually too taxing for the CPU.

  • @arthopacini
    @arthopacini 4 роки тому +9

    you don't need to pass a pointer of std::vector, you may actually pass a reference, but on std::async call you need to wrap it with std::ref,
    so the async call would be:
    std::async(std::launch::async, LoadMesh, std::ref(m_Meshes) , file);

    • @SauvikRoy
      @SauvikRoy 4 роки тому +1

      It's from the bind semantics. It creates copy of whatever you pass to it, unless with std::ref.

  • @MsAskop
    @MsAskop 4 роки тому

    You talk a lot about performance and I really like your videos.
    Can you please make a video talking about data oriented design?
    Some people say that is the best approach to write faster code.
    I love you so much. Great job with this channel, cherno! ♥️

  • @dimts2701
    @dimts2701 4 роки тому +2

    Thank you for the video Cherno! Could you please make a video explaining all the different types of parallel computation (openmp, mpi, std::async...)?

  • @GrantSchulte
    @GrantSchulte 4 роки тому +4

    Is your Mesh::Load(...) function loading the meshes to the GPU or just storing vertex data in RAM? How are you able to load the all the models at the same time without messing up the OpenGL states?

  • @eddyecko94
    @eddyecko94 4 роки тому

    Thanks for this video! You’ve helped a lot. I’ve never considered financially supporting anyone but you might be the 1st.

  • @ccflan
    @ccflan 4 роки тому

    Each time i see one of your videos i learn something new, thank you

  • @lucasgasparino6141
    @lucasgasparino6141 3 роки тому

    OpenACC is also an option, as it allows use of GPUs without losing execution generality (if no GPU is detected it behaves like OpenMP). Or just go ballistic and use CUDA/OpenCL :)

    • @sebastiangudino9377
      @sebastiangudino9377 Рік тому

      Why tho? GPU parallelism is not a magic solution to make everything faster just because. There is nothing wrong with loading models from the CPU. Actually, it's preferible. Since in a game or other RT Rendering situation the GPU is likely already busy actually rendering frames as it is supposed to (And using parallelism to do so anyways). Using CUDA to load some models in real time is beyond overkill

  • @broccoloodle
    @broccoloodle 4 роки тому +5

    Is it more readable to use lambda to pass in `std::async` instead of writing a whole new `static void` function?

    • @TheCherno
      @TheCherno  4 роки тому +9

      It might be more convenient to use a lambda, but I definitely wouldn't call it more ~readable~. The end result is the same but with an actual function we get to name it (hence more readability), as well as keep the scope of our std::async function less polluted. In practice, I personally use both; it depends.

  • @kylefillingim9658
    @kylefillingim9658 4 роки тому +2

    I like the video. I usually use c# myself, and have utilized the parallel for and parallel foreach loops numerous times as well as the thread safe data structures. Is there any chance you could do a video on utilizing the gpu to speed up math operations, say matrix multiplication? You are awesome. Keep making these videos

  • @SriNiVi
    @SriNiVi 4 роки тому

    Amazingly precise presentation. Great job. Will be looking for your videos more.

  • @Knittely
    @Knittely 4 роки тому +4

    While I like how you explain stuff here, I have to say that your speed up is kind of exaggerating things. At first you have to note that the hard drive is normally the bottleneck for loading. Loading from different memory areas at the same time can even be worse, because then the read is not consequential, which is the best for optical hard drives at least (for SSD it doesn't matter I think). So if you would read different files from different locations it would actually be slower, then just chunk reading each file.
    Also for the hard drive there is a page system, similar to a cache which buffers some data around where you are actually reading. By reading the same file all the time you gain the advantage of having a faster harddrive then usual. But basically what this means is that the parsing of the files will take longer in comparison to loading the file from the hard drive (which would normally be slow). Since the parsing can be parallellized you gain your huge speed up here.
    My suggestion for loading meshes from the hard drive is to have 1 thread who just loads the raw files and then gives them to worker threads to parse.

    • @Borgilian
      @Borgilian 3 роки тому

      Could even go further and memory map the file before chunking it. The OS might've cached the file in main memory and that will make the access extremely fast (no need to load it from disk). If it's not cached, at worst it will be just as slow as a typical disk read (the first time). Also, multiple meshes could load the same texture multiple times (in case of a multi-mesh model), so it could benefit from a caching storage system (load unique textures only, and retrieve an index to the texture already in storage if another mesh requires it).

  • @alexhirsch889
    @alexhirsch889 4 роки тому +2

    Have you considered using (or talking about in your upcoming video) a singly-linked list? Herb Sutter has a genius talk ("Lock-Free Programming (or, Juggling Razor Blades)") in which he explains the multithreaded slist. Really cool stuff and as an slist would be very powerfull in the mesh example (as no lock would ever be required) you should give it a thought.

  • @jeffwonk2024
    @jeffwonk2024 3 роки тому +1

    Excellent explanation! Top quality.

  • @m96fa40
    @m96fa40 5 місяців тому

    6:30 into the video:
    "Let's dive into this and take a look into some code..."
    YAY!.. :D

  • @ChrisM541
    @ChrisM541 8 місяців тому

    13:29 The current thread being utilised will have one/more unique ID's - is it possible to have a unique ID sequentially allocated and use that as an index into your own custom vector table? If yes, that will do away with any requirement for mutex/locking.

  • @xxdeadmonkxx
    @xxdeadmonkxx 4 роки тому +1

    I believe you have to use std::ref to pass reference arguments to std::async

  • @maniaharshil
    @maniaharshil 8 місяців тому

    Can you please cover the RAII ?🤗 by the way thank you cherno , sharing your journey with us and being our go to guide in cpp world !!

  • @ccfrodo
    @ccfrodo 4 роки тому

    is there a particular reason that you roll your own transformation loop over using std::transform with a parallel execution policy?
    std::vector output;
    std::transform(std::execution::parallel_unsequenced_policy , begin(filenames), end(filenames), begin(output), [](std::string path)-> ref { ...; return ref;} );

  • @maltr2447
    @maltr2447 4 роки тому

    Never really understood asynchronous programming, but that really helped, thank you!

  • @alexneudatchin2161
    @alexneudatchin2161 6 місяців тому

    it's not so bad solution but you're doing whole thing on lock . where say in dragonflybsd you have thread-safe linked lists which involve locking only on append/remove op's means only where pointers reassign is taking place.

  • @chris_burrows
    @chris_burrows 4 роки тому +3

    Does this work out the box for loading textures? or do you have to do some locking and juggling to prevent one texture calling glBindTexture() while the previous is still bound and loading?

    • @TheCherno
      @TheCherno  4 роки тому +2

      OpenGL can't really* be multi-threaded, so no.

    • @MarkusLobedann
      @MarkusLobedann 4 роки тому +2

      And what about when using direct state access?

  • @DaveChurchill
    @DaveChurchill 4 роки тому +1

    No need to declare the mutex in the global scope if it's going to be static. Put the declaration inside the function!

    • @32gigs96
      @32gigs96 4 роки тому

      Dave Churchill he needs to see this!

    • @TheCherno
      @TheCherno  4 роки тому

      Except it's going to be needed when reading from the vector for rendering.

    • @DaveChurchill
      @DaveChurchill 4 роки тому

      @@TheCherno Ah, good point! I wasn't aware of that part of the architecture.

  • @shifbar
    @shifbar 3 роки тому

    Hi Cherno, I love your videos about C++. They are very clear, helpfull and nice ;-) The only question that I have is to turn down the volume of the music in the background, becasue it's hard for me to get focused with the music :-(

  • @gushleb
    @gushleb 3 роки тому +2

    Great videos, I've been enhancing my code regularly just by watching your videos. one thing, in this one, didn't make sense though. I don't think owning a"mutex" locks access to a resource (the meshes vector). it rather forbids any other code that needs the same mutex to be executed while it's locked.

    • @tatianabasileus
      @tatianabasileus 3 роки тому

      Yeah, which is why the mutex needs to acquired and released on the resource provider side

  • @therealgunny
    @therealgunny 4 роки тому

    a resource system is probably the easiest part, what i'm really interested in is to see how you can multi-thread a rendering pipeline.
    btw, references are not real objects, so they cannot be copied nor assigned and they are constant by default so they have to be initialized at declaration, while pointers are real objects and can be copied, assigned and are not constant by default so you can rebind them during runtime and they can be nullable!

  • @markusdd5
    @markusdd5 4 роки тому +1

    What is the reason for not using OpenMP to parallelize the for loop? Much more concise, easier to use I think.

  • @maxmustermann3938
    @maxmustermann3938 10 місяців тому

    I wonder what's happening inside of your Mesh class - are you creating the GPU buffers outside of it at a later point? Since when you're doing that multithreaded, you would need to do some extra work in whatever graphics API you are using, which might get complicated when you aren't really in control of the threads as they are part of some threadpool created by the standard library.

  • @xyzwio
    @xyzwio 4 роки тому +1

    You added a mutex to the threads pushing data to your meshes vector, but some threads can still read from this vector while its being updated which can eventually crash.

    • @egor.okhterov
      @egor.okhterov 4 роки тому

      Yes, it's better to not mutate global state. We can constuct immutable vector after everything was loaded.

  • @vadiks20032
    @vadiks20032 2 роки тому

    me after watching cherno's video: wow whatr a nice video! didn't know it's all so simple!
    commentors: *doom theme*

  • @bhaskart488
    @bhaskart488 4 роки тому

    I hope you reach 500K by the end of this year...
    maybe million:)

  • @TopConductor
    @TopConductor 3 роки тому

    isn't it better not to copy mesh into mashes by doing push_back, but rather do push_back(move(mash))? Or am I wrong and copy elision will rull this out and automatically call move version of push_back?

  • @stenhealey7320
    @stenhealey7320 11 місяців тому

    Hi, threads aren't a guaranteed method of achieving multi-processing (only multi-threading). std::async doesn't parallelize unless you have the right hardware and operating system. How do you guarantee this method works cross platform.

  • @changthunderwang7543
    @changthunderwang7543 4 роки тому

    You are such a good teacher dude

  • @Zorgosto
    @Zorgosto 3 роки тому +1

    Can't you use the std::for_each method in c++ together with the parallel_policy in std::execution to get a parallel for loop?

  • @arthopacini
    @arthopacini 4 роки тому +4

    Nice video, as always!
    One question though, instead of LoadMesh receives a pointer to meshes, could the async call pass a std::ref(LoadMesh) ?
    I know this need to be done when using std::thread and passing by reference, is this the case? I've never used std::async to be honest

  • @littlebearandchicken2077
    @littlebearandchicken2077 4 роки тому

    Can't wait for a The Cherno Rust channel

  • @PolarBearOC
    @PolarBearOC 4 роки тому +2

    When you used a reference there I wanted to see if it works out for you, and I'd be surprised if it did. This is not supposed to work with a reference, I think you mixed up C# with C++ when you wrote this since the C# ref argument works in a similar way to a pointer argument in C++.

  • @flaminghollows3068
    @flaminghollows3068 3 роки тому +1

    This video:
    CyberPunk 2077: YES!!

  • @skrya1248
    @skrya1248 4 роки тому

    Hostinger is the best! No. 1 support :D

  • @pc5207
    @pc5207 4 роки тому +2

    That was cool,keep going on !!!

  • @secrus
    @secrus 4 роки тому +1

    2 questions.
    1. Have you tried CLion by JetBrains? Any opinion?
    2. Is there any sense in using .h files instead of .hpp? With modern C++ we can’t really import it to C programs (cause only reason for using .h, that I can see, is to share C and C++ code) so whats the point?

  • @user-jd1zx
    @user-jd1zx 4 роки тому

    this channel deserve much more subs

  • @sunthlower4812
    @sunthlower4812 4 роки тому +1

    This was really cool! Thank you man :D

  • @robertmoats1890
    @robertmoats1890 Рік тому

    I think you could have used a reference to m_Meshes (rather than a pointer) by using std::ref(m_Meshes). Don't ask me what this function does. I guess it uses magic to make the reference copyable.

  • @davidm.johnston8994
    @davidm.johnston8994 4 роки тому

    Great video, man, liked the practical exemple, even though I'm new.

  • @antonfernando8409
    @antonfernando8409 2 роки тому

    Does the ::async work in Linux? sorry, I thought you need pthread_mutex lock and unlock for mutual exclusion in Linux, or is that OS level lock APIs wrapped inside the std::async.

  • @clarkd1955
    @clarkd1955 4 роки тому

    If you try to speedup your disk loading by using multiple threads, you will read the disk slower. Switching the disk access from one thread to another means you will move the disk head more and that is very slow. These comments are just nonsense.

  • @marcpanther7924
    @marcpanther7924 4 роки тому +2

    Does std::async handle thread priorities and stuff? I know priorities are irrelevant for this example video, but i'm just trying to draw parallels with the Windows API that I'm used to.

    • @32gigs96
      @32gigs96 4 роки тому +2

      Marc Panther no that’s no how async works I’m pretty sure.

  • @ultimatesoup
    @ultimatesoup 11 місяців тому

    You can get even better performance boost by combining this stuff with pmr

  • @erwinschrodinger2320
    @erwinschrodinger2320 3 роки тому +1

    Great as always!

  • @Bereseker
    @Bereseker 9 днів тому

    5:34 Ohhh , like Loading assets!

  • @paulmichaud3230
    @paulmichaud3230 3 роки тому +1

    no instance of overloaded function "async" matches the argument list.

  • @ilyboc
    @ilyboc 4 роки тому

    that moment when you knew the trucks didnt appear because you used ref instead of pointer is just something I wouldnt have thought of right away like you did :/

  • @arthopacini
    @arthopacini 4 роки тому +5

    One more question, does std::async uses N threads on my system, N being the std::thread::hardware_concurrency() (all threads on cpu)? Is it possible to explicity tell the function the max number of threads to use? Thanks!

    • @johnadams7843
      @johnadams7843 4 роки тому +2

      I have the same question, I'm sure it can be done by just declaring a variable numThreads and writing the code to only launch the number of threads you want to use (less than or equal to numThreads). Not sure how practical this is with the variety of processors on the market though

    • @mateuszabramek7015
      @mateuszabramek7015 4 роки тому +1

      In Linux there is file limiting number of maximum threads per process if I remember well it's set at 2k, in Windows I've made test opening threads in loop and given up at 4k.

    • @manni2
      @manni2 4 роки тому +1

      Up to stdlib implementation. MSVC uses a Thread pool, dispatching work to those threads in turn. GCC will just create as many threads as you create async tasks. On GCC, it can thus be reasonable to limit the rate at which threads are created by some outside mechanism.

    • @Ariccio123
      @Ariccio123 4 роки тому

      @@manni2 it *used* to. I've used std::async in open source code (a big GitHub project, altWinDirStat), and when I used std::async for the primary parallelism in 2013-2015, it worked like that, and all was great. However, the standard actually requires a new thread on each invocation, and msvc changed to that for VS2015. Now, it launces threads with no limit, and it's pretty darn easy to hang the application because of it.

  • @bassbatterer
    @bassbatterer 4 роки тому

    at 15:00 shouldn't you then use a try catch statement aroung the pushback where the catch statement will unlock the mutex? So if the thread crashes before completing the resource doesn't stay locked.

  • @ytwhiletrue
    @ytwhiletrue 8 місяців тому

    hyperthreading should be able to offer a parallel for loop for you.. in c++..