C++ Tutorial: How to use CRTP to speed up your code

Поділитися
Вставка
  • Опубліковано 7 вер 2024

КОМЕНТАРІ • 51

  • @marcusmors8485
    @marcusmors8485 Рік тому +4

    Sick improvement. Can't wait to see my professor's face after applying this. Thank you so much.

  • @Paul-fn2wb
    @Paul-fn2wb 2 роки тому +7

    Hi Zen, great video! I searched for "crtp" in youtube and found this one. Thank you for a clear explanation of the topic. The benchmark times surprised me a lot.

    • @ZenSepiol
      @ZenSepiol  2 роки тому +2

      Thanks a lot. In theory its only the vtable call that gets eliminated.
      However due to that, most modern compilers are able to very aggressively optimize resulting in a much bigger speed up.

  • @Bobbias
    @Bobbias 2 роки тому +3

    I've come across crtp before but never really understood its value. Now I sure as hell do. That performance improvement is absolutely wild, especially for such a simple example.

    • @ZenSepiol
      @ZenSepiol  2 роки тому +1

      Glad it was helpful! Modern compilers are amazing at optimizing your code, but you need to provide them with the right options. CRTP is one of those options.

    • @Dziaji
      @Dziaji 6 місяців тому

      This doesn't actually help. All your classes are now derived from different base types, so it completely negates the purpose of virtual functions to begin with because you can't make an array with a mix of different derived type pointers, which is basically the main reason that virtual functions were created.

  • @krzysztofdabrowski5085
    @krzysztofdabrowski5085 Рік тому +1

    Than u very much it allowed me to understand this better!

  • @Gismo359
    @Gismo359 2 роки тому +3

    The new "deducing this" feature might be worth mentioning here, as it can essentially replace most CRTP use cases but with much less code

    • @ZenSepiol
      @ZenSepiol  2 роки тому

      Already looking forward to the new C++23 stuff! "Deducing this" is one of my favorites!

    • @VioletGiraffe
      @VioletGiraffe 2 роки тому

      I don't see how the two are related, can you elaborate?

    • @Gismo359
      @Gismo359 2 роки тому +3

      @@VioletGiraffe Deducing this essentially allows you to inherit Base without passing Derived as a template parameter. Functions with a deducing this are templates, which infer the type of *this*, so in this case *decltype(this)* will be Derived*, which is pretty much what CRTP does. Except that deducing this includes const/volatile and references, so saves having to reimplement the same methods a dozen times each. I suggest you take a quick look at the open-std article - it has some great examples.

    • @VioletGiraffe
      @VioletGiraffe 2 роки тому

      @@Gismo359, interesting, I had no idea this would be valid code. Thanks!

  • @houseisforthesoul
    @houseisforthesoul Рік тому +1

    great explanation

  • @danwu7275
    @danwu7275 2 роки тому +2

    thank you! please make more c++ videos.

  • @Dziaji
    @Dziaji 6 місяців тому

    omg, those eyes! Never seen anything like them.

  • @chankayau
    @chankayau 3 місяці тому

    thanks for the effort

  • @seanmuniz4651
    @seanmuniz4651 2 роки тому +1

    Based and CRTP pilled ❤️

  • @MaitreBart
    @MaitreBart 2 роки тому +3

    Subject well introduced.
    I find crtp quite an interesting application of compile-time polymorphism.
    But from what I can foresee, any possible implementation of the interface needs to be a template class, thus be coded in header files, thus in some way, "public". So I guess for companies that wish to make their libraries implementation hidden, this is a no-go, right?
    Also, I'd would be interested to see a couple of examples where one can use this principle in a multiple inheritance context (e.g. vertical-only and fanned out).

    • @ZenSepiol
      @ZenSepiol  2 роки тому +1

      If you expose the templated classes via your Interface, you are right.
      You could still speed up things internally using CRPT if you explicitly instantiate the templates that you know are getting used. That way the implementation can still be in the cpp file.

    • @MaitreBart
      @MaitreBart 2 роки тому

      Thank you for taking time to reply. Your feedback confirms my understanding: It's the same technique for any library-targetted code involving class-TMP.

  • @peskyjellyfish
    @peskyjellyfish 2 роки тому +1

    Thanks for the clear explanation of CRTP. In your example you add up the loop index in the Count function. Clang knows this specific loop and usually replaces it with the closed sum formula N(N+1)/2. In this case, the CRTP might just help with inlining the Count function, so Clang can optimize the looping. I'm not sure you'd get such a dramatic performance boost just by avoiding the virtual function overhead. If you change the count function to a `counter += n % 10` do you still see increased performance?

    • @ZenSepiol
      @ZenSepiol  2 роки тому

      Great point! The improvement is definitely not just because of the missing vtable call. Looking at the assembly there is much more involved, including inlining, loop unrolling and vectorized instructions.
      The point I was trying to make is that using CRTP allows your compiler to aggressively optimize. Because of the standard this is not possible without normal inheritance.

  • @rvoros
    @rvoros 2 роки тому +1

    Let me challenge your intro to CRTP.
    1) It'd need some explanation how come the template can refer to an undefined type when defining Derived?
    2) RunCRTP has to be templated so Derived must be known by it and also must be known by the caller.
    So then simply using T* obj as arg would do the job without the fuss of CRTP. Thus based on this example, CRTP doesn't offer anything. We need an example where it really shines. This isn't it. (and no others shine either that I have seen)
    I want to show the power of giving a good example. Recursion is a nice idea but when books demonstrate it through the calculation of factorial you see it's pointless because a for loop would do the job much simpler and easier so you don't get it. What's the point? Because the example was not strong enough.
    On the other hand if you introduce the concept of recursion through tree traversing, for example, then it's obvious. No questions asked.

    • @ZenSepiol
      @ZenSepiol  2 роки тому

      Thanks for the challenge, let me try to answer.
      1) In the initial CRTP paper, is was even deemed impossible: "told them that their compiler was broken and that this shouldn’t compile" (sites.google.com/a/gertrudandcope.com/info/Publications/InheritedTemplate.pdf). However, the c++ standard says, that it is valid code, if the base class _structure_ is independent of the derived class.
      2) Did you try using T* obj as arg and compared the runtimes? I would be surprised, if it is equally fast, since usually the compiler is allowed less optimization in that case. You are right that "Derived" needs to be known. This can be circumvented by using type erasure techniques, e.g. introducing an additional abstract base class as common interface. The code becomes more complex, but is equally fast (but much harder to explain).
      Sidenote: A much better example for recursion are all Dynamic Programming based problems. They need recursion to be solved, tree-traversal does not. Tree-traversal can be nicely explained using recursion, but most of the real code soon runs into stack issues. All the code I know for larger databases uses some sort of Morris traversal.

  • @user-tp1re9ro2e
    @user-tp1re9ro2e 2 роки тому +1

    I believe it's a bad idea to write code avoiding normal inheritance in favor of CRTP. In most simple cases compiler can replace dynamic call to a simple static function call. And performance will be exactly the same (I've checked it)
    I do not deny that there are situations where the compiler cannot (cannot yet) get rid of a virtual call, and the method invocation time is comparable to the running time of that method. But it happens very rarely (almost never)

    • @ZenSepiol
      @ZenSepiol  2 роки тому

      As always, it depends. I show the technique. CRTP surely shouldn't be used to replace any kind of inheritance.

  • @jthrush
    @jthrush 2 роки тому +1

    I see a big drawback of this is that (AFAIK) you can’t have an array of interfaces. Each element would need its own template type. You could probably make it work with tuples but i don’t see a way to make it work with arrays.

    • @ZenSepiol
      @ZenSepiol  2 роки тому

      Yes, you are right. Interfaces are not the use case for CRTP. It is often used in frameworks though.

  • @peterfordham3562
    @peterfordham3562 Рік тому

    That's performance improvement can't just be explained by avoiding the virtual dispatch. You should really look at the compiled code in cases like this to see what really happened.

    • @ZenSepiol
      @ZenSepiol  Рік тому

      Yes you are right. Usually this pattern allows the compiler to optimize the code much better and not only get rid of the vtable.

  • @ItsBaffledd
    @ItsBaffledd 9 місяців тому

    Excuse my ignorance but how does this differ from just skipping the CRTP method and just having each child class implement their own "void DoSomething()" that is not virtual?
    You are still defining the child class with RunCRPT so you could also just use the child class directly? This doesnt "Skip" inheritance because you cant have
    Base* b = c; it needs to still define the type like so
    Base* b = child
    So why not just use
    Child* cptr = c;
    The template removes any kind of anonymity to the parent type and just says "This is your exact parent type". I dont see the use case in this example?

    • @ZenSepiol
      @ZenSepiol  6 місяців тому

      You still avoid code duplication.

  • @roykimor
    @roykimor 2 роки тому +1

    if you know the template type you're going to pass to RunCRPT then its just like accepting the generated Base_CRTPImplemented generated compiler name and you can't use it in containers of type interface really

    • @ZenSepiol
      @ZenSepiol  2 роки тому

      Yes this is true. If you want to store the interface in containers you need some kind of type erasure. It‘s not impossible but far more complex.

  • @lucy-pero
    @lucy-pero 2 роки тому

    so static dispatch is faster than dynamic dispatch at runtime. Mind-blowing revelations in this video. Jesus.

    • @ZenSepiol
      @ZenSepiol  2 роки тому

      I also try to explain how to do it and that it enables a lot of compiler optimizations beyond just the dispatch itself.
      Thanks for watching.

  • @yudaegam
    @yudaegam Рік тому

    Aaaaaaawesome

  • @Antagon666
    @Antagon666 2 місяці тому

    Well, but it's really not polymorphic in the true sense of the word. Can't put different objects into vector and iterate over them whilst expecting polymorphism.

  • @Ztaticify
    @Ztaticify 2 роки тому

    Wait, isn't this just forcing the compiler to monomorphize the method implementations? I mean it's pretty cool to see an easy way to handle it in c++, but rust just does that by default. Either way I'll keep this in mind, it's a pretty useful trick

    • @ZenSepiol
      @ZenSepiol  2 роки тому

      You are right.
      Each time the template is instantiated with a different type, the binary will have an additional copy for exactly that type.
      Rust is amazing, I would love it to be more widespread.

  • @RyanLynch1
    @RyanLynch1 Рік тому

    can you do a video about why virtual methods are so slow and when they should be used?

    • @ZenSepiol
      @ZenSepiol  Рік тому

      They are not necessarily slow. But they sometimes prevent the compiler from optimizing the binary.
      Inheritance and virtual methods make a lot of sense for code reuse and maintainability. Sometimes performance is not everything.

  • @dibyojyotibhattacherjee897
    @dibyojyotibhattacherjee897 2 роки тому

    Will there be any c++ project series or advent of code in c++?

  • @JurekOK
    @JurekOK Місяць тому

    why is C++ so complicated?

  • @Dziaji
    @Dziaji 6 місяців тому

    This actually doesn't work. It completely removes the point of a virtual function to begin with. The problem is that all the derived types will now have different base types. Usually, the point of a virtual function is so you can have an array of base type pointers, and by calling the virtual function from the base type pointer, each object "knows" its derived type and calls the corresponding function.
    Using the way he describes in this video, you can't make an array of base types, because each derived type has its own unique base type, so you have to make an array for each derived type, and if you have separate arrays for each type, you might as well make each array with its own derived type, so the concept of a virtual function becomes pointless. Using his example, you might as well make the template function take "(T* obj)" as a parameter instead of "(Base *obj)".
    The CRTP technique CAN be used to speed up virtual functions a bit, but you have to add another non-template base class that your crtp class inherits from so that all the cinstances can be casted to the same type to make an array, and you have to create a function pointer as a member, and in the constructor of the derived class, you set that pointer equal to a static function defined in the derived class. This means that the runtime will not have to do a lookup to figure out which dervied class type to use, the object will have the function pointer as one of its own data members. There are a couple points to take into consideration for this method:
    - It will not be as fast as a non virtual function call, because of the fact that the location of the function is not known at compile time, so you have an indirection where the pointer needs to be accessed before the function is called, but since you are probably about to access other members of that object anyway, the value of the pointer is probably already cached (or was about to be cached) so it won't even add an additional RAM access, so it is super fast compared to doing a lookup for that function pointer by first accessing the virtual method table.
    - if you only have 1 or 2 virtual functions using this method, you will save ram over the standard virtual function scheme, because each object will have 1 extra pointer, and that would be the case with a virtual function as well, but since this method has no virtual function table, you save on RAM. However, each additional pseudo-virtual function you create in this way will add another pointer to every instance, while virtual functions will continue to only add a single pointer to each instance, so it could end up wasting a lot of RAM, so in that case, you are trading RAM for speed.
    - You have to be careful with the default assignment operator of the base class, because setting a base class instance equal to another will copy the pseudo virtual function pointer as well, and if you are setting a base type equal to a derived type casted to a base type, you will crash if you try to call this virtual method. So to deal with that, you need to explictly implement the assignment operator for your base class and set that pointer equal to the base class's version of the function, and to properly copy a dervied class, you have to use the standard "base *Dervied::Copy()" function, and you can use the pseudo-virtual scheme for that too.
    - Your pseudo virtual member functions will now be static functions, so you have to call them with the "this" pointer as the first member like you would have to do with C, so to call them, you have to do obj.PseudoVirtualFn(obj, otherParams), but you can get around this by using an intermediate function that does this for you automatically.

    • @ZenSepiol
      @ZenSepiol  6 місяців тому

      Thanks for the elaborate comment! Interfaces and base class pointers are one of the reasons for inheritance. But there are more reasons, e.g. less code, easier maintenance, clear structure, class extensions and quite some more.

    • @Dziaji
      @Dziaji 6 місяців тому

      @@ZenSepiol There are more reasons for inheritance, but there aren't more reasons for virtual functions. Virtual functions were created entirely to keep track of the inherited type from a base pointer. As you showed, you can completely get around virtual functions if you aren't working with base pointers.