Unity DOTS vs Assembly Benchmark - Which is fastest?

Поділитися
Вставка
  • Опубліковано 30 тра 2024
  • Performance comparison between 6 different Unity versions one of the being DOTS against a custom assembly version with DirectX 11. And a discussion about the results and what it means for game development.
    Check out my game Moose Miners on Steam: store.steampowered.com/app/25...
    Check out my game Sidestep Legends on Steam: store.steampowered.com/app/20...
    Check out my game Dextram on Steam: store.steampowered.com/app/10...
    Want to discuss assembly game dev or anything else related to Lingon Studios and our games? Join our discord: / discord
    Source code for benchmark project: github.com/maskrosen/combat-b...
    Original Unity project: github.com/Unity-Technologies...
    00:00 Introduction
    02:09 The benchmark
    02:48 Testing hardware
    03:12 Code walkthrough
    03:24 Game objects version code
    04:58 Data oriented version code
    07:38 Burst compiled version code
    08:18 Unity DOTS version code
    10:36 Assembly version code
    12:19 The results
    12:56 Frame time results
    16:36 Memory usage results
    17:55 CPU usage results
    18:58 GPU usage results
    19:46 Disk space results
    20:33 Ease of use and productivity discussion
    27:25 Performance by default
    28:34 Conclusion
    #gamedev #indiedev #gamedevelopment #indiegamedev #unity
  • Ігри

КОМЕНТАРІ • 249

  • @lingonstudios
    @lingonstudios  Рік тому +4

    Wishlist my new game Moose Miners on Steam if you want to support the channel: store.steampowered.com/app/2591410/Moose_Miners/?

  • @Jonathan-di1pb
    @Jonathan-di1pb 11 місяців тому +114

    From C# Unity straight to asm is crazy, props for powering through that. I would think you could a similar level of control in C though.

  • @lufog
    @lufog 11 місяців тому +65

    The strongest argument against assembler is not its poor readability, but its non-portability between different architectures. To port your code to arm, for example, you need to rewrite it from scratch.

    • @lingonstudios
      @lingonstudios  11 місяців тому +14

      Yeah, if portability is important for a project then assembly is not the right tool for the job

  • @sebito6668
    @sebito6668 11 місяців тому +52

    What would be really interesting is seeing how a C version compares to the ASM version.
    Since C is portable but still very low level it would be interesting to see how much this added abstraction layer would cost and what may differ in the resulting assembly! :D

    • @lingonstudios
      @lingonstudios  11 місяців тому +9

      Yeah, I'll probably make a video comparing that at some point

    • @RamsLiff
      @RamsLiff 11 місяців тому +5

      C compilers are so strong that its more powerfull and better in ASM than almost anyone. If you can see a difference probably It wont matter

    • @SnowyPup
      @SnowyPup 11 місяців тому +1

      @@lingonstudios I'd be really interested in seeing a C version -- maybe comparing a multithreaded ASM version and a multithreaded C version would be cool. While ASM is nice, gaming on ARM seems like it's starting to pick up and I feel like re-writing the app for arm and for x86 separately wouldn't be feasible.
      Additionally, have you considered making an OpenCL/CUDA version? Might not be practical, but then there's no need to transfer data from the CPU to the GPU, if it's all in vram anyways.

    • @alexdesander
      @alexdesander 11 місяців тому +1

      @@RamsLiff Yea, manually handling assembly really only is worth it when you look at small chunks of code and try to optimize that

    • @SomeRandomPiggo
      @SomeRandomPiggo 4 місяці тому

      @@SnowyPup I think the only case where rewriting for ARM would really be necessary is on mobile phones, I think it'll be a reeeeeeally long time before we see ARM being used over x86 for performance > power consumption

  • @nthnnnnnnnnn
    @nthnnnnnnnnn 11 місяців тому +4

    I really like the flow of the video and the relevancy of the techniques you've used. Also, it's pretty helpful that it covers the whole spectrum from GameObjects to asm. Thanks for making this.

  • @WhiteBoxDev
    @WhiteBoxDev Рік тому +64

    Have you considered making assembly tutorials in the context of game development? Your video style would be excellent for that sort of thing.

    • @lingonstudios
      @lingonstudios  Рік тому +32

      I have given it a thought every now and then. It's just that making tutorials take a lot of time, and I'm not sure enough people are interested in actual tutorial about assembly game development. I think videos like this work better since people can be interested in how fast things can be with assembly without being interested in working with assembly themselves.

    • @MM-24
      @MM-24 Рік тому +7

      @@lingonstudios one thing, would be a quick overview of how todo it - even a guide. I could see that being a very very popular video overtime. I think most programmers at one time or another have thought about using assembly. If your tutorial could be a guide ... that could be very popular.

    • @MM-24
      @MM-24 Рік тому +2

      hate to double post - something like Fireship's 100 sec overview videos (maybe make it 300 sec) would be amazing. And talk about tools and process and where togo for help

    • @marcsolis4896
      @marcsolis4896 Рік тому

      I would love to see that too (but to be honest, I don't think it will be a very common topic for most game developers, so I'm not sure if it will be worth for a UA-camr perspective )

    • @BrentMalice
      @BrentMalice Рік тому +1

      @@lingonstudios id never use assembly for anything but im watchin this outta general curiosity. did skip a lot of the dry math stuff that im too dumb to understand.
      its neat because this is how it used it have to be done right? so seeing even a slice in that context is neat.

  • @chosencode5881
    @chosencode5881 Рік тому +5

    Thank you i really appreciate this, for the scale and complexity of my games i need to rely on Unity for features (such as VFX, animations, UI, terrain and various level creation tools from the asset store). I would love to see more content of you building with DOTS!

    • @lingonstudios
      @lingonstudios  Рік тому +4

      Thanks :) Yeah there are many benefits with Unity you get for no extra work compared to a custom solution where you would have to make them yourself. And with DOTS there is definitely some good performance to get compared to normal Unity.
      I'll probably won't be building anything else with DOTS for now though at least. Since for the project I made this comparison for, the performance of huge amounts of entities is the most important things and I don't mind making the other things by myself.

  • @vialomur__vialomur5682
    @vialomur__vialomur5682 Рік тому +8

    Just wow! I've written some simple code for assembly and that's not that hard but.. This is amazing that you are capable to write such code!

  • @UnofficialFoneE
    @UnofficialFoneE Рік тому +35

    Thanks for taking the time to put together the video! It's a little frustrating whenever a new DOTS performance comparison video comes out because it is very difficult to capture the speed of DOTS without sinking thousands of hours into it. Which is what I see in a lot of these videos.
    For instance, using the MathF library instead of the Unity.Mathimatics library -- which is the one built for Burst -- can make those operations incompatible and lead to big optimization misses. Also, since DOTS supports built-in multithreading, developers tend to abuse this leading to slower code. So, in the future, it would be nice to see a challenge between yourself, and someone who has a lot of experience with DOTS!
    Lastly, although this is unrelated to DOTS, Unity's renderers are not the best. Any custom solution will 100% blow them away. And really, including rendering into the final numbers felt a little disingenuous considering the point of the video was a comparison between DOTS and Assembly.
    All in all, I appreciate anyone who takes the time to explore DOTS. And in your case, it might be notable to point out that you have access to X86 SIMD instructions through the Unity.Burst.Intrinsics namespace so you can freely mix assembly SIMD and C# in your DOTS code!

    • @sumofat4994
      @sumofat4994 11 місяців тому +1

      Dude if you need to be super experienced to get the PERF out of dots it violates their promise of perf by default.

    • @UnofficialFoneE
      @UnofficialFoneE 11 місяців тому +7

      @@sumofat4994 Depends what you mean by "performance." If anything, the video shows that you can get great performance without being very experienced with DOTS. But the benchmark was DOTS versus handwritten assembly... We were not trying to compare performance by default.

    • @sumofat4994
      @sumofat4994 11 місяців тому

      @@UnofficialFoneE I know I'm replying to ur comment

    • @UnofficialFoneE
      @UnofficialFoneE 11 місяців тому

      @@sumofat4994 I am curious on your thoughts about reaching the "performance by default" promise? In my experience, it seems like DOTS achieves this. My comment was not that DOTS fails to give you performance by default, but instead, order to get the maximum performance out of it, like any tool, you need to be experienced with it. Which is important to the video because the goal is really about exploring the peak capabilities of both approaches.

    • @delphicdescant
      @delphicdescant 11 місяців тому

      So spend thousands of hours getting DOTS to do what it claims to do, or spend a similar amount of time (or less) just writing your own engine code for a better result.
      Yeah, every time I chance upon a video about Unity, I remain glad that I didn't go down the unity path when I got into game dev.
      ECS is dead simple. Lots of people used it in their custom engines before unity decided to pretend they invented it.
      The only question left is: How did Unity manage to botch such a simple concept so badly that it takes that much effort to use it properly?

  • @lingonstudios
    @lingonstudios  11 місяців тому +44

    I made a follow up video testing some improved versions of the Unity DOTS version made by some viewers with more experience in DOTS. I also made a multi-threaded assembly version to test them against. Check it out here: ua-cam.com/video/82XkA2r8HNQ/v-deo.html

  • @jokersterritory
    @jokersterritory 11 місяців тому

    Super cool video and I would love to see a comparison between the Assembly and C/C++ versions!

  • @Gapil
    @Gapil Рік тому +2

    Looking forward for a follow up video with upgrades based on feedbacks received on the reddit thread! :)

    • @lingonstudios
      @lingonstudios  Рік тому +1

      Yeah. Two new versions of the DOTS version have been made by people from the reddit thread which both improved the performance. So I'll probably make an follow up video showcasing the fastest of those versions. "Spoiler alert", it still was not as fast as the assembly version ;)

  • @simoncowell1029
    @simoncowell1029 11 місяців тому

    Cool video ! Yes please, I would like to see a showdown between Assembly, C and C++ !

  • @AlexanderBukh
    @AlexanderBukh 11 місяців тому

    awesome experiment and demonstration!

  • @synchaoz
    @synchaoz 11 місяців тому +1

    I'm always amazed and fascinated about performance comparisons like these. What irks me is that I don't comprehend it well enough to understand how to actually make a game out of it that isn't just simple quads moving around.

  • @joakimgille9677
    @joakimgille9677 8 місяців тому

    Cool video. /Edit: Asked about the multi-threaded sync-points in DOTS and its effect on the result; but i was just recommended a follow-up vid you did that addressed it. :)

  • @GMDeatHSoul
    @GMDeatHSoul 11 місяців тому +1

    Very cool video, thanks!

  • @Ketpain
    @Ketpain Рік тому

    You sir, are a wizard in my eyes. Hope one day I can be there.

  • @voidbinary
    @voidbinary 11 місяців тому +2

    Chris Sawyer would be proud.
    Always wild to see how much overhead is introduced with other languages and how efficient the same idea could be done in assembly while taking significantly less disk space.
    I personally miss the old game dev times, in which efficiency was a must and a by-product due to not having go to engine like UE, Unity etc.
    Really shows how much is possible in a fraction of a size if done on low level

  • @myelinsheathxd
    @myelinsheathxd Рік тому

    Amazing review thx for sharing

  • @gryasl6231
    @gryasl6231 Рік тому +5

    wow, the subject of the video is just so interesting

  • @mlecz
    @mlecz Рік тому +29

    I would like to see how the multi-threaded assembly version would perform. The problem is that multithreading can impose a considerable overhead which, we recover only after a few threads.
    In many cases, the multithreaded solution of the problem can be significantly slower even in 2-4 threads than the original single-threaded, but the matter begins to improve significantly with an increase in the number of threads.

    • @lingonstudios
      @lingonstudios  Рік тому +39

      I'm glad to let you know that I actually just completed a multi-threaded assembly version. It did scale quite well actually, so well that it ended up being bottlenecked by the GPU. I'll make a follow up video about it together with some measurements from some improved versions of the DOTS version some more people more experienced with Unity DOTS made. So keep an eye out for that.

    • @mlecz
      @mlecz Рік тому +5

      @@lingonstudios Wow nice, I'm looking forward to the video about it

    • @ross9263
      @ross9263 11 місяців тому

      A well designed job system should not cause that much overhead at all. What are you talking about. If locks context switching, deadlocks, etc. are properly designed around

  • @ozanyasindogan
    @ozanyasindogan Рік тому +1

    Wow, that's impressing! You really made a big benchmark project, thank you for your effort and sharing this. I didn't even know that you could link Assembly or C binaries with Unity, that's a big finding for me. Although that will work only on Windows and x86 based CPU's as it is now, I mean the Assembly code was crafted that way, that's a great way of optimization. I've seen Zig compiler results on Dave's Garage. I wonder if Zig could produce even more optimized code for this task and can be linked to Unity. Respect! :)

    • @lingonstudios
      @lingonstudios  Рік тому +1

      I think you misunderstood the assembly version. The assembly version is completely standalone and custom built linked directly with DirectX 11. So there is no connection between it and Unity. I just compared it against Unity DOTS since I wanted to see how it would stand against a pure assembly version.

  • @Yamyatos
    @Yamyatos 10 місяців тому +2

    While initially surprised, i guess it makes sense. The thing is DOTS has to work with everything, while any custom solution (especially to rather small problems such as this) can take even more shortcuts. I wouldnt have thought it was this drastic a comparison. Still, i think when complexity scales up, it would be less of a difference. I also wouldnt be keen on making an actual game in assembly lol.

  • @cunikmaxiumus755
    @cunikmaxiumus755 Рік тому +4

    Great work, interesting comparison. One thing i miss is just doing multi-threading and seeing the difference between dots, and c# multi-threaded, since dots is like the only one with multi-threading in there, and it would be nice to see if how much all the dots work that has been done, actually gives you back in performance.
    For game objects you could extrapolate logic to thread tasks using structs or just c# classes, and just move game objects into correct positions and rotations when tasks are done so unity renders it. And non game object is even simpler i suppose..

  • @empireempire3545
    @empireempire3545 Рік тому +3

    Big yes for c vs c++ vs assembly showdown :D

    • @lingonstudios
      @lingonstudios  Рік тому

      Yeah, many have said they want to see it and I am curious about myself as well, so I guess I have to make that now at some point :)

  • @retticle
    @retticle Рік тому +11

    Some pretty awesome info! I'm surprised Unity ECS has so much memory overhead. I would love to see a Bevy version as well.

    • @lingonstudios
      @lingonstudios  11 місяців тому +1

      Thanks! A lot of people have been asking for a bevy version, so maybe I'll have a look at it for a future video

  • @me_owe_ski
    @me_owe_ski 11 місяців тому

    Amazing video. One thing to add would be using custom Renderer for Entities that uses Indirect Instancing. Could help optimizing scene a bit further.

  • @mehmet2247
    @mehmet2247 11 місяців тому

    Awesome comparision

  • @nihil75
    @nihil75 Рік тому +7

    Great vid!! I had a similar experience with DOTS.. It's pretty good, but not as good as expected.. especially when I added physics it .. would still use it though :)

    • @lingonstudios
      @lingonstudios  Рік тому +4

      Yeah, it did give a significant performance boost compared to the other Unity versions. So for anyone using Unity, it is definitely worth it.

  • @Kevroa1
    @Kevroa1 11 місяців тому

    Fantastic video

  • @timothy0098
    @timothy0098 2 місяці тому

    I must say, I really think it is amazing what the DOTS team have managed to gain in performance with unity. That it is so close to asembly in performance is insane, as you need to remember this is cross platform code. Just incredible. What they have achieved is really next level.

  • @not_herobrine3752
    @not_herobrine3752 11 місяців тому +1

    now this is what i call real programming; subscribed

  • @ruslansmirnov9006
    @ruslansmirnov9006 7 місяців тому

    brilliant benchmark

  • @DonC876
    @DonC876 11 місяців тому +2

    Would be really cool if someone or a group of people with more experience of developing high performance unity apps with DOTS would try improve the code and then let you bench again. Not using the Mathematics Library is a big one i think and just in general i would be really reall interested in a follow up and finding out how much there's still to gain on these DOTS performance numbers. Great video

    • @lingonstudios
      @lingonstudios  11 місяців тому +4

      A few people took a look at it and made improved versions of the DOTS version so a follow-up video about that is coming. A lot of people have kept repeating the mathf library ruining performance, but I looked at the assembly generated from it and it was completely reasonable assembly, no function calls or anything that would ruin the performance. And not surprisingly, when they replaced the mathf calls with Unity.Mathematics there were no difference at all in performance

    • @DonC876
      @DonC876 11 місяців тому

      @@lingonstudios I thought the point of Mathematics was to enable the use of SIMD instructions, but i mostly work with shader and not very well versed with optimizing cpu side code. I am excited to see the follow up video though :)

  • @yannmassard3970
    @yannmassard3970 10 місяців тому

    didnt imagine I would see pure ASM coding in GameDev since the end of the Amiga. Tks

  • @pablomirandaandrade3712
    @pablomirandaandrade3712 Рік тому

    Great video!

  • @PirateDave83
    @PirateDave83 11 місяців тому

    Great job LS !! You're one of the best programmer on youtube but for this project I think you try to shoot a fly with a bazooka when you need a fly swatter. For this I suggest to use instead C or C++. They are optimized to machine level and the code are not impossible to understand (even for those who follow you). For the rest I congratulate you. I think it won't be long before some big software house contacts you ( if they haven't already... ) !!

  • @hosseinanisi224
    @hosseinanisi224 10 місяців тому

    You should definitely compare the performance with unreal c++, btw great video

  • @SoaringSimulator
    @SoaringSimulator 11 місяців тому

    There is a Unity tech call Project Tiny. And is a set of workflow features and a specialized build pipeline that allows you to create small, lightweight games.

  • @godDIEmanLIVE
    @godDIEmanLIVE 11 місяців тому

    That was a monumental effort. Really cool. To be fair, DOTS already is a fantastic improvement and will benefit a lot of projects out of the box. I wonder if you could write DOTS for most of your stuff and ASM for critical parts?

    • @lingonstudios
      @lingonstudios  11 місяців тому +3

      I don't think it would help to bring in assembly into the DOTS version. Since the actual assembly code in the DOTS version looks to be good. I think it is the memory model in DOTS that ends up being the bottleneck in this benchmark. So having a very large amount of simple entities seem to come with a significant overhead in DOTS compared to a custom solution.

  • @mrcrackerist
    @mrcrackerist 11 місяців тому +1

    Currently trying to write a game engine in C and Vulkan and this blows my mind :)

    • @lingonstudios
      @lingonstudios  11 місяців тому +1

      I haven't dove properly into the "new" graphics apis yet, but from what I've seen using Vulkan/DX12 compared to OpenGL/DX11 seems to be about the same amount of extra work as using assembly compared to using C. Maybe I'll go all the way and use DX12 with assembly at some point in the future if DX11 ends up being too much of a bottleneck

    • @mrcrackerist
      @mrcrackerist 11 місяців тому

      @@lingonstudios I am a bit more familiar with C then Assembly so I went for it, but it seems to be a lot of repetition in Assembly so I was thinking of using C with Assembly modules.
      Anyways vulkan is fun but lot of boiler plate, I haven't touched DX11 or DX12 do.

  • @benceszeplaki7712
    @benceszeplaki7712 7 місяців тому

    Great video! Not really sure why you picked ASM over C, but I guess it can be just preference. Since you are benchmarking data oriented ECS approaches here, it would be nice to see a Bevy implementation as well.

  • @developerdeveloper67
    @developerdeveloper67 11 місяців тому +1

    You are a f-ing genius, man. Let's go! ASM games for the win! 💪

  • @nah82201
    @nah82201 Рік тому +1

    Great video! Thanks for sharing.
    Do you have the graphs posted anywhere? Really hard to read on UA-cam without being able to zoom in like on a picture (on a phone).

    • @lingonstudios
      @lingonstudios  Рік тому +1

      Sure, here is the google sheets document with the graphs and the measurement data
      docs.google.com/spreadsheets/d/1BHHkyL6oRTCZ2RkaBtGdHv3rwbANc7zqRNYade0e5Bg/edit?usp=sharing
      Hopefully that should be somewhat useable on mobile as well

    • @nah82201
      @nah82201 11 місяців тому

      @@lingonstudios that’s awesome! Thanks for sharing. 🙏🏻

  • @bluzenkk
    @bluzenkk Рік тому +21

    would love to see how c++ perform versus asembly under the same parameters

    • @lingonstudios
      @lingonstudios  Рік тому +6

      Yeah, me too. So I'll probably make a video about that as well at some point

  • @xaviervitor
    @xaviervitor Рік тому +10

    Really liked this video. Can you make the same comparison with Raylib? Would be really cool to see how it stacks up. Thanks and good luck with your project!

    • @nihil75
      @nihil75 Рік тому +3

      ha! just working on raylib project and would love to know.

    • @lingonstudios
      @lingonstudios  Рік тому +6

      Glad you liked it! Yeah, I already have the start of the Raylib version in repo on Github, so I might turn this whole thing into a series and one with Raylib would be one of the first to come next

    • @xaviervitor
      @xaviervitor Рік тому

      @@lingonstudios cool! looking forward to it.

  • @bolloxim1
    @bolloxim1 11 місяців тому +1

    in assembler you have another control option which can be a drastic performance gain , this is the ability to prefetch data before you need it, this requires a complete understanding of your hardware for effectively setting up prefetch. This actually was important on older hardware like ps3 where a memory fetch of 54 cycles from l2 to l1 .. modern intel/amd we are looking at about 12 cycles, but l3 loads are really slow still can be in the 80 cycle range... so prefetching lines before you need them is still a good idea when dealing with high volume of data throughput. I've managed to optimize C code by 20x in assembler with good lookahead cache architecture and cacheline friendly data organization

    • @lingonstudios
      @lingonstudios  11 місяців тому +1

      Yeah the data is oriented in a way to allow for good cache usage, they are just plain arrays and I was careful to not include more data than needed so as much as possible should fit in the caches. However there is no manual prefetching of data on the x86_64 platform, you just get a 64 byte cache line every time you access memory. But yeah, being cache friendly is of great importance for performance. I think that is the main reason for the performance difference between the Unity DOTS version and the assembly version. Unity DOTS seems to introduce quite a bit of overhead data for each entity which I suspect leads to a lot more cache evictions and need to fetch data more often from l3 or even ram, while in the assembly version should be able to operate mostly from l2 on my cpu

  • @onerimeuse
    @onerimeuse Рік тому

    Forget the Cs, let's see assembly vs rust! (I mean, don't forget c, honestly, I love all of these comparison kinds of videos)

    • @lingonstudios
      @lingonstudios  Рік тому

      Yeah, I'll probably make some more comparisons with other languages and engines/frameworks. Maybe rust will be one of them

  • @jasonwylie1445
    @jasonwylie1445 11 місяців тому

    great video, I would have like to see it compared to a compute shader also

  • @gamebuster800
    @gamebuster800 11 місяців тому

    I've decided to write some core loops of my game that leans heavily on simulating something to C. I haven't been able to benchmark the difference yet (since i'm not done yet) but looking at your results... i'm expecting a pretty decent difference. Writing the core loop in C also allows me to test and benchmark the simulations outside of unity, which also helps since my simulation logic is quite complex.
    My current simulation is running inside a single GameObject, like your data oriented example.
    I feel like my development speed with Unity is only high at start. At the moment things start become complicated, any time saved earlier is nullified by time spend debugging. Even the code > compile > run loop is incredibly slow for unity compared to writing it in plain C. Like you said, writing in assembly (or C in my case) is the most enjoyable.
    It seems that your assembly version runs outside unity. I intent to write my performance sensitive stuff in C and load that into unity. I can then do more high-level stuff in plain C# GameObjects while the performance critical parts are thorougly tested on performance and correctness in C.
    While I was writing stuff in C, I realised I can just run my C code in a browser using emscripten, which has been a terrible distraction as well.

  • @tudorelRo
    @tudorelRo Рік тому

    Great video, very insightful, I do have a question tho. I saw that you made separate builds and measurements for mono and il2cpp for Unity using game objects and Unity using data-oriented design, but you didn't do the same for Burst and DOTS, why is that so? (I am assuming that in the comparison both Unity Burst and Unity Dots are built using mono since it is not specified)

    • @lingonstudios
      @lingonstudios  Рік тому +1

      The Burst and DOTS versions are built with il2cpp. I did not bother doing mono vs il2cpp in those since pretty much all the code is compiled with Burst in those versions anyway, so there is no code that would affect the performance that would be changing.

  • @alicivrilify
    @alicivrilify Рік тому

    Thanks and congratulation for the work! A question: Have you heard of Entitas for Unity? Do you know how it compares to DOTS?

    • @lingonstudios
      @lingonstudios  Рік тому +1

      Yeah I did hear about that a long time ago, before Unity DOTS was announced. Did not know it still existed. I don't know how it compares against DOTS, but my guess would be that it is not as fast. If comparing to the versions I made in the video, my guess is that it would be faster than the gameobject version, but slower than the dataoriented non-DOTS version.

  • @Holy_Cannon
    @Holy_Cannon 11 місяців тому

    This is a amazing video, you helped explain the registers very well. Are you planning on doing DirectX12 video or tutorial anytime soon?

    • @lingonstudios
      @lingonstudios  11 місяців тому

      No plans on any videos with DirectX12 at the moment, I still have not dove into those "new" apis. I'm still using DirectX11 and opengl for my games and projects. Maybe I'll get into DirectX12 if I find DirectX11 becoming a bottleneck in multithreaded workloads, but I don't think that will be the case in practise for my projects. Since they are mostly about rendering a lot of the same type of mesh rather than a lot of different meshes. And for that DirectX should work fine.

  • @Volker-Dirr
    @Volker-Dirr 11 місяців тому

    Nice Video. Question about 19:48 "Disk space usage". Is that the disk space of the binary or of the source? Can you also add the results of the missing one (So source or binary; just the missing one)?

    • @lingonstudios
      @lingonstudios  11 місяців тому +2

      It is the size of the build, so the runnable binary and any additional data needed. The size of the source depends on what you measure so it would be hard to select something that seems reasonable to compare.
      The code files of the assembly version are going to be larger than the Unity C# files since there is more code in the assembly version, but still so small sizes it won't be very noticeable. If we include other files than raw code files then we need to decide if we want to treat the whole Unity engine installation as part of it or not, or if we want to look at the size of the cache files generated by Unity (which can be rather large, the Library folder in the Unity project is almost 4gb).

  • @oswoya
    @oswoya 11 місяців тому

    wich ressources did you use to learn assembly for game programing?

  • @KonradGM
    @KonradGM Рік тому +1

    Wait a seconf im little confused, GameObjects Mono and ILCCP is regular Monobehaviour workflow right? But the other Unity Mono vs UnityIlCCP? Is it only data oriented stack or ecs / dots?

    • @lingonstudios
      @lingonstudios  Рік тому

      The Unity Mono and Unity il2cpp are just normal C# written in a data oriented way. So no classes for the data, the data for the bees are just a bunch of arrays. No special Unity features are used in these versions, so no ECS or anything else from DOTS.

  • @xacesraskey5155
    @xacesraskey5155 Рік тому +1

    IenableableComponent is prefer -> alive/dead with structure changes?

  • @dexterman6361
    @dexterman6361 10 місяців тому

    Is it possible to use C++ for your processing and see how that goes? I want to try something, but am not skilled enough!
    Thank you for this video! I'll take a look at the code! A lot to learn!
    Quick edit, I think the memory usage also includes DLL and other shared bits. It could be the unity runtime itself, the "engine" bits and all other runtime dll being loaded. But not sure.

  • @assemblyrtsdev
    @assemblyrtsdev Рік тому +3

    I had no idea that IL2CPP could rival Burst's performance. Thanks a lot for this thorough comparison!

    • @lingonstudios
      @lingonstudios  Рік тому

      Yeah I was a bit surprised as well. There are probably ways of making the Burst version faster, but given how much trouble it was using Burst outside of DOTS I don't see the point in bothering. Better to just go straight to DOTS in that case since Burst fit much better in that context.

    • @assemblyrtsdev
      @assemblyrtsdev Рік тому

      Update: I discovered that in my own tests, Mono is almost as fast as IL2CPP, even for data-oriented code.

    • @lingonstudios
      @lingonstudios  Рік тому

      @@assemblyrtsdev That is interesting. Since in the test I made in the video as you can see in the results there was a really big difference between il2cpp, especially for the data oriented code. Could it be that you had a bottleneck that was not in the C# code perhaps?

    • @assemblyrtsdev
      @assemblyrtsdev Рік тому

      @@lingonstudios I just build combat-bees-improved myself, and for me IL2CPP is also a lot faster than Mono in this case. But for some reason there is no meaningful difference in my own EcsPerformanceComparisons repo.

  • @delphicdescant
    @delphicdescant 11 місяців тому

    "Things do what you tell them to do and nothing else."
    That's my kind of programming. If you haven't checked out Zig before, I think it would fit in nicely with your preferences. I know I've had a blast with it.

    • @lingonstudios
      @lingonstudios  11 місяців тому +1

      Yeah Zig has caught my interest a bit, it seems quite cool. Maybe I'll try it out against assembly in a future video in this series

  • @kiririn39m8
    @kiririn39m8 Рік тому +5

    Thanks for the vid. By the way, what do you thinks about Unreal's Actor-component approach? As an ex-unity developer i find it quite.. odd and anything i write in it seems much worse than even unity in terms of performance and code clarity. I have a feeling that unreal, at least actors-component is super outdated design. Couldnt fiind any other person whom had also moved from unity to unreal to ask their opinion so far

    • @lingonstudios
      @lingonstudios  Рік тому +1

      I have not worked with Unreal so I don't know the exact details around the Actors. But I have heard that Unreal is quite object oriented which seems weird to me that they would make it like that. I think if I ever were to use Unreal I would probably mostly just use the renderer part of the engine, since that is what is impressive in it anyways, and then build my own engine around it almost. Maybe I'll check out Unreal for a follow-up to this video and put it against assembly in this benchmark. Could be interesting

    • @GonziHere
      @GonziHere Рік тому +3

      @@lingonstudios Unreal has recently introduced Mass, which is their own DOTS and it's in cpp, so comparison would certainly be interesting. That being said, I've moved from that engine because of its OOP madhouse. I'm tinkering with Godot, because it's FOSS, it's c++ and doesn't get in the way of data oriented design. I honestly wouldn't expect you to enjoy anything about Unreal, but you might actually somewhat enjoy Godot.

    • @gandev5285
      @gandev5285 Рік тому +2

      @@lingonstudios A follow up video using Unreal Engine would be very interesting to me. I use Unreal Engine professionally and the Actor / Component system is sort of a nightmare IMO (yes it is very OOP). Mass Entity has got me very interested and I'm wondering how it stacks up to Unity DOTS, as I haven't seen a comparison of those two yet. Also a comparison to a custom solution would be very interesting as well. I don't think Mass is as far along as Unity DOTS however. It would also be interesting to hear your perspective on Unreal in general (although I think I may already know the answer to that for the most part). Just found your channel today and subscribed :)

  • @everythingcouldbesimplify818
    @everythingcouldbesimplify818 Рік тому +1

    Insane

  • @cloudsquall88
    @cloudsquall88 11 місяців тому

    Since you are clearly very knowledgeable about ECS, could you possibly make a comparison with Bevy also? This was a great video!

  • @TminusDoom
    @TminusDoom Рік тому +1

    I'm curious what the profiler was saying during the Unity tests.

  • @nickgennady
    @nickgennady 11 місяців тому +1

    This probably be a lot to ask but it be cool to see a C version vs Unity and assembly.
    Also isn't their multiple versions of assembly for different architectures. If so you need to write game multiple times in assembly?

    • @lingonstudios
      @lingonstudios  11 місяців тому +1

      I will probably do a video comparing a C version to these in the future. Yeah there are a different assembly language for each architecture. This is x86_64 assembly which is what is used for most desktop computers and laptops as well as the last two generations of Xbox and PlayStation. If you wanted a version to run on mobile, Nintendo Switch or Apples new computers you would have to rewrite it completely in arm assembly.

    • @Leonard_MT
      @Leonard_MT 11 місяців тому +1

      ​@@lingonstudios Exciting, if the the comparison with C becomes a reality what compiler and optimization flags do you plan to use?

  • @pikkoblank7123
    @pikkoblank7123 11 місяців тому

    Not really programmer here so i don't really get a lot of this, but how hard is to optimize in assembly?

  • @shApYT
    @shApYT 7 місяців тому +1

    thanks. My next game will be an asic.

  • @ITR
    @ITR 5 місяців тому

    For burst compiled code you can use "in" and "ref" to send in structs like arrays and float3 without having to use unsafe

  • @Luxalpa
    @Luxalpa 11 місяців тому +1

    If you write to a register, you'll want to give the CPU some time before you go and read from it. So the reads and writes should be more separate. It would be interesting to compare your asm version with one written in C!

  • @wilykary
    @wilykary 11 місяців тому

    Just a question, why did you make your game in assembly? It's not cross platform and using something like C is order of magnitudes easier, plus the compiler does optimizations and would probably be just as fast as pure assembly.

    • @lingonstudios
      @lingonstudios  11 місяців тому +1

      The games I have released so far has not been made in assembly. The first one was in Java (never again), and the second one was made in C with Raylib. I did work on a traffic sim game for a while on my free time in assembly, which has been put on hold for now since that project was a little bit too big to do right now. I plan on making a game in assembly that I will release that I'll probably start this Autumn. The reason for that is partly that I find assembly very enjoyable to write in, but also due to performance. The thing is that if you want really good performance you need to make proper use of the SIMD instructions for the performance critical parts of the game. And you can't really trust the compiler to do that automatically, it works sometimes and sometimes not. So you'd have to babysit the compiler and always check what assembly it generates and to make sure you get the code you want you'd use intrinsics which is basically writing assembly inside C/C++. I personally find just writing pure assembly to be more pleasant than writing C/C++ code with instrinsics.
      Cross platform is of course a downside of pure assembly, but since I am focusing on PC games, it is really not that important.

    • @wilykary
      @wilykary 11 місяців тому +1

      @@lingonstudios Damn, you really are a freak for enjoying writing assembly over C/C++. I guess the time to performance tradeoff is worth it if you're enjoying the process 🙂.

  • @sirdorius361
    @sirdorius361 11 місяців тому

    "Switch to Twitter time" haha, I'm borrowing this one

  • @thygrrr
    @thygrrr 10 місяців тому

    Unity DOTS actually lets you write to the array ("AsParralelWriter") but random access and aliasing can be a problem.
    The easiest way is to have 1 fresh array, Allocator Temp or TempJob, write to that, and in another dependent job, simply copy the written array back once all threads are done.

  • @ThePoke151
    @ThePoke151 11 місяців тому +3

    Can you do a comparison with bevy? Its a rust based ECS-based game engine and I think it would nicely fit in there! :)

    • @TheSast
      @TheSast 11 місяців тому +3

      Would love to see that!

  • @ViaConDias
    @ViaConDias Рік тому +2

    Love it. You should make this an ASM channel. You are great at both writing and explaining it so I think you should do that and leave all these other (scripting) languages to the wannabe coding channels 😉😅
    2 ideas off the top of my head:
    - Push this bees example to the very limit in efficiency and performance
    - If you really want to work with a scripting language then integrate one into a game written in ASM (like Lua in C/C++)

    • @lingonstudios
      @lingonstudios  11 місяців тому +3

      I am thinking of making this benchmark a series and comparing assembly to different programming languages/frameworks and game engines. The next video will be a follow up to this one with an improved version of the Unity DOTS and a multithreaded assembly version. There are still some (perhaps significant )improvements that can be done to even this multithreaded assembly version. But now it ended up being bottlenecked on the GPU for this benchmark. So I'll have to do some adjustments to the test in case I want to push it even further

    • @ViaConDias
      @ViaConDias 11 місяців тому

      @@lingonstudios Nice. I'm really looking forward to that video. I don't know your setup, but if the GPU is the bottleneck maybe do the calculations and don't push it to the GPU? When you got it all optimized find a kid in the neighborhood with a crazy setup to run it. Would that be an option? I'm running a 3080 so you can always hit me up if I could be of any help.
      ASM is more important today than ever. Not because we need to write it but because too many developers are so abstracted from the hardware that they write very bad code because they just do not understand. They think ASM is just a really old and bad language like Fortran or Cobol they do not know that ASM is "touching metal". For many, they don't even really understand that there is "metal" underneath it all. They don't look lower than the interpreter running their code and for many even that is magic.

  • @HyperDev00
    @HyperDev00 Рік тому +1

    Can the assembly code be portable to different devices? And can you please make tutorials on assembly!!

    • @lingonstudios
      @lingonstudios  Рік тому +1

      Since this is x86_64 assembly it will only run on x86_64 cpus. That means PCs and you could probably make it run on PS5 or the latest Xbox, since both those are using that platform as well. However this particular code base is built with Microsoft's assembler and uses DirectX which means it will only run on windows PCs. You could run it on Linux with something like proton though. For assembly tutorials I would recommend www.youtube.com/@WhatsACreel/ he already has a bunch of good tutorials on the subject.

  • @user-jo5et3ur1p
    @user-jo5et3ur1p 11 місяців тому

    I don’t how this is a thing, like eventually every programming language compiles to asm right?! So I that case it would be more like can I write better asm than the c compiler or the c++ compiler??

  • @treesd
    @treesd 8 місяців тому +2

    Vote for C++ vs Assembly showdown

  • @simpson6700
    @simpson6700 11 місяців тому +1

    Now I'm curious if there are any modern assembly games and how they perform.

  • @Kazyek
    @Kazyek 11 місяців тому

    23:17 the Rust language would like a word with you on that! ;)

  • @turun_ambartanen
    @turun_ambartanen 11 місяців тому +2

    Nice comparison.
    I very much prefer the linux way of counting CPU usage: 100% = 1 core. This makes it independent from the core count and much easier to read.

  • @SNkael
    @SNkael 10 місяців тому

    Hello, I have a question please,
    I dont know Unity Dots and Burst, but I have heard that Burst associated with unity Dots is the most powerful way. But in this comparison, Burst is slower. Why ? It seems to use more precise technique than Dots but waste more performance.
    Thanks and nice work !

    • @lingonstudios
      @lingonstudios  10 місяців тому +1

      The version called Burst in the benchmark is Burst without DOTS. It was just a simple port of the normal Unity version to use Burst. That version is still single threaded since it does not use the job system. The DOTS version also uses the Burst compiler, but it also uses the Entities framework and the jobs system for multithreading.

    • @SNkael
      @SNkael 10 місяців тому

      @@lingonstudios ok thank you :)

  • @ruwe1962
    @ruwe1962 11 місяців тому

    Maybe you could enable SIMD calculaions in Burst/Dots variants, eg. by using float4 instead of float3 and precalculated vectors for multiplications instead of multiplying componentwise with scalar constants

  • @KingdomTerrahearts
    @KingdomTerrahearts Рік тому +1

    Honestly, the biggest draw of unity for me is the ease of switching target platforms, the asset store and the ease of making animations and all that (also tools for assisting my workflow). I know there are better optimized and less resource intensive ways of making games, but my motivation would take a big hit if I had to work on visual studio for mostly everything, idk never tried assembly

    • @KingdomTerrahearts
      @KingdomTerrahearts Рік тому

      What I was trying to say was that for what I do unity is better, but for your case performance is themost important, meaning I understand your decision

    • @lingonstudios
      @lingonstudios  Рік тому +1

      Yeah, I completely agree. There are a lot of good things included in the Unity package that helps a lot while making games. And with that in mind, the performance increase you can get by using DOTS is just a really nice bonus.
      But yea for me the performance part was extra important since making a simulation with a really large number of entities is how I am planning to make my future game stand out. And looking at the numbers it seems that it will likely be able to stand out quite a bit even against games made in DOTS.

  • @ohfor4523
    @ohfor4523 11 місяців тому

    I think the real answer is mix and match. You don't need to write the entire thing in ASM or the entire thing in C#. There are pros and cons to each as you rightfully point out; exploring a hybrid approach would be worth a watch.

  • @nathanfranck5822
    @nathanfranck5822 11 місяців тому +2

    I really did think the Dots version would slap the custom asm version around, since you'd expect the compiler to find all sorts of crazy obscure optimizations. Definitely was surprised!

  • @Hazzel31337
    @Hazzel31337 Рік тому

    i am using unity and this is verry interessting to see, i didnt even knew about il2cpp! high quality comparision video! i would be interessted to see how unreal with c++ compares and i heared they build their own ECS , no clue how far and usable this one is and how it compares, would be interessting to know

    • @lingonstudios
      @lingonstudios  Рік тому

      Thanks! il2cpp is a good boost in performance if you have written performance oriented C# code so definitely recommend using it if you have use for some better cpu performance and don't worry that much about mod support. I'm not familiar with any ECS in Unreal, but then I have not really used Unreal yet. Maybe I'll give it a try for a future entry in this benchmarking series.

  • @developerdeveloper67
    @developerdeveloper67 11 місяців тому

    Can you make a video on how you open a Window's window with assembly? Maybe also how you use OpenGL?

    • @developerdeveloper67
      @developerdeveloper67 11 місяців тому

      Okay I noticed now you have a tutorial on doing DX on ASM, so I guess you don't do OpenGL.

  • @HiHi-iu8gf
    @HiHi-iu8gf 11 місяців тому

    neato

  • @rafa_br34
    @rafa_br34 11 місяців тому

    It's impressive that you got to that point at creating a game in assembly, however, wouldn't it be easier/smarter to use C++? Assembly isn't usually used anymore because most C++ compilers have tons of optimizing tricks (or that's what I've heard)

  • @dan2800
    @dan2800 11 місяців тому

    the dots version considering that if running with 2400MT/s memory that's 1200MHz inf fabric and it's rated for 1800MHz due to it utilizing all threads possible the cross CCD communication could have impact on performace
    for CPU usage we are talking whole cpu better representation would be to use 100% per thread so dots would be 1500% and ASM like 160% also cpu boost freq and other programs could affect the scheduling and for GPU core and mem freq can be diff from build to build

  • @AtomicBl453
    @AtomicBl453 11 місяців тому

    it'd be interesting to see how this would perform in higher level languages like Ruby

  • @jon5155
    @jon5155 7 місяців тому

    I guess the idea won't be making a game in asm but if it's done in c/c++ there's an option to use inline asm for optimization.

  • @PMX
    @PMX 11 місяців тому

    Assembler also means you tie yourself to a processor architecture and would need to rewrite a lot to port your game from say x86 to arm

  • @hpw-dev
    @hpw-dev 14 днів тому

    спасибо за этот бенчмарк

  • @holthuizenoemoet591
    @holthuizenoemoet591 11 місяців тому

    Why not for example go with normal C instead of assembly, like 2011 or something?

  • @JeanLescure
    @JeanLescure 10 місяців тому +1

    Assembly vs C vs C++ please

  • @plasmamac
    @plasmamac 11 місяців тому

    Wow

  • @petrow_
    @petrow_ 11 місяців тому

    What's the game at 1:20?

    • @lingonstudios
      @lingonstudios  11 місяців тому +1

      Not sure exactly which one you mean since that timestamp is just in the switch between two clips. But the first one with the buildings and cars are my city builder game I am working on in assembly, it is on hold for now though. There are a few videos of it on this channel. The other one with the ranger character and the bear is my most recently released game Sidestep Legends which is available on Steam and I have a few videos about here on the channel as well

  • @NickEnchev
    @NickEnchev 11 місяців тому

    Memory usage isn't that important, what's more important is seeing how effectively the application is utilizing the cache. Its well known that data-oriented approaches minimize cache misses at the expense of using a lot more memory. I'd be curious to see how well your data-oriented approach utilizes the cache. I know the Data class is static, and the array references are static, but my C++ brain imagines allocating them with "new" would put them on the heap, correct? Since you've split up the data quite a bit, could you be ending up with too much indirection in systems that utilize multiple arrays in their update methods?

    • @lingonstudios
      @lingonstudios  11 місяців тому +2

      Well they go hand in hand. Since the less memory you need to read, the higher percentage of it can be present in the cache. So it is usually a win to reduce the data you need to store so more entities fit into the cache. In C# if I recall correctly there is no use of static memory. So static is just heap allocated memory that exists the whole runtime of the program. But even in C/C++ where static memory allocations actually exists it would not make that much difference compared to heap allocating them with malloc or new. You might be able to get one fewer pointer deref with a statically allocated array if you use it directly and not passing the array pointer as a parameter to a function, but in this case it would not matter, since that is not where time cpu time is spent anyways.
      The data is structured so the cache should be used rather well. The Movement struct for example bundles the position and the velocity together since they were mostly used together. Usage of the rest of the data was more spread around in the different systems so I opted to have them as their own arrays.
      If I were to make an improved version of the assembly version (could be possible with the Unity versions as well with Burst and intrinsics), I would split out the data even more and not use a Vector3 struct, but instead have an array each for x, y and z of the positions and velocity. That would enable using all 8 SIMD lanes available in 256 bit SIMD instead of the 3 being used in the assembly version right now. The cache usage would still be fine as well. Since even though you would access more arrays, that data needs to be read anyways and all the data you get in each cache line while reading from memory will be used in the next iterations of the loop