Rust iterators and its other zero-cost abstractions are what makes it so good in my opinion. Being able to write more readable code, and less of it, and get more optimized code than I could've written explicitly myself is just a win-win. Sure memory safety is great, but even std C++ iterators can have poor performance, depending on how you use it. Thanks to Rust's borrow checker, it can optimize iterators by not needing to do boundary checks, since it knows the size a head of time. Using abstractions in Rust are often faster than explicit code. Of course it's not a silver bullet (e.g. there's no way to bailout early in an iterator closure).
Yeah I think Rust is a super cool language. I'm hoping to learn more of it over time! After learning rust iterators I found myself somewhat frustrated that Go doesn't have iterators that can give me at least near-native performance. I have a whole file of Go benchmarks filled with iterator attempts that are way slower than regular for loops. Makes me sad
@@UnitOfTimeYT The trait system is just awsome. There so many extremly useful traits in std that you just keep uncovering. IMO, rust is frustrating to learn because it combines concepts from a lot of languages and the best way to learn it is to just learn the basics , get an rough overview what else is possible and learn the rest whenever you found a usecase for them. It makes little sense to try learn all at once since its way too much and you likely do not need many of them because they are for specific usecases you dont have. If you start using async learn about futures, you need to iterate over your custom type? Learn iter traits. You realise you need a big code chunk multiple times but cant use a function? Learn macros.
Yeah that's a good point. Rust definitely pulls a lot of good parts from a lot of languages. I just feel like the spec can be kind of large. My experience with learning was kind of like 1. I want to write a physics simulation in a Rust ECS 2. Bevy is a popular rust ECS, let's use that 3. Let's try to get a basic example from Bevy docs 4. Oh I have no idea how borrow checking works 5. Let me read the rust book 6. Oh now I kind of understand how the borrow checker works 7. Let's try that again 8. Loop 5-7 every time I don't know something The problem I had though, was that at the start I don't know rust, so I don't know what features do what. So when Bevy uses a feature (or mentions a feature), I'm basically forced to go learn what that feature is. And because the rust spec is somewhat big I end up having to learn a lot of it just to use Bevy. I think some parts of this is me being a little lazy, where I just want to get my program written. But I think there is a tough to find balance (for any language) of "At what point do we add this new language feature, and at what point do we not". I'm not trying to criticize, and I don't even think I know the solution. This was just the experience I had.
@@UnitOfTimeYT I think that's fine. The same applies to other languages. E.g. in Rust I don't always opt to iterators, unless I know it's straight forward. I usually start with simply for loops and get it to work, then optimize later. Perfectionism is a struggle when learning something new, but I try to deal with it by just doing it (applies to many things). I like to go nitty gritty in details but it often leads to rabbit holes, so unless I am either stumbling upon a better way, or I know I need it, I just keep it as is until I've had more time with the language/APIs. Chances are that a light bulb goes off in your head when you try to implement something different with similar semantics and you can use that to refactor your old solution. A good example of this is polymorphism in Rust. There are different ways to go around it. Usually I start with enums that wraps structs, because they are the easiest to work with, even though they may not be the right solution for what I need. Traits on the other hand are very useful in cases where you 1) have a lot of variants and 2) want to allow others to "plug-in" to your system in some way (like Bevy's plugin system). Enums don't always apply well for these cases, because they are kind of "static", so they are mostly useful for cases where you don't want to expose that kind of flexibility on purpose.
Yeah totally agree. Building software is a very iterative process and it's definitely fun to have those realizations of "Oh man this is way better than the old thing I was doing"
An excellent video showing how Golang can be fast if memory is correctly managed. It would be really interesting to get an update using the latest versions of Golang, specifically using (PGO) Profile Guided Optimisation.
4:00 I never heard the term "box" used in programming outside of Rust lingo (and basic explanations for variables). It means heap allocated objects there. What do you mean by it in this case? Also, good job on matching Bevy's ECS, even if the optimizations ended up being a bit gnarly Also, queries can be iterated faster in bevy by using the for_each method rather than using the iter methods that I found in your repository. You should try updating them! I could try to fork and benchmark it if you want, sounds interesting
Oh interesting. I actually didn't know that box was a specific term inside the rust ecosystem. The first place I heard the term "boxing/unboxing" was in Java where they convert things like primitive ints into capital letter `Integer` (where Integers inherit from the java base class `Object`). In the video I'm using it kind of loosely: Basically, in Go when you have a generic function, sometimes they will monomorphize it once then pass multiple types into that single function. At that point the only way to pass multiple different types into a single function is to make all the types look like one type. Go achieves this by passing in (I think) a pointer to the base data type, then they also pass in a dictionary which holds some runtime information and a Virtual function table. The virtual function table serves as what I was calling the "boxed" value. If you take a look at the planetscale article in the description, you can read a much much better and more detailed explanation. Also, I didn't know about the bevy for_each way of iterating. I'll have to look into that some more! You're more than welcome to create a fork and open a PR with those changes, someone else already has too! But if you don't I definitely will!
@@UnitOfTimeYT It's a similar concept, just in Rust it's more about size (in memory) and ownership, with the contained type being less-important (lots of caveats around traits here). As a former Go developer gone Rust, the biggest hurdle was paradigm, in the end. Functional code works well in both languages, just the syntax is more specific in Rust, while the typing is more specific in Go (imo).
Rust has the Box generic type which is a container / wrapper of a value of a different type. The concept of boxing exists in programming in general, and is the concept of wrapping over a value of some type.
@@UnitOfTimeYT My proff used to use unboxing/unwrapping interchangably when talking about Optionals so I guess the analogy is just helpful with many things in programming.
I gave up on Go to switch to Rust -- primarily because they were taking way too long to release generics. Rust has a very rapid, but stable, release cycle, and Go will get left in the dust. Which is unfortunate, I believe it's one of the best languages we have for programming, but the speed at which limitations are removed is hurting the adoption.
Yeah Go is definitely reluctant to release new features sometimes. But I'm generally pretty happy with where the language is. There are a few times where my problem could be solved with a missing language feature, and I have to work around it. But I think I'm in the uncommon domain of doing game development in Go.
But aren't Go and Rust used in completely different situations? Rust is used in most low level systems while Go is primarily used in cloud native distributed services. How can one be a direct competitor of the other?
Excellent work! Working on a VERY similar problem at the moment and watching your process has been helpful, if for nothing else, to validate some decisions I've made along the way. I've mostly been looking at Unity's DOTS/ECS implementation as reference, but being closed-source, I've had to make lots of assumptions based on the API alone. In any case, this is great content. Hoping to see more adoption of Go for games.
Hey! Super happy to hear that! I also made this video where I go over the high level implementation of my ECS architecture: ua-cam.com/video/71RSWVyOMEY/v-deo.html In the description of the video I linked, I put some references to some really good articles about ECS design by some authors of the more mainstream ECS frameworks. Best of luck!
Not a game programmer myself, but I do write go in my day to day work. Learnt alot from this video about go code optimization. Do you have a compiled resource on where I can learn more on this topic?
Hey! Glad you enjoyed the video! I don't really have a compiled resource on performance optimization. But there are quite a few good blogs out there that will sometimes talk about it. Here's a few I know of: 1. Russ Cox's blog: research.swtch.com/ 2. Dave Cheney's blog: dave.cheney.net/high-performance-go (Specifically this one: dave.cheney.net/2020/04/25/inlining-optimisations-in-go ) 3. Planetscale about generics: planetscale.com/blog/generics-can-make-your-go-code-slower Hope that helps!
Thanks! One thing to keep in mind is that this is a physics performance benchmark. All of the memory is pre-allocated. Normally, one of the performance advantages that Rust has over Go is that there is no GC slowing it down, but in this case there is only like 5 really big arrays which means the GC doesn't have very much work to do (each array represents one GC item, so only 5 checks are needed). [This is an excerpt from a comment I left to Wander Watterson, You can see the full response there]
@@jongxina3595 In complex apps the problem aggravates. If your purpose is served in GC based apps, go for it. If GC is the bottleneck you may have to invest in development effort later on (either optimizing like the author did here, mind you it was a lot of work for such a small codebase) or migrate to a non GC based language.
People really like hating on GC'd langs, especially in gamedev but people stil use Unity(C#)/Unreal(Blueprints)/Godot(Gdscript) and they all use GC and/or interpreted langs on top of C/C++. Anyways good job!
Thanks! I appreciate it! Yeah i'm not really sure how the GC gets "compiled in" to the final binary in, say, a Unity game. I think a lot of the more critical components are in C++. But I think like 90% of games that I'm interested in playing could run just as well in a GC language. I think good Algorithms and memory layout are the big optimization targets. One thing I have noticed in my own coding in Go is that, because its so easy to create memory, I sometimes add allocations (without thinking) to my critical code loops rather than pre-allocating. You could do this in any language, but its easier to accidentally do in a GC language I think. Usually not hard to fix: Just move the allocation up further and re-use/pool memory better.
Sorry for the impolite comments from some Rustaceans here. Great experiment! But as you mentioned, definitely curious about workloads with more disparate alloc. Hmm but in that case, maybe the allocator becomes a factor too, and in Rust you can change it from system default to jemalloc or mimalloc, so more variables to consider. It’s really hard to isolate only the ECS code to compare. More than performance though I’m curious how much you feel the Rust style of using types helps you avoid errors or annoys you from progression. The more interesting trade off between Go and Rust is I think of Go, less static analysis for fast compile and more attention to debugging, versus of Rust, more static analysis (and annotation and API restrictions) for less debugging and slower compile.
Took a brief look at your code, and found a low hanging fruit: your native() for Rust didn’t preallocate the memory for all the stuff like Go forces you to do with make. You can use Vec::with_capacity(size) to match Go behavior. More precisely since I remember Go zeros memory on make, you can use the macro vec![default; size] to replicate zeroing. (Check doc for details)
Haha don't worry, I haven't seen anything too impolite yet :) But yeah, this benchmark doesn't really highlight one of the core benefits rust has over Go. Which is that it has no GC. Luckily, by organizing in an ECS I think my GC impact will be much less than if I had objects sporadically defined by pointers. I think in games its somewhat normal to pre-allocate chunks of memory to work in so that you don't have to allocate inside your physics or render loop. So the timing code just operates on the actual "do physics work" part of the code, and I pre-allocated all of the variables I needed. About Rust style, I do sometimes wish I had Algebraic datatypes and match statements in Go. That's one thing that is really useful for modelling your data correctly. But right now I'm trying to figure out if its the best way for modelling the data in all cases. I think sometimes I think I want to use an ADT, but I forget the impact of having to do the unwrapping of the inner type (especially when you're enum scales to start having more and more members). I'm not really sure the right cutoff for when to use an ADT/Enum and when to build things with interfaces. Once I eventually struggle and figure out a nice interface for my type, I'm usually pretty happy and stable though. I do have maybe one controversial opinions about Rust (Which I definitely might be wrong on!). I feel like they have added too many new language features which makes it less and less accessible for someone like me to pick it up. In my personal case, I'd prefer to have better developer ergonomics, even if it cost me some cycles on the CPU. I also sometimes feel like rust library developers err on the side of "Let's model this API with the same amount of entropy as the underlying problem" whereas Go developers err on the side of "Let's model like 90% of this underlying problem in a really simple way". Because I more often then not fall into the 90% of developers, I'd rather have a bunch of simple APIs rather than one perfect, but complex one. All that said, I have only used a handful of packages, but the handful I used I found most to be somewhat hard to grasp. Anyways, I appreciate your comment. Sorry for the rant. I really think Rust is a cool language and when I was learning it I tossed and turned in bed many times thinking "Man, should I switch from Go?"
At 1:20 you say 35ms and 18ms but the graph shows 350ms and 180ms. The same for other graphs... seems what you say is right, just messed up with Y axis.
Rust also does bounds checking. You can use the non-bounds checking api directly. But if you indexed directly into arrays in Rust, you are using bounds checks everywhere.
Bounds check elimination in Rust can also be done like he described in the video. Since Rust uses LLVM as its backend, it is insanely good at optimizing away redundant computation like bounds checking. However it is still difficult since there are often cases in which it actually isn't possible for the compiler to optimize such things out.
Yeah there are probably similar tricks you can do in Rust to remove bounds checks inside loops by adding some prechecks. I didn't look into how much LLVM will automagically optimize that sort of thing though. In Go there are a few situations where it could accurately detect and perform BCE, so I had to make sure I got it to understand that it was safe to do BCE, mostly by comparing arrays before a loop.
@The Silenced If I had to guess, I'd think that regardless of how you loop over a single array (either with an iterator or for idx in 0...container.len()), the bounds checking would be implicit in the loop itself (ie both loops guarantee we won't go out of bounds, so bounds checks can be eliminated). The problem I was having in my Go code was that I'd be looping over multiple arrays that were the same length. In the for loop, I loop over a single array but then I index into multiple other arrays. Each of the "multiple other arrays" now requires bounds checks because the compiler can't dynamically predict that their length will be in bounds inside the whole loop. Thanks for watching, and thanks for the comment!
This is awesome content! It's pretty cool seeing how well Go did, and I really was expecting to see a bigger difference between the two. I guess in the case of this particular simulation it makes sense that GC wouldn't be too big of an issue. I've been meaning to try my hand at learning Go, because I think it could complement Rust pretty well and this video may just be the motivation I needed to get over that hill lol
Thanks! Glad you liked it! Lol Yeah I think Rust and Go are not so different in the types of problems that you can solve with them. I think Rust will often be slower to write but faster to run and Go will be the exact opposite of that. So depending on the exact problem you're solving, Rust or Go might be a better choice. I had been meaning to try my hand at learning Rust, so this video was also my motivation for doing that :) Best of luck with your Go journey!
From what I’ve seen, Go is nice because it includes the batteries. It’s got a lot more concrete functionality in its standard library and it’s easier to write in. However, the main thing I feel Rust has going for it (aside from its longevity and stability-oriented design that gives users the ability to extend the language) is it’s flexibility. With a bit more work, Rust can compile to almost any target, including microcontrollers and webassembly without resorting to a similar-but-not-quite language (like cgo). In Rust, the only thing you lose is the standard library, most of which is actually re-exported from the core library; you just need to bring your own allocator (this is usually done by a controller-specific crate in your dependencies)
@@sploders1019 Yeah there's definitely some nice things in Rust that I wish Go had. Go seems much more targeted for things like webapps and less targeted for embedded. In my game engine I actually compile my Go code to run on wasm though and it seems to work pretty well!
In my experience with these 4 languages, Go feels like what would come out if people experienced in Python tried to remake C, and Rust feels like people experienced in Haskell trying to remake C++. Pretty wild stuff, both with their own different utility
Near native performance iterators would be so damn nice in go but I can kind of understand why they’re resistant to adding them. Their whole mantra is simplicity and sometimes performance gets sacrificed to it. And for most things people use go for it’s actually a pretty nice trade off
Go's mantra is not simplicity. It is whatever the Go team decided that Wednesday. Go is like Turkmenistan and the Go team is like the Grand Turkman that issues fatwas for renaming months.
Awesome video! But I was expecting the result to be much more larger though, maybe you haven't compile the rust code in release mode? cargo run --release?
Glad you liked it! Don't worry, I did run the code in --release mode. One thing to keep in mind is that this is a physics performance benchmark. All of the memory is pre-allocated. Normally, one of the performance advantages that Rust has over Go is that there is no GC slowing it down, but in this case there is only like 5 really big arrays which means the GC doesn't have very much work to do (each array represents one GC item, so only 5 checks are needed). Ultimately the biggest thing we measured in the Native Go Vs Rust benchmark was: "How good is the compiler at optimizing this physics looping code". Go has a custom compiler and rust uses LLVM so that's at least one difference. The other difference (that I mentioned in the video) is that I'm much more used to writing fast Go code, so I might have done some things that I didn't realize were slow when writing my Rust benchmark. Overall I don't think Rust is *slower* than Go. I don't even think that Go is *slower* than rust. I think that someone can write a slow program in either and someone could write a fast program in either. I also think that the average Rust program will probably be faster than the average Go program, mostly because of the GC being slower than static frees, but as I mentioned that wasn't a feature of this video. Glad you enjoyed!
@@UnitOfTimeYT Excellent explanation! that makes a lot more sense to me, all and all programming language are just tools for the job, it all depends on the person who uses the tool correctly and not about how blazingly fast a tool claims to be I think go is generally considered to be slower than rust just because of how people uses it not as efficient like you do, which tends to make people just assume that rust is the fastest despite it isn't always the case
@@UnitOfTimeYT you also did some compiler flag trickery, Rust can also do a bunch of that. The funniest imo is the one that sets the concurrent build jobs to 1, makes the compiler extremely slow but hey it can do some extra optimizations
Oh interesting. I guess some optimizations aren't possible with parallel builds? Yeah this was my first time using compiler flags to check things like BCE and inlining. It was super useful for finding the slow points of this weird experiment.
@@UnitOfTimeYT for rust, I once tried setting LTO (link time optimization) to true. The default is "thin", which compiles faster, but isn't good at inlinning code between dependencies. Using this can make the binary size smaller and increase performance.
haha. Yeah I had to upgrade for the sake of your ears. I feel a little bad about the mic quality back then, when it was me with a $15 mic zip tied to a phone mount. Now I have a bit more professional mic setup lol
Don't know is someone else suggested you yet, but you should try to rearrange things to not require the invalid entity check in the iteration loop. Its a great feature in an ecs to let you take certain assumption to skip those kind of tests.
Internally I have to make a tradeoff of what to do if an entity gets deleted from my component arrays. I can either 1. Mark the slot as a "hole" -> Fastest delete method, but requires extra check in the iteration loop 2. Move the components from the last index of the array into the deleted array index I kind of do a combination of 1 and 2 where: I mark holes up to a point, but then repack if there are too many holes. Option 1 saves me work for things like particle systems where I'm constantly adding and deleting particle entities. At this point though, the majority of my processing time is spent outside the core ECS and inside major systems like rendering or collision detection. Thanks for watching!
Sorry for the confusion. There's nothing long with the for i := range slice. The problem comes with passing the function to be executed *into* the Map function which puts it inside the for loop. Because the function is dynamically passed it, its really hard for compilers to inline, So Go can't inline the function inside the for loop. This reduces a lot of optimization opportunity making the code quite a lot slower. In other languages they'll write "iterators" which replicate the for loop logic in the outer function so inlining is more achievable. These sorts of optimizations are fairly specific to my problem set (looping over an enormous amount of objects). Most of the time dynamically passing functions in is good enough performance wise. Thanks for watching!
I don’t understand how iterator in other languages help. You still pass in say a custom lambda to a map function and the map will pass an iterator inside your lambda, right ? What do you mean it can “replicate the loop logic in an outer function” specifically what is outer and is there an inner function for your case?
@@ChuPang This is what I mean. Sorry if there's any errors in it. I didn't test any of the code: gist.github.com/unitoftime/a95003111e5bacd47d9b4a7d71197a1d The problem is more: I couldn't find a way to write iterators (like other languages have) in Go without losing performance equivalent to just using a map function.
I think you could make the algorithm significantly faster by optimizing the collision-check loop at 0:34. You check every distance twice there (once from A to B and again from B to A). I'm assuming that the distance function is quite expensive. You could either cache (commutatively) the results of the distance function or keep track of which positions have already been compared and skip. Of course, it's not really relevant to this video, if you used the same algorithm in both languages.
Yeah there's definitely some algorithmic optimization opportunities in the benchmarks. Probably the simplest would be to compare each distance only once by having the inner loop go from the outer loop position + 1 (instead of starting from 0 each time). There's also some spatial lookup optimizations you can do such as QuadTrees (KD Trees), Spatial hashes, or as suggested by another comment: Bounded Volume Hierarchies. But you're right, I was mostly interested in the looping performance and not the algorithms for this particular video. Thanks for the comment and thanks for watching!
@@UnitOfTimeYT Incrementing the start value of the inner loop sound elegant. I didn't even think of that. And the other optimization options you mentioned, I've never heard of. Time to do some research :-) Thx for your reply!
@@UnitOfTimeYT maybe interesting: for bevy, i made a plugin called bevy_spatial intended to allow easy drop-in usage of kd-tree (and for static objects, r*-tree)
If you like using assembly etc there's some optimization guides you might want to look at? I don't know anything about golang but it'll tell you about how long different instructions take on different arcitecjtures, how many pipelines they have etc...
That's a good suggestion. Yeah try to not get too far down in the weeds unless I think its important, but reading a guide would probably help me find potential optimization targets faster.
Have you implemented spatial access (i.e. only iterating through objects that are within a certain distance from yourself)? If you have, I wonder if that explains the performance difference between the Rust and Go code once you added back in the finite health bounds, as I'm sure you didn't consider it in the rust implementation. Then again, it seems the Go and Rust implementations follow the same curve, just scaled differently, so perhaps you didn't consider it in the Go implementation either. In Rust, one option to do this is with the bevy_spatial plugin for Bevy, which allows you to perform spatial queries for performance (otherwise operations will be O(n^2). This might be an interesting and rewarding API implementation to pursue in your own engine. Also, I feel I should point out (tangentially to Errata No. 1) that in Rust, type aliases are considered to refer to the same type, so they will share monomorphizations in a similar way to how you described stenciling. Also, we do have dynamic dispatch in Rust by way of so-called "trait objects" using the dyn keyword.
For this video, I didn't do any spatial access optimizations in any of the benchmarks. I just did the really basic, naive n^2 collision checks. KD trees and spatial hashes are two optimization targets I'm going to look at in the future (and when I do I want to make a video about it hopefully). One thing I regrettably didn't cover in the video was the delta between Bevy vs my Go Ecs. A few things might contribute to this (I'll put this in the errata section too, now that I know at least one person reads it :)): I think number 1 and 2 are probably the biggest contributors to Bevy running slow. 1. Bevy has A LOT more features than my ECS, each of those features might contribute to a global optimization for most cases, but in my benchmark contribute to a slowdown. I expect that most people aren't disabling multithreading and the system scheduler and manually executing every game loop. 2. Bevy has a few different layers at which you can use the ECS, I'm not a bevy expert, so I may have written the code in an abnormal way where bevy ran slower than it usually would have. Very specifically, Bevy seemed to slow down the more I started adding and deleting components, I'm not sure the limitation there, but I thought it was notable. 3. Bevy isn't the fastest Rust ECS framework according to this: github.com/rust-gamedev/ecs_bench_suite - I learned this actually kind of recently after all my code was written for bevy On the Rust monomorphization note you made, that's interesting. I didn't know that. The unfortunate part (if you're interested, read the planetscale article in the description) about the Go generics implementation, is that *every pointer is the same GC shape* which means that *every pointer shares the same set of generic function* and Go has to box/unbox these on every function call (which really adds up in critical code paths).
@@UnitOfTimeYT That's an interesting and somewhat unfortunate note about Go generics. I really wonder why they didn't simply choose to go the monomorphization path - the reason that monomorphization takes up compiletime in Rust is primarily because of the much more complex and advanced typesystem that Rust has and the potentially huge numbers of monomorphizations that might need to be generated. I assume - or at least hope - that the decision to use GC stenciling was informed by extensive profiling and consideration before standardization, but at a glance it seems a very strange decision to me, as I don't see any parallel features that Go has with Rust that would cause monomorphized generics to inflate compile times in the same manner.
@@nekodjin Yeah I'm not really sure either. I think that the typical Go applications might have been more inclined towards the GC stenciled version of generics. I think that most people aren't writing the kind of Go code that I'm writing, which might fit into Go's narrative of solving the majority of the problem. I guess we will have to wait and see how it turns out in a few versions, they might have some long-term optimization plan for it that we don't know about (Fingers crossed) :)
@@UnitOfTimeYT I would add the "rayon" crate in the "check_collision" function to parallelize the collision detection. You can use the par_iter_mut method provided by the ParallelIterator trait, which returns a parallel iterator over the elements of a collection. You can then use the for_each method on the parallel iterator to apply a function to each element in parallel.
@@romanstingler435 Thanks for the info, for this particular comparison I was doing single threaded performance. But it is cool how easy it is to parallelize code when the data is laid out this way :)
Hey guys, isn’t it possible to use some kdtree instead of the second loop? It of course has nothing to do with rust vs go comparison, but a quick check for collisions looks like a problem solvable by a bettr data structure.
Yeah definitely. From what I've read online, it seems like KD-trees and spatial hashes are good optimization strategies to reduce the number of comparisons you have to do in the world. I'm hoping to make a video about it one day. I also wonder what a hybrid structure of coarse grained spatial hasing + fine-grained KD-trees would look like. So I want to look into that too.
@@UnitOfTimeYT I found myself in this situation and BVH (Bounding volumes hierarquies) are easier to implement and it's construction looks like a KD-tree construction, the magic about a BVH is that you can update a existing bvh iteratively for most frames, so no full re-build every frame.
I've seen other comparisons like this, where C code that's been optimized over the last several decades and looks unlike a natural writing of the concept... is compared to rust code that's 10 minuets old. If you give both projects equal time that's one way, but I think equal in-depth knowledge would be a better comparison. I.E. if for Go you have to understand how the inlining system works you need to give rust a similar advantage and go panics/auto-fail if something can't be learned about rust of equal value.
Yeah that's a valid point. I'm sure there are some things I did which aren't idiomatic Rust because I'm unfortunately not a rust veteran. That said, the code is fairly simple in the "native benchmark" and didn't include any of the optimizations mentioned in the video. For the ECS library comparisons which did contain my optimizations (ie my ECS vs Bevy) - my assumption is that the Bevy authors have already done their own set of optimizations, so in that sense Bevy probably has had more time than my ECS. I left an errata at the end of the description which contains a few notes about why Bevy performed a bit slower than my ECS. The biggest contributor (I think) is that Bevy just simply has more features than my library. There might be optimization in Bevy which improves performance for general cases (probably some thread safety features), but causes a slowdown in my specific case. Or just convenience features that add up to a net slowdown. Thanks for watching! Hope you enjoyed!
@@cheako91155 Go compiler was released in 2012 llvm was released in 2002 The development of Go began in 2007 Development of Rust began in 2006 The Rust Bevy library is older than the owner's video code. All conditions are biased in Rust's favour. Go's good performance in these conditions deserves recognition
@@baxiry. I was talking about comparisons to C code, obviously. For Rust to generate optimized assembly might take the same level of hackery we see in C code today, but those tricks haven't been discovered. Rust needs at least a few decades to be mastered to a comparable level.
Yeah I totally think there is some language sitting out there with a perfect balance of Rust and Go features. I know very little about building languages, but I'm itching to figure that one out lol
@@UnitOfTimeYT I don't think that such a language exists, but you can always create your own! :) As a Go programmer learning Rust for the first time, what things stand out to you about Rust, and what things do you have in Go that you miss in Rust? What are your favorite features or properties of Go?
@@nekodjin haha. If I had infinite time in my life then I might try to make a language. Maybe one day when I've tried a few more langs I'll try to build one. The things I like in Rust are: Race-Free concurrency, ADTs/Enums, and the fact that generics are always monomorphized. I'd probably like macros, but I don't understand those well enough to really have an opinion on them. I like the general idea of having a way to write code that writes code though. For the inverse, I like that Go is really simple, and that I can focus more on getting my algorithms to work rather than focus on figuring out what language feature I need to use in what location. I like that Go builds are really fast. And not that I've done much with async, but I have a feeling I won't like it. I did the javascript version of async in javascript and I wasn't much of a fan. I like how Go's concurrency is modeled (maybe there is a rust equivalent though). Obviously , lots of opinions most of which might change as I learn more :P
Glad you liked the video! I started programming maybe like 13 years ago, and I originally got into it because I wanted to make video games. So I've been failing at making games for about 13 years, with some breaks in the middle :) I'm very recently doing Go and Gamedev much more full time and trying to be a little bit more organized and efficient with how I make things.
In the bevy example I used the bevy iterators. In the native example, I did regular for loops but with all the arrays being preallocated the same size at the start I assume LLVM can do BCE already. Though I did not check. I made that assumption because in the Go native example, the compiler was able to automatically do BCE.
nice video - i really like the ergonomics of developing with ECS and it's nice to see implementations in various languages. I have some thoughts on your "why is bevy slower" description a) That's probably the largest point - while you won't get much out of parallel queries (can't parallelize mutable access to the same data), Bevy can parallelize even the for loops inside a system, by using methods such as par_for_each(_mut) b) by default, bevy uses what it calls "Table" storage for components - this is fast to iterate but slow to update. In use cases where you might want to add/remove components a lot, you can switch to SparseSet storage on a per-component basis. c) The ECS benchmark you mentioned is outdated by 2 years, and was also recently archived to represent this. Bevy has had multiple optimizations since then. I see you using bevy_ecs 0.7 - in 0.9 there were some decent performance improvements which would be interesting to see. Sadly, i'm too lazy to run the benchmarks myself.
Thanks for the thoughts! Glad you liked the video. Yeah I also like organizing my games into an ECS, the structure feels lot less spaghetti to me than anything else I've used. a) Yeah I agree b) Oh interesting, I didn't know this, but it does align with what I saw where Bevy slowed down once I started deleting entities every frame. Let me add this note to my errata. c) I'll add this one too. I always thought it'd be cool to make a video about how the performance of some thing (probably a language) improves over the years by running the same benchmark on every version. I feel like it's cool to see how software improves over time. Thanks!
Yeah, relatively speaking it's safer than say C or C++. You can still have null pointers and things like that, but Go will prevent any invalid memory accesses as panics AFAIK. As an aside, you can write memory unsafe programs with race conditions: research.swtch.com/gorace
Not to start any language wars, but while making my rust benchmarks I learned that Rust will provide memory safety across threads (which is really cool!) because the borrow checker will prevent data races altogether. So I think Rust might actually be considered a bit safer than Go. Of course I do still love Go! :)
Good video. Makes me sad so many Rust programmers cant see past "ohh Go has GC so it must be slower than Rust". Specially on something like this where garbage collection isnt particulary common.
Yeah, I was kind of surprised myself about Go GC being little to no impact. I was going to have a section of the video about "The impact of the Go GC", but I disabled the Go GC and it didn't speed anything up. I think the lesson is that, even in a Garbage collected language, you can still kind of "manage your memory" in somewhat manual ways to get better performance.
@@UnitOfTimeYT Go's GC is much more lightweight than Java, that much I am sure of. From your comment it seems that Go's GC only adds overhead when a novice is writing the code and doesn't manage memory properly. If the quality of code is like Rust, there will be no impact of Go's GC. But the problem is that you never know the impact of GC unless you have something to compare to.
I think it's more about choice, you can have GC in Rust(it was before 1.0) but different projects need different things. Rust allows(strives) for GC, single allocation, group allocation or no allocation. In some instances a GC can kill a program. In others GC vs non GC is meaningless. In some GC is even better than non GC. It's just about knowing your data and testing to know what will work best for you.
@@dynfoxx You're talking about the runtime for managing green threads which was required for IO bound apps. It is now known as tokio. It is not a GC. There are unofficial GC implementations if you want to use, but the only official paradigm which comes remotely close to doing what GC does is Reference Counter and it is not a GC. By no allocation, do you mean lazy execution? The resource will be allocated only when condition is fulfilled? Even so it is not GC. Check once_cell crate, everything is checked at compile time.
I’d like peoples to compare tools that are used for the same context, Rust & Go aren’t used to solve similar problems. Id rather see a Zig Vs Rust comparaison that would be very interesting.
I don't really know Zig, but from what I've read it seems to have some nice features. I do think that Rust and Go is a valid comparison, IMO they are both fairly high-level systems programming languages, but with different philosophies on how to abstract the developer from hard things such as concurrency, memory safety, package management, etc.
I didnt understand one bit just came for a Rust Vs Go Fight saw the graphs and liked the video 👍 Wish I were as knowledgeable as you 😅😅Too Hard for my brain to get these things 🙏 I should have read the word "physics" in the title and stayed away 😆
Hey, yeah for the rust code I ran in release mode, which as I understand it will enable all optimizations. And then for Go there is really only one build mode. Cheers!
haha yeah. I actually tried to do an experiment to see the performance delta between debug builds and --release builds, but debug build took so long I couldn't stomach waiting for results and gave up. O3 appears to be the default opt-level for release builds, which is what I used.
@@UnitOfTimeYTi know it may sound crazy so for you and other readers to understand why I’m suggesting it: Julia has been shown to be comparably as fast as C, while being as easy as Python. It achieves performance by means of a combination of monomorphization and LLVM optimizations. It’s becoming popular in the scientific community which needs to do, among other things, particle simulations similar to this video. It has a REPL and commands (macros actually) that spit out the intermediate (LLVM) and final assembly code. And it’s backed by the MIT. If the focus of this channel is purely game development, Julia can be at most a curiosity as it lacks the libraries and integrations for game development. But high-performance numerical computing and data visualization are some of the very reasons for its existence… 🙂👋
No Problem! This is all single threaded performance. I'm hoping that once I support multithreading I can do a similar comparison type video but with multiple threads.
Yeah you go with clickbait just by calling your channel what it is. Bear in mind there is no such thing as time there is only speed. Time is concept of how much speed of human in relativity of position in space to the rotation of the earth and it's relative position in space has passed. Time is just a thing made up to get context for the brain that is unable to wrap it's mind around this law to make it more accessible :)
@@UnitOfTimeYT don't sweat it, thanks for the answer. You know how quickly it could escalate in the comment sections. Have a fine day and keep on providing your videos. :)
My main takeaway from this video: A naive code in non GC based languages will always outperform naive code in GC based languages. I code in Rust, and didn't have any idea that Go code can be optimized this way. You have gone to such lengths to reduce GC checks in Go, it's mind blowing. Reminded me of Just-JS web framework. I want to know your take on reducing GC overhead in highly concurrent Go applications. By GC overhead I mean the periodic cleanup of useless items taking up resources, as Rust has no GC to clean up.
Haha thanks! ECS is a fairly well established pattern for game development, which does have the side effect of reducing GC load (because memory is more manually organized), but the majority of my optimizations that I spent time on were around inlining and BCE elimination. Also, of course I had to write the ECS framework to begin with :) Thanks for watching!
I don't think that it's accurate to say that GC makes up all of the difference here. In some situations, GC can be more efficient than reference counting. I think the primary difference in performance comes from such facts as the inefficient way that Go implements generics, as well as the fact that the Rust compiler is just outright better at optimizing code than the Go compiler by a wide margin.
@@CheaterCodesYou need to do reference counting to do anything that even remotely resembles a real program, so reference counting is really just the default rust way.
I didn't do a C++ comparison here unfortunately, but I have made a video in the past called "Go OpenGL performance" which compares C++ and Go in terms of how many sprites they can draw on a screen each frame. C++ is better, but Go was close enough behind that I wasn't too bothered.
No matter you will do, Rusties will glorify their language like they are in a cult. The marketing was for sure done correctly by big corporations for political reasons, not engineering ones. It's kinda sad to read in the comments to see how people discovering our tech industry are thinking. Meanwhile Go and Zig are trying to take their time, as we should do in an engineering world. If you want to have fun, you could have a native layer with Zig then bind your game logic in Go. You will have remarkable results by using a bind of FLECS in Zig. Go is not meant for video games due slower native calls. Zig and Go works well together though. Take your time to create useful, meaningful and timeless solutions.
I haven't played with Zig too much, but I've heard good things about it. I actually made a video about OpenGL performance in Go (Basically hoping to measure the CGO impact, by comparing it to a similar C++ implementation). I was somewhat pleased with the results. It turns out the CGO impact doesn't really bother me. That said, I mostly make pixel art games. If you had to write a game which had an enormous amount of OpenGL calls you'd definitely see some more impact than me.
While some people are obsessed with Rust I don't thing it's more then any go or c++ fanatics. What companies pushed rust? As far as I know rust is basically a community driven language(has been for many years) In what way is Rust going to fast? Not trying to be confrontational just think it's good to talk to people with different ideas.
You performed benchmarks without creating a rust release build … a more accurate comparison would require you to build your own ECS in rust like you did in Go lol.
Stop fighting with numbers, that's not cool and very childish. The way you do your programs affect how they will perform, doesn't matter if it is C#, C, Go, Rust or even pure assembly.
Great video! It was very nice following along the progress and finally seeing it all together was very interesting. My ecs still beats yours :elbowcough:
I actually only recently learned that Go had a `goto` statement. So you're saying that people dance around inlining by putting everything in the same function and using `goto` to switch between different methods? I assume you can't `goto` some code in another method.
@@UnitOfTimeYT You can't jump from one method to another using goto, unless you switch to Assembler. Gotos were very useful in complex numeric methods for increasing a performance, but required incredible programming skills which very limited number of people have.
@@UnitOfTimeYT My classical advice would be to avoid using gotos unless it is absolutely positively necessary. Gotos, as a general rule, make programs much more logically complex and messy, making them harder to understand, harder to maintain, and harder to debug. The situation in which it is impossible to get a result that is "good enough" without using goto is vanishingly rare.
For a 5 minutes video it packs a whopping amount of great insights. Thanks a lot
Rust iterators and its other zero-cost abstractions are what makes it so good in my opinion. Being able to write more readable code, and less of it, and get more optimized code than I could've written explicitly myself is just a win-win.
Sure memory safety is great, but even std C++ iterators can have poor performance, depending on how you use it. Thanks to Rust's borrow checker, it can optimize iterators by not needing to do boundary checks, since it knows the size a head of time.
Using abstractions in Rust are often faster than explicit code. Of course it's not a silver bullet (e.g. there's no way to bailout early in an iterator closure).
Yeah I think Rust is a super cool language. I'm hoping to learn more of it over time! After learning rust iterators I found myself somewhat frustrated that Go doesn't have iterators that can give me at least near-native performance. I have a whole file of Go benchmarks filled with iterator attempts that are way slower than regular for loops. Makes me sad
@@UnitOfTimeYT The trait system is just awsome. There so many extremly useful traits in std that you just keep uncovering.
IMO, rust is frustrating to learn because it combines concepts from a lot of languages and the best way to learn it is to just learn the basics , get an rough overview what else is possible and learn the rest whenever you found a usecase for them.
It makes little sense to try learn all at once since its way too much and you likely do not need many of them because they are for specific usecases you dont have.
If you start using async learn about futures, you need to iterate over your custom type? Learn iter traits. You realise you need a big code chunk multiple times but cant use a function? Learn macros.
Yeah that's a good point. Rust definitely pulls a lot of good parts from a lot of languages. I just feel like the spec can be kind of large. My experience with learning was kind of like
1. I want to write a physics simulation in a Rust ECS
2. Bevy is a popular rust ECS, let's use that
3. Let's try to get a basic example from Bevy docs
4. Oh I have no idea how borrow checking works
5. Let me read the rust book
6. Oh now I kind of understand how the borrow checker works
7. Let's try that again
8. Loop 5-7 every time I don't know something
The problem I had though, was that at the start I don't know rust, so I don't know what features do what. So when Bevy uses a feature (or mentions a feature), I'm basically forced to go learn what that feature is. And because the rust spec is somewhat big I end up having to learn a lot of it just to use Bevy.
I think some parts of this is me being a little lazy, where I just want to get my program written. But I think there is a tough to find balance (for any language) of "At what point do we add this new language feature, and at what point do we not". I'm not trying to criticize, and I don't even think I know the solution. This was just the experience I had.
@@UnitOfTimeYT I think that's fine. The same applies to other languages. E.g. in Rust I don't always opt to iterators, unless I know it's straight forward. I usually start with simply for loops and get it to work, then optimize later.
Perfectionism is a struggle when learning something new, but I try to deal with it by just doing it (applies to many things). I like to go nitty gritty in details but it often leads to rabbit holes, so unless I am either stumbling upon a better way, or I know I need it, I just keep it as is until I've had more time with the language/APIs. Chances are that a light bulb goes off in your head when you try to implement something different with similar semantics and you can use that to refactor your old solution.
A good example of this is polymorphism in Rust. There are different ways to go around it. Usually I start with enums that wraps structs, because they are the easiest to work with, even though they may not be the right solution for what I need.
Traits on the other hand are very useful in cases where you 1) have a lot of variants and 2) want to allow others to "plug-in" to your system in some way (like Bevy's plugin system).
Enums don't always apply well for these cases, because they are kind of "static", so they are mostly useful for cases where you don't want to expose that kind of flexibility on purpose.
Yeah totally agree. Building software is a very iterative process and it's definitely fun to have those realizations of "Oh man this is way better than the old thing I was doing"
Thanks! Great video!
Thanks! Glad you liked it! Thanks for the superlike!
An excellent video showing how Golang can be fast if memory is correctly managed. It would be really interesting to get an update using the latest versions of Golang, specifically using (PGO) Profile Guided Optimisation.
Rust basically implies most of what you were optimizing manually in Go. Borrow checker and exhaustive compile time checks allow for that as a default.
To be fair, Cart, the guy who started Bevy, is pretty freaking brilliant.
Yeah Bevy is an absolutely massive production and hosts waaaaay more features than I have in my ECS right now.
@stysner4580 Bevy has it's own ECS
@@UnitOfTimeYT You are doing a good, hard job. You can get over it.
who knows ? We may see your work being compared to Godot Unity in the near future
4:00 I never heard the term "box" used in programming outside of Rust lingo (and basic explanations for variables). It means heap allocated objects there. What do you mean by it in this case? Also, good job on matching Bevy's ECS, even if the optimizations ended up being a bit gnarly
Also, queries can be iterated faster in bevy by using the for_each method rather than using the iter methods that I found in your repository. You should try updating them! I could try to fork and benchmark it if you want, sounds interesting
Oh interesting. I actually didn't know that box was a specific term inside the rust ecosystem. The first place I heard the term "boxing/unboxing" was in Java where they convert things like primitive ints into capital letter `Integer` (where Integers inherit from the java base class `Object`). In the video I'm using it kind of loosely: Basically, in Go when you have a generic function, sometimes they will monomorphize it once then pass multiple types into that single function. At that point the only way to pass multiple different types into a single function is to make all the types look like one type. Go achieves this by passing in (I think) a pointer to the base data type, then they also pass in a dictionary which holds some runtime information and a Virtual function table. The virtual function table serves as what I was calling the "boxed" value. If you take a look at the planetscale article in the description, you can read a much much better and more detailed explanation.
Also, I didn't know about the bevy for_each way of iterating. I'll have to look into that some more! You're more than welcome to create a fork and open a PR with those changes, someone else already has too! But if you don't I definitely will!
@@UnitOfTimeYT It's a similar concept, just in Rust it's more about size (in memory) and ownership, with the contained type being less-important (lots of caveats around traits here). As a former Go developer gone Rust, the biggest hurdle was paradigm, in the end. Functional code works well in both languages, just the syntax is more specific in Rust, while the typing is more specific in Go (imo).
@@tsalVlog Even JavaScript has boxing btw
Rust has the Box generic type which is a container / wrapper of a value of a different type. The concept of boxing exists in programming in general, and is the concept of wrapping over a value of some type.
@@UnitOfTimeYT My proff used to use unboxing/unwrapping interchangably when talking about Optionals so I guess the analogy is just helpful with many things in programming.
I gave up on Go to switch to Rust -- primarily because they were taking way too long to release generics. Rust has a very rapid, but stable, release cycle, and Go will get left in the dust. Which is unfortunate, I believe it's one of the best languages we have for programming, but the speed at which limitations are removed is hurting the adoption.
Yeah Go is definitely reluctant to release new features sometimes. But I'm generally pretty happy with where the language is. There are a few times where my problem could be solved with a missing language feature, and I have to work around it. But I think I'm in the uncommon domain of doing game development in Go.
lmao 😹
Good old Fortran, C and C++, full-stack language will probably lack performance optimization options. Julia and WASM are more promising.
Man, I can't wait till hype kills Rust because it's not the magic solution every dweeb keeps saying it is to everything.
But aren't Go and Rust used in completely different situations? Rust is used in most low level systems while Go is primarily used in cloud native distributed services. How can one be a direct competitor of the other?
Excellent work! Working on a VERY similar problem at the moment and watching your process has been helpful, if for nothing else, to validate some decisions I've made along the way.
I've mostly been looking at Unity's DOTS/ECS implementation as reference, but being closed-source, I've had to make lots of assumptions based on the API alone.
In any case, this is great content. Hoping to see more adoption of Go for games.
Hey! Super happy to hear that!
I also made this video where I go over the high level implementation of my ECS architecture: ua-cam.com/video/71RSWVyOMEY/v-deo.html
In the description of the video I linked, I put some references to some really good articles about ECS design by some authors of the more mainstream ECS frameworks.
Best of luck!
Go does have some things now for creating iterators, might be worth checking out.
Ah yeah I'm very looking forward to iterators! I think they are behind a feature flag still. But I still need to dig into it a bit!
this is amazing!
thx for making and sharing
Thanks! Appreciate the encouragement! Glad you liked it!
Not a game programmer myself, but I do write go in my day to day work. Learnt alot from this video about go code optimization. Do you have a compiled resource on where I can learn more on this topic?
Hey! Glad you enjoyed the video! I don't really have a compiled resource on performance optimization. But there are quite a few good blogs out there that will sometimes talk about it. Here's a few I know of:
1. Russ Cox's blog: research.swtch.com/
2. Dave Cheney's blog: dave.cheney.net/high-performance-go (Specifically this one: dave.cheney.net/2020/04/25/inlining-optimisations-in-go )
3. Planetscale about generics: planetscale.com/blog/generics-can-make-your-go-code-slower
Hope that helps!
Amazing that you got a GC language running this fast
Thanks! One thing to keep in mind is that this is a physics performance benchmark. All of the memory is pre-allocated. Normally, one of the performance advantages that Rust has over Go is that there is no GC slowing it down, but in this case there is only like 5 really big arrays which means the GC doesn't have very much work to do (each array represents one GC item, so only 5 checks are needed).
[This is an excerpt from a comment I left to Wander Watterson, You can see the full response there]
gc hate is waaay overblown
@@jongxina3595 In complex apps the problem aggravates. If your purpose is served in GC based apps, go for it. If GC is the bottleneck you may have to invest in development effort later on (either optimizing like the author did here, mind you it was a lot of work for such a small codebase) or migrate to a non GC based language.
@@jongxina3595 Garbage collecting is… garbage. Avoid garbage languages.
@@jongxina3595 GCs are the best thing ever made.
Till the day it gets in the way.
Nice research, thanks for sharing!
People really like hating on GC'd langs, especially in gamedev but people stil use Unity(C#)/Unreal(Blueprints)/Godot(Gdscript) and they all use GC and/or interpreted langs on top of C/C++.
Anyways good job!
Thanks! I appreciate it! Yeah i'm not really sure how the GC gets "compiled in" to the final binary in, say, a Unity game. I think a lot of the more critical components are in C++. But I think like 90% of games that I'm interested in playing could run just as well in a GC language. I think good Algorithms and memory layout are the big optimization targets.
One thing I have noticed in my own coding in Go is that, because its so easy to create memory, I sometimes add allocations (without thinking) to my critical code loops rather than pre-allocating. You could do this in any language, but its easier to accidentally do in a GC language I think. Usually not hard to fix: Just move the allocation up further and re-use/pool memory better.
hhhhh. agree
Sorry for the impolite comments from some Rustaceans here. Great experiment! But as you mentioned, definitely curious about workloads with more disparate alloc. Hmm but in that case, maybe the allocator becomes a factor too, and in Rust you can change it from system default to jemalloc or mimalloc, so more variables to consider. It’s really hard to isolate only the ECS code to compare.
More than performance though I’m curious how much you feel the Rust style of using types helps you avoid errors or annoys you from progression. The more interesting trade off between Go and Rust is I think of Go, less static analysis for fast compile and more attention to debugging, versus of Rust, more static analysis (and annotation and API restrictions) for less debugging and slower compile.
Took a brief look at your code, and found a low hanging fruit: your native() for Rust didn’t preallocate the memory for all the stuff like Go forces you to do with make. You can use Vec::with_capacity(size) to match Go behavior. More precisely since I remember Go zeros memory on make, you can use the macro vec![default; size] to replicate zeroing. (Check doc for details)
I see, that initial allocation is not timed anyway.
Haha don't worry, I haven't seen anything too impolite yet :) But yeah, this benchmark doesn't really highlight one of the core benefits rust has over Go. Which is that it has no GC. Luckily, by organizing in an ECS I think my GC impact will be much less than if I had objects sporadically defined by pointers. I think in games its somewhat normal to pre-allocate chunks of memory to work in so that you don't have to allocate inside your physics or render loop. So the timing code just operates on the actual "do physics work" part of the code, and I pre-allocated all of the variables I needed.
About Rust style, I do sometimes wish I had Algebraic datatypes and match statements in Go. That's one thing that is really useful for modelling your data correctly. But right now I'm trying to figure out if its the best way for modelling the data in all cases. I think sometimes I think I want to use an ADT, but I forget the impact of having to do the unwrapping of the inner type (especially when you're enum scales to start having more and more members). I'm not really sure the right cutoff for when to use an ADT/Enum and when to build things with interfaces. Once I eventually struggle and figure out a nice interface for my type, I'm usually pretty happy and stable though.
I do have maybe one controversial opinions about Rust (Which I definitely might be wrong on!). I feel like they have added too many new language features which makes it less and less accessible for someone like me to pick it up. In my personal case, I'd prefer to have better developer ergonomics, even if it cost me some cycles on the CPU. I also sometimes feel like rust library developers err on the side of "Let's model this API with the same amount of entropy as the underlying problem" whereas Go developers err on the side of "Let's model like 90% of this underlying problem in a really simple way". Because I more often then not fall into the 90% of developers, I'd rather have a bunch of simple APIs rather than one perfect, but complex one. All that said, I have only used a handful of packages, but the handful I used I found most to be somewhat hard to grasp.
Anyways, I appreciate your comment. Sorry for the rant. I really think Rust is a cool language and when I was learning it I tossed and turned in bed many times thinking "Man, should I switch from Go?"
At 1:20 you say 35ms and 18ms but the graph shows 350ms and 180ms. The same for other graphs... seems what you say is right, just messed up with Y axis.
Ah yeah, you're right. Thanks for catching, that's a mistake. Sorry about that!
Great video, super informative!
Thanks! Glad you liked it!
Rust also does bounds checking. You can use the non-bounds checking api directly. But if you indexed directly into arrays in Rust, you are using bounds checks everywhere.
Bounds check elimination in Rust can also be done like he described in the video. Since Rust uses LLVM as its backend, it is insanely good at optimizing away redundant computation like bounds checking.
However it is still difficult since there are often cases in which it actually isn't possible for the compiler to optimize such things out.
Yeah there are probably similar tricks you can do in Rust to remove bounds checks inside loops by adding some prechecks. I didn't look into how much LLVM will automagically optimize that sort of thing though. In Go there are a few situations where it could accurately detect and perform BCE, so I had to make sure I got it to understand that it was safe to do BCE, mostly by comparing arrays before a loop.
@The Silenced If I had to guess, I'd think that regardless of how you loop over a single array (either with an iterator or for idx in 0...container.len()), the bounds checking would be implicit in the loop itself (ie both loops guarantee we won't go out of bounds, so bounds checks can be eliminated). The problem I was having in my Go code was that I'd be looping over multiple arrays that were the same length. In the for loop, I loop over a single array but then I index into multiple other arrays. Each of the "multiple other arrays" now requires bounds checks because the compiler can't dynamically predict that their length will be in bounds inside the whole loop. Thanks for watching, and thanks for the comment!
This is awesome content! It's pretty cool seeing how well Go did, and I really was expecting to see a bigger difference between the two. I guess in the case of this particular simulation it makes sense that GC wouldn't be too big of an issue.
I've been meaning to try my hand at learning Go, because I think it could complement Rust pretty well and this video may just be the motivation I needed to get over that hill lol
Thanks! Glad you liked it!
Lol Yeah I think Rust and Go are not so different in the types of problems that you can solve with them. I think Rust will often be slower to write but faster to run and Go will be the exact opposite of that. So depending on the exact problem you're solving, Rust or Go might be a better choice. I had been meaning to try my hand at learning Rust, so this video was also my motivation for doing that :)
Best of luck with your Go journey!
@@UnitOfTimeYT Thanks and same to you!
From what I’ve seen, Go is nice because it includes the batteries. It’s got a lot more concrete functionality in its standard library and it’s easier to write in. However, the main thing I feel Rust has going for it (aside from its longevity and stability-oriented design that gives users the ability to extend the language) is it’s flexibility. With a bit more work, Rust can compile to almost any target, including microcontrollers and webassembly without resorting to a similar-but-not-quite language (like cgo). In Rust, the only thing you lose is the standard library, most of which is actually re-exported from the core library; you just need to bring your own allocator (this is usually done by a controller-specific crate in your dependencies)
@@sploders1019 Yeah there's definitely some nice things in Rust that I wish Go had. Go seems much more targeted for things like webapps and less targeted for embedded. In my game engine I actually compile my Go code to run on wasm though and it seems to work pretty well!
In my experience with these 4 languages, Go feels like what would come out if people experienced in Python tried to remake C, and Rust feels like people experienced in Haskell trying to remake C++. Pretty wild stuff, both with their own different utility
Btw Rust also does bounds checking on arrays/slices
Oh so it's this man right here who needed iterators in GO 3:45
I like to imagine that they added it just for me lol. Though funnily enough, I still haven't ported my ECS over to use the new iterators...
Near native performance iterators would be so damn nice in go but I can kind of understand why they’re resistant to adding them. Their whole mantra is simplicity and sometimes performance gets sacrificed to it. And for most things people use go for it’s actually a pretty nice trade off
Yeah I think I saw they are in preview for go 1.22. and maybe released soon after that. I guess we will have to see!
Go's mantra is not simplicity. It is whatever the Go team decided that Wednesday. Go is like Turkmenistan and the Go team is like the Grand Turkman that issues fatwas for renaming months.
Awesome video! But I was expecting the result to be much more larger though, maybe you haven't compile the rust code in release mode? cargo run --release?
Glad you liked it! Don't worry, I did run the code in --release mode.
One thing to keep in mind is that this is a physics performance benchmark. All of the memory is pre-allocated. Normally, one of the performance advantages that Rust has over Go is that there is no GC slowing it down, but in this case there is only like 5 really big arrays which means the GC doesn't have very much work to do (each array represents one GC item, so only 5 checks are needed).
Ultimately the biggest thing we measured in the Native Go Vs Rust benchmark was: "How good is the compiler at optimizing this physics looping code". Go has a custom compiler and rust uses LLVM so that's at least one difference. The other difference (that I mentioned in the video) is that I'm much more used to writing fast Go code, so I might have done some things that I didn't realize were slow when writing my Rust benchmark.
Overall I don't think Rust is *slower* than Go. I don't even think that Go is *slower* than rust. I think that someone can write a slow program in either and someone could write a fast program in either. I also think that the average Rust program will probably be faster than the average Go program, mostly because of the GC being slower than static frees, but as I mentioned that wasn't a feature of this video.
Glad you enjoyed!
@@UnitOfTimeYT Excellent explanation! that makes a lot more sense to me, all and all programming language are just tools for the job, it all depends on the person who uses the tool correctly and not about how blazingly fast a tool claims to be
I think go is generally considered to be slower than rust just because of how people uses it not as efficient like you do, which tends to make people just assume that rust is the fastest despite it isn't always the case
@@UnitOfTimeYT you also did some compiler flag trickery, Rust can also do a bunch of that. The funniest imo is the one that sets the concurrent build jobs to 1, makes the compiler extremely slow but hey it can do some extra optimizations
Oh interesting. I guess some optimizations aren't possible with parallel builds? Yeah this was my first time using compiler flags to check things like BCE and inlining. It was super useful for finding the slow points of this weird experiment.
@@UnitOfTimeYT for rust, I once tried setting LTO (link time optimization) to true. The default is "thin", which compiles faster, but isn't good at inlinning code between dependencies. Using this can make the binary size smaller and increase performance.
damn your mic got so much better since I've seen your "building the slowest ECS framework" video
haha. Yeah I had to upgrade for the sake of your ears. I feel a little bad about the mic quality back then, when it was me with a $15 mic zip tied to a phone mount. Now I have a bit more professional mic setup lol
Don't know is someone else suggested you yet, but you should try to rearrange things to not require the invalid entity check in the iteration loop. Its a great feature in an ecs to let you take certain assumption to skip those kind of tests.
Internally I have to make a tradeoff of what to do if an entity gets deleted from my component arrays. I can either
1. Mark the slot as a "hole" -> Fastest delete method, but requires extra check in the iteration loop
2. Move the components from the last index of the array into the deleted array index
I kind of do a combination of 1 and 2 where: I mark holes up to a point, but then repack if there are too many holes. Option 1 saves me work for things like particle systems where I'm constantly adding and deleting particle entities. At this point though, the majority of my processing time is spent outside the core ECS and inside major systems like rendering or collision detection.
Thanks for watching!
thanks for helpful comparsion
Damn! This is so cool!!!
Thanks! Glad you enjoyed it!
I just realized the whole video was a marp
Haha - Marp is great. I definitely used it in my other ECS video. I can't remember if I used it on this one tho.
@@UnitOfTimeYT Used marp a lot so i'd recognize the theme always hehe!!!!
3:40 "I think I will have to switch to for-loop based iterators..." Can anyone explain what was the initial issue with the original "range for"?
Sorry for the confusion. There's nothing long with the for i := range slice. The problem comes with passing the function to be executed *into* the Map function which puts it inside the for loop. Because the function is dynamically passed it, its really hard for compilers to inline, So Go can't inline the function inside the for loop. This reduces a lot of optimization opportunity making the code quite a lot slower. In other languages they'll write "iterators" which replicate the for loop logic in the outer function so inlining is more achievable. These sorts of optimizations are fairly specific to my problem set (looping over an enormous amount of objects). Most of the time dynamically passing functions in is good enough performance wise. Thanks for watching!
I don’t understand how iterator in other languages help. You still pass in say a custom lambda to a map function and the map will pass an iterator inside your lambda, right ? What do you mean it can “replicate the loop logic in an outer function” specifically what is outer and is there an inner function for your case?
@@ChuPang This is what I mean. Sorry if there's any errors in it. I didn't test any of the code: gist.github.com/unitoftime/a95003111e5bacd47d9b4a7d71197a1d
The problem is more: I couldn't find a way to write iterators (like other languages have) in Go without losing performance equivalent to just using a map function.
how is the compiler so smart
I think you could make the algorithm significantly faster by optimizing the collision-check loop at 0:34. You check every distance twice there (once from A to B and again from B to A). I'm assuming that the distance function is quite expensive. You could either cache (commutatively) the results of the distance function or keep track of which positions have already been compared and skip.
Of course, it's not really relevant to this video, if you used the same algorithm in both languages.
Yeah there's definitely some algorithmic optimization opportunities in the benchmarks. Probably the simplest would be to compare each distance only once by having the inner loop go from the outer loop position + 1 (instead of starting from 0 each time).
There's also some spatial lookup optimizations you can do such as QuadTrees (KD Trees), Spatial hashes, or as suggested by another comment: Bounded Volume Hierarchies.
But you're right, I was mostly interested in the looping performance and not the algorithms for this particular video.
Thanks for the comment and thanks for watching!
@@UnitOfTimeYT Incrementing the start value of the inner loop sound elegant. I didn't even think of that.
And the other optimization options you mentioned, I've never heard of. Time to do some research :-) Thx for your reply!
@@askassk Of course! I hope you enjoy the Wikipedia rabbit hole of spatial optimizations :P
@@UnitOfTimeYT maybe interesting: for bevy, i made a plugin called bevy_spatial intended to allow easy drop-in usage of kd-tree (and for static objects, r*-tree)
@@laundmo Oh very cool. If I ever optimize my n^2 collision loop and want to compare to Rust, I'll definitely try yours out!
Well Go has iterators now since 1.23 released.
If you like using assembly etc there's some optimization guides you might want to look at?
I don't know anything about golang but it'll tell you about how long different instructions take on different arcitecjtures, how many pipelines they have etc...
That's a good suggestion. Yeah try to not get too far down in the weeds unless I think its important, but reading a guide would probably help me find potential optimization targets faster.
for assembly x86 try avo, just like peachpy provides a easier to write assembly code
Have you implemented spatial access (i.e. only iterating through objects that are within a certain distance from yourself)? If you have, I wonder if that explains the performance difference between the Rust and Go code once you added back in the finite health bounds, as I'm sure you didn't consider it in the rust implementation. Then again, it seems the Go and Rust implementations follow the same curve, just scaled differently, so perhaps you didn't consider it in the Go implementation either. In Rust, one option to do this is with the bevy_spatial plugin for Bevy, which allows you to perform spatial queries for performance (otherwise operations will be O(n^2). This might be an interesting and rewarding API implementation to pursue in your own engine. Also, I feel I should point out (tangentially to Errata No. 1) that in Rust, type aliases are considered to refer to the same type, so they will share monomorphizations in a similar way to how you described stenciling. Also, we do have dynamic dispatch in Rust by way of so-called "trait objects" using the dyn keyword.
For this video, I didn't do any spatial access optimizations in any of the benchmarks. I just did the really basic, naive n^2 collision checks. KD trees and spatial hashes are two optimization targets I'm going to look at in the future (and when I do I want to make a video about it hopefully). One thing I regrettably didn't cover in the video was the delta between Bevy vs my Go Ecs. A few things might contribute to this (I'll put this in the errata section too, now that I know at least one person reads it :)):
I think number 1 and 2 are probably the biggest contributors to Bevy running slow.
1. Bevy has A LOT more features than my ECS, each of those features might contribute to a global optimization for most cases, but in my benchmark contribute to a slowdown. I expect that most people aren't disabling multithreading and the system scheduler and manually executing every game loop.
2. Bevy has a few different layers at which you can use the ECS, I'm not a bevy expert, so I may have written the code in an abnormal way where bevy ran slower than it usually would have. Very specifically, Bevy seemed to slow down the more I started adding and deleting components, I'm not sure the limitation there, but I thought it was notable.
3. Bevy isn't the fastest Rust ECS framework according to this: github.com/rust-gamedev/ecs_bench_suite - I learned this actually kind of recently after all my code was written for bevy
On the Rust monomorphization note you made, that's interesting. I didn't know that. The unfortunate part (if you're interested, read the planetscale article in the description) about the Go generics implementation, is that *every pointer is the same GC shape* which means that *every pointer shares the same set of generic function* and Go has to box/unbox these on every function call (which really adds up in critical code paths).
@@UnitOfTimeYT That's an interesting and somewhat unfortunate note about Go generics. I really wonder why they didn't simply choose to go the monomorphization path - the reason that monomorphization takes up compiletime in Rust is primarily because of the much more complex and advanced typesystem that Rust has and the potentially huge numbers of monomorphizations that might need to be generated. I assume - or at least hope - that the decision to use GC stenciling was informed by extensive profiling and consideration before standardization, but at a glance it seems a very strange decision to me, as I don't see any parallel features that Go has with Rust that would cause monomorphized generics to inflate compile times in the same manner.
@@nekodjin Yeah I'm not really sure either. I think that the typical Go applications might have been more inclined towards the GC stenciled version of generics. I think that most people aren't writing the kind of Go code that I'm writing, which might fit into Go's narrative of solving the majority of the problem. I guess we will have to wait and see how it turns out in a few versions, they might have some long-term optimization plan for it that we don't know about (Fingers crossed) :)
@@UnitOfTimeYT I would add the "rayon" crate in the "check_collision" function to parallelize the collision detection.
You can use the par_iter_mut method provided by the ParallelIterator trait, which returns a parallel iterator over the elements of a collection. You can then use the for_each method on the parallel iterator to apply a function to each element in parallel.
@@romanstingler435 Thanks for the info, for this particular comparison I was doing single threaded performance. But it is cool how easy it is to parallelize code when the data is laid out this way :)
Hey guys, isn’t it possible to use some kdtree instead of the second loop? It of course has nothing to do with rust vs go comparison, but a quick check for collisions looks like a problem solvable by a bettr data structure.
Yeah definitely. From what I've read online, it seems like KD-trees and spatial hashes are good optimization strategies to reduce the number of comparisons you have to do in the world. I'm hoping to make a video about it one day. I also wonder what a hybrid structure of coarse grained spatial hasing + fine-grained KD-trees would look like. So I want to look into that too.
@@UnitOfTimeYT I found myself in this situation and BVH (Bounding volumes hierarquies) are easier to implement and it's construction looks like a KD-tree construction, the magic about a BVH is that you can update a existing bvh iteratively for most frames, so no full re-build every frame.
@@erickweil4580 Oh wow I had no idea about those. Thanks for letting me know! Really really appreciate it!
I've seen other comparisons like this, where C code that's been optimized over the last several decades and looks unlike a natural writing of the concept... is compared to rust code that's 10 minuets old. If you give both projects equal time that's one way, but I think equal in-depth knowledge would be a better comparison. I.E. if for Go you have to understand how the inlining system works you need to give rust a similar advantage and go panics/auto-fail if something can't be learned about rust of equal value.
Yeah that's a valid point. I'm sure there are some things I did which aren't idiomatic Rust because I'm unfortunately not a rust veteran. That said, the code is fairly simple in the "native benchmark" and didn't include any of the optimizations mentioned in the video. For the ECS library comparisons which did contain my optimizations (ie my ECS vs Bevy) - my assumption is that the Bevy authors have already done their own set of optimizations, so in that sense Bevy probably has had more time than my ECS. I left an errata at the end of the description which contains a few notes about why Bevy performed a bit slower than my ECS. The biggest contributor (I think) is that Bevy just simply has more features than my library. There might be optimization in Bevy which improves performance for general cases (probably some thread safety features), but causes a slowdown in my specific case. Or just convenience features that add up to a net slowdown.
Thanks for watching! Hope you enjoyed!
The comparison in this video is correct. Bevy took enough time to improve. And don't forget the llvm improvements too
@@baxiry. That's impossible, rust hasn't existed for several decades to have improvements over time.
@@cheako91155
Go compiler was released in 2012
llvm was released in 2002
The development of Go began in 2007
Development of Rust began in 2006
The Rust Bevy library is older than the owner's video code.
All conditions are biased in Rust's favour. Go's good performance in these conditions deserves recognition
@@baxiry. I was talking about comparisons to C code, obviously. For Rust to generate optimized assembly might take the same level of hackery we see in C code today, but those tricks haven't been discovered. Rust needs at least a few decades to be mastered to a comparable level.
Rust with Go runtime (green threads + GC) would be my dream general purpose language.
Yeah I totally think there is some language sitting out there with a perfect balance of Rust and Go features. I know very little about building languages, but I'm itching to figure that one out lol
@@UnitOfTimeYT I don't think that such a language exists, but you can always create your own! :)
As a Go programmer learning Rust for the first time, what things stand out to you about Rust, and what things do you have in Go that you miss in Rust? What are your favorite features or properties of Go?
Oddly enough what you are describing is basically rust before 1.0. It had green threads and GC in its earlier development.
@@nekodjin haha. If I had infinite time in my life then I might try to make a language. Maybe one day when I've tried a few more langs I'll try to build one. The things I like in Rust are: Race-Free concurrency, ADTs/Enums, and the fact that generics are always monomorphized. I'd probably like macros, but I don't understand those well enough to really have an opinion on them. I like the general idea of having a way to write code that writes code though. For the inverse, I like that Go is really simple, and that I can focus more on getting my algorithms to work rather than focus on figuring out what language feature I need to use in what location. I like that Go builds are really fast. And not that I've done much with async, but I have a feeling I won't like it. I did the javascript version of async in javascript and I wasn't much of a fan. I like how Go's concurrency is modeled (maybe there is a rust equivalent though). Obviously , lots of opinions most of which might change as I learn more :P
@@dynfoxx Interesting, thanks for pointing out. I genuinely had no idea!
cool video. New to the channel. How long have you been coding and how long have you been doing game dev?
Glad you liked the video! I started programming maybe like 13 years ago, and I originally got into it because I wanted to make video games. So I've been failing at making games for about 13 years, with some breaks in the middle :)
I'm very recently doing Go and Gamedev much more full time and trying to be a little bit more organized and efficient with how I make things.
it's really awesome!
What command did you use to compile rust? -C opt-level=3?
Just checked my build script, because it's been a while. I used `--release` which I believe uses a profile with opt-level=3
I decided to follow because you included the code.
Haha glad to know that was the turning point!
@@UnitOfTimeYT There have been a number of times where I wouldn't sub because some creator didn't have code. Reading code helps me a ton.
@@wjrasmussen666 Thanks for letting me know. I'll keep trying to do it where I can! Most of my code is open source MIT.
If you used Rust iterators instead of an index based for-loop, would it still generate bond checks?
In the bevy example I used the bevy iterators. In the native example, I did regular for loops but with all the arrays being preallocated the same size at the start I assume LLVM can do BCE already. Though I did not check. I made that assumption because in the Go native example, the compiler was able to automatically do BCE.
Is there a way to integrate Arena, would love to see the benchmark just really hope it comes out of experiment tag
Yeah I want to try that and also try the new profile guided optimization stuff. Lots of cool new features this release!
Stumbled onto this video - what is the rendering engine for these code tests? Browser? Unity?
I wrote my own opengl rendering library and I just used that for the visualizations.
nice video - i really like the ergonomics of developing with ECS and it's nice to see implementations in various languages.
I have some thoughts on your "why is bevy slower" description
a) That's probably the largest point - while you won't get much out of parallel queries (can't parallelize mutable access to the same data), Bevy can parallelize even the for loops inside a system, by using methods such as par_for_each(_mut)
b) by default, bevy uses what it calls "Table" storage for components - this is fast to iterate but slow to update. In use cases where you might want to add/remove components a lot, you can switch to SparseSet storage on a per-component basis.
c) The ECS benchmark you mentioned is outdated by 2 years, and was also recently archived to represent this. Bevy has had multiple optimizations since then. I see you using bevy_ecs 0.7 - in 0.9 there were some decent performance improvements which would be interesting to see. Sadly, i'm too lazy to run the benchmarks myself.
Thanks for the thoughts! Glad you liked the video. Yeah I also like organizing my games into an ECS, the structure feels lot less spaghetti to me than anything else I've used.
a) Yeah I agree
b) Oh interesting, I didn't know this, but it does align with what I saw where Bevy slowed down once I started deleting entities every frame. Let me add this note to my errata.
c) I'll add this one too. I always thought it'd be cool to make a video about how the performance of some thing (probably a language) improves over the years by running the same benchmark on every version. I feel like it's cool to see how software improves over time.
Thanks!
Did you run with optimisations enabled? / Rust release mode
Hey, yes I ran all the rust benchmarks with release enabled. Thanks for checking!
Why deleting the circle when you can just reset its position.
I wanted to simulate the situation where entities were constantly being added/removed and that seemed like a good way to do it :)
cool video!
compare to dart which one fastest?
nice idea
Definitely rust
so useful and interesting!.
Thanks! Glad you liked it!
Ah, these small go brains
I never knew go was memory safe
yes. go zig haskell erlang etc.. are all safer than rust, but humbly, and without hustle
Yeah, relatively speaking it's safer than say C or C++. You can still have null pointers and things like that, but Go will prevent any invalid memory accesses as panics AFAIK. As an aside, you can write memory unsafe programs with race conditions: research.swtch.com/gorace
Not to start any language wars, but while making my rust benchmarks I learned that Rust will provide memory safety across threads (which is really cool!) because the borrow checker will prevent data races altogether. So I think Rust might actually be considered a bit safer than Go. Of course I do still love Go! :)
@@baxiry. why would you go to the internet just to spread lies ?
I understand how high emotions can get in a video comparing two of everyone's favorite languages, but let's not be rude. :P
Good video. Makes me sad so many Rust programmers cant see past "ohh Go has GC so it must be slower than Rust". Specially on something like this where garbage collection isnt particulary common.
Yeah, I was kind of surprised myself about Go GC being little to no impact. I was going to have a section of the video about "The impact of the Go GC", but I disabled the Go GC and it didn't speed anything up. I think the lesson is that, even in a Garbage collected language, you can still kind of "manage your memory" in somewhat manual ways to get better performance.
Its almost like the guy who intended the B programming language and was at bell labs inventing modern computing, knows a thing or two
@@UnitOfTimeYT Go's GC is much more lightweight than Java, that much I am sure of. From your comment it seems that Go's GC only adds overhead when a novice is writing the code and doesn't manage memory properly. If the quality of code is like Rust, there will be no impact of Go's GC. But the problem is that you never know the impact of GC unless you have something to compare to.
I think it's more about choice, you can have GC in Rust(it was before 1.0) but different projects need different things.
Rust allows(strives) for GC, single allocation, group allocation or no allocation.
In some instances a GC can kill a program. In others GC vs non GC is meaningless. In some GC is even better than non GC. It's just about knowing your data and testing to know what will work best for you.
@@dynfoxx You're talking about the runtime for managing green threads which was required for IO bound apps. It is now known as tokio. It is not a GC.
There are unofficial GC implementations if you want to use, but the only official paradigm which comes remotely close to doing what GC does is Reference Counter and it is not a GC.
By no allocation, do you mean lazy execution? The resource will be allocated only when condition is fulfilled? Even so it is not GC. Check once_cell crate, everything is checked at compile time.
I’d like peoples to compare tools that are used for the same context, Rust & Go aren’t used to solve similar problems. Id rather see a Zig Vs Rust comparaison that would be very interesting.
I don't really know Zig, but from what I've read it seems to have some nice features. I do think that Rust and Go is a valid comparison, IMO they are both fairly high-level systems programming languages, but with different philosophies on how to abstract the developer from hard things such as concurrency, memory safety, package management, etc.
I didnt understand one bit just came for a Rust Vs Go Fight saw the graphs and liked the video 👍 Wish I were as knowledgeable as you 😅😅Too Hard for my brain to get these things 🙏 I should have read the word "physics" in the title and stayed away 😆
Hahaha glad you liked the video. Sorry it didn't make sense lol
Hey, great video! What did you use to create your animations?
Thanks! For the circles moving around I did a custom animation in my game engine. Then for the graphs and stuff like that I used Manim
Generics may run slower, but I don't think it is worth my time to rewrite the same code 49 times.
How do you compile?
Hey, yeah for the rust code I ran in release mode, which as I understand it will enable all optimizations. And then for Go there is really only one build mode. Cheers!
@@UnitOfTimeYT
You need to build a bot specialized in answering this question. so funny hhh
Please tell me you disabled the rust debugger and ran in release mode with O3! Otherwise useless stats.
haha yeah. I actually tried to do an experiment to see the performance delta between debug builds and --release builds, but debug build took so long I couldn't stomach waiting for results and gave up. O3 appears to be the default opt-level for release builds, which is what I used.
@@UnitOfTimeYT ah ok, nice :)
I've seen countless posts on r/rust being removed because they were so embarrassed for using debug mode :') glad someone asked this
Can you compare the best one with Julia?
Haha sorry. I don't know much about Julia, but if I ever have time to learn then I might!
@@UnitOfTimeYTi know it may sound crazy so for you and other readers to understand why I’m suggesting it: Julia has been shown to be comparably as fast as C, while being as easy as Python.
It achieves performance by means of a combination of monomorphization and LLVM optimizations. It’s becoming popular in the scientific community which needs to do, among other things, particle simulations similar to this video.
It has a REPL and commands (macros actually) that spit out the intermediate (LLVM) and final assembly code.
And it’s backed by the MIT.
If the focus of this channel is purely game development, Julia can be at most a curiosity as it lacks the libraries and integrations for game development.
But high-performance numerical computing and data visualization are some of the very reasons for its existence… 🙂👋
@@a0um Very interesting. Yeah I will have to take a look into that one! Thanks for the information :)
What is your hardware?
Good question. I have an, i7-8700K CPU @ 3.70GHz. And I have Two 8 GB sticks of DDR set at 2133 MT/s
@@UnitOfTimeYT sorry for asking instead of researching, but is your program singlethreaded or multithreaded
No Problem! This is all single threaded performance. I'm hoping that once I support multithreading I can do a similar comparison type video but with multiple threads.
@@UnitOfTimeYT yeah, i asked because bevy is multithreaded by default
While you build Rome I'll use Unity's ECS and build games. Lol.
haha definitely the more popular option!
Yeah you go with clickbait just by calling your channel what it is. Bear in mind there is no such thing as time there is only speed. Time is concept of how much speed of human in relativity of position in space to the rotation of the earth and it's relative position in space has passed. Time is just a thing made up to get context for the brain that is unable to wrap it's mind around this law to make it more accessible :)
sorry that you don't like my username. I came up with it because I thought "Jiffy" was a funny unit of time from the Linux kernel.
@@UnitOfTimeYT don't sweat it, thanks for the answer. You know how quickly it could escalate in the comment sections. Have a fine day and keep on providing your videos. :)
My main takeaway from this video: A naive code in non GC based languages will always outperform naive code in GC based languages.
I code in Rust, and didn't have any idea that Go code can be optimized this way. You have gone to such lengths to reduce GC checks in Go, it's mind blowing. Reminded me of Just-JS web framework.
I want to know your take on reducing GC overhead in highly concurrent Go applications. By GC overhead I mean the periodic cleanup of useless items taking up resources, as Rust has no GC to clean up.
Haha thanks! ECS is a fairly well established pattern for game development, which does have the side effect of reducing GC load (because memory is more manually organized), but the majority of my optimizations that I spent time on were around inlining and BCE elimination. Also, of course I had to write the ECS framework to begin with :)
Thanks for watching!
I don't think that it's accurate to say that GC makes up all of the difference here. In some situations, GC can be more efficient than reference counting. I think the primary difference in performance comes from such facts as the inefficient way that Go implements generics, as well as the fact that the Rust compiler is just outright better at optimizing code than the Go compiler by a wide margin.
@@nekodjin Note that in general Rust doesn't even do reference counting unless you specifically tell it to.
You took away something false, it depends on what you mean by "naive" implementation and the specifics of the problem.
@@CheaterCodesYou need to do reference counting to do anything that even remotely resembles a real program, so reference counting is really just the default rust way.
Performance? C++
I didn't do a C++ comparison here unfortunately, but I have made a video in the past called "Go OpenGL performance" which compares C++ and Go in terms of how many sprites they can draw on a screen each frame. C++ is better, but Go was close enough behind that I wasn't too bothered.
you had to compile it in release mode.
Don't worry. I compiled in release mode!
gcc vs llvm
gc (goCompiler) vs llvm here
I 👍
I stand by my previous statement
Hahaha Typical...
great video, mate. I sense loads of rusters will start attacking this video shortly 🤣🤣🤣🤣🤣🤣
Haha thanks. So far everyone has been super friendly :) I do think Rust has some cool language features though!
No matter you will do, Rusties will glorify their language like they are in a cult. The marketing was for sure done correctly by big corporations for political reasons, not engineering ones. It's kinda sad to read in the comments to see how people discovering our tech industry are thinking.
Meanwhile Go and Zig are trying to take their time, as we should do in an engineering world.
If you want to have fun, you could have a native layer with Zig then bind your game logic in Go.
You will have remarkable results by using a bind of FLECS in Zig.
Go is not meant for video games due slower native calls.
Zig and Go works well together though.
Take your time to create useful, meaningful and timeless solutions.
I haven't played with Zig too much, but I've heard good things about it. I actually made a video about OpenGL performance in Go (Basically hoping to measure the CGO impact, by comparing it to a similar C++ implementation). I was somewhat pleased with the results. It turns out the CGO impact doesn't really bother me. That said, I mostly make pixel art games. If you had to write a game which had an enormous amount of OpenGL calls you'd definitely see some more impact than me.
While some people are obsessed with Rust I don't thing it's more then any go or c++ fanatics.
What companies pushed rust? As far as I know rust is basically a community driven language(has been for many years)
In what way is Rust going to fast?
Not trying to be confrontational just think it's good to talk to people with different ideas.
@@dynfoxx I've comment three times already and UA-cam refuse me to post things....
This is the most reasonable comment I've read. Thanks
Performance is nearly identical, but go is definitely easier and fun to write code than rust. Go wins
yes
You performed benchmarks without creating a rust release build … a more accurate comparison would require you to build your own ECS in rust like you did in Go lol.
Oh man, If I have to write another ECS after this one, I'll never finish making a game!
Don't worry I did a release build! :)
Rust is the future, this silly benchmark... Rust performance will always be better than Go except in compile speed and size binaries
I do think Rust is a cool language. Sorry you didn't like the benchmark, but I hope you enjoyed the Video.
He literally said he was worse at programming in Rust than in Go...
Stop fighting with numbers, that's not cool and very childish. The way you do your programs affect how they will perform, doesn't matter if it is C#, C, Go, Rust or even pure assembly.
Rust is the past
Performance is not everything. Rust is a great language, but it seems that those that evangelize it almost never know why it is so great.
yes preach that shit my nigga. PREACH!
Great video! It was very nice following along the progress and finally seeing it all together was very interesting.
My ecs still beats yours :elbowcough:
Hahaha thanks Jomy. One day I'll take you down :P also it's not fair when you have 7 ECS libraries and I only have 1...
You should use goto instead of inlining. It is why FORTRAN is so good for a huge computation - GOTO.
I actually only recently learned that Go had a `goto` statement. So you're saying that people dance around inlining by putting everything in the same function and using `goto` to switch between different methods? I assume you can't `goto` some code in another method.
@@UnitOfTimeYT You can't jump from one method to another using goto, unless you switch to Assembler. Gotos were very useful in complex numeric methods for increasing a performance, but required incredible programming skills which very limited number of people have.
Interesting. I had no idea, I'll have to look more into that
@@UnitOfTimeYT My classical advice would be to avoid using gotos unless it is absolutely positively necessary. Gotos, as a general rule, make programs much more logically complex and messy, making them harder to understand, harder to maintain, and harder to debug. The situation in which it is impossible to get a result that is "good enough" without using goto is vanishingly rare.
@@nekodjin Yeah I generally agree. I just thought it was interesting that it was a method by which people coded optimizations in their fortran code.