I like how this channel gives a clear statement on what will be answered in a moment. Example: (1) At 1:12 Nick says, "I'm going to put all the ways to iterate over a list here" (2) This allows the user to pause the video and try and think of ways. (3) Then click play and view the 7 different ways. Its great that something like "Feel free to pause the video and try...." as most viewers know this and would not do it anyway. Side note - I was able to only think of three ways. (for loop, and forearch loop, ToArray().Select(x=>x) )
I've been a C# software engineer for a few years now and until this video I've never heard of a Span (outside the context of html lol). Going to be looking that up, great vid!
Great benchmark. Good that you explained that the parallel versions are most likely faster when you actually do work inside the iterations. It always distracts me when you say "half the speed" when you mean "half the time" (ie. double the speed). I know it might be a language thingy, but it is really confusing at times.
Great video as always! I also did some collection benchmarks back in April 2022, and i found something different than your benchmarks: for loops were considerably faster than foreach loops. The problem in your benchmark code might be that you access the private field instead of using a variable. foreach will automatically inline the field access to a variable, but for does not do that. In my benchmarks, foreach loops on List were 50% slower than for loops.
Yep, made similar benchmark tests and in dotnet 6 for 100k iterations and foreach took 75 us, for took 37us and Span Foreach 24us. Nice to see that dotnet 7 has many hidden performance boosts!
@@MaxReble Yep, can confirm, too: I reran my tests with .NET 7 RC1 and for and foreach loops are now nearly identitcal in speed. Still nearly twice as slow as iterating over arrays, spans, or ImmutableArray.
Should be parallel unless future iterations/loops are impacted by previous ones IMO. Going for the fastest synchronous/sequential approach is a great exercise, but independent iterations are the perfect Use Case to be done in parallel.
Your videos are the best and I have learned so much from watching your videos. Keep up the fantastic work! I've been coding for over 30years and I'm still learning new tricks, thanks!
Didn’t know about Span(), never would have thought to look, foreach was already heaven, thank you. I was taught never add or remove items during a for loop but dd it anyway, then fast forward to writing multithreaded applications and foreach and Span() throwing an exception is a useful indicator of faulty design.
Getting a local scoped reference to the array allows JIT to optimize away the range checks in the loop (technically also unroll but I don't think it does that). It's not possible for a list since there is no guarantee that other code somewhere wont change the length during our looping. But if you have an array, then length is fixed, and you can do a single if-check pre-looping instead of checking the bounds every iteration.
Interesting, this should also be possible for any foreach since the collection isn't allowed to change during the loop, but I guess .NET does not implement that optimization (yet).
As an old guy, I want to add that, if you don't care about the order of iteration (and all the parallel examples illustrate this is the case here), you can run the index backwards, which avoids calling Count, Size, or Length on each iteration: var asSpan = CollectionsMarshal.AsSpan(_items); for (int i = asSpan.Length; --i >= 0; ) { var item = asSpan[i]; } Note that I explicitly declared i as a signed type so the loop termination condition can be satisfied. Many of the other examples (not using Span) also fail if the collection is changed during the iteration. The difference is that with the cases that use an enumerator you deterministically get a specific exception, whereas with the Span you just get mysterious behaviour (which is also true for the direct indexing loop).
Just a couple of days ago, I implemented a backward for loop similar to your example and it used a var for the index variable. I wonder if is it really worth using --i >= 0 ?
@@alexintel8029 It depends on the actual processor, but on the ones typically used nowadays comparing the result of a computation with zero is faster because no compare instruction is required. The instruction for the computation (in this case likely a decrement instruction) will set flags in the processor indicating if the result was zero, negative, or the computation produced signed or unsigned carry/borrow/overflow, so it can be immediately followed by a conditional jump. If you separate the decrement from the compare, a decent optimizing compiler should be able to relocate them so the conditional jump is still right after the compare, for instance treating: for (int i = x; i >= 0; --i) {...} as: int i = x; if (i >= 0) do {...} while (--i >= 0); // which can again decrement and conditionally jump with no compare instead of the more direct int i = x; while (i >= 0) { ...; --i; } However if your exit condition compares with a value other than constant zero, as in for (int i = 0; i
@@kevinmartin7760 Thanks for the explanation Kevin. I realise the beauty of your original example for (int i = asSpan.Length; --i >= 0; ) {...} It helps a) prevent accessing asSpan[asSpan.Length] which would lead to index out-of-bound error b) decrement the loop c) test for exit condition
Probably the absolute winner: on Stackoverflow I found an example of doing parallel work on a Span, but it involves unsafe pointers, since the Span type itself cannot escape to the heap and therefore Parallel calls on it are normally not possible. I searched for this because I was curious why this ultimate combination was not metioned in the video.
You can actually combine Parallel.ForEach and Span without pointers. I did some benchmarks and it was faster than all of Nick's implementations @ 1 million elements. The trick is using Partitioner.Create(0, list.Count) and passing it to Parallel.ForEach along with a closure around list that takes a Tuple as parameter and marshalls list to Span, then slices it using the tuple and finally iterates over it.
@@pedroferreiramorais9773 Oh great :), I really have to dive into that to understand how that works. Let's say that a very simple solution does at least not exist yet.
@@jongeduard well, you can create an extension method to encapsulate all the logic, but it loses much of the performance gain. It can still be better than sequential span iteration of very large lists/arrays, but if performance is the main concern, you often have to get your hands dirty.
A nice extension method is in order :D public static void FastForEach(this List source, Action action) { var span = CollectionsMarshal.AsSpan(source); foreach (var t in span) { action(t); } }
You're invoking an Action. Invoking an Action is the main difference between the for loop and List.Foreach (the *slowest* on the largest data set). I didn't benchmark anything, but you definitely should before you use this.
Interesting. I could use this to speed some things up. Seems pretty niche for most of what I do tho, but definitely an improvement where improvements can be made
ForEach is list method, not linq method. Span forbids only adding and removing of items, but not assigning. And also I don't like benchmarks that don't use data. Some optimizer can remove more than expected. You've shown IL, but it doesn't garantee jit won't change smth. 9:26 How can 1 byte be allocated?
1 byte is allocated because the JIT allocates, and the benchmark picks it up. That allocation is removed in .Net 7. You can read more about it on the benchmarkdotnet repo. It's actually more than 1 byte allocated, but the benchmark divides it by how many iterations were ran. [Edit] Actually, the rogue allocation still seems to be showing up in Net 7, but the cause hasn't been looked into yet.
Extra way: walk via unconditional (!) "for" loop, exit via catching "OutOfBounds" exception. Removes double-checking of bounds, but introduces overhead from exception handling. May outperform if the list is extremely huge (throw cost is constant and doesn't scale with items count).
I rarely work with List, would have liked to see benchmarks with lists of classes. I feel like the results would be very different. Also please add disclaimers to your performance videos. Some might get the idea they should be using span loops everywhere.
I ran the test and on my machine results are as followed (with dotnet 7) List Iterate_ForEach | 49.71 us | 0.330 us | 0.276 us | - | Iterate_For | 40.59 us | 0.241 us | 0.213 us | - | Iterate_ForEach_AsSpan | 24.64 us | 0.129 us | 0.121 us | - | List Iterate_ForEach | 52.27 us | 0.634 us | 0.562 us | - | Iterate_For | 41.18 us | 0.477 us | 0.398 us | - | Iterate_ForEach_AsSpan | 25.33 us | 0.480 us | 0.426 us | - | So, I see a bigger difference between for and foreach as nick does, but the delta of for and foreach_asspan does not change.
foreach statement cannot operate on enumerators of type 'Span.Enumerator' in async or iterator methods because 'Span.Enumerator' is a ref struct. Has that changed in .Net 7?
@@RahulSingh-il1xk Obviously not on values that you access during the iteration. What I am interested in is what happens if I mutate values, that are not accessed during the iteration.
You cant update collection while doing regular foreach loop either. How come in this case it is not 'unsafe'? How the safety is different when compared to span?
The span will not throw an exception that the collection is modified. Example: var list = Enumerable.Range(1, 5).ToList(); foreach (var item in CollectionsMarshal.AsSpan(list)) { if (item == 1) { list.RemoveAt(3); } Console.Write(item); } Will print 12355, without giving an exception.
It also depends how you modify the list in question. If you replace items in the list, it's not a big deal; you'll see the changes, and nothing will explode. CollectionMarshal even gives you a ReadWrite span, this is an expected and intended use case. If you remove things, you'll get odd and inconsistent results like Stian has demonstrated; In his example you are reading data that is out of range of normal indexers. This data is garbage; it's either stale, or null (depends on whether references are involved) and the span will not be aware of the new range. Now if you do add or insert, you'll be fine until the backing store has to expand. You'll see the inserted data (up to the original size) until that expansion, but once that expansion happens I have no idea what the results will be. It depends on whether Span keeps the original backing store alive, and that I'm not sure of... if it does, you'll stop seeing changes in your span, if it doesn't, then it'll either explode in your face, or expose you to C style bugs where you end up reading memory you're not supposed to; the latter is very bad since it leads to issues like OpenSSL's heartbleed. The short of it is: if you do ANYTHING that can affect the size of the backing store, you are playing with fire. While you'd normally use an array with Array.Fill for this, this example is completely above board. public static void ResetCounters(List counters) { var span = CollectionsMarshal.AsSpan(counters); foreach (ref var counter in span) counter = 0; }
On ForEach you are unable to, it would raise an Exception. With a Span, you are able to (as you are with for(int i=0; ...), but you shouldn't cause you can cause a crash. You shall not anyway, but the damage if you do , is different. That's why Nick shouts out louder. The compiler or the runtime is not hindering you, you have to do it yourself. That's the message.
There is one more way to iterate - using SIMD (Vector), which has lowest readability, but probably will have the best performance, because it's accelerated on hw level and CPU takes n items in processing at one CPU clock.
What if my foreach loop is inside an async method? I'm getting this error: foreach statement cannot operate on enumerators of type Span.Enumerator in async or iterator methods because 'Span.Enumerator' is a ref struct.
I'm not an expert, but I think it's because a Span can only be accessed on the stack (a ref struck physically cannot exist on the heap), but different threads have different stacks so a span on one thread literally doesn't exist for any other threads.
When a program's performance tests are below client's expectations, would you always go for a Span refactorization? ...assuming that no major blunder was made like a bad algorithmic complexity.
This is so much nicer than my in a pinch method: Compiled reflection accessing the array, then use that and the count to get a span. Probably a smidgeon faster too... Spans are amazing....
Looking at the implementation of List.ForEach I don't understand why this one is so much slower than just the standard for. Is invoking an Action really that slow of an operation?
I am only 2 weeks into the learning how to code and actually learning c#. What you talking about and showing looks great and fun but it's to much for my brain is right now. Your videos are cool and very informative but just to advanced or geared to seasoned coders. But nice videos, one day I'll fully understand all the things you explain in your vids.
This is super interesting! I never learned about "spans" in any school, I don't even know what it is! I will eventually google it but could you elaborate quickly in case my laziness gets the better of me? 😅Thanks!
Calling it before watching the video: CollectionsMarshal.AsSpan() Edit: Called it. I would've also liked to see a benchmark which uses the list cast to an IEnumerable, then the IEnumerator would be an interface variable instead of a stack struct which would slow things down because of dynamic dispatch.
@@nickchapsas Oh yeah I didn't have time to watch the whole video from start to finish, I used UA-cam's 10 second skips to go through the most relevant parts, but only after I made the comment :D Nice vid for sure
as per the video, The collectionmarshal provides details that the list should not be modified when iterating as span. So it makes sense to NOT have it as default bcoz lots of use cases requires iterating through a list and possibly updating the data in the list .. consumers of the list should have the flexibility to choose what they want to do with the list whilst enumerating over it. pls correct me if i am wrong
Because the span might point to the old version of the backing array when the list needs to resize its capacity when adding more items. At this point when the list is updated, only the new backing array is affected. The span is unaware of this because it’s still pointing to the old one. You’d also have to reassign the span to point to the current backing array to counter this, but the list may be updated at anytime and anywhere else.
I will give you a little tip, on the List as Span method you can actually use it to modify the items with ref, like: var listSpan = CollectionsMarshal.AsSpan(list); foreach(ref var item in listSpan) { // You can modify 'item' here even if it is a struct }
@@nickchapsas No, if they are structs you will surely need ref, and if you want to "modify" immutable records too. Think in things like: foreach(ref var itemStats in item.Stats) { itemStats = itemStats with { Speed = 10 }; } You cannot do anything like that without refs.
I did not expect to wake up this morning and be surprised by a genuinely faster way to iterate a list, one of the most common tasks in software development.
How much time did you train to talk so fast and clearly at the end of the video? I really thought that I knew how to iterate... thanks for always taking our code to the next level
I'm pretty surprised, so they optimised out the allocation of the foreach enumerator? And somehow also eliminated the overhead of its constant 'version' checks (to stop changes mid iteration)? I wonder if this is a consequence of that 'sealed' keyword you mentioned the other day, or if other work also went into it
I have a question regarding the iteration with span. Is it ok to modify the objects in the array as long as I don’t add or remove? Eg changing or updating a property? Thanks in advance
Hello! At my job, we had an ague with colleagues what is better if you need to represent a collection in your DTO: List or Array. I think array is better. I had my own bunch of arguments about this, but would love to listen to your opinion about this question. Is there any chance you will make a video with close topic?
depends on the usage. For DTOs (since there is no mutations) I prefer to use IEnumerable (most of the time) because it's implemented on almost all collections (Arrays, Lists ..etc). This makes things much easier to work with, and to avoid adding more memory overheads. If you want to only read, then just iterate. If you want to mutate, then copy it to any desired collection type (ToList, ToArray, ToHashSet ..etc.).
So only way to grantee the collection is not mutated while using CollectionMarshal is to lock it? Or would it be safe to use if I wanted, let's say get a list of Entities I wanted to write out on a page? I mean, what collections couldn't be mutated at any given time when you haven't locked it?
Locking doesn't prevent from mutation. If you make a contract with yourself to lock it before use in any place, you prevent it from beeing used simultanously. If you make a contract with yourself to lock it before modify, you can modify it only in one place at a time. Each simple foreach enumerator will throw exception, if you modify during iteration. The new thing is, a span will not raise an exception, it could crash. That's why the extra care.
Windows firewall was blocking my benchmark that used `Unsafe.Add(...)`. I was able to get it to stop doing that, but now i'm seeing `AccessViolationExceptions`. can't figure out why that would be the case. shouldn't the for loops `i < span.Length` check prevent an overrun?
@@nickchapsas That does make sense. The only time I've seen memory corruption like that was in an overrun scenario, so that's what I was thinking this was. I'll look at it again, maybe this is user error. I WAS pretty tired when I was working on this lol Edit: turns out this comment isn't even on the right video lol.
What if I had a list of objects and used it with AsSpan method? Would I be able to mutate the object (ie. change its internals), because from the collection standpoint the reference stays the same?
I think you can. I think it is unsafe to change the length of the list (add or remove) but you can change the data itself even if is of type int (or not objects)
@@nickchapsas Yeah, that was an oversimplification from me. What I meant specifically is calling CollectionsMarshal.AsSpan(list) where list would be a List where Foo is a class defined by me.
@@nickchapsas there will be no performance difference iterating Array vs Span, and why would there be - both represent contiguous memory with minimum overhead. List is a different story because it has some additional logic, and you are paying the (small) price for it.
@@krccmsitp2884 I didn't knew that.. I was thinking about the old calculator trick we used to do in schools during mid 90s, when we first got to use them.
Once you had the span view of your list, you could then divide the span into sub-spans based on how many threads you want to spawn, then apply the same foreach method but from within a Task. So yes you could, but the specifics would require you to understand your problem to the point where you'd be better optimizing at a higher level.
I just tested removing items, but didn't got any error for "for loop" over CollectionsMarshal.AsSpan(items); but foreach without CollectionsMarshal.AsSpan(items); also throw error if collection modified. its common behavior not because CollectionsMarshal. Or I'm I missing some thing? var spanitems = CollectionsMarshal.AsSpan(items); for (int i = 0; i < spanitems.Length;i++) { var item = items[i]; items.Remove(item); }
Great video! I'm curious, is Span still faster if there's heavy processing of the data done inside the loops? What if (as a separate question) you were to perform async operations like retrieving data from a DB inside your parallel loop, implemented with the Parallel.ForEachAsync method? How does it compare when just using Span?
I use Parallel when the tasks are small in number and heavy. I have an app that does six similar tasks, each taking about 500 ms, and it's great. If you needed 3000 ms instead by doing 3000 tasks that each take 1 ms, the overhead of creating each instance makes it a close call. If it's 3,000,000 jobs that each take 1 us, then I definitely would not parallel.
But that's kind of missuse. It's easy to use, but to do exactly 6 things in parallel, you could take async/await, or ThreadPool.queueuserworkitem. Cause you very likely have one thread per task. Parallel Library is made for the situation, when you have more items than threads, it partions the items to the threads.
You can't mutate the collection in any foreach anyway so there's no downside to using the span method it would seem to me. So if you need to mutate the list you're using for instead of foreach anyway and just can't use the span. Very cool!
It's limited to List though, wished it would have worked for IEnumerable or IReadOnlyCollection. Very rarely do I need to iterate a List without doing some work on the data, then I would use a IReadOnlyCollection or IEnumerable.
@@lnagy88 I'm just responding to the idea that "there's no downside". Using a forEach isn't thread safe just because it can't modify the data, you still have to make sure nothing else is modifying it.
@Nick, I'm curious if you've ever encounter a Duff's Device in C# and if you would consider it to benchmark it as well. I would be delighted to see it compared to the span. Thanks for this video in any case.
Cool video as usual Nick! Could a rule of thumb be something like: - few items with no mutations: span - few items with mutations: for/foreach -many items with few operations: span/for - many items many operations: parallel ?
i get that i can't add or remove a object from the list i'm iterating with span, but i can iterate a list and add each object from that list to another list with no problems, right?
With the last example, the comment says not to add or remove items while the span is in use. My question is, can we change existing items? For example, if it's a list of Person which has FirstName and LastName, can we change the FirstName/LastName as long as we don't create a new Person?
Just taking a guess but I would think that as the span references the internal array, that internal array would just be made up of pointers to each allocated object on the heap. So changing the objects themselves wouldn't cause an issue since the that internal array of pointers hasn't changed, they're all still pointing to the same memory locations.
What is the different between for(.... List.Count(), i++) in each for loop instead just store the list length in a temporary variable which I always do this way?
Let's start with the fact that List.Count and List.Count() are different things (second one is a LINQ method) so you literally should access the property. What about the speed when stored in a separate variable, it shouldn't really matter. But you can always perform your own benchmarks if you want :)
@@zhongwang3599 This is micro-optimisation and you shouldn't do this. You assume the compiler is doing, what you note down. But in fact, the pattern for(int i=0;i
@@holger_p I tested. Code it yourself was still little faster at that time. I don't trust the compiler though, and I think that one extra line code effort is not a big deal.
.ForEach is not a LINQ method it is actually defined on List
Nick should get a shirt with "I Stan the Span" printed on it
Coming on a t-shirt near you twitter.com/nickchapsas/status/1523025560774987777
@@nickchapsas Coming on a t-shirt??? 😳😳
I would buy that
WriteLine'd on it*
I like how this channel gives a clear statement on what will be answered in a moment. Example: (1) At 1:12 Nick says, "I'm going to put all the ways to iterate over a list here" (2) This allows the user to pause the video and try and think of ways. (3) Then click play and view the 7 different ways. Its great that something like "Feel free to pause the video and try...." as most viewers know this and would not do it anyway. Side note - I was able to only think of three ways. (for loop, and forearch loop, ToArray().Select(x=>x) )
I've been a C# software engineer for a few years now and until this video I've never heard of a Span (outside the context of html lol). Going to be looking that up, great vid!
Nick is a legend. Integrating "easter eggs" like 69, 1337 and 80085 always makes me smile
I love the humor of using 80085 as your seed and I'm not sure if anyone else caught this old calculator dirty humor
Exactly
Great benchmark.
Good that you explained that the parallel versions are most likely faster when you actually do work inside the iterations.
It always distracts me when you say "half the speed" when you mean "half the time" (ie. double the speed).
I know it might be a language thingy, but it is really confusing at times.
Oh damn you are right. I was thinking it in my head in Greek. In English it doesn't really make sense.
@@nickchapsas Oh wait,so you are actually Greek huh?
@@ZeroSleap Yeap
He a bit geeky, he a bit Greeky
I did not notice that actually because in russian that means the same Nick meant.
Can't express enough how amazing and educational your videos are. Keep doing what you do!!
Great video as always! I also did some collection benchmarks back in April 2022, and i found something different than your benchmarks: for loops were considerably faster than foreach loops. The problem in your benchmark code might be that you access the private field instead of using a variable. foreach will automatically inline the field access to a variable, but for does not do that. In my benchmarks, foreach loops on List were 50% slower than for loops.
It looks like this got optimized in .NET 7
Yep, made similar benchmark tests and in dotnet 6 for 100k iterations and foreach took 75 us, for took 37us and Span Foreach 24us. Nice to see that dotnet 7 has many hidden performance boosts!
@@MaxReble Yep, can confirm, too: I reran my tests with .NET 7 RC1 and for and foreach loops are now nearly identitcal in speed. Still nearly twice as slow as iterating over arrays, spans, or ImmutableArray.
Should be parallel unless future iterations/loops are impacted by previous ones IMO. Going for the fastest synchronous/sequential approach is a great exercise, but independent iterations are the perfect Use Case to be done in parallel.
It might be a good idea to add a benchmark consumer type to actually consume the iteration result to make sure nothing gets optimized away
LINQ doesn't have a ForEach extension method. What's being used in the video looks like the ForEach method defined by List.
Your videos are the best and I have learned so much from watching your videos. Keep up the fantastic work! I've been coding for over 30years and I'm still learning new tricks, thanks!
Didn’t know about Span(), never would have thought to look, foreach was already heaven, thank you.
I was taught never add or remove items during a for loop but dd it anyway, then fast forward to writing multithreaded applications and foreach and Span() throwing an exception is a useful indicator of faulty design.
This was a great video. I love your little deep dives into the C# language.
Very detailed and helpful showing you the results of the typical options Andrew ones I had not seen before
Getting a local scoped reference to the array allows JIT to optimize away the range checks in the loop (technically also unroll but I don't think it does that). It's not possible for a list since there is no guarantee that other code somewhere wont change the length during our looping. But if you have an array, then length is fixed, and you can do a single if-check pre-looping instead of checking the bounds every iteration.
Interesting, this should also be possible for any foreach since the collection isn't allowed to change during the loop, but I guess .NET does not implement that optimization (yet).
As an old guy, I want to add that, if you don't care about the order of iteration (and all the parallel examples illustrate this is the case here), you can run the index backwards, which avoids calling Count, Size, or Length on each iteration:
var asSpan = CollectionsMarshal.AsSpan(_items);
for (int i = asSpan.Length; --i >= 0; )
{
var item = asSpan[i];
}
Note that I explicitly declared i as a signed type so the loop termination condition can be satisfied.
Many of the other examples (not using Span) also fail if the collection is changed during the iteration. The difference is that with the cases that use an enumerator you deterministically get a specific exception, whereas with the Span you just get mysterious behaviour (which is also true for the direct indexing loop).
Just a couple of days ago, I implemented a backward for loop similar to your example and it used a var for the index variable. I wonder if is it really worth using --i >= 0 ?
@@alexintel8029 It depends on the actual processor, but on the ones typically used nowadays comparing the result of a computation with zero is faster because no compare instruction is required. The instruction for the computation (in this case likely a decrement instruction) will set flags in the processor indicating if the result was zero, negative, or the computation produced signed or unsigned carry/borrow/overflow, so it can be immediately followed by a conditional jump.
If you separate the decrement from the compare, a decent optimizing compiler should be able to relocate them so the conditional jump is still right after the compare, for instance treating:
for (int i = x; i >= 0; --i) {...}
as:
int i = x;
if (i >= 0)
do {...} while (--i >= 0); // which can again decrement and conditionally jump with no compare
instead of the more direct
int i = x;
while (i >= 0) { ...; --i; }
However if your exit condition compares with a value other than constant zero, as in
for (int i = 0; i
@@kevinmartin7760 Thanks for the explanation Kevin.
I realise the beauty of your original example
for (int i = asSpan.Length; --i >= 0; ) {...}
It helps
a) prevent accessing asSpan[asSpan.Length] which would lead to index out-of-bound error
b) decrement the loop
c) test for exit condition
Probably the absolute winner: on Stackoverflow I found an example of doing parallel work on a Span, but it involves unsafe pointers, since the Span type itself cannot escape to the heap and therefore Parallel calls on it are normally not possible.
I searched for this because I was curious why this ultimate combination was not metioned in the video.
You can actually combine Parallel.ForEach and Span without pointers. I did some benchmarks and it was faster than all of Nick's implementations @ 1 million elements.
The trick is using Partitioner.Create(0, list.Count) and passing it to Parallel.ForEach along with a closure around list that takes a Tuple as parameter and marshalls list to Span, then slices it using the tuple and finally iterates over it.
@@pedroferreiramorais9773 Oh great :), I really have to dive into that to understand how that works.
Let's say that a very simple solution does at least not exist yet.
@@jongeduard well, you can create an extension method to encapsulate all the logic, but it loses much of the performance gain. It can still be better than sequential span iteration of very large lists/arrays, but if performance is the main concern, you often have to get your hands dirty.
A nice extension method is in order :D
public static void FastForEach(this List source, Action action)
{
var span = CollectionsMarshal.AsSpan(source);
foreach (var t in span)
{
action(t);
}
}
You're invoking an Action.
Invoking an Action is the main difference between the for loop and List.Foreach (the *slowest* on the largest data set).
I didn't benchmark anything, but you definitely should before you use this.
Interesting. I could use this to speed some things up. Seems pretty niche for most of what I do tho, but definitely an improvement where improvements can be made
That final console output is really good resume to keep in mind the use of iterations in c#. Very useful 👏👏
ForEach is list method, not linq method.
Span forbids only adding and removing of items, but not assigning.
And also I don't like benchmarks that don't use data. Some optimizer can remove more than expected. You've shown IL, but it doesn't garantee jit won't change smth.
9:26 How can 1 byte be allocated?
I assume there are cases when compiler (jit) can infer that it's safe to use span for list traversal.
malloc(1)
1 byte is allocated because the JIT allocates, and the benchmark picks it up. That allocation is removed in .Net 7. You can read more about it on the benchmarkdotnet repo.
It's actually more than 1 byte allocated, but the benchmark divides it by how many iterations were ran.
[Edit] Actually, the rogue allocation still seems to be showing up in Net 7, but the cause hasn't been looked into yet.
@@protox4 Thanks for pointing out is is averaged.
It was as expected. Nothing beats the standard WHILE loop except doing things in parallel - which comes with a lot of downsides.
Extra way: walk via unconditional (!) "for" loop, exit via catching "OutOfBounds" exception. Removes double-checking of bounds, but introduces overhead from exception handling. May outperform if the list is extremely huge (throw cost is constant and doesn't scale with items count).
I rarely work with List, would have liked to see benchmarks with lists of classes. I feel like the results would be very different.
Also please add disclaimers to your performance videos. Some might get the idea they should be using span loops everywhere.
There is a similar performance improvement with other objects. The video does have a disclaimer too
I ran the test and on my machine results are as followed (with dotnet 7)
List
Iterate_ForEach | 49.71 us | 0.330 us | 0.276 us | - |
Iterate_For | 40.59 us | 0.241 us | 0.213 us | - |
Iterate_ForEach_AsSpan | 24.64 us | 0.129 us | 0.121 us | - |
List
Iterate_ForEach | 52.27 us | 0.634 us | 0.562 us | - |
Iterate_For | 41.18 us | 0.477 us | 0.398 us | - |
Iterate_ForEach_AsSpan | 25.33 us | 0.480 us | 0.426 us | - |
So, I see a bigger difference between for and foreach as nick does, but the delta of for and foreach_asspan does not change.
Tak!
foreach statement cannot operate on enumerators of type 'Span.Enumerator' in async or iterator methods because 'Span.Enumerator' is a ref struct. Has that changed in .Net 7?
No, and it does not make sense to combine this. You cannot store a span in any object, to run anything parallel.
I'm watching this for fun, it's legit fun to watch. Hat's off
What a great video and information for iterating a List !! I think that this works with a List of object that has plenty of properties...
i think the unsafe part is fine and expected, foreach breaks if you add/remove elements during a loop as well.
That's true. But what if we mutate objects of the list. Say, a person from List while looping. Foreach allows this - will this span approach too?
@@RahulSingh-il1xk Obviously not on values that you access during the iteration. What I am interested in is what happens if I mutate values, that are not accessed during the iteration.
Guys I think you can mutate objects :) its not a readonly span, just a span
because foreach breaks if the list is mutated then they should just make foreach compile into the unsafe part
You cant update collection while doing regular foreach loop either. How come in this case it is not 'unsafe'? How the safety is different when compared to span?
If you have a list items and you wanna foreach it with mutate just do foreach( var item in items.ToList()) and then you can mutate
But you shouldnt mutate a list while your iterate over it anyways
The span will not throw an exception that the collection is modified.
Example:
var list = Enumerable.Range(1, 5).ToList();
foreach (var item in CollectionsMarshal.AsSpan(list))
{
if (item == 1)
{
list.RemoveAt(3);
}
Console.Write(item);
}
Will print 12355, without giving an exception.
It also depends how you modify the list in question.
If you replace items in the list, it's not a big deal; you'll see the changes, and nothing will explode.
CollectionMarshal even gives you a ReadWrite span, this is an expected and intended use case.
If you remove things, you'll get odd and inconsistent results like Stian has demonstrated; In his example you are reading data that is out of range of normal indexers.
This data is garbage; it's either stale, or null (depends on whether references are involved) and the span will not be aware of the new range.
Now if you do add or insert, you'll be fine until the backing store has to expand.
You'll see the inserted data (up to the original size) until that expansion, but once that expansion happens I have no idea what the results will be.
It depends on whether Span keeps the original backing store alive, and that I'm not sure of...
if it does, you'll stop seeing changes in your span, if it doesn't, then it'll either explode in your face, or expose you to C style bugs where you end up reading memory you're not supposed to; the latter is very bad since it leads to issues like OpenSSL's heartbleed.
The short of it is: if you do ANYTHING that can affect the size of the backing store, you are playing with fire.
While you'd normally use an array with Array.Fill for this, this example is completely above board.
public static void ResetCounters(List counters)
{
var span = CollectionsMarshal.AsSpan(counters);
foreach (ref var counter in span)
counter = 0;
}
On ForEach you are unable to, it would raise an Exception.
With a Span, you are able to (as you are with for(int i=0; ...), but you shouldn't cause you can cause a crash.
You shall not anyway, but the damage if you do , is different. That's why Nick shouts out louder.
The compiler or the runtime is not hindering you, you have to do it yourself. That's the message.
There is one more way to iterate - using SIMD (Vector), which has lowest readability, but probably will have the best performance, because it's accelerated on hw level and CPU takes n items in processing at one CPU clock.
This is the best free software Ive seen. Respect.
Watching your videos makes is very humbling and makes me realise that I'm absolutely shit at my job!
Nah trust me you don’t need to know 99% of the stuff I show to be good at your job
What if my foreach loop is inside an async method? I'm getting this error: foreach statement cannot operate on enumerators of type Span.Enumerator in async or iterator methods because 'Span.Enumerator' is a ref struct.
I'm getting same error
I'm not an expert, but I think it's because a Span can only be accessed on the stack (a ref struck physically cannot exist on the heap), but different threads have different stacks so a span on one thread literally doesn't exist for any other threads.
Foreach also has the same limitation, can't add or remove from the list, then we can replace each "foreach" with span version?
Oh wow! didn't know about this. Thank you so much
When a program's performance tests are below client's expectations, would you always go for a Span refactorization?
...assuming that no major blunder was made like a bad algorithmic complexity.
Great video! I certainly could have used this on previous projects but will make sure it gives me the benefits I need for my current ones.
This is so much nicer than my in a pinch method:
Compiled reflection accessing the array, then use that and the count to get a span. Probably a smidgeon faster too...
Spans are amazing....
Love these kind of videos, and so well explained!
Looking at the implementation of List.ForEach I don't understand why this one is so much slower than just the standard for. Is invoking an Action really that slow of an operation?
Yes and it is also prone to closures that can make the problem even worse
@@nickchapsas what if you make the lambda static
I am only 2 weeks into the learning how to code and actually learning c#. What you talking about and showing looks great and fun but it's to much for my brain is right now. Your videos are cool and very informative but just to advanced or geared to seasoned coders. But nice videos, one day I'll fully understand all the things you explain in your vids.
I’ve been programming for ten years and I have never heard of a span before
@@matthewjohnson3656 Same here (25 years!) and just heard of a span a month ago (that isn't )!
Hi, great video! its so usefull. Thanks. I have a question, how can i implement it in async method? what is the best practices in async?
This is super interesting! I never learned about "spans" in any school, I don't even know what it is! I will eventually google it but could you elaborate quickly in case my laziness gets the better of me? 😅Thanks!
This was really interesting. I've never seen the AsSpan methods before. I honestly had to look up what a Span was lmao.
Very interesting to learn this, I could use this in my project. Thanks Nick
Calling it before watching the video: CollectionsMarshal.AsSpan()
Edit: Called it. I would've also liked to see a benchmark which uses the list cast to an IEnumerable, then the IEnumerator would be an interface variable instead of a stack struct which would slow things down because of dynamic dispatch.
Hey I can see that you skipped forward 👀
@@nickchapsas Oh yeah I didn't have time to watch the whole video from start to finish, I used UA-cam's 10 second skips to go through the most relevant parts, but only after I made the comment :D
Nice vid for sure
@@nickchapsas you haven't shown foreach on IEnumerable in the video.
Any reasons to why the span way of doing things is not the default way? (why is the default way not lowered the same way as the span way is lowered)
I assume because getting access to the array backing a list is not a very safe operation
because span version is un-safe. and there are a lot of scenarios when you update data when iterating through it
as per the video, The collectionmarshal provides details that the list should not be modified when iterating as span.
So it makes sense to NOT have it as default bcoz lots of use cases requires iterating through a list and possibly updating the data in the list .. consumers of the list should have the flexibility to choose what they want to do with the list whilst enumerating over it.
pls correct me if i am wrong
@@antonmartyniuk You can update the list while using a normal foreach loop? I think thats not possible either...
Because the span might point to the old version of the backing array when the list needs to resize its capacity when adding more items. At this point when the list is updated, only the new backing array is affected. The span is unaware of this because it’s still pointing to the old one.
You’d also have to reassign the span to point to the current backing array to counter this, but the list may be updated at anytime and anywhere else.
As always , thanks Nick
That is awesome! Thanks Nick
"I'm going to add a seed here." - casually puts 80085 (BOOBS) LOL. Thanks for the small laugh.
I will give you a little tip, on the List as Span method you can actually use it to modify the items with ref, like:
var listSpan = CollectionsMarshal.AsSpan(list);
foreach(ref var item in listSpan)
{
// You can modify 'item' here even if it is a struct
}
You don’t need ref to modify the items. You can just modify them. They are still references
@@nickchapsas
No, if they are structs you will surely need ref, and if you want to "modify" immutable records too.
Think in things like:
foreach(ref var itemStats in item.Stats)
{
itemStats = itemStats with { Speed = 10 };
}
You cannot do anything like that without refs.
Great video. could you make one video on Partitioner?
Fantastic video, great info and very clear. Learned a lot. Subscribed.
I did not expect to wake up this morning and be surprised by a genuinely faster way to iterate a list, one of the most common tasks in software development.
Thanks! I wasn't aware of CollectionsMarshal!
What about Parallel.ForEach & AsSpan?
How much time did you train to talk so fast and clearly at the end of the video? I really thought that I knew how to iterate... thanks for always taking our code to the next level
It's nice that you check the IL code, but what about the JIT?
Very interesting. Thanks for this!
Always great info man.
To span a loop into a 13 minute video, that's what I call a great job! lol
I'm pretty surprised, so they optimised out the allocation of the foreach enumerator?
And somehow also eliminated the overhead of its constant 'version' checks (to stop changes mid iteration)?
I wonder if this is a consequence of that 'sealed' keyword you mentioned the other day, or if other work also went into it
They did optimise it in .NET 7
I have a question regarding the iteration with span. Is it ok to modify the objects in the array as long as I don’t add or remove? Eg changing or updating a property? Thanks in advance
Yeah that’s fine
Hello! At my job, we had an ague with colleagues what is better if you need to represent a collection in your DTO: List or Array. I think array is better. I had my own bunch of arguments about this, but would love to listen to your opinion about this question. Is there any chance you will make a video with close topic?
depends on the usage. For DTOs (since there is no mutations) I prefer to use IEnumerable (most of the time) because it's implemented on almost all collections (Arrays, Lists ..etc). This makes things much easier to work with, and to avoid adding more memory overheads. If you want to only read, then just iterate. If you want to mutate, then copy it to any desired collection type (ToList, ToArray, ToHashSet ..etc.).
Great video, glad I found you.
hi, great video as usual. Do you take topic suggestions? parallel.foreach vs parallel.foreachasync pls .. :)
So only way to grantee the collection is not mutated while using CollectionMarshal is to lock it? Or would it be safe to use if I wanted, let's say get a list of Entities I wanted to write out on a page? I mean, what collections couldn't be mutated at any given time when you haven't locked it?
Locking doesn't prevent from mutation. If you make a contract with yourself to lock it before use in any place, you prevent it from beeing used simultanously. If you make a contract with yourself to lock it before modify, you can modify it only in one place at a time.
Each simple foreach enumerator will throw exception, if you modify during iteration.
The new thing is, a span will not raise an exception, it could crash. That's why the extra care.
Windows firewall was blocking my benchmark that used `Unsafe.Add(...)`. I was able to get it to stop doing that, but now i'm seeing `AccessViolationExceptions`. can't figure out why that would be the case. shouldn't the for loops `i < span.Length` check prevent an overrun?
The position right after the span still belongs to the list/array so it is not considered an overrun
@@nickchapsas That does make sense. The only time I've seen memory corruption like that was in an overrun scenario, so that's what I was thinking this was. I'll look at it again, maybe this is user error. I WAS pretty tired when I was working on this lol
Edit: turns out this comment isn't even on the right video lol.
This was user error. Make sure you don't forget the `ref`s on your `searchSpace`.
Would this be the same for readonly lists?
What if I had a list of objects and used it with AsSpan method? Would I be able to mutate the object (ie. change its internals), because from the collection standpoint the reference stays the same?
You cannot use AsSpan on a List
I think you can. I think it is unsafe to change the length of the list (add or remove) but you can change the data itself even if is of type int (or not objects)
@@nickchapsas Yeah, that was an oversimplification from me. What I meant specifically is calling CollectionsMarshal.AsSpan(list) where list would be a List where Foo is a class defined by me.
Nice video.
I am also interested if there are any actual difference between Span and an Array, which in my opinion should perform more like the same.
There is a difference indeed. I’ve covered this in the dedicated span video if I remember correctly
@@nickchapsas there will be no performance difference iterating Array vs Span, and why would there be - both represent contiguous memory with minimum overhead. List is a different story because it has some additional logic, and you are paying the (small) price for it.
@@МаксФедотов-т7з Actually there is. I also have noticed that
80085, I see a man of refined culture. 😅☺️
The zip code of the Simpsons' home town Springfield.
@@krccmsitp2884 I didn't knew that.. I was thinking about the old calculator trick we used to do in schools during mid 90s, when we first got to use them.
@@AbhinavKulshreshtha Well, that's the other meaning. I know that trick too and what you wanted to indicate. :-)
Awesome video. Always learn something new.
Is there a way to parallelise the span as well for even faster iteration of large lists ?
Once you had the span view of your list, you could then divide the span into sub-spans based on how many threads you want to spawn, then apply the same foreach method but from within a Task.
So yes you could, but the specifics would require you to understand your problem to the point where you'd be better optimizing at a higher level.
This is really amazing Nick! Even in the things that seem so basic in simple, we are finding hidden gems!
how about parallel foreach span? :D
It was funny when I rolled my eyes before you said “Of course it would be a span” and started giggling 😂😂
For_Span backward cannot delete items from list?
I just tested removing items, but didn't got any error for "for loop" over CollectionsMarshal.AsSpan(items);
but foreach without CollectionsMarshal.AsSpan(items); also throw error if collection modified. its common behavior not because CollectionsMarshal.
Or I'm I missing some thing?
var spanitems = CollectionsMarshal.AsSpan(items);
for (int i = 0; i < spanitems.Length;i++)
{
var item = items[i];
items.Remove(item);
}
That is the problem with Span - you will silently get garbage if mutation happens, it relies on programmer discipline.
Yeah that’s why it’s unsafe with span. Because you won’t get the error that you would normally get
Great video! I'm curious, is Span still faster if there's heavy processing of the data done inside the loops? What if (as a separate question) you were to perform async operations like retrieving data from a DB inside your parallel loop, implemented with the Parallel.ForEachAsync method? How does it compare when just using Span?
You can’t use await in a method that uses spans
Excellent find. Thank you
did you try release build?
I use Parallel when the tasks are small in number and heavy. I have an app that does six similar tasks, each taking about 500 ms, and it's great. If you needed 3000 ms instead by doing 3000 tasks that each take 1 ms, the overhead of creating each instance makes it a close call. If it's 3,000,000 jobs that each take 1 us, then I definitely would not parallel.
But that's kind of missuse. It's easy to use, but to do exactly 6 things in parallel, you could take async/await, or ThreadPool.queueuserworkitem. Cause you very likely have one thread per task. Parallel Library is made for the situation, when you have more items than threads, it partions the items to the threads.
What would be speed regarding list of data set. Like array of objects or multi-field lists.
You can't mutate the collection in any foreach anyway so there's no downside to using the span method it would seem to me. So if you need to mutate the list you're using for instead of foreach anyway and just can't use the span.
Very cool!
If you're running parallel code something else might be mutating the collection.
@@jasonx7803 Then maybe you shouldn't use a list and try something more thread safe.
It's limited to List though, wished it would have worked for IEnumerable or IReadOnlyCollection. Very rarely do I need to iterate a List without doing some work on the data, then I would use a IReadOnlyCollection or IEnumerable.
@@lnagy88 I'm just responding to the idea that "there's no downside". Using a forEach isn't thread safe just because it can't modify the data, you still have to make sure nothing else is modifying it.
@Nick, I'm curious if you've ever encounter a Duff's Device in C# and if you would consider it to benchmark it as well. I would be delighted to see it compared to the span.
Thanks for this video in any case.
C# Syntax is much more restrictive than C, it won't allow the original Duff's Device. And, tbh, I'm rather happy about that. 🙂
Cool video as usual Nick! Could a rule of thumb be something like:
- few items with no mutations: span
- few items with mutations: for/foreach
-many items with few operations: span/for
- many items many operations: parallel
?
Which IDE are you using?
I wonder how these results compare to Unity's Job and Collection systems using NativeList or NativeArray or even Span.
Outstanding comparison. 5 stars
i get that i can't add or remove a object from the list i'm iterating with span, but i can iterate a list and add each object from that list to another list with no problems, right?
What if you cache the value of Items.Count and use it in the for loop?
Good point man. Thanks for the video.
If you are algorithm is loop heavy, you can use loop unrolling for more speed.
Thank you, I literally didn't even know about spans.
With the last example, the comment says not to add or remove items while the span is in use. My question is, can we change existing items? For example, if it's a list of Person which has FirstName and LastName, can we change the FirstName/LastName as long as we don't create a new Person?
Sure you can
you can't make any changes for any of your items in the collection while the collection been iterated whit the span method
Just taking a guess but I would think that as the span references the internal array, that internal array would just be made up of pointers to each allocated object on the heap. So changing the objects themselves wouldn't cause an issue since the that internal array of pointers hasn't changed, they're all still pointing to the same memory locations.
@@veselinvalkanov why? Seems like you can change anything including asssigning items unless the size of the list changes.
This helped a lot thank you
What is the different between for(.... List.Count(), i++) in each for loop instead just store the list length in a temporary variable which I always do this way?
Let's start with the fact that List.Count and List.Count() are different things (second one is a LINQ method) so you literally should access the property. What about the speed when stored in a separate variable, it shouldn't really matter. But you can always perform your own benchmarks if you want :)
@@volan4ik.
For_ListCount | 1000000 | 517.7 us
For_len | 1000000 | 513.0 us
For_Span_len | 1000000 | 258.9 us
@@zhongwang3599 This is micro-optimisation and you shouldn't do this. You assume the compiler is doing, what you note down. But in fact, the pattern for(int i=0;i
@@holger_p I tested. Code it yourself was still little faster at that time. I don't trust the compiler though, and I think that one extra line code effort is not a big deal.
Where can I learn the techniques you use like [Benchmark] [Params] [GlobalSetup]?