People don't seem to understand that O(1) is not FASTER than O(n), it just SCALES better; there's a breakpoint. Imagine a "hashcode as a service" rest call vs doing it in memory: it'd still scale better, but it'd take an awfully large 'n' to get there. I see people missing this in PRs, streaming/looping an array is often faster with small arrays than something like a `HashSet`. And in practice, its really unlikely either makes a tangible difference in most use cases.
Yeah, I’ve started picking and choosing what I optimize a bit more. I don’t need to optimize a dictionary that maps an enum to a value when that enum has 30 elements, and the iteration happens once a cycle outside of the heavy loops. I need to optimize the heavy loops
That's true for any O() btw. O()s are the simplification of a function to represent the shape of its curve. Depending on the full function you can get a O(n²) that's faster in low levels than a O(n), all we can say is that eventually O(n) will be quicker but knowing this breakpoint requires testing and depending on application is pointless.
yes thats why you say O is for worst-case, when n gets big. But for small n the bigger O could very well perform better. That's when you need Omega() of your runtime
In functional programming, singly linked lists play a special role of being immutable, and being able to be prepended to without modifying the existing list. Arrays are virtually impossible to implement as anything other than an array of lists, which throws away most of the performance benefits that one would get from using an array in an ordinary imperative language.
1) Yes. 2) Arrays are not impossible, Haskell has efficient arrays, they are just clumsy to work with. Linear Haskell can improve this a bit I think. 3) Immutable data structures exist and they play with cache pretty well. 4) Linked lists are cool because you can have multiple heads share portions of the list.
This is an absolutely classic talk that every software developer must see. I remember writing a lot of linked list code back in the days of DOS and it worked wonderfully... but computers changed a lot since 1990.
Everyone knows that if you want to store a list you store them as bucket on S3 and then if you want to count the size of it you use Spark.This is all nonsense that assumes users know how to download RAM when we all know now that it's in the cloud.
memory caching architectures screwed up all that wonderful 1970s/1980s intuitive simplicity but all of our Comp-Sci data structures 101 courses still get tought as though that were the reality the one place I find I can return to that simplicity, though, is FPGA programming where digital circuits do their thing on a clock cycle - but even there, there is signal propagation latency to contend with, but the design and simulation tools help coping with that
There are some cases where linked lists are highly efficient. One real-world example is dynamically allocating sound channels. When you need a new channel, grab the head of the free list. When freeing a channel, link it onto the head of the free list. Both constant-time operations. And you have to traverse the active list every update, so you have your prev/next pointers to unlink if a channel has ended. And because it's all management of elements within a single array, you don't need pointers, or even to store the prev link. Just a single next index. Another is overhead RPG sprites. They need to be rendered in order of Y position, and you can do an insertion sort every frame to relink them in proper order. Normally an insertion sort is O(n2) because you have to traverse the list to find the insertion point, but because the list is still mostly sorted from last frame, you end up linking most entries to the head of the new list so it's O(n).
Store data in a dynamic array, add a unique ID to each entry and shuffle it every time you add an entry. Every time you need to access an entry, search for the index by randomizing the current index and check if the ID matches the ID of the entry you are looking for.
_Back in my day_ cache friendly code wasn't a thing; now these newfangled caches make the case for arrays stronger than ever. Having that said, solving the problem correctly before measuring performance and making the case for struct of arrays is super important or else you're just gonna have a lot of arrays that aren't doing you any good.
"Cache friendly" was always or never a thing depending on the circles you moved in. Go ask the OG's in the demo scene, image processing and others the lengths they'd go to make sure the performance critical sections of code COULD fit in the cache. The only difference between then and now is that before you had to do some work with limited resources and now you have to A LOT MORE work with (a bit more) resources. Think VGA vs 4K, it's 29x more work just for the basic pixel count, now add "new shiny stuff", and it quickly becomes "any optimization counts". The key take is obviously "performance critical", and most people don't tread those nasty paths, unless they have to.
what do you mean? spatial locality and temporal locality have always been a thing, that's kinda the whole point of caches and why engineers calculated cache hits and misses to find the sweet spot of how much do you actually need for them to start making a difference in performance.
@@ErazerPTn the very old days, CPUs were in general much slower than memory access latency, so cache efficiency didn't matter, in fact, most of these CPUs didn't have caches because memory was fast enough. Of course there were other tricky optimizations specific to each CPU that often had to do with memory layout, for example some CPUs may be extremely picky with regards to alignment, but cache efficiency in particular is a relatively new idea in the history of computing.
I saw this a while ago and it blew my mind. It just shows that when you move theory to practice things may not work how you expect since there's a lot of complexity such as caching, predictable usage patterns, compactness, number of pointer redirections, etc. that significantly impacts performance. It's very difficult to correctly optimize a piece of code without first measuring and understanding where the bottle necks unless you already have experience in tuning.
"Premature optimization is the root of all evil" -Donald Knuth That's not to say, optimized code is bad, but that many programmers will attempt to optimize what doesn't matter to the detriment of maintainability, while ignoring the very small section of code that matters most. Don't optimize 90% of your code before you optimize where your code spends 90% of its time.
This is a good take. I think it's good to educate people that if they need to search for a node to do something with it, then it will be expensive. I personally rarely use linked lists when I have to search for the node I want to do things to, but you don't always need to actually search for the node - when you don't need to search is when linked lists are (more likely to be) useful.
I would presume, it really depends on what kind of linked lists, with what kind of data inside we are talking about. The mentioned example may hold true for the primitives, especially if they can be stored and processed on the stack, but I'm not sure it would be the same when applied to, for example, a list of pointers to the attributes of objects stored on the heap, where addition/deletion is performed based on the logic that requires accessing their values.
A trivial example of where a linked list shines is when implementing algorithms that require queues which can grow and shrink with ease like BFS. Again, very trivial example.
The linked lists gain a lot for some special purposes when the elements themselves have prev and next pointers and you never have to search. But even then not always - contiguous memory really makes a difference.
Exactly. And that's the only case I would use it :) I'm mainly a C# programmer and I never used the "LinkedList" class but I have implemented my own linked list structures here and there where it made sense.
Linked lists aren't good for containers, because of the linear search problem mentioned here. Linked lists are good when they're built into the data type itself. That way, if you have a reference to an object, you can just call something like appendAfter() efficiently, without searching for the insertion point
Exactly. The cursors are a critical part of the linked list data structure. Linked lists without cursors are like vectors with encrypted indices. There's just no reason to do that to yourself.
I like to use a combination of both. Make chunks of max 4K elements (or whatever size is suitable for your cache), store these chunks in a linked list. Insertion and deletion with a chunk is fast since you're shifting stuff already in the cache. Split a chunk if it gets near 4K elements into 2 chunks of 2K elements, or merge 2 chunks if they fall below 2K elements. Should be as fast as arrays, with the flexibility of linked lists.
And if you build a level of the same structure, with each element pointing to one of those chunks, and another up, until you have a single chunk parenting all chunks, that’s a b-tree. Literally THE fundamental index structure in RDBMS.
This was great. I remember watching Bjarne's presentation a few years ago and was blown away by how simple the problem was after he explained it but yet so damning for Linked-Lists. My favourite language is C++ and I miss it dearly (been stuck doing Kotlin for several years now).
A skip list is a clever and interesting method of speeding up lookups in sorted linked lists. It's possible, though, that cache misses still give it a disadvantage to vectors.
one additional thing is the impact of linked list and structures with lots of pointers is heap fragmentation and compaction. At some point, your system may need to compact the memory to get better performance. A lot of programmers don't think about heap fragmentation and how much if affects performance.
I'm glad that our professor emphasized the linear search bottleneck in the DS course. I still use linked list a lot, but the cases are almost always about some other abstract data structures like a stack or a hashtable (for each slot, not the whole mapping, of course). If something by design only uses head/tail removal or requires linear search anyway, linked list is the way to go. When sorting comes into the play, the first things that go through my mind would be an array, a heap or a RB tree. You'd better retake the whole data structure and algorithm curriculum if you come up with a linked list first.
If you're doing only head or only tail removal/addition I'd go to a vector or a vector in reversed order. If you're doing both, I'd go to an array deque which is essentially just a vector treated as if wrapped in a circle. I try to avoid linked lists whenever possible.
If you're not order dependent you can always do a swap remove on the vector, get O(1) there too. i.e. Swap with the last element of the vector and shrink the vector by 1.
I'd say deque is an optimized linked list. I'm not entirely sure what a vector wrapped in a circle would mean from a memory point of view. From a usage point of view it makes more sense, but usage of deque can be thought of as just being a linked list also
@@shiinondogewalker2809 an array deque is a specific implementation of a deque that is the same idea as a ring buffer. The head and tail 'pointers' are actually indexes into the array. If you were to remove several elements from the head, it would increment the head index by however many, leaving empty slots at the start of the array. Then if you keep adding elements to the end, the tail will eventually increment past the end of the array, so what you do is wrap back around to index 0 where that empty space is. If you remove elements from the tail, it will decrement below index 0 so then you wrap around the other way back to the end of the array (i.e. N-1). Adding and removing elements from the array deque's head can wrap the same way. Sometimes it's easier to visualize what's happening by bending the array into a circle so that elements 0 and N-1 touch. It's a wonderfully clever way to keep the lower overhead and contiguous layout of an array without having to shuffle the existing contents about if all you're doing is working with the head and tail.
a doubly linked list will be much faster if you're doing lots of head-insertions or concatenation. Honestly speaking, I can't remember the last time I've removed random elements from my data structure.
It’s not even just cache misses and prefetching like he mentions, but it’s also a linear chain of data dependencies. When traversing a linked list, the out of order CPU can’t do anything for you because it doesn’t know where the next node is until it gets the result of the memory load of the next pointer for the current node. So the CPU is just stuck waiting around for stuff to do while memory loads. Basically, linked lists are the worst case data structure for allowing your CPU to do what it’s good at. Hybrid data structures (lists with fat nodes that contain compact arrays) can give many benefits of both, though.
If the items aren’t sorted, you can just swap the to-be-deleted item with the last item in the vector and then pop that. Super fast. No linked list will beat that :)
Thinking about this.. Useful if you don't have external references to items in the list, but probably difficult if you do. I use linked lists in my game engine in some cases for this reason, deletions only occur via external pointers, no searching required, and because they're only iterated once a frame at most, (and because it runs on an uncached cpu).
But you have just created the exact problem that made linked lists slow. Every single time you want to find something in your vector you need to traverse through the entire vector first.
It is all about use case. There is no best data structure to store multiple elements. If you only need to store multiple elements and dont care about order or finding specific elements, say a game where you have a list of objects to update each frame (note, not render order) then an array where you delete by replacing with the last element makes an amazing data structure.@@christopherpepin6059
@@christopherpepin6059 No? He did not. He's assuming you already know the position of the item, the index, and if you know the time complexity is constant. With linked lists even indexes need to be traversed in order to get to the correct element, and this is not what make them slower than normal lists. What makes them slower is memory spreading and being prone to cache misses, in this sense a normal vector would perform better even if you need to iterate it entirely first.
Just gonna say my thoughts as I watch: -Okay the thing about the linear search dominating makes sense at first blush but that's just bad usage of a linked list. Generally you will couple it with a hashtable from some key to the node. Any simple LRU cache would work like this. -The nodes don't have to be malloc'd randomly. You could malloc sizeof(Node)*1024 and then dole them out. -Linear search is especially bad for linked lists because with an array it can prefetch elements that it knows you are going to touch. Not so with a linked list because it's all through pointers so who knows what you're going to need next. Again, use a hashtable if you need lookups into your linked list. Linked lists are good at insertion and removal, not at random access. Nothing new there. -"Vectors are more compact than [linked] lists." No, see above about mallocing a pool of nodes. -"Always use a vector unless you can prove a map is faster." For a trivial case of searching for a given integer in a vector of integers, a map becomes faster after about 100 elements (from my testing a while ago. Also depends on the implementation of those data structures. I did my test in C# (List vs Dictionary so take that with a grain of salt). Linked lists are very powerful when you start using them how they are meant to be used. They are not merely "a worse vector." They afford different things.
I think there is a small implied assumption when stating that insertion/removal is faster in a list, namely that you already have a pointer to the desired location and linear search is not an issue. Just like with an array, you don't usually search the element to be removed, you just index into the array by keeping track of the "interesting" indices, with a smart use of a linked list you usually keep track of pointers to the "interesting" nodes. In this circumstance, the list is better.
A good example of when a linked list beats an array is if you get the data served randomly and you want to sort it, and you also have a HEAD that keeps track of the current Node's position. Then if you want to sort the random data in ascending order all you do is: Move head forwards if current node is smaller, move head backwards if current node is bigger. Insert when value before is smaller and value after is bigger. (Which is O(1) Now at worst you're doing a linear scan, but usually you're only moving a few nodes at a time depending on distribution of numbers. If you did an array however, even if you did this HEAD approach, you would need to to first find the number, which is a linear scan in the worst case and on average only a few array cells. But even if that type of perf is similar, once you get to actually inserting you are now having to shuffle n cells to accomodate the new element.
Very good to pick up this topic. Honestly data being randomly all over the place makes these JS/TS languages even more slow than the slowdown that GC uses... Btw I have a data structure that uses a locally linked liist with all kinds of trickery to make it still continous in memory. However you skimmed over whatt he talks about having multiple bytes wasted for the forward and back pointers... In my case the data locality is not so shit - I totally try to make linked data close, yet I had to fight with the pointer overhead too! That is for example instead of using pointers I started using 32 bit indices when there is a jump - just because its smaller. Imagine this data structure templatized with x,y,z float triplet (like Vector3), which is small enough that 32 vs 64 bit size wasted for the pointer counts HUGE - not just in memory use, but in cache use. also people seem to think "not used memory is wasted memory" however memory in use also means cache collisions more often... winblows people never understand why its better to have minimalism, than the OS trying to keep everything in memory for "what if the next moment it is needed". Page table handling becomes harder, addresses for cache lines that mix with each grow, all kinds of subtle issues. TL;DR: pOOP style really hurts performance...
@@marcossidoruk8033 Thta is good advice, was toying with the idea after seeing that in JohnBlows language supposedly supporting it on language level. However actually yesterday night I did a more nasty thing: I literally found a way to overwrite parts of the "data" as "key" and mark a topmost bit on that position: this makes me need a preprocessing step in my data structure that now I separate based on that key first into two of these local lists. This only works because I provide accessors that "undo" this operation on the fly. It also has a lot of pros and cons - will see how it fares and real use cases not just in local tests, but microbenchmark-way it looks plausible "hack".
The main reason for avoiding linked lists and preferring array-based structures is that the machine memory since dr. von Neumann and dr. Turing is an array, not a linked list.
TLDR and for grug: Make linked list inside array. Advantage of both linked list and array, less disadvantage. The magic of data structures is that you can combine them. If you allocate your linked lists with an arena allocator(pool allocator, bump allocator), then the point Stroustrup makes doesn't make sense. Linked lists aren't defined as having to use malloc (even if the do most of the time). Arrays(vectors) vs Linked lists doesn't make sense in all algorithms, as some necessitate one or the other. If you do what I suggest (arena allocator) there is cache locality, no memory fragmentation, the only draw back that remains is the memory overhead of the linked list structure.
@@ea_naseer more like store the list inside a growable array (vector in c++) so when you need to allocate a new list you grow the array and store it in the next cell.
If the size of the thing you are putting in each linked list node is comparable to or larger than 8 kB, linked lists start performing better again (though trees will do even better since for large nodes extra pointers per node are cheap). But the default page size in virtual memory is 8 kB, so fetching the next element during a linear scan will be a cache miss anyway even if you use a vector That 50-100x ratio is basically the ratio of the node size to the memory page size. Also, Rusts btreemap is _incredibly_ underrated if you plan on iterating through your keys or look up closely related objects often
There is no guarantee an array is consecutive in memory. Not in virtual memory, not in physical memory and not on disk. If you look into the malloc functionality you will discover this. You will also encounter how they store long lists of things to manage the always moving, allocating, deallocating and even resizing of the allocated page list.
Almost always vector is the rule. Unless it's C#, because C# List is equivalent to C++'s vector. The only reasons to ever use a list for storage are if it's intrusive (like intrusively linking slabs inside an allocator) or if you have a collection that's constantly changing and your objects have a highly complex move.
I think what he is missing is vertical scaling. Make the stored type 10 or 50 bytes and the overhead of the pointers will go away and the copy issue of the array will get a lot worse. Would like to see the performance graphs with larger types.
The list would probably still be worse because each node would get copied into cache with every cache miss, which is every access. The vector would get the benefits of prefetching whereas the list would stall on every indirection.
50 bytes is nowhere near big enough, it has to be at least a kilobyte. Linked lists suck because of cache misses and data localisation. Because of this the stored data in each element needs to be comparable to the page size. At that point you are probably storing objects when a vector of pointers to an object array/vector is faster.
I remember seeing this video around 10 years ago when I was in university, and having my mind blown. Very satisfying seeing the same look on Prime's face.
You could also put a linked list into an array, where you store the next and the previous index in each element. And the removed elements, which are still inside the array also form a linked list in the same array. No custom allocation, pretty cache friendly, and almost all operations are O(1) until you resize. And sorting doesn't need to copy the elements itself, but only the indices.
Doesn't support insertion between elements (or it does, but then you need to waste "tombstone" elements between each real element and also you can only do a fixed number of inserations in a position).
@@nonamehere9658 It does support insertions between elements in O(1). It only has empty elements where elements have been removed. And I only can do one insertion per index, or multiple insertions before and after indices. Indices are not related to the position in the list, though.
I suspect this is still slower than vectors. You still have to store two pointers per element which makes the dataset larger. Binary search is impossible and linear search will randomly jump through the array if it's unordered. And if the values (without pointers/indices) are only integers then sorting is guaranteed to be much slower compared to a vector. And this isn't just about cache efficiency, if you do a linear search through memory without indirection the compiler and CPU can do further optimizations.
If you do head or tail removal, you would leverage a dequeue or other FIFO like structure made for that, to still maximize cache hits, not a linked list. Pick the structure that minimize the services you need of it :) I would also say, off the record that it is bold to go and comment on a Bjarne Stroustrup presentation !
4:07, 6:48, that's bull, just keep existing allocations even if they're removed from the main list, like this: buff = all head = ... kept = NULL when a link is removed just move it to the "kept" linked list instead, whenever you need a new link you check to see if "kept" is NULL, if not take from there, otherwise you make an allocation. The links are instead offsets from the current link determined from a simple link->next = a - b which in turn is used for link = link + link->next, the "head" and "kept" will always be an offset from buff, this method gives the speed of buffer allocations + the speed of linked list management
In specific use cases, you can make using a vector even more efficient. For example, let's say you are first populating your vector with values that occur in some set of data, and then after that you are deleting from that vector any of those same values that occur in a second set of data. So after you populate the vector, then you sort it, and then when you go into the deletion procedure you do NOT move any items in the vector at all, but merely NULL those entries. No moving necessary. And then, if you want, you can create a second vector, traverse the first vector and populate the second vector with non-null values, and then deallocate the first vector, and just use the second vector.
BRAVO 👏 EXCELLENT use of raw C pointers. You DID mean to say you are populating the vector with the addresses of the values contained in the first data set correct? You said "values" but I didn't want to be a d*** and assume you were trying to assign NULL to stack values. The sort is probably better saved until after the pointers/values are copied into vector 2 though- as vector 2 should be smaller.
@@malusmundus-9605 You're correct that the vector contains addresses to the values. When you sort is a use case. Typically I'm using a binary search on the vector, then checking for matching values, which if found means it's a duplicate - and since I'm looking for unique values, I null the dupes. A different usage may not require the sort up front. (My context here is string values.) I save processing time because nothing is moved until the end when I copy the "values" (addresses) into the second vector, skipping the nulls.
@@malusmundus-9605 You're know, thinking back over the years, I think I did actually arrive at that kind of method when coding in C. But then I've used it in C++, Perl, Java, and C#.
@@steveg1961 yes I do this sort of thing when I code in C and C++. I don't why but ever since I started coding in C and using pointers I think of copy/move operations as being "fat" and I tend to avoid them whenever possible... even though in reality the difference is negligible on most modern machines. Hey how do you like Perl though? I know it's kind of an old language but I was thinking of picking it up for fun. Seems like it would be nice to have in my repertoire for parsing strings. (I like Python for this but it's really slow)
@@malusmundus-9605 Perl is great for string parsing/matching. (The regular expression system is a dream.) But the one thing about Perl that irritates me to this day is how you pass parameters to functions - klunkiest method I've ever experienced. Other than that, Perl feels concise and elegant to me. Note that I haven't used any version beyond Perl 5.
Where LLs are used, like in Linux kernel one does not do list traversal. Things have pointer right to the node they care about and add, remove/add it/other nodes right there.
I've seen the weirdest linked list interview challenges... putting numbers in a linked list digit by digit in reverse order and then "adding" the linked lists and turning the result back to a number. Never in my career did I encounter a use case for linked lists. Now I realize I've been doing math wrong the hole time...
As a general rule of thumb, if you have to look for the element in your list, linked list are usually beat by some kind of array construct, because the search is still the same assymtotic speed as the more expensive operations on array constructs, so it comes down to the exact cost inside that, and linked list can easily end up with a lot of overhead. Now, one can then try and come up with some specific cases where linked lists should be better, but in practise I do not recall an instance where a well set up array construct could not handle it more efficiently. In the end, the main use of linked lists are as a basic concept of nodes and graphs, which can then be used to construct more advanced datastructures, like trees, that do contain some more advantages. Also, it can often be useful to fist think of a problem in terms of a node datastructure, and then figure out a way to effectively turn it into an array construct of some kind, for instance you can handle a lot of cache-misses by simply implementing your linked list with indexes in a larger array instead of in general memory. In reality, most of structures like this are not used with the purpose of performance, but because they can be more readable, maintainable, and easier to work with, especially once you build several layers of such things together. In most cases I use a hashmap of some kind, it is not because I really need the special hashmap performance (hint: hashing can be expensive, and hashing cheats a bit to appear to have constant operations, by just starting with large enough constants that you would not reasonably escape to the part where it stops being constant), but because I want something to be stored based on just some hashable value, usually either because it is convinient for the problem or because it makes it more readable or quicker to write. In practise, you often get a fairly performant result by structuring the inner part of the logic in array constructs of some kind, and then you can build the other structure through nodes, maps and other convinient data-structures. This works, because a lot of the advantages of array constructs fall off once the content of said array are pointers, and directly and fully building larger problems with multiple levels combined into efficient array structures is a nightmare in maintainablity.
There appears to be two types of linked lists taught: one where a node is a separate object, and one where the node is a composite member. The latter I have only seen done with kernel programming in C, because constructs like CONTAINING_RECORD cannot be done in high-level languages such as Javascript. For example, if you have a process object (struct task_struct on linux or KPROCESS on windows), removing that process from the list does not require iterating all processes because you can directly access the list_entry member. Whereas for generic containers, the node is external and must be searched.
Funny thing is, the first linked lists I saw was when studying source code for MUDs, and they use a lot of those "intrinsic" linked lists. I thought that was how all linked lists were done. I have never used the linked lists with separate nodes in my career, and have a few use cases for the intrinsic linked lists, but mostly for gamedev.
I've done some work of trying to optimize algorithms, and this is my opinion at the moment: - The tighter you can pack your problem in memory the faster your algorithm will run. - The CPU is fast, memory is slow - It may be best to think of memory as linear-access rather than random-access. The number of passes through memory your algorithm needs determines a lot about performance. - there isn't much that can substitute for just testing different implementations of your problem. (Though the specifics of your system may influence the results a lot)
That was kinda the same argument for Unity engine to introduce ECS, they explained how OO code was slower, and using arrays was faster because objects were contiguous in memory, benefiting from CPU caches
We were taught the same thing about using lists, but the condition on these algorithms was always that there was no space limit aswell as allocation and other things that are done in the background were not considered/ irgnored, to fully focus on the efficency of the actual algorithm you'd write
Maybe I missed it, but when doing the insertion on the linked list you have the pointer to the last thing you inserted. if the next random number to insert is bigger than the last you don't start at beginning of linked list, you start at last item you inserted. I haven't thought this out, but it seems quicker
To me linked lists are pretty cool as a concept. Since they are just nodes floating about, with easily rearrangeable connections, you can avoid linear thinking that comes with arrays. One cool example is a structure with O(√(n)/2) random access time, and since it's a linked list the deletion/insertion has the same scaling time complexity.
These criticisms of linked lists are usually based on certain assumptions of bad surrounding programming practices which make linked lists perform badly. Insert IQ bell curve meme. The first assumption is that nodes will be individually dynamically allocated with e.g. malloc/free. In fact, allocating linked lists as a static array or on something like an arena allocator - potentially all at once, instead of individually per element - can eliminate not only tons of allocation overhead, but cache misses as well. If list elements are linearly allocated, cache coherence is very high, and you only have to worry about fragmentation as elements are shuffled around, contingent on your usage. The next big assumption is that pointer traversal is inherently slow. Not so. As above, cache misses are the main cause of slow pointer traversal on modern hardware. Following a pointer to a location in cache is very fast. There are a couple other related assumptions I'll point out - namely, the assumption linked lists are being used as generic containers, as non-intrusive linked lists, and that the element size is small. Generic containers are indeed not always an ideal use case for linked lists. Intrusive linked lists are also often more desirable. And the overhead is amortized somewhat when you are using larger elements, such as game entities, versus having a generic linked list of small elements like integers. If you keep all of this in mind and use linked lists tailored to your specific use case, with allocation patterns that aren't stupid, you will get dramatically better performance out of them. Even fragmentation, which slows them down significantly due to cache decoherence, usually indicates usage patterns that vectors/arrays are very poor at (lots of insertion and removal from the middle). Here is a good writeup by Ryan Fleury of the case for linked lists: open.substack.com/pub/ryanfleury/p/in-defense-of-linked-lists
I've read this article and it's a whole lot of nothing… Basically, the author says all standard implementations of linked lists are bad, lists a lot of troubles with linked lists, and says you can solve them (at least some of them like cache misses) by making them less of a linked list by clumping multiple data in a single node. Another warning flag for lined lists not being all that cool is that he advises to handcraft an implementation of each linked list based on the circumstances, use a bunch of tricky optimizations, and all that without any confirmation in a form of benchmarks that it's actually faster in a real world scenario. Apparently rather than citing sources or producing working examples, it's easier to criticize other people for not being scientifically rigorous, even though all they do is express very reasonable skepticism, e.g. to the "there is also no guarantee that nodes have any locality in memory" the author responds with "this is not a well-formed nor meaningful criticism of linked lists, because it is assuming concrete information (namely, how linked list nodes are allocated)" - NO, IT IS NOT ASSUMING! It doesn't say there is no locality, it questions it's reliability. That example is probably the biggest offense to logical reasoning I found in this article, and I wonder how much more absurd is hidden there in the convoluted reasoning, that I didn't care to completely unwrap - nor did the author care to thoroughly explain, even though he aims this to be a defense towards the beginners, who are allegedly mislead by linked-lists-critics. All in all, a whole lot of mental masturbation all just to unconsciously DISPROVE one's own thesis. Sorry for the hateful attitude, but with the little actual merit of the article and unjustifiably confident tone, it could as well be some GPT generated text.
There are two more aspects that didn't get mentioned: (1) In order to end up with a neat compact array/vector you either need to know the total amount of elements, or you'll end up having to increase your capacity if you run out of free slots. And at the very end you either have to compact your array or live with having unused slots. (2) Arrays are easy if the only thing you have to worry about are elements of the exact same type - a.k.a elements that have the exact same size. That however is usually not the case with random objects. That's the same issue relational databases have with text-based columns. You either restrict the application to strings of fixed size and by that waste unused space, or you use pointers to text content of arbitrary size.
["If list elements are linearly allocated, cache coherence is very high"] Do you perceive the stupidity of this, right? If the point is to keep the memory contiguous then linked lists , they render to be useless. You use a linked list because of the constant time insertions/removals so you will still end up in the same problems as you have with normal heap-based linked lists.
A lot depends on the language, compiler version, operating system, architecture and usage patterns. Always benchmark and look at your metrics before jumping to conclusions. At first prefer the simplest, most readable and most idiomatic version of the code. Optimize when absolutely necessary.
yeah head insertion/deletion, or things that can be effectively reduced to head insertion/deletion. if the problem you're solving is amenable to having a "cursor" (or multiple "cursors") linked list would still be better because it avoids the search you'd need for true random access. has the side benefit of letting you do thread safe access with fewer locks too, although i think in most practical cases you don't actually need every thread to be able to randomly access any element in the list at any time. if your access pattern is random you'd probably want to divide your list up among the worker threads instead, or have some kind of batch update scheme.
Cool topic and I admit that I didn't think about it like that, either... but I always arrange my data in arrays if I can. That includes the way I define objects. If "some thing" has to do the same function a lot of times, I allocate arrays for the data. The main reason why I do this is memory layout. By pre-allocating a fixed memory block and checking range all kinds of nasty dynamic bugs can be avoided because the system always knows how much memory it has left.
Every language construct is a tool. Use the appropriate tool for the job. If it's a short list and you don't expect it to grow you can get away with an array. Push comes to shove you can always realloc if you need to grow the array, and using a logarithmic scale for memory allocation can help. The advantage of a linked list or deque is insertion and deletion times are fast, you don't need to worry about indexing (if you shuffle an array any indices you had previously are now invalid). Sorting can be more efficient in a linked list than in an array (and I would say if you need fast sorting, just go with a self balancing tree structure). Disadvantages are that it takes more memory, and searching is O(n) vs O(1). If you have less than a hundred items though O(n) and O(1) are pretty close in terms of magnitude. Memory ends up more fragmented, which can be an advantage on smaller systems with less contiguous memory available. At the end of the day they are just tools, and just like you don't use a hammer to turn a screw (you could always brute force it, but that never turns out good), you should use the appropriate tool for the job.
I was hired as a 'systems administrator' and I was looking over the shoulder of the guy who was working on a custom MySQL table engine that was key to what the company did. I pointed out an indirection in a data structure the guy was using and told him that it would be a lot faster if he removed it. He did it, and I was right, and he was a little surprised and rather pleased. 🙂 I also dissected the file structure of a proprietary b-tree and wrote code in Python to traverse it and highlighted a way in which they were doing the comparisons wrong in their own code to manipulate this structure that would result in an inconsistent b-tree.. They fired me after I worked there for three months, and after they had begged me to switch from being a contractor to being an employee... it was all pretty obnoxious.
The one place I see linked lists pretty commonly in the Real World is in allocators, for example if you're managing your own free list. Since you can't control what malloc actually *gives* you as far as a memory address, it can be helpful (and way faster) to keep track of the pointers that you *are* given and their size, and serve those up first when a new allocation is requested, before asking for more memory. This is mostly a video game code thing, and this saves cycles, so while it may seem like we're splitting hairs over fractions of a second....we are. We try to avoid *actual* allocation and deallocation at all costs.
It is a really good discussion, we tend to overlook how much more performant contiguous memory can be. However, if I did not miss it, shifting of the array indices is not mentioned here. I believe it is convenient to point out that for operations that add/remove an element towards the start, list can be more performant as there will be no index shifting. If same operation is done for an array, it takes O(n) to modify the indices of elements in the case where the first element is removed. It is not easy to tell at which point array starts to compensate for shifting operations by providing more performance via contiguous memory. However, for tasks where we are more interested in first or last elements of a list, it would be wiser to work with stacks or queues. Considering this, if we are generally comparing list and arrays for manipulating elements of mostly mid-indices, array seems to be the way. Even if it performs worse in some cases, it is more measurable and predictable as stated in the video.
Except even in cases you mentioned, it's literally faster to shift thousands of elements than it is to allocate yet another memory block and follow thousands of pointers to where it's supposed to be inserted. "Can" be is irrelevant, because it's practically never. C++ has std::deque which is most likely linked list of bigger memory blocks than 1 element, and even that is often outperformed by std::vector even if inserting into the beginning of the vector. If you measure everything except std::list and it's still not fast enough, only then you check std::list, except it's more often issue with what you're doing and not choice of data structure.
@@shinobuoshino5066 I almost totally agree with this and can see following pointers would be much slower compared to arrays. I was just curious about when we generally insert/delete items at the start or end, where we are concerned with only 1 pointer. If you are neglecting this assumption, "can" be is indeed irrelevant, but only if you neglect it.
@@shinobuoshino5066 You dont need that smartass tone, especially when I already told you that I generally agree with you. I am aware this can be naively tested with some timers in the code as a quick setup, provided that there are sufficient number of elements to make timer execution negligible. I will probably do it and see myself.
I had a similar argument about 20 years ago. My "you need to find the thing before you do the thing, so it'll be trash" ran right into "it's computer science!" I quickly gave up.
I would try making a parallel index of pointers, in an array, ordered. The objects would be stored in a linked list because of faster allocations. But for search, an ordered array of pointers without rearrange would be good, depending on the size of elements.
this is so cool. i'm self teaching rn and learned about the different cpu components yesterday. abusing the cache with a simple array over a more complex structure like a tree or list with nodes sounds super interesting. i'll start to code up a hybrid in c with different node sizes where each node is an array that fits in cache. gotta do some more research tho how the diffrent cache levels interact when loading a node like that on my intel cpu. on an amd something like this could also have insane multi core performance when the infinity fabric is abused with each core dealing with a different array(node) with, tho i did not study amd cpus yet and don't have one
In Haskell, linked lists are used more than anything else, and it's common for a list to be produced by one function while it's being consumed by another, so the whole list is never in memory. For updating/inserting/deleting elements at random positions in a container, I'd use a sequence, since if you use an array, you have to copy the entire array every time you update one element.
Linked list is taught as an practical alternative to arrays to fledgling programmers, but raw linked list is actually only useful as specialized data structure.
Sometimes, it's okay to increase indirection if it makes the overall structure of your code more manageable, testable, and maintainable. If you're making a game or working in high-frequency trading, go ahead and be particular with how you allocate memory and pass data around, but if you're working with enterprise microservices handling thousands of API calls per second, a few cache misses is negligible compared to network latency, load balancing, and visibility of issues. Software is full of tradeoffs, and part of being an engineer is knowing what things to value in each situation.
this is all very valid. C/C++/Rust etc people think about this stuff all day long, and those language are designed to give you this level of control (or ability to shoot yourself in the foot). However, if you are trying to manipulate your interpreted / jit compiled dynamically typed language into using efficient memory layouts... just don't... use a different language for the bit that needs this level of performance.
No they don't. Cniles implement linked lists because they're too braindead to write a proper malloc + realloc sequence correctly. C++ programmers just use std::vector and never care again because std::vector practically never shows up in profiler. Rust has no programmers.
Linked list is better when we need to insert or remove elements in the middle of lists and we also already have a reference to element next to it, which is kinda very specific. Removing head element, like chat said, is not really strong reason (but still good one) to use linked list, since vector with ring buffer (or something like deque that usually implemented using vector) is still better. Real use case is modified linked list that usually used by functional language to do "immutable" list with less memory footprint (less copied), which is understandable.
@@ThePrimeTimeagenyeah, that sort of thing is basically the main/only way to efficiently use memory in a Pure FP/immutable paradigm - you're copying everything, but because everything is immutable that's just passing a pointer in practice, and you structure your data so that when you do change things you only have to allocate the bits that have changed. I've seen this video before and to me it seemed like a long and unfocused way of saying "use the right tool for the job and be aware of how your structures exist in memory" idk. 🤷♂️
good take! some more examples where linked lists shine are retained mode guis (like the dom), intermediate representations in compilers (like ssa), and language runtime stuff (memory allocators, sometimes work queues). (incidentally, those are the things i work on primarily. so i could definitely see linked lists being useful in other areas too!)
This also gets much faster when you’re talking about caching. Data locality is king and keeping your data compact like this enables stupid fast lookup because the memory will always be in CPU cache. This is literally the idea behind ECS (entity component system). Essentially create a compact list for each component of an entity rather than Malloc a bunch of objects with the components attached. Because the components are compacted and destructed in this way, you get more cache hits and improved performance for your simulation
When I was learning about linked lists using Java, I made them manually. Each element (data + pointers) was its own object. While debugging I dumped the pointers of the objects to the screen, and was horrified to discover that objects were scattered around the heap. The compiler knows how big each object is, why not pack them together--which would at least give you some cache hits. But that's not the way Java works. I've never tested C++ to see of the compiler randomizes the location of things in the heap. Perhaps I should. Anyway, I believe Stroustrup. Doing a linear search of a linked list is far slower than doing a linear search of a vector (or an array). With the list you not only get to fool around with pointers + data, rather than just data with the vector, but if the compiler randomizes the location of the elements you are guaranteed a cache miss for every access. Of course that'd be the compiler's fault.
if you have a pointer, or an iterator/IENumerable depending on your language, it's fast to insert around it. It's only slower if you use it the same way as you use vectors, WHICH THEY ARE NOT MEANT TO. There's a reason there are both vectors and linked lists. I sped up a lot of my programs by using linked lists. I remember having a competitive task where I had to insert N numbers before Mth character. I took the iterator to the Mth character, and just inserted to the list from that iterator. IRL applications are queues, dequeues and stacks. If I know how much I need to insert into memory, it's the array/vector that's the best. But if I have teoretically unlimited insertions, vectors quickly start to have bottlenecks, and also the memory taken by it, since there is a massive memory padding in huge vectors. (it is worth noting the built in stack, queue and dequeue are all using vectors and not linked lists tho)
If you need to keep the order of insertion, a LL with a hashmap matching values to their node (or ancestor node, in a singly LL), works just fine. I know, not the most common use case ever, but, still...
This is why you really want some kind of tree of arrays, with the nodes annotated with data that lets you look up by index rapidly. Ideally each of the nodes sits in a page, including the node annotations.
You have to do a linear search on an array if you are looking for a specific thing within the array....the only difference is the reference look-in and the cache misses that cause a slowdown, but it's still considered O(n). The bigger point is when you have data that stores a pointer to within the linked list for fast referencing, and when that object is destroyed, it can quickly delete itself from the list without having to move all other elements at a O(1) speed to search. These are also more thread safe and can be handled in multitasking efforts unlike arrays where elements can change mid-search (a linked list that changes other elements within it won't affect the read because the list doesn't shift). Linked lists can also be sorted in-place without ever moving memory position. A bridge can be formed in order to move any active throughput points into a new set without effort, which makes sorting also thread-able. Objects can move between linked lists in-place and thereby anything that has a reference to it will not have to change its reference, retaining all other data across the move. So...basically....arrays are great for....quicksort stuff? I honestly don't know the point of an array unless it is literally a math vector or a place to store sync or roundrobin data....linked lists are better, at just about everything. I'm sure you can inch out a few ms of speed, but I prefer flexibility and expandability over perceived speed (which can be improved using several links in an octree fashion anyway)
Just for reference, default Haskell strings are linked lists of 32-bit char values. The first thing you do to improve performance is replace the default string type with one from the text library. These are byte arrays containing (as of v2) utf8. That one change can give a 20x speedup in string manipulating code. If even the weird hair-shirt pure functional language can benefit from arrays over lists... how are you confused about the utility of arrays?
@@okuno54 you are comparing static to a dynamic and then saying it is the reason for its use case.....yes, as I stated in my comment 5 months ago, "unless it is vectors or sync/roundrobin" (static) then there is no need for an array. Majority of programmers will be quick to use an array because that's what they are taught, and then they start performing moves, insertions, deletions and sorts....then it becomes a dynamic type, at which point the array becomes useless in comparison. Also, bad implementation is still bad implementation....giving one as an example does not set the rule either.
To be fair, linear search insertion is not the only method. There are many contexts where you would just pass the node directly before or after where you want to insert. But these are probably not as common as needing to do the linear search, so definitely a valid point.
To my understanding of the question, if you have 100k elements and need to remove the one at 702, you can index directly to that element with a vector and it just requires a memory copy to move the elements down and that memory copy is amazingly quick because the data is contiguous in the cache. With a list, it's never going to be contiguous and possibly in a different numa and you have to traverse the list from the start to the desired node to remove the element.
Only time a linked list made sense to me was when I was storing really large state objects in a ring buffer (2000 simulation steps "forward") for visualising orbits. It was basically a huge FIFO queue and using a linked list meant I didn't have to reorder anything
even then... a properly allocated ring buffer (an array) with a head / tail pointer can work for this thing too! (maybe, i don't know the exact details)
@@ThePrimeTimeagenthis is true but you'd still have a ring array of pointers to heap allocated objects since they're so large. In the end it would probably be the same speed since you're still looking at a pointer dereference per 20kb block. Within each state block the rendering is iterating over super compact data oriented position states. So uh. Overall. Performance is the same lol
B-trees have an advantage for searching, so groups of linked lists should allow fast searching, then you have the advantage of inserting / deleting on a linked list. Like having a filing cabinet with three drawers, ages 0-25, ages 26-60, ages 61+
I don't get where the idea to use a linked list comes from for this example. Sometimes you already have a pointer to the insertion/deletion point, I feel like that's the only interesting use case to compare. Like "userptr" in some library, a dictionary matching UI element to a linked list node or something like that, no linear search involved.
Just as I expected, it all boils down to the bizzare effects of modern cached processors. I wonder if Stroustrup is still right when it comes to a list of numbers that far outstrips the CPU's cache. It could be that the CPU's loading of cache-sized chunks improves the shift-by-one process so much, that this is even true for arbitrarily-long arrays/lists. But it's also likely to be affected by concurrency since we use shared-cache multicore processors instead of true multiprocessors where each CPU has its own cache. So a completely external event such as a user clicking over to a UA-cam video while your program is running could possibly change whether vectors or linked lists are faster. In today's environment, the only way to know for sure what's faster than what is direct experimentation in realistic conditions.
I kinda thought the optimal would be somewhere in the middle. where you would map to vectors which are the size of a cache block. So you load vectors of like 10000 elements in to the cache then go to the heap periodically to fetch the next block.
It seems like what's really happening is the list algorithm doesn't use as much information as the vector algorithm: Vector memory allocation algorithm: - Understand how much data is expected. - Allocate contiguous memory for needs. List memory allocation algorithm: - Don't care about how much data is expected. - Allocate memory one chunk at a time with no demands for contiguity. I'm probably going to spend an hour dabbling with a different approach for lists: Use the number of entries needed for the list to malloc a single contiguous chunk of list nodes and initialize that contiguous set of nodes as the initial list object that will have elements randomly inserted & deleted. My hunch is that this will even the playing field until the number of deletions grows large enough to disorganize the initial list memory block. Will update with results if I get around to it.
That's interesting. I read a blog post a couple of weeks ago by a guy at Naver (a Korean company), who implemented a Lisp based on vectors instead of lists. The title was: LispE isn't exactly your granny's Lisp. And basically the guy was stating the same facts as well.
Vectors with generational indexes. You get O(1) access, O(1) insertion, O(1) deletion and retain almost all of the cache locality advantages of a regular vector.
People don't seem to understand that O(1) is not FASTER than O(n), it just SCALES better; there's a breakpoint. Imagine a "hashcode as a service" rest call vs doing it in memory: it'd still scale better, but it'd take an awfully large 'n' to get there.
I see people missing this in PRs, streaming/looping an array is often faster with small arrays than something like a `HashSet`. And in practice, its really unlikely either makes a tangible difference in most use cases.
^-- yaya
Yeah, I’ve started picking and choosing what I optimize a bit more. I don’t need to optimize a dictionary that maps an enum to a value when that enum has 30 elements, and the iteration happens once a cycle outside of the heavy loops. I need to optimize the heavy loops
That's true for any O() btw. O()s are the simplification of a function to represent the shape of its curve. Depending on the full function you can get a O(n²) that's faster in low levels than a O(n), all we can say is that eventually O(n) will be quicker but knowing this breakpoint requires testing and depending on application is pointless.
yes thats why you say O is for worst-case, when n gets big. But for small n the bigger O could very well perform better. That's when you need Omega() of your runtime
Fu fact: Brute force attacks on encrypted text is O(1). But the constant is quite large, by design.
Haskell programmers: Linked lists all the way and back.
In functional programming, singly linked lists play a special role of being immutable, and being able to be prepended to without modifying the existing list. Arrays are virtually impossible to implement as anything other than an array of lists, which throws away most of the performance benefits that one would get from using an array in an ordinary imperative language.
Lisp too, the entire language is linked lists lol
@@demolazerdepends on the dialect. Common lisp hews imperative over functional and supports arrays just fine.
Like how your pfp is just "C"
1) Yes.
2) Arrays are not impossible, Haskell has efficient arrays, they are just clumsy to work with. Linear Haskell can improve this a bit I think.
3) Immutable data structures exist and they play with cache pretty well.
4) Linked lists are cool because you can have multiple heads share portions of the list.
This is an absolutely classic talk that every software developer must see. I remember writing a lot of linked list code back in the days of DOS and it worked wonderfully... but computers changed a lot since 1990.
Also depends on what you use them for, they are quite handy when writing certain allocators
Also fun to write
Everyone knows that if you want to store a list you store them as bucket on S3 and then if you want to count the size of it you use Spark.This is all nonsense that assumes users know how to download RAM when we all know now that it's in the cloud.
My hardware is weather independent and doesn't have cache .....
memory caching architectures screwed up all that wonderful 1970s/1980s intuitive simplicity
but all of our Comp-Sci data structures 101 courses still get tought as though that were the reality
the one place I find I can return to that simplicity, though, is FPGA programming where digital circuits do their thing on a clock cycle - but even there, there is signal propagation latency to contend with, but the design and simulation tools help coping with that
There are some cases where linked lists are highly efficient. One real-world example is dynamically allocating sound channels. When you need a new channel, grab the head of the free list. When freeing a channel, link it onto the head of the free list. Both constant-time operations. And you have to traverse the active list every update, so you have your prev/next pointers to unlink if a channel has ended. And because it's all management of elements within a single array, you don't need pointers, or even to store the prev link. Just a single next index.
Another is overhead RPG sprites. They need to be rendered in order of Y position, and you can do an insertion sort every frame to relink them in proper order. Normally an insertion sort is O(n2) because you have to traverse the list to find the insertion point, but because the list is still mostly sorted from last frame, you end up linking most entries to the head of the new list so it's O(n).
Store data in a dynamic array, add a unique ID to each entry and shuffle it every time you add an entry.
Every time you need to access an entry, search for the index by randomizing the current index and check if the ID matches the ID of the entry you are looking for.
The dumb quicksort
Is your name Tom, by chance? That sounds like a Tom-level of genius-ness
Bogus-IO?
@@catholic_zoomer_bro Bogosort, right?
Critsort
_Back in my day_ cache friendly code wasn't a thing; now these newfangled caches make the case for arrays stronger than ever. Having that said, solving the problem correctly before measuring performance and making the case for struct of arrays is super important or else you're just gonna have a lot of arrays that aren't doing you any good.
"Cache friendly" was always or never a thing depending on the circles you moved in. Go ask the OG's in the demo scene, image processing and others the lengths they'd go to make sure the performance critical sections of code COULD fit in the cache. The only difference between then and now is that before you had to do some work with limited resources and now you have to A LOT MORE work with (a bit more) resources. Think VGA vs 4K, it's 29x more work just for the basic pixel count, now add "new shiny stuff", and it quickly becomes "any optimization counts". The key take is obviously "performance critical", and most people don't tread those nasty paths, unless they have to.
@@ErazerPT depends on the demo scene. 8 bit scene could always expect constant time access to RAM.
what do you mean? spatial locality and temporal locality have always been a thing, that's kinda the whole point of caches and why engineers calculated cache hits and misses to find the sweet spot of how much do you actually need for them to start making a difference in performance.
@@ErazerPTn the very old days, CPUs were in general much slower than memory access latency, so cache efficiency didn't matter, in fact, most of these CPUs didn't have caches because memory was fast enough.
Of course there were other tricky optimizations specific to each CPU that often had to do with memory layout, for example some CPUs may be extremely picky with regards to alignment, but cache efficiency in particular is a relatively new idea in the history of computing.
@@donkeyy8331That is incorrect.
I saw this a while ago and it blew my mind. It just shows that when you move theory to practice things may not work how you expect since there's a lot of complexity such as caching, predictable usage patterns, compactness, number of pointer redirections, etc. that significantly impacts performance. It's very difficult to correctly optimize a piece of code without first measuring and understanding where the bottle necks unless you already have experience in tuning.
"Premature optimization is the root of all evil" -Donald Knuth
That's not to say, optimized code is bad, but that many programmers will attempt to optimize what doesn't matter to the detriment of maintainability, while ignoring the very small section of code that matters most.
Don't optimize 90% of your code before you optimize where your code spends 90% of its time.
C++ not censured, it thought it was family friendly channel
much sorry
@@ThePrimeTimeagen hey, some people are into uncensored hardcore C++
This is a good take. I think it's good to educate people that if they need to search for a node to do something with it, then it will be expensive.
I personally rarely use linked lists when I have to search for the node I want to do things to, but you don't always need to actually search for the node - when you don't need to search is when linked lists are (more likely to be) useful.
I would presume, it really depends on what kind of linked lists, with what kind of data inside we are talking about. The mentioned example may hold true for the primitives, especially if they can be stored and processed on the stack, but I'm not sure it would be the same when applied to, for example, a list of pointers to the attributes of objects stored on the heap, where addition/deletion is performed based on the logic that requires accessing their values.
when you don't need to search?
I think searching or accessing is one of the common things one does.
A trivial example of where a linked list shines is when implementing algorithms that require queues which can grow and shrink with ease like BFS. Again, very trivial example.
access =\= search. You can access the head of a linked list like a queue, no search required
But how would you know that you won't need to search in the future?
The linked lists gain a lot for some special purposes when the elements themselves have prev and next pointers and you never have to search. But even then not always - contiguous memory really makes a difference.
Exactly. And that's the only case I would use it :) I'm mainly a C# programmer and I never used the "LinkedList" class but I have implemented my own linked list structures here and there where it made sense.
I love listening to Stroustrup talk. His pacing and tone are just fantastic when he's explaining _anything_ :)
Love that the heap is referred to as "The Garbage Collection"
Linked lists aren't good for containers, because of the linear search problem mentioned here. Linked lists are good when they're built into the data type itself. That way, if you have a reference to an object, you can just call something like appendAfter() efficiently, without searching for the insertion point
Exactly. The cursors are a critical part of the linked list data structure. Linked lists without cursors are like vectors with encrypted indices. There's just no reason to do that to yourself.
I like to use a combination of both. Make chunks of max 4K elements (or whatever size is suitable for your cache), store these chunks in a linked list. Insertion and deletion with a chunk is fast since you're shifting stuff already in the cache. Split a chunk if it gets near 4K elements into 2 chunks of 2K elements, or merge 2 chunks if they fall below 2K elements. Should be as fast as arrays, with the flexibility of linked lists.
I think that's how std::deque is implemented
en.wikipedia.org/wiki/Unrolled_linked_list
You're a genius you idiot!
@@kebien6020 I think it's also close to std::colony. Which use skip list in contrast.
And if you build a level of the same structure, with each element pointing to one of those chunks, and another up, until you have a single chunk parenting all chunks, that’s a b-tree. Literally THE fundamental index structure in RDBMS.
This was great. I remember watching Bjarne's presentation a few years ago and was blown away by how simple the problem was after he explained it but yet so damning for Linked-Lists. My favourite language is C++ and I miss it dearly (been stuck doing Kotlin for several years now).
A skip list is a clever and interesting method of speeding up lookups in sorted linked lists. It's possible, though, that cache misses still give it a disadvantage to vectors.
Most devs have dead brains. Skip lists are too difficult for them.
one additional thing is the impact of linked list and structures with lots of pointers is heap fragmentation and compaction. At some point, your system may need to compact the memory to get better performance. A lot of programmers don't think about heap fragmentation and how much if affects performance.
I'm glad that our professor emphasized the linear search bottleneck in the DS course. I still use linked list a lot, but the cases are almost always about some other abstract data structures like a stack or a hashtable (for each slot, not the whole mapping, of course). If something by design only uses head/tail removal or requires linear search anyway, linked list is the way to go. When sorting comes into the play, the first things that go through my mind would be an array, a heap or a RB tree. You'd better retake the whole data structure and algorithm curriculum if you come up with a linked list first.
If you're doing only head or only tail removal/addition I'd go to a vector or a vector in reversed order.
If you're doing both, I'd go to an array deque which is essentially just a vector treated as if wrapped in a circle. I try to avoid linked lists whenever possible.
If you're not order dependent you can always do a swap remove on the vector, get O(1) there too.
i.e. Swap with the last element of the vector and shrink the vector by 1.
I'd say deque is an optimized linked list. I'm not entirely sure what a vector wrapped in a circle would mean from a memory point of view. From a usage point of view it makes more sense, but usage of deque can be thought of as just being a linked list also
@@shiinondogewalker2809 an array deque is a specific implementation of a deque that is the same idea as a ring buffer. The head and tail 'pointers' are actually indexes into the array. If you were to remove several elements from the head, it would increment the head index by however many, leaving empty slots at the start of the array. Then if you keep adding elements to the end, the tail will eventually increment past the end of the array, so what you do is wrap back around to index 0 where that empty space is. If you remove elements from the tail, it will decrement below index 0 so then you wrap around the other way back to the end of the array (i.e. N-1). Adding and removing elements from the array deque's head can wrap the same way.
Sometimes it's easier to visualize what's happening by bending the array into a circle so that elements 0 and N-1 touch.
It's a wonderfully clever way to keep the lower overhead and contiguous layout of an array without having to shuffle the existing contents about if all you're doing is working with the head and tail.
a doubly linked list will be much faster if you're doing lots of head-insertions or concatenation. Honestly speaking, I can't remember the last time I've removed random elements from my data structure.
It’s not even just cache misses and prefetching like he mentions, but it’s also a linear chain of data dependencies. When traversing a linked list, the out of order CPU can’t do anything for you because it doesn’t know where the next node is until it gets the result of the memory load of the next pointer for the current node. So the CPU is just stuck waiting around for stuff to do while memory loads.
Basically, linked lists are the worst case data structure for allowing your CPU to do what it’s good at.
Hybrid data structures (lists with fat nodes that contain compact arrays) can give many benefits of both, though.
If the items aren’t sorted, you can just swap the to-be-deleted item with the last item in the vector and then pop that. Super fast. No linked list will beat that :)
Thinking about this.. Useful if you don't have external references to items in the list, but probably difficult if you do. I use linked lists in my game engine in some cases for this reason, deletions only occur via external pointers, no searching required, and because they're only iterated once a frame at most, (and because it runs on an uncached cpu).
This is useful specifically for unordered data structures, which in my experience... yeah haven't used unordered lists like ever lol
But you have just created the exact problem that made linked lists slow. Every single time you want to find something in your vector you need to traverse through the entire vector first.
It is all about use case. There is no best data structure to store multiple elements. If you only need to store multiple elements and dont care about order or finding specific elements, say a game where you have a list of objects to update each frame (note, not render order) then an array where you delete by replacing with the last element makes an amazing data structure.@@christopherpepin6059
@@christopherpepin6059
No? He did not. He's assuming you already know the position of the item, the index, and if you know the time complexity is constant. With linked lists even indexes need to be traversed in order to get to the correct element, and this is not what make them slower than normal lists. What makes them slower is memory spreading and being prone to cache misses, in this sense a normal vector would perform better even if you need to iterate it entirely first.
Just gonna say my thoughts as I watch:
-Okay the thing about the linear search dominating makes sense at first blush but that's just bad usage of a linked list. Generally you will couple it with a hashtable from some key to the node. Any simple LRU cache would work like this.
-The nodes don't have to be malloc'd randomly. You could malloc sizeof(Node)*1024 and then dole them out.
-Linear search is especially bad for linked lists because with an array it can prefetch elements that it knows you are going to touch. Not so with a linked list because it's all through pointers so who knows what you're going to need next. Again, use a hashtable if you need lookups into your linked list. Linked lists are good at insertion and removal, not at random access. Nothing new there.
-"Vectors are more compact than [linked] lists." No, see above about mallocing a pool of nodes.
-"Always use a vector unless you can prove a map is faster." For a trivial case of searching for a given integer in a vector of integers, a map becomes faster after about 100 elements (from my testing a while ago. Also depends on the implementation of those data structures. I did my test in C# (List vs Dictionary so take that with a grain of salt).
Linked lists are very powerful when you start using them how they are meant to be used. They are not merely "a worse vector." They afford different things.
I think there is a small implied assumption when stating that insertion/removal is faster in a list, namely that you already have a pointer to the desired location and linear search is not an issue. Just like with an array, you don't usually search the element to be removed, you just index into the array by keeping track of the "interesting" indices, with a smart use of a linked list you usually keep track of pointers to the "interesting" nodes. In this circumstance, the list is better.
A good example of when a linked list beats an array is if you get the data served randomly and you want to sort it, and you also have a HEAD that keeps track of the current Node's position.
Then if you want to sort the random data in ascending order all you do is:
Move head forwards if current node is smaller, move head backwards if current node is bigger. Insert when value before is smaller and value after is bigger. (Which is O(1)
Now at worst you're doing a linear scan, but usually you're only moving a few nodes at a time depending on distribution of numbers.
If you did an array however, even if you did this HEAD approach, you would need to to first find the number, which is a linear scan in the worst case and on average only a few array cells. But even if that type of perf is similar, once you get to actually inserting you are now having to shuffle n cells to accomodate the new element.
Exactly.
Very good to pick up this topic. Honestly data being randomly all over the place makes these JS/TS languages even more slow than the slowdown that GC uses...
Btw I have a data structure that uses a locally linked liist with all kinds of trickery to make it still continous in memory. However you skimmed over whatt he talks about having multiple bytes wasted for the forward and back pointers... In my case the data locality is not so shit - I totally try to make linked data close, yet I had to fight with the pointer overhead too! That is for example instead of using pointers I started using 32 bit indices when there is a jump - just because its smaller. Imagine this data structure templatized with x,y,z float triplet (like Vector3), which is small enough that 32 vs 64 bit size wasted for the pointer counts HUGE - not just in memory use, but in cache use.
also people seem to think "not used memory is wasted memory" however memory in use also means cache collisions more often... winblows people never understand why its better to have minimalism, than the OS trying to keep everything in memory for "what if the next moment it is needed". Page table handling becomes harder, addresses for cache lines that mix with each grow, all kinds of subtle issues.
TL;DR: pOOP style really hurts performance...
You can maybe use relative pointers of some kind to reduce the overhead even further.
@@marcossidoruk8033 Thta is good advice, was toying with the idea after seeing that in JohnBlows language supposedly supporting it on language level. However actually yesterday night I did a more nasty thing: I literally found a way to overwrite parts of the "data" as "key" and mark a topmost bit on that position: this makes me need a preprocessing step in my data structure that now I separate based on that key first into two of these local lists. This only works because I provide accessors that "undo" this operation on the fly. It also has a lot of pros and cons - will see how it fares and real use cases not just in local tests, but microbenchmark-way it looks plausible "hack".
+
there's a good video by computerphile on this topic, called "arrays vs linked lists", in which they compare this on computers with and without cache
The main reason for avoiding linked lists and preferring array-based structures is that the machine memory since dr. von Neumann and dr. Turing is an array, not a linked list.
@10:48 "If you don't understand what's happening here, this is really really good!" SUPER HELPFUL
TLDR and for grug: Make linked list inside array. Advantage of both linked list and array, less disadvantage.
The magic of data structures is that you can combine them. If you allocate your linked lists with an arena allocator(pool allocator, bump allocator), then the point Stroustrup makes doesn't make sense. Linked lists aren't defined as having to use malloc (even if the do most of the time). Arrays(vectors) vs Linked lists doesn't make sense in all algorithms, as some necessitate one or the other. If you do what I suggest (arena allocator) there is cache locality, no memory fragmentation, the only draw back that remains is the memory overhead of the linked list structure.
so an array cell that stores a list where at the end of the list it points to the next index of the array? Is that correct
@@ea_naseer more like store the list inside a growable array (vector in c++) so when you need to allocate a new list you grow the array and store it in the next cell.
I’ve done exactly this. It works super well. You can get access to any point in the list, and then traverse directly. It’s awesome.
cache locality is worse when iterating through it still
New to data structures what's an arena allocator? Do you mean having two arrays, one for the data and one for the pointers?
If the size of the thing you are putting in each linked list node is comparable to or larger than 8 kB, linked lists start performing better again (though trees will do even better since for large nodes extra pointers per node are cheap). But the default page size in virtual memory is 8 kB, so fetching the next element during a linear scan will be a cache miss anyway even if you use a vector
That 50-100x ratio is basically the ratio of the node size to the memory page size. Also, Rusts btreemap is _incredibly_ underrated if you plan on iterating through your keys or look up closely related objects often
If your elements are large, then you use a vector of structs, each struct containing the key value and a pointer to the bulk data.
There is no guarantee an array is consecutive in memory. Not in virtual memory, not in physical memory and not on disk. If you look into the malloc functionality you will discover this. You will also encounter how they store long lists of things to manage the always moving, allocating, deallocating and even resizing of the allocated page list.
I don't like "Avoid " videos, every DS has its use, for something like unordered data or streams LL are the only option.
Almost always vector is the rule. Unless it's C#, because C# List is equivalent to C++'s vector.
The only reasons to ever use a list for storage are if it's intrusive (like intrusively linking slabs inside an allocator) or if you have a collection that's constantly changing and your objects have a highly complex move.
I think what he is missing is vertical scaling. Make the stored type 10 or 50 bytes and the overhead of the pointers will go away and the copy issue of the array will get a lot worse. Would like to see the performance graphs with larger types.
wouldn't you just store a pointer for larger types? would kill the array's cache locality but apart from that it's probably best
vector
The list would probably still be worse because each node would get copied into cache with every cache miss, which is every access. The vector would get the benefits of prefetching whereas the list would stall on every indirection.
50 bytes is nowhere near big enough, it has to be at least a kilobyte. Linked lists suck because of cache misses and data localisation. Because of this the stored data in each element needs to be comparable to the page size.
At that point you are probably storing objects when a vector of pointers to an object array/vector is faster.
I remember seeing this video around 10 years ago when I was in university, and having my mind blown.
Very satisfying seeing the same look on Prime's face.
You could also put a linked list into an array, where you store the next and the previous index in each element. And the removed elements, which are still inside the array also form a linked list in the same array.
No custom allocation, pretty cache friendly, and almost all operations are O(1) until you resize. And sorting doesn't need to copy the elements itself, but only the indices.
Doesn't support insertion between elements (or it does, but then you need to waste "tombstone" elements between each real element and also you can only do a fixed number of inserations in a position).
@@nonamehere9658 It does support insertions between elements in O(1). It only has empty elements where elements have been removed.
And I only can do one insertion per index, or multiple insertions before and after indices.
Indices are not related to the position in the list, though.
I suspect this is still slower than vectors. You still have to store two pointers per element which makes the dataset larger. Binary search is impossible and linear search will randomly jump through the array if it's unordered. And if the values (without pointers/indices) are only integers then sorting is guaranteed to be much slower compared to a vector. And this isn't just about cache efficiency, if you do a linear search through memory without indirection the compiler and CPU can do further optimizations.
Just saying that I'm watching his course on frontend masters and its fenomenal! Such an awesome work Prime! I'm loving it!
If you do head or tail removal, you would leverage a dequeue or other FIFO like structure made for that, to still maximize cache hits, not a linked list. Pick the structure that minimize the services you need of it :) I would also say, off the record that it is bold to go and comment on a Bjarne Stroustrup presentation !
Linked lists - did a TON of them in University, never have I ever ONCE used one in my professional career.
4:07, 6:48, that's bull, just keep existing allocations even if they're removed from the main list, like this:
buff = all
head = ...
kept = NULL
when a link is removed just move it to the "kept" linked list instead, whenever you need a new link you check to see if "kept" is NULL, if not take from there, otherwise you make an allocation. The links are instead offsets from the current link determined from a simple link->next = a - b which in turn is used for link = link + link->next, the "head" and "kept" will always be an offset from buff, this method gives the speed of buffer allocations + the speed of linked list management
Thank you for the free algo course Mr. Prime! I greatly appreciate it! I really enjoy your teaching style :-))
In specific use cases, you can make using a vector even more efficient. For example, let's say you are first populating your vector with values that occur in some set of data, and then after that you are deleting from that vector any of those same values that occur in a second set of data. So after you populate the vector, then you sort it, and then when you go into the deletion procedure you do NOT move any items in the vector at all, but merely NULL those entries. No moving necessary. And then, if you want, you can create a second vector, traverse the first vector and populate the second vector with non-null values, and then deallocate the first vector, and just use the second vector.
BRAVO 👏 EXCELLENT use of raw C pointers. You DID mean to say you are populating the vector with the addresses of the values contained in the first data set correct? You said "values" but I didn't want to be a d*** and assume you were trying to assign NULL to stack values.
The sort is probably better saved until after the pointers/values are copied into vector 2 though- as vector 2 should be smaller.
@@malusmundus-9605 You're correct that the vector contains addresses to the values. When you sort is a use case. Typically I'm using a binary search on the vector, then checking for matching values, which if found means it's a duplicate - and since I'm looking for unique values, I null the dupes. A different usage may not require the sort up front.
(My context here is string values.)
I save processing time because nothing is moved until the end when I copy the "values" (addresses) into the second vector, skipping the nulls.
@@malusmundus-9605 You're know, thinking back over the years, I think I did actually arrive at that kind of method when coding in C. But then I've used it in C++, Perl, Java, and C#.
@@steveg1961 yes I do this sort of thing when I code in C and C++. I don't why but ever since I started coding in C and using pointers I think of copy/move operations as being "fat" and I tend to avoid them whenever possible... even though in reality the difference is negligible on most modern machines.
Hey how do you like Perl though? I know it's kind of an old language but I was thinking of picking it up for fun. Seems like it would be nice to have in my repertoire for parsing strings. (I like Python for this but it's really slow)
@@malusmundus-9605 Perl is great for string parsing/matching. (The regular expression system is a dream.) But the one thing about Perl that irritates me to this day is how you pass parameters to functions - klunkiest method I've ever experienced. Other than that, Perl feels concise and elegant to me. Note that I haven't used any version beyond Perl 5.
Wow I saw Mr Stroustrup during lunch at a conference years ago, he looked to be in his 40s. He was talking about his Tetris scores. Time sure flies 😮
I like these types of videos. Like pair learning.
Didn't realize linked list was so slow compared to arrays, but it makes sense.
Where LLs are used, like in Linux kernel one does not do list traversal. Things have pointer right to the node they care about and add, remove/add it/other nodes right there.
I've seen the weirdest linked list interview challenges... putting numbers in a linked list digit by digit in reverse order and then "adding" the linked lists and turning the result back to a number. Never in my career did I encounter a use case for linked lists. Now I realize I've been doing math wrong the hole time...
post your comment in Chatgpt and then ask: give me a solution to one of these questions
It's longer than I was hoping lol
As a general rule of thumb, if you have to look for the element in your list, linked list are usually beat by some kind of array construct, because the search is still the same assymtotic speed as the more expensive operations on array constructs, so it comes down to the exact cost inside that, and linked list can easily end up with a lot of overhead. Now, one can then try and come up with some specific cases where linked lists should be better, but in practise I do not recall an instance where a well set up array construct could not handle it more efficiently. In the end, the main use of linked lists are as a basic concept of nodes and graphs, which can then be used to construct more advanced datastructures, like trees, that do contain some more advantages. Also, it can often be useful to fist think of a problem in terms of a node datastructure, and then figure out a way to effectively turn it into an array construct of some kind, for instance you can handle a lot of cache-misses by simply implementing your linked list with indexes in a larger array instead of in general memory.
In reality, most of structures like this are not used with the purpose of performance, but because they can be more readable, maintainable, and easier to work with, especially once you build several layers of such things together. In most cases I use a hashmap of some kind, it is not because I really need the special hashmap performance (hint: hashing can be expensive, and hashing cheats a bit to appear to have constant operations, by just starting with large enough constants that you would not reasonably escape to the part where it stops being constant), but because I want something to be stored based on just some hashable value, usually either because it is convinient for the problem or because it makes it more readable or quicker to write. In practise, you often get a fairly performant result by structuring the inner part of the logic in array constructs of some kind, and then you can build the other structure through nodes, maps and other convinient data-structures. This works, because a lot of the advantages of array constructs fall off once the content of said array are pointers, and directly and fully building larger problems with multiple levels combined into efficient array structures is a nightmare in maintainablity.
There appears to be two types of linked lists taught: one where a node is a separate object, and one where the node is a composite member. The latter I have only seen done with kernel programming in C, because constructs like CONTAINING_RECORD cannot be done in high-level languages such as Javascript. For example, if you have a process object (struct task_struct on linux or KPROCESS on windows), removing that process from the list does not require iterating all processes because you can directly access the list_entry member. Whereas for generic containers, the node is external and must be searched.
Funny thing is, the first linked lists I saw was when studying source code for MUDs, and they use a lot of those "intrinsic" linked lists. I thought that was how all linked lists were done.
I have never used the linked lists with separate nodes in my career, and have a few use cases for the intrinsic linked lists, but mostly for gamedev.
I've done some work of trying to optimize algorithms, and this is my opinion at the moment:
- The tighter you can pack your problem in memory the faster your algorithm will run.
- The CPU is fast, memory is slow
- It may be best to think of memory as linear-access rather than random-access. The number of passes through memory your algorithm needs determines a lot about performance.
- there isn't much that can substitute for just testing different implementations of your problem. (Though the specifics of your system may influence the results a lot)
That was kinda the same argument for Unity engine to introduce ECS, they explained how OO code was slower, and using arrays was faster because objects were contiguous in memory, benefiting from CPU caches
yep
the difference of an array of structs or a struct of arrays
We were taught the same thing about using lists, but the condition on these algorithms was always that there was no space limit aswell as allocation and other things that are done in the background were not considered/ irgnored, to fully focus on the efficency of the actual algorithm you'd write
I’ve seen a node server handle 7000 concurrent connections. Need better engineers :)
making you memory more cache friendly is the whole point of Data Oriented Programming, I feel like more people should know this exists.
Is there a book or other learning resource you would recommend on Data Oriented Programming?
Link List vs Vector is good example of why you can't just look at Big-Oh.
Lispchads I don't feel so good!
Maybe I missed it, but when doing the insertion on the linked list you have the pointer to the last thing you inserted. if the next random number to insert is bigger than the last you don't start at beginning of linked list, you start at last item you inserted. I haven't thought this out, but it seems quicker
To me linked lists are pretty cool as a concept.
Since they are just nodes floating about, with easily rearrangeable connections, you can avoid linear thinking that comes with arrays.
One cool example is a structure with O(√(n)/2) random access time, and since it's a linked list the deletion/insertion has the same scaling time complexity.
These criticisms of linked lists are usually based on certain assumptions of bad surrounding programming practices which make linked lists perform badly. Insert IQ bell curve meme.
The first assumption is that nodes will be individually dynamically allocated with e.g. malloc/free. In fact, allocating linked lists as a static array or on something like an arena allocator - potentially all at once, instead of individually per element - can eliminate not only tons of allocation overhead, but cache misses as well. If list elements are linearly allocated, cache coherence is very high, and you only have to worry about fragmentation as elements are shuffled around, contingent on your usage.
The next big assumption is that pointer traversal is inherently slow. Not so. As above, cache misses are the main cause of slow pointer traversal on modern hardware. Following a pointer to a location in cache is very fast.
There are a couple other related assumptions I'll point out - namely, the assumption linked lists are being used as generic containers, as non-intrusive linked lists, and that the element size is small. Generic containers are indeed not always an ideal use case for linked lists. Intrusive linked lists are also often more desirable. And the overhead is amortized somewhat when you are using larger elements, such as game entities, versus having a generic linked list of small elements like integers.
If you keep all of this in mind and use linked lists tailored to your specific use case, with allocation patterns that aren't stupid, you will get dramatically better performance out of them. Even fragmentation, which slows them down significantly due to cache decoherence, usually indicates usage patterns that vectors/arrays are very poor at (lots of insertion and removal from the middle).
Here is a good writeup by Ryan Fleury of the case for linked lists:
open.substack.com/pub/ryanfleury/p/in-defense-of-linked-lists
I've read this article and it's a whole lot of nothing… Basically, the author says all standard implementations of linked lists are bad, lists a lot of troubles with linked lists, and says you can solve them (at least some of them like cache misses) by making them less of a linked list by clumping multiple data in a single node. Another warning flag for lined lists not being all that cool is that he advises to handcraft an implementation of each linked list based on the circumstances, use a bunch of tricky optimizations, and all that without any confirmation in a form of benchmarks that it's actually faster in a real world scenario. Apparently rather than citing sources or producing working examples, it's easier to criticize other people for not being scientifically rigorous, even though all they do is express very reasonable skepticism, e.g. to the "there is also no guarantee that nodes have any locality in memory" the author responds with "this is not a well-formed nor meaningful criticism of linked lists, because it is assuming concrete information (namely, how linked list nodes are allocated)" - NO, IT IS NOT ASSUMING! It doesn't say there is no locality, it questions it's reliability. That example is probably the biggest offense to logical reasoning I found in this article, and I wonder how much more absurd is hidden there in the convoluted reasoning, that I didn't care to completely unwrap - nor did the author care to thoroughly explain, even though he aims this to be a defense towards the beginners, who are allegedly mislead by linked-lists-critics.
All in all, a whole lot of mental masturbation all just to unconsciously DISPROVE one's own thesis. Sorry for the hateful attitude, but with the little actual merit of the article and unjustifiably confident tone, it could as well be some GPT generated text.
There are two more aspects that didn't get mentioned:
(1) In order to end up with a neat compact array/vector you either need to know the total amount of elements, or you'll end up having to increase your capacity if you run out of free slots. And at the very end you either have to compact your array or live with having unused slots.
(2) Arrays are easy if the only thing you have to worry about are elements of the exact same type - a.k.a elements that have the exact same size. That however is usually not the case with random objects. That's the same issue relational databases have with text-based columns. You either restrict the application to strings of fixed size and by that waste unused space, or you use pointers to text content of arbitrary size.
["If list elements are linearly allocated, cache coherence is very high"]
Do you perceive the stupidity of this, right? If the point is to keep the memory contiguous then linked lists , they render to be useless. You use a linked list because of the constant time insertions/removals so you will still end up in the same problems as you have with normal heap-based linked lists.
A lot depends on the language, compiler version, operating system, architecture and usage patterns. Always benchmark and look at your metrics before jumping to conclusions. At first prefer the simplest, most readable and most idiomatic version of the code. Optimize when absolutely necessary.
yeah head insertion/deletion, or things that can be effectively reduced to head insertion/deletion. if the problem you're solving is amenable to having a "cursor" (or multiple "cursors") linked list would still be better because it avoids the search you'd need for true random access.
has the side benefit of letting you do thread safe access with fewer locks too, although i think in most practical cases you don't actually need every thread to be able to randomly access any element in the list at any time. if your access pattern is random you'd probably want to divide your list up among the worker threads instead, or have some kind of batch update scheme.
Cool topic and I admit that I didn't think about it like that, either... but I always arrange my data in arrays if I can. That includes the way I define objects. If "some thing" has to do the same function a lot of times, I allocate arrays for the data. The main reason why I do this is memory layout. By pre-allocating a fixed memory block and checking range all kinds of nasty dynamic bugs can be avoided because the system always knows how much memory it has left.
Every language construct is a tool. Use the appropriate tool for the job. If it's a short list and you don't expect it to grow you can get away with an array. Push comes to shove you can always realloc if you need to grow the array, and using a logarithmic scale for memory allocation can help. The advantage of a linked list or deque is insertion and deletion times are fast, you don't need to worry about indexing (if you shuffle an array any indices you had previously are now invalid). Sorting can be more efficient in a linked list than in an array (and I would say if you need fast sorting, just go with a self balancing tree structure). Disadvantages are that it takes more memory, and searching is O(n) vs O(1). If you have less than a hundred items though O(n) and O(1) are pretty close in terms of magnitude. Memory ends up more fragmented, which can be an advantage on smaller systems with less contiguous memory available. At the end of the day they are just tools, and just like you don't use a hammer to turn a screw (you could always brute force it, but that never turns out good), you should use the appropriate tool for the job.
I call a linked list, a leaked list.
One mistake and you leaked memory
to be fair, if you use linked lists in the real world you probably didn't code them yourself
This is why it's good to implement singly linked lists with unique pointers and doubly linked lists with shared pointers and weak pointers
@@embedded_software And then you blow the stack on the destructor
I was hired as a 'systems administrator' and I was looking over the shoulder of the guy who was working on a custom MySQL table engine that was key to what the company did. I pointed out an indirection in a data structure the guy was using and told him that it would be a lot faster if he removed it. He did it, and I was right, and he was a little surprised and rather pleased. 🙂
I also dissected the file structure of a proprietary b-tree and wrote code in Python to traverse it and highlighted a way in which they were doing the comparisons wrong in their own code to manipulate this structure that would result in an inconsistent b-tree..
They fired me after I worked there for three months, and after they had begged me to switch from being a contractor to being an employee... it was all pretty obnoxious.
The one place I see linked lists pretty commonly in the Real World is in allocators, for example if you're managing your own free list. Since you can't control what malloc actually *gives* you as far as a memory address, it can be helpful (and way faster) to keep track of the pointers that you *are* given and their size, and serve those up first when a new allocation is requested, before asking for more memory.
This is mostly a video game code thing, and this saves cycles, so while it may seem like we're splitting hairs over fractions of a second....we are. We try to avoid *actual* allocation and deallocation at all costs.
It is a really good discussion, we tend to overlook how much more performant contiguous memory can be. However, if I did not miss it, shifting of the array indices is not mentioned here. I believe it is convenient to point out that for operations that add/remove an element towards the start, list can be more performant as there will be no index shifting. If same operation is done for an array, it takes O(n) to modify the indices of elements in the case where the first element is removed. It is not easy to tell at which point array starts to compensate for shifting operations by providing more performance via contiguous memory. However, for tasks where we are more interested in first or last elements of a list, it would be wiser to work with stacks or queues. Considering this, if we are generally comparing list and arrays for manipulating elements of mostly mid-indices, array seems to be the way. Even if it performs worse in some cases, it is more measurable and predictable as stated in the video.
Except even in cases you mentioned, it's literally faster to shift thousands of elements than it is to allocate yet another memory block and follow thousands of pointers to where it's supposed to be inserted.
"Can" be is irrelevant, because it's practically never. C++ has std::deque which is most likely linked list of bigger memory blocks than 1 element, and even that is often outperformed by std::vector even if inserting into the beginning of the vector.
If you measure everything except std::list and it's still not fast enough, only then you check std::list, except it's more often issue with what you're doing and not choice of data structure.
@@shinobuoshino5066 I almost totally agree with this and can see following pointers would be much slower compared to arrays. I was just curious about when we generally insert/delete items at the start or end, where we are concerned with only 1 pointer. If you are neglecting this assumption, "can" be is indeed irrelevant, but only if you neglect it.
@@cankuter inserting to vector at the beginning is still faster on average, you know that you can just time this in your own projects right?
@@shinobuoshino5066 You dont need that smartass tone, especially when I already told you that I generally agree with you. I am aware this can be naively tested with some timers in the code as a quick setup, provided that there are sufficient number of elements to make timer execution negligible. I will probably do it and see myself.
I had a similar argument about 20 years ago. My "you need to find the thing before you do the thing, so it'll be trash" ran right into "it's computer science!" I quickly gave up.
If you'd have shown me a picture of him and asked why I thought he was famous, I would have said he was that German cannibal.
I would try making a parallel index of pointers, in an array, ordered. The objects would be stored in a linked list because of faster allocations. But for search, an ordered array of pointers without rearrange would be good, depending on the size of elements.
this is so cool. i'm self teaching rn and learned about the different cpu components yesterday. abusing the cache with a simple array over a more complex structure like a tree or list with nodes sounds super interesting. i'll start to code up a hybrid in c with different node sizes where each node is an array that fits in cache. gotta do some more research tho how the diffrent cache levels interact when loading a node like that on my intel cpu. on an amd something like this could also have insane multi core performance when the infinity fabric is abused with each core dealing with a different array(node) with, tho i did not study amd cpus yet and don't have one
I just took a class at NYU (DSA) and they went into all of this and wayyy deeper, but this was a nice refresher lol
In Haskell, linked lists are used more than anything else, and it's common for a list to be produced by one function while it's being consumed by another, so the whole list is never in memory. For updating/inserting/deleting elements at random positions in a container, I'd use a sequence, since if you use an array, you have to copy the entire array every time you update one element.
Linked list is taught as an practical alternative to arrays to fledgling programmers, but raw linked list is actually only useful as specialized data structure.
Sometimes, it's okay to increase indirection if it makes the overall structure of your code more manageable, testable, and maintainable. If you're making a game or working in high-frequency trading, go ahead and be particular with how you allocate memory and pass data around, but if you're working with enterprise microservices handling thousands of API calls per second, a few cache misses is negligible compared to network latency, load balancing, and visibility of issues. Software is full of tradeoffs, and part of being an engineer is knowing what things to value in each situation.
this is all very valid. C/C++/Rust etc people think about this stuff all day long, and those language are designed to give you this level of control (or ability to shoot yourself in the foot).
However, if you are trying to manipulate your interpreted / jit compiled dynamically typed language into using efficient memory layouts...
just don't...
use a different language for the bit that needs this level of performance.
No they don't. Cniles implement linked lists because they're too braindead to write a proper malloc + realloc sequence correctly. C++ programmers just use std::vector and never care again because std::vector practically never shows up in profiler. Rust has no programmers.
Linked list is better when we need to insert or remove elements in the middle of lists and we also already have a reference to element next to it, which is kinda very specific.
Removing head element, like chat said, is not really strong reason (but still good one) to use linked list, since vector with ring buffer (or something like deque that usually implemented using vector) is still better.
Real use case is modified linked list that usually used by functional language to do "immutable" list with less memory footprint (less copied), which is understandable.
what you are describing is effectively LRU implementations (map -> node, node cut out and put on front / back depending on the impl)
@@ThePrimeTimeagenyeah, that sort of thing is basically the main/only way to efficiently use memory in a Pure FP/immutable paradigm - you're copying everything, but because everything is immutable that's just passing a pointer in practice, and you structure your data so that when you do change things you only have to allocate the bits that have changed.
I've seen this video before and to me it seemed like a long and unfocused way of saying "use the right tool for the job and be aware of how your structures exist in memory" idk. 🤷♂️
There are only three reasons i'd use a std::list.
Stability
Extremely large elements
Expensive to copy/move elements
good take!
some more examples where linked lists shine are retained mode guis (like the dom), intermediate representations in compilers (like ssa), and language runtime stuff (memory allocators, sometimes work queues).
(incidentally, those are the things i work on primarily. so i could definitely see linked lists being useful in other areas too!)
I thought that list were made because it was hard to have long arrays, and if you had to move it to expand the array it was costly
Goodness! Reminds me of CS classes a couple of decades ago. A bit of bitter nostalgia.
This also gets much faster when you’re talking about caching. Data locality is king and keeping your data compact like this enables stupid fast lookup because the memory will always be in CPU cache.
This is literally the idea behind ECS (entity component system). Essentially create a compact list for each component of an entity rather than Malloc a bunch of objects with the components attached. Because the components are compacted and destructed in this way, you get more cache hits and improved performance for your simulation
When I was learning about linked lists using Java, I made them manually. Each element (data + pointers) was its own object. While debugging I dumped the pointers of the objects to the screen, and was horrified to discover that objects were scattered around the heap. The compiler knows how big each object is, why not pack them together--which would at least give you some cache hits. But that's not the way Java works.
I've never tested C++ to see of the compiler randomizes the location of things in the heap. Perhaps I should.
Anyway, I believe Stroustrup. Doing a linear search of a linked list is far slower than doing a linear search of a vector (or an array). With the list you not only get to fool around with pointers + data, rather than just data with the vector, but if the compiler randomizes the location of the elements you are guaranteed a cache miss for every access. Of course that'd be the compiler's fault.
And in C++ std::vector you can preallocate space upon creation, so adding new elements will be very quick (up to certain point).
if you have a pointer, or an iterator/IENumerable depending on your language, it's fast to insert around it. It's only slower if you use it the same way as you use vectors, WHICH THEY ARE NOT MEANT TO. There's a reason there are both vectors and linked lists. I sped up a lot of my programs by using linked lists. I remember having a competitive task where I had to insert N numbers before Mth character. I took the iterator to the Mth character, and just inserted to the list from that iterator. IRL applications are queues, dequeues and stacks. If I know how much I need to insert into memory, it's the array/vector that's the best. But if I have teoretically unlimited insertions, vectors quickly start to have bottlenecks, and also the memory taken by it, since there is a massive memory padding in huge vectors. (it is worth noting the built in stack, queue and dequeue are all using vectors and not linked lists tho)
If you need to keep the order of insertion, a LL with a hashmap matching values to their node (or ancestor node, in a singly LL), works just fine.
I know, not the most common use case ever, but, still...
This is why you really want some kind of tree of arrays, with the nodes annotated with data that lets you look up by index rapidly. Ideally each of the nodes sits in a page, including the node annotations.
I love his remark about people thinking 1000 elements makes a large list. Performance experts tell you to measure everything for a reason.
You have to do a linear search on an array if you are looking for a specific thing within the array....the only difference is the reference look-in and the cache misses that cause a slowdown, but it's still considered O(n). The bigger point is when you have data that stores a pointer to within the linked list for fast referencing, and when that object is destroyed, it can quickly delete itself from the list without having to move all other elements at a O(1) speed to search. These are also more thread safe and can be handled in multitasking efforts unlike arrays where elements can change mid-search (a linked list that changes other elements within it won't affect the read because the list doesn't shift). Linked lists can also be sorted in-place without ever moving memory position. A bridge can be formed in order to move any active throughput points into a new set without effort, which makes sorting also thread-able. Objects can move between linked lists in-place and thereby anything that has a reference to it will not have to change its reference, retaining all other data across the move.
So...basically....arrays are great for....quicksort stuff? I honestly don't know the point of an array unless it is literally a math vector or a place to store sync or roundrobin data....linked lists are better, at just about everything. I'm sure you can inch out a few ms of speed, but I prefer flexibility and expandability over perceived speed (which can be improved using several links in an octree fashion anyway)
Just for reference, default Haskell strings are linked lists of 32-bit char values. The first thing you do to improve performance is replace the default string type with one from the text library. These are byte arrays containing (as of v2) utf8. That one change can give a 20x speedup in string manipulating code.
If even the weird hair-shirt pure functional language can benefit from arrays over lists... how are you confused about the utility of arrays?
@@okuno54 you are comparing static to a dynamic and then saying it is the reason for its use case.....yes, as I stated in my comment 5 months ago, "unless it is vectors or sync/roundrobin" (static) then there is no need for an array. Majority of programmers will be quick to use an array because that's what they are taught, and then they start performing moves, insertions, deletions and sorts....then it becomes a dynamic type, at which point the array becomes useless in comparison.
Also, bad implementation is still bad implementation....giving one as an example does not set the rule either.
To be fair, linear search insertion is not the only method. There are many contexts where you would just pass the node directly before or after where you want to insert. But these are probably not as common as needing to do the linear search, so definitely a valid point.
To my understanding of the question, if you have 100k elements and need to remove the one at 702, you can index directly to that element with a vector and it just requires a memory copy to move the elements down and that memory copy is amazingly quick because the data is contiguous in the cache. With a list, it's never going to be contiguous and possibly in a different numa and you have to traverse the list from the start to the desired node to remove the element.
Only time a linked list made sense to me was when I was storing really large state objects in a ring buffer (2000 simulation steps "forward") for visualising orbits. It was basically a huge FIFO queue and using a linked list meant I didn't have to reorder anything
even then... a properly allocated ring buffer (an array) with a head / tail pointer can work for this thing too! (maybe, i don't know the exact details)
@@ThePrimeTimeagenthis is true but you'd still have a ring array of pointers to heap allocated objects since they're so large. In the end it would probably be the same speed since you're still looking at a pointer dereference per 20kb block. Within each state block the rendering is iterating over super compact data oriented position states. So uh. Overall. Performance is the same lol
My dude, your course is pure gold. I very nearly spit out my coffee when you randomly chose 69420
is this the guy responsible for making cpp
shhh... don't say that outloud
B-trees have an advantage for searching, so groups of linked lists should allow fast searching, then you have the advantage of inserting / deleting on a linked list. Like having a filing cabinet with three drawers, ages 0-25, ages 26-60, ages 61+
I don't get where the idea to use a linked list comes from for this example. Sometimes you already have a pointer to the insertion/deletion point, I feel like that's the only interesting use case to compare. Like "userptr" in some library, a dictionary matching UI element to a linked list node or something like that, no linear search involved.
Just as I expected, it all boils down to the bizzare effects of modern cached processors. I wonder if Stroustrup is still right when it comes to a list of numbers that far outstrips the CPU's cache. It could be that the CPU's loading of cache-sized chunks improves the shift-by-one process so much, that this is even true for arbitrarily-long arrays/lists. But it's also likely to be affected by concurrency since we use shared-cache multicore processors instead of true multiprocessors where each CPU has its own cache. So a completely external event such as a user clicking over to a UA-cam video while your program is running could possibly change whether vectors or linked lists are faster. In today's environment, the only way to know for sure what's faster than what is direct experimentation in realistic conditions.
At 5:36 i have started screaming "B+ Trees, goddamit!". It is an unrolled linked list with a tree as a search index assistant.
I kinda thought the optimal would be somewhere in the middle. where you would map to vectors which are the size of a cache block. So you load vectors of like 10000 elements in to the cache then go to the heap periodically to fetch the next block.
"TheHeapagen" has to be my favorite. Thanks Bjarne.
7:39 I love how excited Prime became
It seems like what's really happening is the list algorithm doesn't use as much information as the vector algorithm:
Vector memory allocation algorithm:
- Understand how much data is expected.
- Allocate contiguous memory for needs.
List memory allocation algorithm:
- Don't care about how much data is expected.
- Allocate memory one chunk at a time with no demands for contiguity.
I'm probably going to spend an hour dabbling with a different approach for lists:
Use the number of entries needed for the list to malloc a single contiguous chunk of list nodes and initialize that contiguous set of nodes as the initial list object that will have elements randomly inserted & deleted.
My hunch is that this will even the playing field until the number of deletions grows large enough to disorganize the initial list memory block. Will update with results if I get around to it.
That's interesting. I read a blog post a couple of weeks ago by a guy at Naver (a Korean company), who implemented a Lisp based on vectors instead of lists. The title was: LispE isn't exactly your granny's Lisp. And basically the guy was stating the same facts as well.
Vectors with generational indexes. You get O(1) access, O(1) insertion, O(1) deletion and retain almost all of the cache locality advantages of a regular vector.
It still has the additional cost of keeping track of the generational indexes tho, but they are quite powerful
These are the videos I like, where he gets out the chalk board