I improved the string class a while after making this video. Now it uses an optimization called SSO, which is short for "small string optimization." This makes the code much nicer, and small/empty strings are now much less costly to create. The worst thing about the code from this video is that even empty strings allocate space on the heap. SSO fixes this. This video also gets the award for my worst thumbnail ever. Thanks for coming to my TED Talk.
You cannot rely on realloc for good performance: you may still get worst-case quadratic performance when building up a string from smaller pieces. Your benchmarks may show an improvement over std::string because your heap isn't fragmented yet. In a long running process with lots of malloc/free, realloc will need to frequently copy the string around. To avoid that you need to track the capacity separately from the length, and increase it exponentially -- which is what std::string does. I was also going to comment about SSO, but looking at your github code you already figured it out yourself; now your string also has 32 bytes overhead :)
That was very interesting. One question though, why is the standard library string class how it is though? I mean, is it made for maximum safety or just backwards compatibility ?
I think it's backwards compatibility, and some of those features came after the original version of the standard library string, and now it's too late...
Thanks! I have my own CWString class because std::wstring can't be safely used across exe/dll boundaries. I'll have a look at your code to see tips on how to improve my own code.
I improved the string class a while after making this video. Now it uses an optimization called SSO, which is short for "small string optimization." This makes the code much nicer, and small/empty strings are now much less costly to create. The worst thing about the code from this video is that even empty strings allocate space on the heap. SSO fixes this.
This video also gets the award for my worst thumbnail ever.
Thanks for coming to my TED Talk.
You cannot rely on realloc for good performance: you may still get worst-case quadratic performance when building up a string from smaller pieces. Your benchmarks may show an improvement over std::string because your heap isn't fragmented yet. In a long running process with lots of malloc/free, realloc will need to frequently copy the string around. To avoid that you need to track the capacity separately from the length, and increase it exponentially -- which is what std::string does.
I was also going to comment about SSO, but looking at your github code you already figured it out yourself; now your string also has 32 bytes overhead :)
Absolutely. Loading and running a search tree or even a string-based SVT would make for a better benchmark.
Smart Pointers are the future!
This is some of the most interesting content I've seen in quite some time, thanks and keep it up
That was very interesting. One question though, why is the standard library string class how it is though? I mean, is it made for maximum safety or just backwards compatibility ?
I think it's backwards compatibility, and some of those features came after the original version of the standard library string, and now it's too late...
5:14 There's no temporary there since C++17
this is why i use c#
you think you could make string concatenations do nothing for empty strings, for constexprs and such?
Don’t mind the Simon up in here
He’s just gonna
uh
_yoink_
???
@@jfk_the_second means he's gonna steal the code
Copy and move constructors give me headaches....
Thanks! I have my own CWString class because std::wstring can't be safely used across exe/dll boundaries. I'll have a look at your code to see tips on how to improve my own code.
(strcpy isnt safe)
It's as safe as the programmer. Always check for errors and assume your functions callers will pass garbage.
10th comment! 😂