I am curious if you have a histogram of arity usage. If it were me I would be fine with there being a performance cliff at some point, perhaps at two standard deviations above the mean arity. The cliff serves as a warning to the app developer that probably should be using a different approach. There has to be a limit to how much effort to spend in making bad code perform well; give that effort to make the good and average code even faster.
I don't have any histograms to share, but most apps I've looked at have a lot of low-arity concats then a long tail. But then there are those apps that generate massive concat expressions on the fly and skew the picture. 😅 I think the now integrated implementation strikes a good balance. Small expressions inline and optimize *very* well, then as things get larger regular inlining heuristics in the JIT will gracefully degrade performance without ever really falling off a cliff. More a gentle slope.
Interesting talk, thanks. If I'm generating Java code that needs to target JVMs before 23, and might generate a complex concatenation, what's a reasonable threshold to swap out the concatenation operator in favor of emitting a StringBuilder expression?
20? 😊 I'd suggest measuring. Appropriate thresholds might be different on some older JDK versions and on non-HotSpot VMs, so making it configurable is probably good.
@@ClaesRedestad Thanks, will definitely measure, but my guess was pretty close and it's nice to confirm you're seeing the blowup about where I thought it might be.
@_SG_1 yes, this was briefly mentioned near the end of this talk. While the templates feature has been pulled out for now, if/when we're redoing it will benefit from the work we presented here, increasing confidence in the runtime side of it.
One of the things I miss in java is the ability to get immutable views from a string that don't incur in allocation. If I get a substring of an already immutable string, why does it have to allocate? So, in this sense, rust's mutable and immutable string types might be a good inspiration, though arguably requiring a better API for Java.
Allocation by the JVM is surprisingly fast already and doing substring this way would probably prevent garbage collection of the original string, so idk...
Substring method used to point to the original String, this causes memory leak, as substring held the original String referenced. This was fixed in JDK 1.7. Basically, the behavior you describe is a bug to be fixed, not a desired behaviour. Search "substring memory leak".
@vasiliigulevich9202 funny how what you're arguing is a bug is a language feature in other languages. This is a false equivalent, as the bug in previous implementation isn't necessarily the only solution to the problem, so you shouldn't consider the problem as "a bug to be fixed". I'm advocating for a string view, which could be a different class, instead of reusing the same class with an added responsibility.
@@sebastianb7496 it really depends on what you're doing and the order of magnitude we're talking about. Operations that happens tens or hundreds of times per request can be significantly impact a request handling by having tiny allocations and increased GC pressure.
@@hkupty String view has the same problem, if it holds a strong reference. Arguably, language could intercept garbage collection event and make a copy of substring whenever original is collected, but that would degrade GC performance. Languages with strict lifetime control can afford non-owning references at cost of dangerous or complex lifetime management.
I hated JEP 280 from the moment I heard about it. And now I know who to blame for it. It's SO OBVIOUSLY an absolutely grotesque abuse of dynamic class generation, SO OBVIOUSLY doomed to bog down startup and waste memory and clog the code cache. It has no redeeming value whatsoever and should be rolled back as the bug it is. Invokedynamic is the hammer that makes everything look like a nail. It prompted me to MANUALLY use StringBuilder more often, knowing that trusting `+` for concatenation is going to suck. Oh what's that, they had to install a workaround in the VM to use StringBuilder after all? No surprise. PATHETIC.
Ouch! I had (almost) nothing to do with JEP 280 initially, but have picked it up and worked on the implementation after it got integrated, reducing the footprint and runtime impact over the releases. I admit I've been thinking quite a few times - and probably argued internally - that we should roll this back entirely or make it opt-in on a number of occasions. Alas. Still, there have been a number of indirect benefits come out of persevering - including a great number of optimizations in several related and unrelated areas. The recent StringBuilder workaround in JDK 23 is ugly, yes, and prompted the prototype rework that this talk concludes around. (Now integrated in the main OpenJDK repo and slated for JDK 24). On the flip side, we now also have a very straightforward path to link- and assembly-time static code generation that will allow pre-generating any string concats while safely getting the peak performance benefits.
Skipping through this, I didn't realize I know nothing about the simplest things
I am curious if you have a histogram of arity usage. If it were me I would be fine with there being a performance cliff at some point, perhaps at two standard deviations above the mean arity. The cliff serves as a warning to the app developer that probably should be using a different approach. There has to be a limit to how much effort to spend in making bad code perform well; give that effort to make the good and average code even faster.
I don't have any histograms to share, but most apps I've looked at have a lot of low-arity concats then a long tail. But then there are those apps that generate massive concat expressions on the fly and skew the picture. 😅
I think the now integrated implementation strikes a good balance. Small expressions inline and optimize *very* well, then as things get larger regular inlining heuristics in the JIT will gracefully degrade performance without ever really falling off a cliff. More a gentle slope.
Thank you!
Interesting talk, thanks. If I'm generating Java code that needs to target JVMs before 23, and might generate a complex concatenation, what's a reasonable threshold to swap out the concatenation operator in favor of emitting a StringBuilder expression?
20? 😊 I'd suggest measuring. Appropriate thresholds might be different on some older JDK versions and on non-HotSpot VMs, so making it configurable is probably good.
@@ClaesRedestad Thanks, will definitely measure, but my guess was pretty close and it's nice to confirm you're seeing the blowup about where I thought it might be.
Yo soy fan de este lenguage, mi favorito.
After years, String is still a pain to work with in Java (not counting text block)
Which languages are better?
@@zappiniThe interpolation in .NET looks amazing
hope string interpolation kinda thing would exist in java soon.
look up string templates in java 21.
I assume that String interpolation would use this "low-level" String concatenation under the hood anyway.
@_SG_1 yes, this was briefly mentioned near the end of this talk. While the templates feature has been pulled out for now, if/when we're redoing it will benefit from the work we presented here, increasing confidence in the runtime side of it.
One of the things I miss in java is the ability to get immutable views from a string that don't incur in allocation. If I get a substring of an already immutable string, why does it have to allocate? So, in this sense, rust's mutable and immutable string types might be a good inspiration, though arguably requiring a better API for Java.
Allocation by the JVM is surprisingly fast already and doing substring this way would probably prevent garbage collection of the original string, so idk...
Substring method used to point to the original String, this causes memory leak, as substring held the original String referenced. This was fixed in JDK 1.7. Basically, the behavior you describe is a bug to be fixed, not a desired behaviour.
Search "substring memory leak".
@vasiliigulevich9202 funny how what you're arguing is a bug is a language feature in other languages. This is a false equivalent, as the bug in previous implementation isn't necessarily the only solution to the problem, so you shouldn't consider the problem as "a bug to be fixed".
I'm advocating for a string view, which could be a different class, instead of reusing the same class with an added responsibility.
@@sebastianb7496 it really depends on what you're doing and the order of magnitude we're talking about. Operations that happens tens or hundreds of times per request can be significantly impact a request handling by having tiny allocations and increased GC pressure.
@@hkupty String view has the same problem, if it holds a strong reference. Arguably, language could intercept garbage collection event and make a copy of substring whenever original is collected, but that would degrade GC performance. Languages with strict lifetime control can afford non-owning references at cost of dangerous or complex lifetime management.
We want String interpolation like every other cool language
Bonjour
I hated JEP 280 from the moment I heard about it. And now I know who to blame for it. It's SO OBVIOUSLY an absolutely grotesque abuse of dynamic class generation, SO OBVIOUSLY doomed to bog down startup and waste memory and clog the code cache. It has no redeeming value whatsoever and should be rolled back as the bug it is. Invokedynamic is the hammer that makes everything look like a nail. It prompted me to MANUALLY use StringBuilder more often, knowing that trusting `+` for concatenation is going to suck. Oh what's that, they had to install a workaround in the VM to use StringBuilder after all? No surprise. PATHETIC.
Ouch! I had (almost) nothing to do with JEP 280 initially, but have picked it up and worked on the implementation after it got integrated, reducing the footprint and runtime impact over the releases. I admit I've been thinking quite a few times - and probably argued internally - that we should roll this back entirely or make it opt-in on a number of occasions. Alas. Still, there have been a number of indirect benefits come out of persevering - including a great number of optimizations in several related and unrelated areas.
The recent StringBuilder workaround in JDK 23 is ugly, yes, and prompted the prototype rework that this talk concludes around. (Now integrated in the main OpenJDK repo and slated for JDK 24).
On the flip side, we now also have a very straightforward path to link- and assembly-time static code generation that will allow pre-generating any string concats while safely getting the peak performance benefits.