Splendid presentation. I totally agree that shipping a language v1.0 with optimizations that make it this hostile to the programmer is unacceptable. Personally, I'm completely okay with not having parameter reference optimization. Programmers should be aware of their allocations and copies, even if its on the stack. I find this idea of secretly eliding a copy antithetical to Zig's idea of simple, explicit, allocation aware programs. I'm interested to see how the Zig core team addresses this.
Fantastic talk. I think opt-in is a good option (I'm firmly in the "explicit is better than implicit" camp for most things, even if I acknowledge the convenience of implicit optimizations) but I have no way of knowing how much that affect the language, compiler, and optimizations. It's a tricky thing and I appreciate this talk for exposing the problem.
yeah I'm really confused I didn't even realize it existed. the examples shown are only a "bug" once you realize that the compiler will turn a T into a *const T. which changes the semantic model you have. If the language allows take parameters as pointers then when you don't take a parameter by a pointer your assumption is that it is a copy. it's sounds like saying " I'm creating this big memory and moving it here and ... ,oops, ... ohh that's wasteful" ... uhhh yeah so if you don't want to do that .. then don't do it...
Thanks, very honest talk. I now see why zig is absolutely not production ready, thus auf is a 100% no go. But love the transparency, hoping for the "how we fought off the attack of the killer features" talk.
These features feel out of place in Zig. If I pass a value, I expect to get a copy. I would prefer for these features to be removed. Trust the developer to optimize their code when copying by value becomes a performance problem. Maybe add a section to the docs explaining how copying values can be slow, show how to optimize manually and add a warning about aliasing. That would be the explicit way of dealing with that problem IMO. Edit: of course if there is a solution that keeps the simple syntax and never creates unchecked undefined behavior, then that's great, but it seems there is no such solution.
I agree. It seems antithetical to Zig's ethos of 'no hidden control flow'. I know those are different concepts but this hidden functionality. If it's causing problems, it's the programmers fault.
I tend to agree, yes. One of my favorite things in Zig is the "no hidden" approach, and this goes contrary to that. It's well intentioned but it is also contradictory to my expectations, for sure.
Yes it does seem opposed to how Zig markets and talks about itself to have potentially-damaging optimization in place that the programmer is expected to just already know about separately from explicit language semantics. It's perfectly fine behavior but it should be made obvious to the programmer in some way. I like the suggestion of Go's no PRO but have a compiler or linter warning along the lines of "hey are you sure you want to pass a 2kB object by value? This could impact performance. Maybe pass this by reference instead" instead of doing this behind my back when it has a non-zero chance of blowing up in my face.
On a modern desktop CPU, there are so many registers that passing stuff by value is cheap for things that are even 4-16 cache lines in size. A struct of say a couple of doubles is same cost to pass by value in registers as it is to pass by reference. Except that at the time of use, loading via the reference will eat cycles for nothing. Copies via registers are always cheaper. The only time when references make sense is when the stuff is large enough that loading it to copy would start evicting useful data from the data cache. So we don’t want memcpy, and don’t want to load things that are more than a couple cache lines in size if they are in memory and unlikely to be in the cache, and if the function would only access a small part of the copy in terms of cache lines. But passing by value in vector registers is cheap on platforms that support it if the memory roundtrips can be avoided. So passing large structs around by value will work as long as there’s locality of reference and optimizations are global.
I'm very hopeful that Zig can be the successor to C that it wants to be. Having written more C than I ever planned to, I can say without a doubt that Zig's approach of making what would be UB in C opt in and explicit is easily the most exciting feature of the language for me (tied with comptime :) ). It removes the dread and paranoia that I experience as a product of valid syntax and seemingly safe code producing the strangest bugs that waste my time, and erode my sanity. It makes me feel safe to pull out more stops to optimize my code when necessary instead of trying to always walk the defined behavior tightrope to avoid getting stabbed by the UB demon clinging to my ceiling that's been eyeing me ever since I opened the .c file I'm working in.
I suppose the biggest question to come out of this for me is: What degree of Zig's big performance vs other languages is due to Parameter Reference Optimization, and would removing it cause serious negative impacts to projects like Zap that are impressive for performance? It seems like the obvious thing is to remove it as an automatic optimization, make pass-by-reference the only way to get this behavior, but the presenter seemed like he had brought this up and encountered pushback.
but I find this kind of funny @7:05. there really is no bug until you realize this "optimization feature". If you think about it the code before was semantically correct until you reveal the fact that the compiler changed it to a pointer under from under your feet. This means that when you are writing zig code you can't just look at the code and understand what's happening without thinking about the optimization the compiler is going to make
This talk scared the s**t out of me tbh. There's no way I want to think about these details all the time as a developer... *EDIT*. I want to make it clear that we Software Engineers _obviously_ need to think about some implementation details, at some point this is inevitable. But having a language abstract you away from a very important detail, and then offer you _very_ bad surprises when it's too late - that's the scary part.
Tbh it's not worst than C / Cpp UB, I mean if you just have to remember that detail to be effective in the language to me that's ok, I'd much rather have one thing to remember, than 800 subtleties that you have to know by heart if you don't want to waste hours debugging C/cpp all those weird integer promotion rules, the std behaviour, which one does allocation etc, plus the mental load of macros and templates, and keywords having to be interpreted in context to their location in the code. I still think this is not great and they should definitely fix it but honestly I would take those UB any day of the week.
@@pierreollivier1 I do feel it's worse than UB because these issues are (apart from the async part where RLS is needed) purely caused by optimizations that are not as safe as they appeared, that also violate the idea of having the language be explicit - if I say a value should be passed by value and rely on that inside the function, this can magically mess me up and I might not notice. A lot of UB has well-defined rules when it applies, and there's already tons of tools to detect and deal with a large class of it, both statically and dynamically, but more importantly: The fixes are often simple. The list append example in the talk? My only idea on fixing this would be an explicit clone to a temporary inside the function, before growing, because I cannot trust the type system that by-value means by-value when it actually matters.
@@pierreollivier1 If you genuinely run into UB this often while using C++ and C, you are doing something very wrong. This feature is bad enough to kill zig entirely, and needs to be removed from the language ASAP.
@@Adowrath I have to disagree, from my intuition I feel like it would be easier to just remember one unpredictable but dramatic UB, rather than 10s of very well documented one, like think about it how many times have you encountered some really nasty bug in C just to realise its the implicit integer promotion rules, or maybe on your machine you have a logical shift instead of an arithmetic one, or maybe the function of the std send back -1 but you forgot that you are assigning it to an unsigned int and you are never catching that -1 (I know this one is very silly) but the truth is that we are all humans and despite knowing about probably all of C UB I still makes those mistakes some times, and remembering and searching where it's coming from is always an annoying experience, I believe they will be able to solve it in the future, but even if for some reasons they can't then In my opinion 1 very annoying UB is better than 40 small UB.
Result location semnatics have been present in cpp for a long time under the name of Named Return Value Optimization (NRVO), compilers have been doing for some time but it became madatory in cpp17 IIRC. its not as problematic in cpp i think so maybe zig designers can look into that.
NRVO does not apply to the braced initializers in the type of example where it becomes problematic in Zig. The same example In cpp would be a copy initializer and incur the cost of the copy.
A fantastic talk. It is a cause to appreciate the strict approach to being explicit about aliasing, moving, copying and (infamously) lifetimes taken in the language that shall not be named (I'm assuming that's what you call it, I'm only a visitor to the Zig community TBH)
Wasn’t zig selling point that there is nothing going on which you don’t see in the code? Now they tell me that it is full of footguns? Dangling pointers in the stdlib in a language that was invented AFTER rust?
I don't like parameter reference optimization, and it shouldn't be in any language. The check that would trigger it could be a warning. Return value optimization could be opt-in at the call site.
Can't PRO go the other way around? instead of optimizing copies into references for big enough types, optimize references into copies for small enough ones?..
Not really. Pointers add semantics on top of values, where as values (ideally) do not add semantics on to pointers. People don't just return things by pointers to be efficient. They often do it so that they can signal some non-local change via the state associated with it. Generally people return things by value because they are only interested in the value, which you have access to whether its a pointer or a value.
Comptime, inline loops and errdefer are all old ideas. Comptime is a bit different in Zig than comparable concepts are in other languages, but not really to such an extent that it makes a huge difference; And inline loops and errdefer have existed with other names in other programming language for decades: in C++ "inline for" recursive template expansion, in D it's just "static for"; meanwhile "defer/errdef" in either is just RAII (you can write defer/errdefer in compliant C++98, although C++11 and up allow for more convenient syntax).
RAII is like defer tied to value lifetime, it's not the same. defer/errdefer don't care about value lifetimes and can be used for other purposes other than destructors.
who said Zig is making groundbreaking stuff ? It is just Simple Language, due to this it can't do background magic like other Programming language does.
No individual feature makes a tool worth using. Its generally about the interplay between features, and agreement with the intent and priorities of the design and execution as a whole. Whatever the feature is, lisp more than likely got there first anyhow. So why doesn't anyone use it?
I don't know enough about rust to say for certain but it seems like... In a sense the problem is that they didn't create rust. Rust is very explicit copying, moving, aliasing, "borrowing" and of course (infamously) lifetimes.
@@maninalift i had on mind in that joke that they in purusit of optimization reach a place two steps behind borrow checker 😉 People trying to save C++ make similar route many times but chasing safety instead 😉 I keep fingers crossed for Zig wich have guaranteed first row as C replacement for a big start and evolve further. Rust is.... diffrent, totally new world with new laws of "programatics" but at first seems familiar but it is a trap 😅 More i know i more belive in Occaml with low-level access metaphore...
@@shrin210 issues? I thought about fact that c++ syntax is not what prevents optimalisation but nature of some things or set of things that are allowed. But I strongly keep fingers crossed for Zig anyway. It is great language with clear rules and its integration with c/c++ compiler/linker ecosystem additionally with crosscomplilation in mind is incredible almost unbelievable and sure will have place in CS books. But with set goals I clearly see zig as few orders better but compatible replacement in embedded systems, especially in cars or planes (no hidden allocation rule could and can save few lives)
Splendid presentation.
I totally agree that shipping a language v1.0 with optimizations that make it this hostile to the programmer is unacceptable.
Personally, I'm completely okay with not having parameter reference optimization. Programmers should be aware of their allocations and copies, even if its on the stack.
I find this idea of secretly eliding a copy antithetical to Zig's idea of simple, explicit, allocation aware programs.
I'm interested to see how the Zig core team addresses this.
Referenced youtube videos:
Understanding compiler optimization: ua-cam.com/video/FnGCDLhaxKU/v-deo.html
LLVM Optimization Remarks: ua-cam.com/video/qmEsx4MbKoc/v-deo.html
Tuning C++: ua-cam.com/video/nXaxk27zwlk/v-deo.html
Rust BTW
LLVM Optimization Remarks ua-cam.com/video/qmEsx4MbKoc/v-deo.html how rust solves the problem.
TOP G
"No hidden control flow" - until now!
This is my favorite heavy metal bands going into country music all over again.
Fantastic talk. I think opt-in is a good option (I'm firmly in the "explicit is better than implicit" camp for most things, even if I acknowledge the convenience of implicit optimizations) but I have no way of knowing how much that affect the language, compiler, and optimizations. It's a tricky thing and I appreciate this talk for exposing the problem.
Answer seems obvious: remove parameter reference optimization.
yeah I'm really confused I didn't even realize it existed.
the examples shown are only a "bug" once you realize that the compiler will turn a T into a *const T. which changes the semantic model you have.
If the language allows take parameters as pointers then when you don't take a parameter by a pointer your assumption is that it is a copy.
it's sounds like saying " I'm creating this big memory and moving it here and ... ,oops, ... ohh that's wasteful" ... uhhh yeah so if you don't want to do that .. then don't do it...
Thanks, very honest talk. I now see why zig is absolutely not production ready, thus auf is a 100% no go.
But love the transparency, hoping for the "how we fought off the attack of the killer features" talk.
These features feel out of place in Zig. If I pass a value, I expect to get a copy. I would prefer for these features to be removed.
Trust the developer to optimize their code when copying by value becomes a performance problem. Maybe add a section to the docs explaining how copying values can be slow, show how to optimize manually and add a warning about aliasing. That would be the explicit way of dealing with that problem IMO.
Edit: of course if there is a solution that keeps the simple syntax and never creates unchecked undefined behavior, then that's great, but it seems there is no such solution.
I agree. It seems antithetical to Zig's ethos of 'no hidden control flow'. I know those are different concepts but this hidden functionality. If it's causing problems, it's the programmers fault.
I tend to agree, yes. One of my favorite things in Zig is the "no hidden" approach, and this goes contrary to that. It's well intentioned but it is also contradictory to my expectations, for sure.
Yes it does seem opposed to how Zig markets and talks about itself to have potentially-damaging optimization in place that the programmer is expected to just already know about separately from explicit language semantics. It's perfectly fine behavior but it should be made obvious to the programmer in some way. I like the suggestion of Go's no PRO but have a compiler or linter warning along the lines of "hey are you sure you want to pass a 2kB object by value? This could impact performance. Maybe pass this by reference instead" instead of doing this behind my back when it has a non-zero chance of blowing up in my face.
On a modern desktop CPU, there are so many registers that passing stuff by value is cheap for things that are even 4-16 cache lines in size. A struct of say a couple of doubles is same cost to pass by value in registers as it is to pass by reference. Except that at the time of use, loading via the reference will eat cycles for nothing. Copies via registers are always cheaper. The only time when references make sense is when the stuff is large enough that loading it to copy would start evicting useful data from the data cache. So we don’t want memcpy, and don’t want to load things that are more than a couple cache lines in size if they are in memory and unlikely to be in the cache, and if the function would only access a small part of the copy in terms of cache lines. But passing by value in vector registers is cheap on platforms that support it if the memory roundtrips can be avoided. So passing large structs around by value will work as long as there’s locality of reference and optimizations are global.
@27:27 Do you noted how heavy RLSPRO is? It leaved a dent in desk surface 😮
I’m on #58 of ziglings..been fun so far. I feel like zig is what we actually wanted from rust, and the successor to C
odin is much much fun
@@_slier imagine finding odin content and let happy odin users that zig is much much fun... Why?!
I'm very hopeful that Zig can be the successor to C that it wants to be.
Having written more C than I ever planned to, I can say without a doubt that Zig's approach of making what would be UB in C opt in and explicit is easily the most exciting feature of the language for me (tied with comptime :) ). It removes the dread and paranoia that I experience as a product of valid syntax and seemingly safe code producing the strangest bugs that waste my time, and erode my sanity.
It makes me feel safe to pull out more stops to optimize my code when necessary instead of trying to always walk the defined behavior tightrope to avoid getting stabbed by the UB demon clinging to my ceiling that's been eyeing me ever since I opened the .c file I'm working in.
@@kanji_nakamoto imagine caring :), seriously though they're both fantastic and there is room for both
I enjoy writing in Zig, but I think Rust is still the go-to lang for me.
Thank you so much for this! It's answering some of the questions I had regarding struct return semantics!
I suppose the biggest question to come out of this for me is: What degree of Zig's big performance vs other languages is due to Parameter Reference Optimization, and would removing it cause serious negative impacts to projects like Zap that are impressive for performance? It seems like the obvious thing is to remove it as an automatic optimization, make pass-by-reference the only way to get this behavior, but the presenter seemed like he had brought this up and encountered pushback.
This was a great talk to know more on how the language works under the hood. I wonder how the maintainers will find a fix to this issue.
but I find this kind of funny @7:05.
there really is no bug until you realize this "optimization feature".
If you think about it the code before was semantically correct until you reveal the fact that the compiler changed it to a pointer under from under your feet.
This means that when you are writing zig code you can't just look at the code and understand what's happening without thinking about the optimization the compiler is going to make
This talk scared the s**t out of me tbh. There's no way I want to think about these details all the time as a developer...
*EDIT*. I want to make it clear that we Software Engineers _obviously_ need to think about some implementation details, at some point this is inevitable. But having a language abstract you away from a very important detail, and then offer you _very_ bad surprises when it's too late - that's the scary part.
Yes, it goes against everything that Zig promises to be.
Tbh it's not worst than C / Cpp UB, I mean if you just have to remember that detail to be effective in the language to me that's ok, I'd much rather have one thing to remember, than 800 subtleties that you have to know by heart if you don't want to waste hours debugging C/cpp all those weird integer promotion rules, the std behaviour, which one does allocation etc, plus the mental load of macros and templates, and keywords having to be interpreted in context to their location in the code. I still think this is not great and they should definitely fix it but honestly I would take those UB any day of the week.
@@pierreollivier1 I do feel it's worse than UB because these issues are (apart from the async part where RLS is needed) purely caused by optimizations that are not as safe as they appeared, that also violate the idea of having the language be explicit - if I say a value should be passed by value and rely on that inside the function, this can magically mess me up and I might not notice. A lot of UB has well-defined rules when it applies, and there's already tons of tools to detect and deal with a large class of it, both statically and dynamically, but more importantly: The fixes are often simple.
The list append example in the talk? My only idea on fixing this would be an explicit clone to a temporary inside the function, before growing, because I cannot trust the type system that by-value means by-value when it actually matters.
@@pierreollivier1 If you genuinely run into UB this often while using C++ and C, you are doing something very wrong. This feature is bad enough to kill zig entirely, and needs to be removed from the language ASAP.
@@Adowrath I have to disagree, from my intuition I feel like it would be easier to just remember one unpredictable but dramatic UB, rather than 10s of very well documented one, like think about it how many times have you encountered some really nasty bug in C just to realise its the implicit integer promotion rules, or maybe on your machine you have a logical shift instead of an arithmetic one, or maybe the function of the std send back -1 but you forgot that you are assigning it to an unsigned int and you are never catching that -1 (I know this one is very silly) but the truth is that we are all humans and despite knowing about probably all of C UB I still makes those mistakes some times, and remembering and searching where it's coming from is always an annoying experience, I believe they will be able to solve it in the future, but even if for some reasons they can't then In my opinion 1 very annoying UB is better than 40 small UB.
when would these optimizations actually be useful? like for vectors or something?
Result location semnatics have been present in cpp for a long time under the name of Named Return Value Optimization (NRVO), compilers have been doing for some time but it became madatory in cpp17 IIRC. its not as problematic in cpp i think so maybe zig designers can look into that.
NRVO does not apply to the braced initializers in the type of example where it becomes problematic in Zig. The same example In cpp would be a copy initializer and incur the cost of the copy.
That was literally said in the video
@22:40 if the function ( such as init ) only uses the single pointer is it automatically noalias?
A fantastic talk. It is a cause to appreciate the strict approach to being explicit about aliasing, moving, copying and (infamously) lifetimes taken in the language that shall not be named (I'm assuming that's what you call it, I'm only a visitor to the Zig community TBH)
Oh my gidh!
A lot of regressions ?
My dight !😢🎉
Zig dream...
Wasn’t zig selling point that there is nothing going on which you don’t see in the code? Now they tell me that it is full of footguns? Dangling pointers in the stdlib in a language that was invented AFTER rust?
Zig isn't a product, and you didn't pay anything for it. So fine, go get your money back i guess. Its a in development. It has bugs. These are bugs.
M... I hope lang designers can find an elegant solution without add too much complexity
I don't like parameter reference optimization, and it shouldn't be in any language. The check that would trigger it could be a warning. Return value optimization could be opt-in at the call site.
jeez I love zig but goddamn this was depressing.
It'll get fixed before 1.0 (ETA: at least 5 years™)
Nobody said it would be easy!
I hope in 2050 zig will be production ready.
Can't PRO go the other way around? instead of optimizing copies into references for big enough types, optimize references into copies for small enough ones?..
Not really. Pointers add semantics on top of values, where as values (ideally) do not add semantics on to pointers. People don't just return things by pointers to be efficient. They often do it so that they can signal some non-local change via the state associated with it. Generally people return things by value because they are only interested in the value, which you have access to whether its a pointer or a value.
why dumb ads, why
Comptime, inline loops and errdefer are all old ideas. Comptime is a bit different in Zig than comparable concepts are in other languages, but not really to such an extent that it makes a huge difference; And inline loops and errdefer have existed with other names in other programming language for decades: in C++ "inline for" recursive template expansion, in D it's just "static for"; meanwhile "defer/errdef" in either is just RAII (you can write defer/errdefer in compliant C++98, although C++11 and up allow for more convenient syntax).
RAII is like defer tied to value lifetime, it's not the same. defer/errdefer don't care about value lifetimes and can be used for other purposes other than destructors.
who said Zig is making groundbreaking stuff ? It is just Simple Language, due to this it can't do background magic like other Programming language does.
No individual feature makes a tool worth using. Its generally about the interplay between features, and agreement with the intent and priorities of the design and execution as a whole.
Whatever the feature is, lisp more than likely got there first anyhow. So why doesn't anyone use it?
And then we realized we try to recreate Rust... ;)
I don't know enough about rust to say for certain but it seems like...
In a sense the problem is that they didn't create rust. Rust is very explicit copying, moving, aliasing, "borrowing" and of course (infamously) lifetimes.
@@maninalift i had on mind in that joke that they in purusit of optimization reach a place two steps behind borrow checker 😉 People trying to save C++ make similar route many times but chasing safety instead 😉
I keep fingers crossed for Zig wich have guaranteed first row as C replacement for a big start and evolve further.
Rust is.... diffrent, totally new world with new laws of "programatics" but at first seems familiar but it is a trap 😅
More i know i more belive in Occaml with low-level access metaphore...
And the we realized that...by removing garbage syntax not enough to outsmart c++ compiler authors ;)
Can you explain, how much and what issues does this create ?
@@shrin210 issues? I thought about fact that c++ syntax is not what prevents optimalisation but nature of some things or set of things that are allowed.
But I strongly keep fingers crossed for Zig anyway.
It is great language with clear rules and its integration with c/c++ compiler/linker ecosystem additionally with crosscomplilation in mind is incredible almost unbelievable and sure will have place in CS books.
But with set goals I clearly see zig as few orders better but compatible replacement in embedded systems, especially in cars or planes (no hidden allocation rule could and can save few lives)