Rust Tip - Into String as Function Arguments
Вставка
- Опубліковано 14 жов 2024
- Here is a quick Rust Programming tutorial showing how to type function arguments to accept &str, &String, and String (which will be moved). This is useful when the body of the function needs the full String no matter what is passed.
Help support this channel: / jeremychone
When &str or &String is passed as an argument, new string allocation occurs on .into()
When String is passed, no new allocation occurs on .into(). The String is moved to the function.
So, in a way, this technique gives allocation control to the caller.
Notes for this pattern:
1. Do not use this pattern if the function body does not need to own the String, and can get away &String or &str. Use &str in this case.
2. Some might think this pattern makes necessary allocation, but this is not the case (see video and points above). It is allocated only in the case of &str or &String are passed (which would have been anyway), and not if String is passed. The caaller has the control.
3 Downsides:
3.1 Binary size: As noted in a great comment from @JeffHanke below, the compiler will duplicate the function for each unique type passed (monomorphization). See (and vote) the comment for the binary size mitigation. For most app code, the cost should be negligible, moreover, if mitigated, compared to the ergonomic value (watch out for compiler inlining, but this can be mitigated as well). For embedded programming, this needs a little bit more scrutiny, as many other aspects of the code design anyway.
3.2 Function ergonomics could be a discussion, but it is subjective. I personally like this pattern as the method signature is clear, flexible, and efficient, as allocation will only occur if needed. But do what works for you and your team.
3.3 Unnecessary discussions about unnecessary allocations (see point 2 above).
Related external resources:
www.philipdani... (Scroll down to Into section)
rust-lang.gith... (more generic but related "caller-decides" topic, see generic paragraphs)
Related external resources:
www.philipdani... (Scroll down to Into section)
rust-lang.gith... (more generic but related "caller-decides" topic, see generic paragraphs)
Jeremy Chone:
Twitter - / jeremychone
Discord On Rust - / discord
AWESOME-APP - awesomeapp.dev - Rust Templates for building awesome applications.
Patreon - / jeremychone - Any help is a big help (for Rust educational content)
Other popular Rust Programming videos:
Quick Start Code Layout - • Rust - Simple Code Lay...
AWESOME-APP Full Overview - Rust template for building Awesome Desktop Application: • Building Awesome Deskt...
Tauri Quick Introduction (Desktop App wit Rust Programming): • Rust Tauri 1.0 - Quick...
Rust Web App tutorials series: • Rust Web App - 1/3 - D...
Rust Bevy Full Tutorial - Game Development with Rust: • Rust Bevy Full Tutoria...
Rust for Java Developers series: • Rust for Java Develope...
Playlists:
Rust For Desktop App: • JC - Rust Programming ...
Everything Rust Programming - Tutorials, Courses, Tips, Examples: • JC - Rust - Everything...
Rust Programming for Web Development: • JC - Rust Programming ...
Rust Courses: • Rust Course 2021 by th...
Furst for Java Developers: • Rust for Java Developers
AWESOME-APP ➜ awesomeapp.dev - Rust Templates for building awesome applications.
Rust AWESOME-APP GitHub - github.com/org...
Other notes:
Tool used to do the green lines. ScreenBrush on Mac App Store (Gromit seems to be the equivalent on Linux)
Edited with Davinci Resolve.
#rustprogramming #tutorial #rustlangn
One thing to be careful with when accepting trait arguments is that rust will duplicate the entire function when it specializes it for each concrete type. If you care about binary size, your function is big, and you're going to call it with lots of different types, you probably want to factor out most of the logic into a helper function and just do the .into() (or other conversion) call in the main function.
Yes, binary size is a tradeoff when impl trait as args in general.
I have not done the math, but for low argument functions, I think the tradeoff is worthwhile. But this is a fair point to consider if binary size is critical.
@@JeremyChone Is there is a different way to do this without such expensive operation? Also, it is not clear whether this can be a general approach to all things that pass string over?
Putting the binary size cost aside as I think this can be negligible in most cases.
From an allocation point of view, assuming your function needs the full String anyway, this impl Into< String> is cheaper, as it allows the caller to move the full value if it can.
So, rather to do a .to_string() no matter what (if the arg is &str), or require String no matter what (with arg: String), the impl Into allows the caller to pass the full value if it can to avoid unnecessary allocation.
Now, if your function does not need the ownership of the String to do its work, then it's probably better to accept &str.
Did I answer your question?
@@JeremyChone Yes, you did. Thank you!
@@dragonmax2000 just to add to Jeremy's answer, you can use basically the same implementation in the video with `impl AsRef` if you don't need to own the data.
This is common with functions taking an argument that implements AsRef, where you don't necessarily need to own a PathBuf, but instead just need to reference something "path-like" (for example, to read or write to a file). This would accept strings, string slices, PathBuf, Path, etc.
It's important to know that the Into will copy the string if it is not already owned. If you can afford a lifetime it's better to use &str and work with references.
In the function call using & will dereference any type to &str that implements it.
Yes, if you pass a &str or &string the into() will create a new string, if you pass a String it will not.
So, in a way, the caller has the control of new allocation or not.
`Into` may be less efficient because it requires converting the value into a `String` before processing it, which may involve allocating memory on the heap. However, it may be more convenient to use because it allows you to pass any type that can be converted into a `String`. I heard using `AsRef` are more efficient because it can work with a `&str` slice directly, without requiring any additional memory allocation.
There is no extra convertion when the argument is a String. So, no extra work.
The only downside is the binary size as the function get duplicated for each usage. See description for bin size mitigation.
This issue is addressed by `maybe-owned-trait` crate that I published a while ago.
It's important to note that you need to be very careful when specifying traits with impl arguments because that will make the trait not object safe.
Good point.
NICE! Although i had to rewatch it few times to grasp the catch. Thanks
I think the equivalent thing in C++ are the universal references, where the function parameter might be an l-value reference (that would need to be copied into an owned type) or a r-value reference (that can directly be moved into an other owned type) depending of the actual usage.
For example the equivalent C++ code for the code in the video is:
template
auto do_stuff(T&& val) {
std::string owned_val = val;
}
Really powerful ! Thanks a lot !
what is the difference between `Into` and `AsRef`? both work the same in your example
AsRef cen only get a reference &'_ str whereas Into< String> allows you to get an owned String
Into does unnecesary heap allocations in the cases you only need a &str. Use Into only when you need an owned String.
Correct, this is an important point. This technique works well when your function body need the full String of the argument. In short it gives control to the caller if the string should be moved or cloned.
Otherwise, use &str or AsRef, if you just need the reference.
i love. much more like this shorts
Thank you Jeremy.
Excellent as always.
How about `impl ToString` or `&impl ToString` as a function/method argument?
You would actually need to `impl Display`, and then you can call `to_string()`.
That approach works too, and in fact, it can take more types, but the downside is that this will require calling `to_string()`, which will always do a new string allocation.
`impl Into` will not generate a new string on `.into()` if a String is passed.
@@JeremyChone Thank you for the quick response.
How about no_std context and use of heapless::String ? Is it the same ?
Ha, I do not know the answer to this one. If anybody wants to chime in.
Thank you, it seems like overloading from C++
I generally discourage this pattern due to code multiplication. Passing "...".into() expresses the intention better and avoids multiple unnecessary code copies.
Only use this where it's extremely necessary.
I think we can just do_stuff(val:String) // no Into needed?
Yes, you can, but as shown in the video, the callers have to do the .to_string() if they have references or string slice. So by accepting into…, the caller pass what they have as long as it implements this into.
If you want the Into trait on one of your type, make sure to impl From (and not the into)
Why not &str
First, if the function body does not need to own String, yes, &str should be used.
Now, if the function body must own the String, then using impl Into has the following advantages:
- When &str or &String is passed as an argument, new string allocation occurs on .into()
- When String is passed, no new allocation occurs on .into(). The String is moved to the function.
This way, the caller does not have to call .to_string() when it wants to clone the String, but can pass the full value when it is ok to be moved.
why not do so?
fn do_stuff(val: T){}
You variant is syntactic sugar?
Yes, it’s the same. Both are good.
Nice! I will sleep tonight less idiot! 😁
oui
ah yes, “hello”.to_string()
What the hell
This literally looks like chaos… why does a string need to be a reference type??
it doesnt have to be, thats just one way to do it. Sometimes having the tools to precisely describe your intent to the compiler is very useful. Can you not imagine a situation where you want to reference a string?
Im going to add on to this because I'm a few months into Rust and about a year into self taught programming in general. Teaching builds comprehension
Basically, T (A generic symbol to represent a type), &T, and &mut T are all different types. If I own a car, and you want to take a picture of it (as a reference), you don't own my car. If you want to store your reference, it will obviously take much less space than storing the owned version that I have (the physical car). If the storage company is managed by the compiler, saying "I'm dropping off my car later" just isn't enough information. If I decide to paint my car, it invalidates the picture you have. We can get around this by writing a contract that says "Your picture will match my car until the end of this contract's lifetime".
There is still a ton of information that can be done with your reference, you can even perform some similar actions on it as the physical car. You could say get_car_model and get the same result.
The better terms of the analogy would be if your reference is the address at which the car is stored, and the contract being a guarantee the identical car will be stored there until its lifetime expires.
Not having the option to define intent as "reference to T" forces assumptions...Assumptions suck if you want a safe and efficient runtime. That is a string literal needs to be a reference type, with a static lifetime [&'static str]. That is my understanding of it at least
Because a string literal is copied by the compiler into (read-only?) data memory of the executable, and therefore is best represented as a reference, unless you want to allocate memory (which you control/own) and copy that literal over so you can now modify the characters in the copy (which .to_string() does).
I suppose order might seem like chaos to those ignorant of the choices made for them in other languages.
This is just wrong. What you did can be consider one of the worst example show case `From/Into` traits.
Can you elaborate? “impl Into” as function argument is a well known pattern in Rust. Nothing new here.
Btw, this was not about show casing ‘From/To’ traits. It was about show casing the value of impl Intoas function argument.