An Optimization That Is Impossible In Rust

Поділитися
Вставка
  • Опубліковано 1 січ 2025

КОМЕНТАРІ • 570

  • @SaHaRaSquad
    @SaHaRaSquad 3 місяці тому +257

    There are at least two Rust libraries with this exact optimization, for strings and vectors respectively. Also Rust lets you embed assembly directly in the code, in case you need one of the few niche things not supported by unsafe blocks. Some things may not lead to "clean" code but anyone who says something isn't possible in Rust probably never checked.

    • @Ornithopter470
      @Ornithopter470 3 місяці тому +22

      All computable problems can be handled by Rust. Same is true of JavaScript. And every Turing complete language.

    • @ClimateChangeDoesntBargain
      @ClimateChangeDoesntBargain 3 місяці тому

      @@Ornithopter470 try doing the same with javascript

    • @krumbergify
      @krumbergify 3 місяці тому +10

      Exactly. You can write a super-optimized datastructure using inline assembly or regular unsafe code and give it a safe API.

    • @comradepeter87
      @comradepeter87 3 місяці тому +53

      @@Ornithopter470 I agree, but this isn't a computable problem, this is a memory layout problem. Unless ofcourse, you first simulate memory in your language and then do this, then this is possible in all languages. (Example: You can't do this in JavaScript, simply because you can't manipulate memory).

    • @零云-u7e
      @零云-u7e 3 місяці тому

      ​@@comradepeter87ya, this is what I thought about strings. It was an immediate "oh" feeling, getting limited. I have a pointers in C book. Didn't stop to nail it down, but it's feature set of a language that does this advantage vs disadvantage where I cringe on having to do code tricks or a secondary library. I wanted to do this thing, not realizing what I stepped into because I'm still learning Rust, but annoyed by it, strings in general. Now it has me thinking about embedded optimizations.

  • @rumplstiltztinkerstein
    @rumplstiltztinkerstein 3 місяці тому +120

    Hey Prime. A PhantomData is simply a type marker for the compiler. Let's say we have a trait that has functions that work with a certain generic type D. We want to apply that trait to a new Struct we created. But that struct doesn't have the type D. Like this:
    pub struct Something{
    ...
    }
    The struct has the generic marker D, but it doesn't have any of its attributes that follow the requirements of D. This will confuse the compiler because it will not be sure of how to apply the trait rules. In that case we create a PhantomData that implements those rules:
    pub struct Something {
    PhantomData,
    ....
    }
    PhantomData is just to fulfill the rules of the compiler, it will not exist in the compiled source code.
    Edit: Here is an example of a use for this. I once created a Worker struct for running custom functions in multi-threaded environments. It requires a trait for the parameter of the function, and another for the function it is going to run. It stores the parameters of the function, so that is okay. It doesn't store any attribute that implement the trait of the function it is going to execute. Due to that, I needed to implement a PhantomData for the function it is going to execute.
    Edit: For those unfamiliar with Rust. Traits are interfaces. Or, in simpler terms, a "class" with only functions. A generic is a "type that implement ("inherit") a certain trait ("Interface").

    • @williamdrum9899
      @williamdrum9899 3 місяці тому +2

      I don't get it

    • @StingSting844
      @StingSting844 3 місяці тому +18

      @@rumplstiltztinkerstein rust will never beat the complexity allegations

    • @rumplstiltztinkerstein
      @rumplstiltztinkerstein 3 місяці тому +7

      @@williamdrum9899 understanding traits and generics is required to understand PhantomData 😕

    • @LtdJorge
      @LtdJorge 3 місяці тому +16

      ​@@williamdrum9899 it doesn't have a size or anything. It's just used because you want to put a generic bound in a function or trait definition, but Rust doesn't let you put a bound on a type and not use that type. So if you define struct Vec { ... }, you have to use T inside the struct. That's why you put a field like phantom: PhantomData, and now the T bound is in use.
      The phantom field is a zero sized type, you just initialize it like phantom: PhantomData, and that's it. It has no use apart from giving use to the bound.
      There are more complex things that can be done with PhantomData, like using a PhantomData or PhantomData to denote that the struct shouldn't be Send, like Rc. But PhantomData always has a size of 0.

    • @rumplstiltztinkerstein
      @rumplstiltztinkerstein 3 місяці тому +1

      @@LtdJorge 👏👏👏

  • @Aras14
    @Aras14 3 місяці тому +7

    In this case PhantomData is there mostly because of variance, the struct is now covariant to T because it is naturally covariant to PhantomData. To explain variance:
    There exist three types if variance.
    Covariant: 'a if T: 'a // a restriction in T leads to a restriction in Covariant
    Invariant if T: 'a // a restriction in T does nothing to Invariant
    'a: Contravariant if T: 'a // a restriction in T loosens restrictions in Contravariant
    Examples on how to make those types:
    struct Covariant(T);
    struct Invariant(*T);
    struct Contravariant(fn()->T);

  • @CjqNslXUcM
    @CjqNslXUcM 3 місяці тому +61

    Liking the return keyword is such a procedural-brained opinion. Everything is an expression is so clean. We urgently need to put prime in another OCaml reeducation camp.

    • @yjlom
      @yjlom 3 місяці тому +2

      right? return is just goto except it doesn't look as ugly to most people
      (to be clear I think goto can be useful, but it should be the last option you think about)

    • @stretch8390
      @stretch8390 3 місяці тому

      ​@@yjlom interesting, had never thought of return being like goto.

    • @lightning_11
      @lightning_11 3 місяці тому

      Looks like the only solution that pleases everyone is to automatically return the result of the final statement in a function to prevent the need for return.

    • @realtimberstalker
      @realtimberstalker 3 місяці тому +1

      @@yjlomReturn isn’t like goto it is goto. It just that the destination is set to the code immediately after the function call, instead of being dynamic.

    • @MrMeltdown
      @MrMeltdown 3 місяці тому

      @@stretch8390it’s computed from the caller… so it also indicates when defer or local stack variables get destroyed… so it’s not really like goto at all…

  • @cmilkau
    @cmilkau 3 місяці тому +4

    It's kind of funny that a German has to tell you this :P There is no need for a bit to discern the two different layouts for short and long strings, because both layouts share the length field, and the length of the string determines the layout. If length ≤ 12, it's the short string layout, else it's the long string layout.

  • @Baltasarmk
    @Baltasarmk 3 місяці тому +162

    Primogen: I do not comment on politics
    Also Primogen: German strings!

    • @highdefinist9697
      @highdefinist9697 3 місяці тому +4

      I can understand it somewhat in case of some more complicated names, but "Neumann-Freitag-Strings" would have been pronounceable even for your average clichee lazy American. No need to call it "German strings" (although to be fair, it is also kind of funny to call them that, and might help with clickbaiting and all that).

    • @Nicoder6884
      @Nicoder6884 3 місяці тому +6

      @@highdefinist9697 The article called them German strings. Not prime's fault.

    • @highdefinist9697
      @highdefinist9697 3 місяці тому

      @@Nicoder6884 Well... not entirely at least, yeah.

    • @snippletrap
      @snippletrap 3 місяці тому +8

      How is that political

    • @pattyspanker8955
      @pattyspanker8955 3 місяці тому

      ​@@snippletrap N word? But not that n word. The fascist kind.

  • @robinmoussu
    @robinmoussu 3 місяці тому +30

    Compared to implementing short string optimisation in C++ it doesn’t seems that much more complicated. There is some noise because of PhantomData, and the fact that C++ pointer are thin by default while Rust pointer (slices) are fat by default which requires a bit more casting. In C++ you still need to take care of the 5 constructor (copy, move, default), raw union, pointer casting and all that. I do agree that here Rust is a bit more complicated, but not by much.

    • @stdprocedure
      @stdprocedure 3 місяці тому

      I agree. Both have trade offs

    • @stevenhe3462
      @stevenhe3462 3 місяці тому +3

      The PhantomData asserts that the struct is covariant to its generic argument, so it does give more type-level guarantee.
      The casting is unnecessary-you can just use raw pointers and allocate bytes directly. It does give better type safety, though.

  • @bearwolffish
    @bearwolffish 3 місяці тому +24

    Shit got C++ template-ish real quick.
    Also both C++ and Rust are really good languages.

    • @asdf1616
      @asdf1616 3 місяці тому +7

      C++ templates are awful. Rust generics have parametric polymorphism like ML languages. C++ templates are more similar to Rust macros, because they are expanded for each type that they are applied to and they only fail when expanded.

    • @ITSecNEO
      @ITSecNEO 3 місяці тому +1

      @@asdf1616 You forgot to add that C++ is awful in general. Indeed, it's such a messy language that nearly every company has its own Style Guide with a own C++ subset. Google for example has not released a Rust Style Guide yet, the answer from a stuff member was that there is no need for one since Clippy and rustfmt are already there.

    • @KayOScode
      @KayOScode 3 місяці тому +11

      @@asdf1616 how are templates awful. It’s extraordinarily powerful

    • @TheSulross
      @TheSulross 3 місяці тому +31

      Saying positive remarks about other programming languages is at the top of the no-no list in the Rust Zealotry handbook

    • @TheSulross
      @TheSulross 3 місяці тому +15

      @@asdf1616 the way C++ templates work (vs, say, type erasure style generics) allows compiler visibility into all the code in respect to the types and can do deeper optimization. And template features allow for more optimal coding such as, say, constructing objects in place when adding said object to a container.

  • @eemanemm
    @eemanemm 3 місяці тому +11

    Would be nice to see the asm generated by this , compared to a C implementation of the same...

    • @ElementaryWatson-123
      @ElementaryWatson-123 3 місяці тому +2

      Assembler is generated by the back end. LLVM is written in C++, so as long as front end isn't screwing up, the result should be the same.

    • @LtdJorge
      @LtdJorge 3 місяці тому +6

      @@ElementaryWatson-123 LLVM being written in C++ doesn't have much to do with that. The important thing is not the language it's written in, but what it emits as output. As you said, if both frontends emit the same IR, then LLVM will emit the same machine code.

  • @monad_tcp
    @monad_tcp 3 місяці тому +15

    27:46 that's why its german, ITS OVERENGINEERED

  • @zactron1997
    @zactron1997 3 місяці тому +29

    Here's the interesting thing about Rust compared to Zig, C, C++, etc.: you can be very productive in Rust without ever learning how to do the hard things. I didn't need to use the unsafe keyword until my third year of working in Rust, after shipping several pieces of tooling to my team.
    So yeah, if you're the kind of 10× Savant that makes everything from scratch I could see why you might hate Rust, since it does put down more rules than other languages. But as a user of Rust who only works on a handful of libraries, it's amazing. As Prime has said before, writing Axum sucks, but using Axum is amazing. Likewise for Serde, Rust Iterators, etc. etc.

    • @stevenhe3462
      @stevenhe3462 3 місяці тому +1

      Unsafe is not the hard thing. The hard thing is Trait, which enables OOP, which enables building abstraction hell.

    • @zactron1997
      @zactron1997 3 місяці тому +4

      @@stevenhe3462 Rust doesn't give you full OOP for abstraction hell, since there's no inheritance. All you have is composition (structs in structs, same as C) and traits (interfaces). Look at the Clone trait for example, there's no abstraction hell here, it's just a label and consistent function name for deeply cloning a piece of data. In fact, most traits are so simple and explicit in how they work that the compiler can just implement them for you via derive.

    • @josephmellor7641
      @josephmellor7641 3 місяці тому +13

      The big problem with C++ is that everyone throws the hard things at you immediately because it's taught like a mix of C and Java. I've written entire projects where I don't do any manual memory management because I can just use vector, unordered_map, etc.

    • @zactron1997
      @zactron1997 3 місяці тому

      @@josephmellor7641 Which is also why I find it so frustrating that C++ has such massive footguns built into its type system. Mutex locks not being a container, use-after-move being valid, etc.
      The concept behind C++ is great, that's why I like Rust so much. It's what I like from C++ with modern restrictions. Totally reasonable to not have a borrow checker invented in the 80s, but it's 2024 now, our standards should be higher.

    • @stretch8390
      @stretch8390 3 місяці тому

      ​@@josephmellor7641even if you don't manually manage memory in cpp it's still so dang easy to segfault in that language it pains me.

  • @haniyasu8236
    @haniyasu8236 3 місяці тому

    Imma be honest, I rolled my eyes a little bit when I saw the guy say this was impossible in Rust. Not only is it possible, it's easy, and the end result fits into the language way better than it would have in C or C++. Only reason it got close to bad was because this wasn't _just_ strings, it was pretty much a re-implementation of Arc _ontop of_ the small-string optimization.

    • @anotherelvis
      @anotherelvis 3 місяці тому

      The blogpost describes how he solved the problem with unsafe code Rust and published a library.

  • @guckstift1081
    @guckstift1081 3 місяці тому +3

    I thought German strings are like: "Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz"

  • @Yotanido
    @Yotanido 3 місяці тому +16

    If you add a reference to a struct, the struct gets a lifetime, and you suddenly need lifetimes all over the place. It's true.
    I'm hesitant to say this is a problem, though. Every single time I thought it was a good idea to add a reference to a struct, it turned out to be a bad idea and I undid everything.
    There are exceptions, of course, but unless the struct is designed with a lifetime in mind from the start (for example tree sitter's TreeCursor) it's probably not the correct approach in the first place.

    • @sigmundwong2489
      @sigmundwong2489 3 місяці тому

      You may be interested to know that Graydon Hoare originally envisioned & as a "second-class parameter passing mode, and "not a first-class part of the type system" (paraphrasing). That would mean that you could not even put it in a struct if you wanted. Also, AFAICT, Mojo does things exactly this way.

    • @christopher8641
      @christopher8641 3 місяці тому

      Just want everyone to know that a struct that is generic over T can hold a reference. Notice how the definition of a Vec doesn't have lifetimes. But, for some reason I can have a Vec of references. I feel like everyone with strong opinions about rust has just literally never used it

    • @GoodWill-s8j
      @GoodWill-s8j 3 місяці тому

      You're wasting time and brain. Just use C++!

    • @ElementaryWatson-123
      @ElementaryWatson-123 3 місяці тому

      References in structs can be very useful. As an example a tie() in C++ creates a tuple of references, so one line "std::tie(a,b,c) = std::tie(d,e,f);" provide an assignment of multiple variables.

  • @porky1118
    @porky1118 3 місяці тому +2

    34:30 Lifetimes work well in Rust. The trick is just not to use them at all if possible. Or only in structs.
    And when you use a lifetime in a struct, there has to be a good reason for it.
    The best reason I came across is when you want to create some temporary type T, which has mutable access to some other type O, forbidding the user to access the value of type O as long as a value of type T exists.
    In my case it was some edit struct T, which enables editing O safely. Because after editing, the states have to be updated.
    So the safe way to edit is using a value of type T. And the drop method of T updates the states of O. And only after T has been dropped, the user can access O directly again, which will be in a safe state again.

  • @whotherelack-yi5uw
    @whotherelack-yi5uw 3 місяці тому +10

    The whole article is just kind of wrestling with the semantics of the language lol

    • @Reydriel
      @Reydriel 2 дні тому

      If you dive into the implementation code of the various data types you'd commonly use in std::collections, it will all look like this (even worse actually) lol
      It's what you have to do in Rust if you want to make a library that is as generic as possible with as many useful traits as possible

  • @pif5023
    @pif5023 3 місяці тому +21

    This is the kind of stuff I want to mess with with low level langs, I am so tempted to switch to Zig or back to C instead to continue with advanced Rust. The indecision is excruciating. I guess how I will feel coming back to Rust in a few months will tell what I’ll do.

    • @funkdefied1
      @funkdefied1 3 місяці тому +30

      Clickbait title (the article’s). This optimization is possible. It just doesn’t happen to be implemented in the standard library

    • @Gigasharik5
      @Gigasharik5 3 місяці тому +4

      I love zig but it never ceases to bother me that zig today is a very immature language. They have yet to rewrite the backend without LLVM

    • @hanifarroisimukhlis5989
      @hanifarroisimukhlis5989 3 місяці тому +2

      @@Gigasharik5 Yep, i bet it'll took at least another year for Zig to be production-ready. Especially with async and proper functional template rather than imperative comptime.

    • @sbdnsngdsnsns31312
      @sbdnsngdsnsns31312 3 місяці тому

      All of those libraries use unsafe.

    • @anotherelvis
      @anotherelvis 3 місяці тому +1

      @@SaHaRaSquad The blogpost author also wrote a crate that solves the problem.

  • @hanifarroisimukhlis5989
    @hanifarroisimukhlis5989 3 місяці тому +3

    Cargo geiger must be beeping harder than Chernobyl lmao.
    Also there's nightly feature for better fat/thin pointer shennanigans.

  • @ANONAAAAAAAAA
    @ANONAAAAAAAAA 3 місяці тому +61

    The more a language get feature rich and abstract, the more it gets difficult to micro manage hardware.
    This is a fundamental trade off no language can escape from.

    • @ITSecNEO
      @ITSecNEO 3 місяці тому +23

      @@ANONAAAAAAAAA Said no one ever (besides you ofc).
      C lacks features and abstraction, so C devs just develop their own abstractions with macros. This is just bad, because everyone does macros in a different way. So, your comment is just wrong, have you ever looked at C hardware code? It's pain to work with it, a language with more abstraction would solve some C problems directly. Rust suits perfectly for this. It's so refreshing that the compiler tells me about nearly all my bugs at compile time and also about memory issues. This so sooooo nice, especially when dealing with custom Syscalls or shared memory between two privilege modes in general.

    • @rocapbg9518
      @rocapbg9518 3 місяці тому +16

      @@ITSecNEO What the fuck are you talking about?

    • @ITSecNEO
      @ITSecNEO 3 місяці тому

      @@rocapbg9518 If I have to guess, I talk about things that lil kids like you obviously don't understand. Right?

    • @jhinseng5271
      @jhinseng5271 3 місяці тому +17

      @@ITSecNEO seriously though, what the fuck are you talking about?

    • @enricosevenfoldism
      @enricosevenfoldism 3 місяці тому +4

      @@ITSecNEO the fuck you talking about?

  • @-syn9
    @-syn9 3 місяці тому +2

    I think what they really mean is that you can't make a string where the pointer points to an internal buffer because all data structures in rust must be able to be moved via memcopy

    • @asdfghyter
      @asdfghyter 3 місяці тому

      but that’s not used in german strings as far as i understood?

    • @anotherelvis
      @anotherelvis 3 місяці тому

      The author of the blogpost wrote a library that solves the problem in Rust. The blog title is just click bait, but Prime fell for it.

  • @yearswriter
    @yearswriter 3 місяці тому +2

    I really like those type of videos.

  • @MrMeltdown
    @MrMeltdown 3 місяці тому +3

    24:19 I mean you could just use string from c++ standard lib which has this built in… it’s unsafe but so is this rust but the rust code has been hand rolled…
    Any way… I’ll just keep writing Perl…

  • @SJohnTrombley
    @SJohnTrombley 3 місяці тому +9

    Thanks for making waste a day thinking about that stupid resistor grid problem.

    • @MrMeltdown
      @MrMeltdown 3 місяці тому +2

      QED

    • @asdfghyter
      @asdfghyter 3 місяці тому

      congrats on getting nerd sniped!

  • @williamdrum9899
    @williamdrum9899 3 місяці тому +3

    And this whole time I thought a "fat pointer" was a segment:offset pairing, x86-16 style.

    • @niless3528
      @niless3528 3 місяці тому

      Thank you for your service.

  • @CjqNslXUcM
    @CjqNslXUcM 3 місяці тому +4

    What? Polars the python library is written Rust. They are not two separate things.

  • @rnts08
    @rnts08 3 місяці тому +28

    Look at what they have to do just to emulate a fraction of .

    • @corvoworldbuilding
      @corvoworldbuilding 3 місяці тому +9

      I read this with Tsoding's voice.

    • @braineaterzombie3981
      @braineaterzombie3981 3 місяці тому

      Python

    • @RustIsWinning
      @RustIsWinning 3 місяці тому

      Good one. Rust is still better tho

    • @johndoe2-ns6tf
      @johndoe2-ns6tf 3 місяці тому

      @@RustIsWinning fanatic cultist. take your veganism arrogance and stick it where the sun doesn't shine.

  • @eemanemm
    @eemanemm 3 місяці тому +4

    Looking at this makes me love C and C++...

  • @stevenhe3462
    @stevenhe3462 3 місяці тому +1

    You can implement these things in Rust in a C way and it would be easier. Just use *mut and transmute everywhere. Doing it the way the article did is much safer due to the type system guarantees.

  • @afinewhitehorse
    @afinewhitehorse 3 місяці тому +6

    Reminds me of strings is Pascal

  • @szirsp
    @szirsp 2 місяці тому

    I agree that it is interesting that this is possible. And I'm am certain that it has some use cases, otherwise why would anyone go trough this.
    But this seems super domain specific optimization to me.
    Yeah it's nice to have the 4 byte prefix for string comparisons (except for authentication, password hash comparison, don't do that, don't leak information for side channel attacks), but you could also have an array of prefixes where the memory layout is much more dense and you can look up strings, compare many strings in parallel... and it's much easier to change how long prefixes you want.

  • @theevilcottonball
    @theevilcottonball 3 місяці тому +6

    Spoiler: The article title is wrong.

  • @redcrafterlppa303
    @redcrafterlppa303 3 місяці тому

    17:45 I'm not sure about some optimizations but from an assembly pov requiring fixed size on the stack is completely arbitrary. Subtracting a register value from the stack pointer is a valid operation.
    The only problem I see is that this makes stack overflows easier and accessing something bellow the dynamic stack object might be more expensive as the length of the dynamic object needs to be calculated to get the address of the next variable. With only constant sized types every offset is known at compile time.

  • @DebFaith-q9y
    @DebFaith-q9y 2 місяці тому

    please do such run downs with implementations like tcp udp quick memory manaagement and more others such as storage code etc etc because they are very informative talks

  • @JFRA24
    @JFRA24 3 місяці тому +1

    gotta optimize that Rust weapon handling

  • @isaacyonemoto
    @isaacyonemoto 3 місяці тому +6

    Zig has many strings. []u8, []const u8, [:0]u8, [n]u8, [n:0]u8, [n:0]u8, []u16 (WTF-16), [:0]u16, comptime [n:0]u8, etc etc etc

    • @brod515
      @brod515 3 місяці тому +2

      zig has no strings

    • @pierreollivier1
      @pierreollivier1 3 місяці тому +7

      technically Zig has no strings, those types you mentioned or slices of bytes or arrays of bytes nothing more than that

    • @RustIsWinning
      @RustIsWinning 3 місяці тому

      Interesting. Now can you explain what each of these types mean?

    • @brod515
      @brod515 3 місяці тому +9

      @@RustIsWinning we can but we can tell from your name... you are a troll.

    • @RustIsWinning
      @RustIsWinning 3 місяці тому +1

      @@brod515 I'm genuinely asking that's why I said "Interesting". Just because I like one language doesn't mean I won't accept others. You think I have a job in Rust? lol

  • @jamlie977
    @jamlie977 3 місяці тому +93

    This video is not approved by the Rust Foundation

    • @alexander53
      @alexander53 3 місяці тому +7

      watch the vid first

    • @jamlie977
      @jamlie977 3 місяці тому +2

      @@alexander53 i know lol

    • @2xsaiko
      @2xsaiko 3 місяці тому +5

      This video is approved by the DreamBerd Foundation

    • @ITSecNEO
      @ITSecNEO 3 місяці тому +6

      @@jamlie977 this artificial Rust hate is funny. I wonder if people realize that nearly every language of the past had some drama involved. Everyone who decides against a language because of some language drama is just a trash dev lol. Grow up

    • @jamlie977
      @jamlie977 3 місяці тому +7

      @@ITSecNEO bro i love rust, i use it sometimes when i want to have fun
      maybe know satire better and grow up

  • @rainbain5474
    @rainbain5474 3 місяці тому +1

    It's funny how everything in programming has some weird name. I've been making German strings for years but I did not know their official.
    You can tend to do the same trick for many things that are not just strings too.

  • @thenightcorecrafter
    @thenightcorecrafter 2 місяці тому

    Nerds snipe each other but nobody dares sniping the nerds

  • @ronakmehta8106
    @ronakmehta8106 3 місяці тому

    @ThePrimeTime how would this would check equal if they are same big strings in separate pointers ?? doesn't it check if the ptr is the same and if not says they are not equal ?? or am I missing some detail here ??

  • @tekneinINC
    @tekneinINC 3 місяці тому +1

    13:02 ObjC mentioned!? 😂😂😂

  • @nescafezos4265
    @nescafezos4265 3 місяці тому

    I hope we see a reaction video about (c/c++) arena memory allocation method (which Casey was talking about) ^^

  • @rhbvkleef
    @rhbvkleef 3 місяці тому

    The polars you are talking about are both the same. It's a data library for rust, but also has (excellent) bindings for Python

  • @porky1118
    @porky1118 3 місяці тому

    28:24 I just disabled this lint altogether. I guess it isn't even enabled by default when using clippy.

  • @swannie1503
    @swannie1503 3 місяці тому +11

    Hard truth: the lifetimes are always there. Just because they’re elided most of the time doesn’t mean they aren’t there. You just don’t like having to define explicit lifetimes when you have more than one in a scope.

    • @CyberDork34
      @CyberDork34 3 місяці тому +1

      Actually though, I'd rather be told by the compiler that I'm doing a potentially dangerous pattern (using an iterator/view/reference after the underlying object is modified, returning a pointer to a stack variable, using an object after it's been moved) so I can abort and do something else than be forced to manually annotate how long every variable lives. If I'm fighting the language so much to do, like, "just store a pointer or some small number of bytes" that I know I could do correctly in C or C++, I'd almost rather just write it, and then fix the memory issues with a sanitizer. I'd probably be done faster.
      C++ has a lifetime profile that works this way. Only issue is that the tooling for it is still really new and not yet available. I think clang has parts of it and Visual Studio has most of it.

    • @swannie1503
      @swannie1503 3 місяці тому

      @@CyberDork34 then use C++ if you want to write code like that lol I’m just pointing out a fact about lifetime elision.

    • @Turalcar
      @Turalcar 3 місяці тому

      @@CyberDork34 It ends up not being faster. Maintaining the ostensibly "done" C++ code adds up.

    • @blueghost3649
      @blueghost3649 3 місяці тому

      @@CyberDork34then don’t use Rust? Some of us would rather not deal with the pain of not knowing statically if what you’re doing is correct and need to rely on memory sanitizers

    • @CyberDork34
      @CyberDork34 3 місяці тому

      @@blueghost3649 I don't

  • @Treviath
    @Treviath 3 місяці тому

    Are German strings similar to how python stores int values up to 256 in the pointer to that value?

  • @jankymcjangles3817
    @jankymcjangles3817 3 місяці тому +1

    Liked because of the gstring joke.

  • @Spartan322
    @Spartan322 3 місяці тому

    I really want to see prime checkout C3, not necessarily use it as a big thing yet, as its still way more alpha then Zig, but I think given Prime's position on Zig, he'll love C3.

  • @lancemax857
    @lancemax857 3 місяці тому +10

    If the reason why you code rust is because it prevents leaks. Just go back to garbage-collector land.

    • @RustIsWinning
      @RustIsWinning 3 місяці тому +3

      @@lancemax857 what?

    • @deltamico
      @deltamico Місяць тому

      Isnt the difference, that rust checks for leaks compile time, so no runtume wasted

    • @RustIsWinning
      @RustIsWinning Місяць тому

      @@deltamico Yes resources are dropped automatically once they go out of scope but this does not prevent you from leaking resources that still live longer than it was intended. However, even a GC would not be able to help. People really have to learn how memory leaks can happen in any language with and without GC :/

  • @Karurosagu
    @Karurosagu 3 місяці тому +6

    10:59 Software engineering in the 500 years later: Minimum size of Strings is 512MB and they are AI powered by Skynet 😂😂😂

  • @awesomedavid2012
    @awesomedavid2012 3 місяці тому +112

    You don't need to optimize rust, it's already perfect and as performant as physically possible

    • @romangeneral23
      @romangeneral23 3 місяці тому +74

      I'll have what he's having!!!!

    • @StingSting844
      @StingSting844 3 місяці тому +3

      Whatchu talkin bout David

    • @rnts08
      @rnts08 3 місяці тому +8

      Puff puff pass buddy

    • @andguy
      @andguy 3 місяці тому +28

      Sarcasm doesn’t translate well over the internet - but I get you

    • @PixelThorn
      @PixelThorn 3 місяці тому +7

      ​@@andguy it does if you append with /s

  • @Wolfeur
    @Wolfeur 3 місяці тому

    I'm a bit confused by the prefix thingy. Considering characters could range from 1 to 4 bytes and that basically any non-ASCII character is 2 bytes long, there's a non-negligible chance that the prefix truncates in the middle of a character. How does the string compare the characters then? I assume it can't sort with any actual collation, as characters are not sorted purely by their code numbers.

    • @defeqel6537
      @defeqel6537 3 місяці тому +3

      I don't see the problem, AFAIK it's just used for a quick (in)equality comparison, if it does equal, you need compare the full string, which would contain rest of the bytes for the specific character. I also suspect this is mostly used in ASCII land.

  • @JohnDoe-np7do
    @JohnDoe-np7do 3 місяці тому +2

    Yeah []const u8 is just the best string abstraction 😂😂😂

  • @Luclecool123
    @Luclecool123 3 місяці тому +1

    LLMs are just the new ctrl-f, and that's a tweet for ya ;)

  • @RandomGeometryDashStuff
    @RandomGeometryDashStuff 3 місяці тому

    08:59 diagonal screen tearing?

  • @monad_tcp
    @monad_tcp 3 місяці тому +1

    33:45 lack of dependent types, that research never gets done and put into actual use to solve this nasty problem

    • @sigmundwong2489
      @sigmundwong2489 3 місяці тому

      an, a fellow fan of dependent types. :) They are underappreciated for sure!

  • @alphaomega154
    @alphaomega154 3 місяці тому

    im not a coder, but it sounded to me that its just code validation procedure. so whats the difference of it from simply hash all codes running in the OS? like for example only those that passed the checksum can execute? whats the difference? so why must "rust'?
    is this means, if hackers can get ideas of unique pointers needed for certain string for vital root process, they could just made their own pointer to alias with the existent codes? so long it doesnt violate the length? it sounded like it.

    • @tomtravis858
      @tomtravis858 3 місяці тому

      You would need basic understanding in CPU architecture, heap, stack, etc to get a firm grasp on what the problem is.

  • @creativecraving
    @creativecraving 3 місяці тому

    22:43 I have totally dereferenced pointers in safe Rust. Try comparing a String to an &str some time. You can dereference them both to str, and the types match for the comparison.

    • @Betacak3
      @Betacak3 3 місяці тому +4

      He's talking about what's often called a "raw pointer" in Rust. A &str is not a raw pointer.
      Try turning a reference to an int to a *const i32 like `let ptr = &my_int as *const i32` and then dereferencing it back to an int like `let my_int2 = *ptr` and you will get a compiler error, unless you wrap the second one into an unsafe block.
      You could even create a pointer out of thin air with `let ptr = 123456 as *const i32;` and then try to dereference it into an i32. That one will actually give you a segmentation fault, which is precisely why raw pointer dereferencing is unsafe.

  • @redyau_
    @redyau_ 3 місяці тому +1

    Please move the chat back to the top!
    It always covers up the text your're reading.

    • @redyau_
      @redyau_ 3 місяці тому +2

      14:30 Oh wow, didn't know this was a live comment section ✨😅

  • @danser_theplayer01
    @danser_theplayer01 3 місяці тому

    I got nerd sniped to build my own, possibly bigger BigInt in javascript, that would probably trade calculation time for bigger number length in digits.
    I have some ideas but haven't started making it cause I have 10 other things I want to do at the same time.
    I got some sorta project ADHD.

  • @yapet
    @yapet 3 місяці тому +2

    Wasn’t hard to follow at all. In fact I had to do the exact same small string optimization in rust when writing my lua interpreter

  • @dfjab
    @dfjab 3 місяці тому

    Swift compiler does this for you

  • @dylan_the_wizard
    @dylan_the_wizard 3 місяці тому

    I've seen this tweet before, it's just a guy named German String

  • @aaronpolichar7936
    @aaronpolichar7936 3 місяці тому

    Is far pointer to thin anything like Fatboy Slim?

  • @Kitulous
    @Kitulous Місяць тому

    5:35 everything came from xkcd

  • @akemrir
    @akemrir 3 місяці тому

    Hmm, ok. It's kind of nice to be able to do it. Did he benchmarked it?
    Unsafe is unsafe for some kind of reason right?

    • @MatthijsvanDuin
      @MatthijsvanDuin 3 місяці тому +2

      unsafe just means you have the responsibility of making sure what you're doing is memory-safe (same as in C/C++), instead of being able to rely on the type checker. It's not an inherently bad thing, and may be necessary in some cases, but its use should generally be minimized.

  • @eclipse6859
    @eclipse6859 3 місяці тому

    Can you not just make a [u8; const N: usize] on the stack and just make sure each byte is valid utf8?

  • @JMurph2015
    @JMurph2015 3 місяці тому +8

    The most annoying thing about Rust is that traits have a bad habit of spreading all over your codebase. They get "viral" where changing one thing to be generic ends up pushing everything else to be generic too, and then pretty soon you have loke Rust graphics library stupidity where just the trait bounds are like a dozen lines by themselves.

  • @mateusvmv
    @mateusvmv 3 місяці тому +1

    wdym impossible, there's a popular crate for small strings

    • @anotherelvis
      @anotherelvis 3 місяці тому

      The author of he blogpost also wrote a library.

  • @Summanis
    @Summanis 3 місяці тому +2

    One day I hope Prime learns how to read 🙏🏻🙏🏻

  • @AK-vx4dy
    @AK-vx4dy 3 місяці тому

    I have a problem with this German string
    .. utf-8 char can have 1 to 4 bytes so it is possible to check partial equality but ordering is impossible IMHO

    • @CjqNslXUcM
      @CjqNslXUcM 3 місяці тому +2

      Why not? The length encoding is the amount of 1s (bits) before the first 0. Codepoints that are larger are (obviously) longer. It will order as correctly as it can.

  • @pinklife4310
    @pinklife4310 3 місяці тому +1

    That was impressive and I'm nowhere near that level of skill.

  • @AliceTurner-q3f
    @AliceTurner-q3f 3 місяці тому +20

    I've been down this route in Rust before. It's difficult and can be done. Unsafe Rust is also far better than C when you get used to it, that's the thing with Rust, it is very similar to C++ in that it has a underlying language C in C++'s case, and unsafe Rust in Rust. The difference is that Rust still requires thought about how the memory is being managed at all times, where as with C it will happily let you do anything including unsafe stuff and compile. Unsafe Rust isn't perfect, and some bugs are easier to create than Rust but still safer than C.
    A bit of a tangent here; I got intrigued by your comments on Zig, being simpler than Rust and safer than C. What i found was a poor experience on Windows, as the compiler didn't output errors explaining why, it just crashed. I figured Zig needs more time to cook (as you say lol). So i flirted back with C but what i dislike about C isn't the safety stuff as with Windows and most operating system all memory created by a process is automatically all freed at the end of the program, so really memory safety at runtime on a modern OS is always safe. But Rust helps with efficiency at runtime.
    Anyway long story short is, i prefer Rust after my journeys, even down to the error handling and verbose pattern matching. I seldom use the '?' operator in favour of `if let Err(error) = (something that errors) . . . { return (error); }`. I've learned to avoid the syntatic sugar i previously used, simply because it helps with code readability. Sure if you know what the syntatic sugar does, it's fairly simple to follow but plain old match arms and if patterns are enough for me.
    Others may not value it like i do, but the added memory efficiency, hygienic macros, error handling and pattern matching. Rust to me is the language of choice, and if i'm honest, if i wasn't using Rust i'd probably use C

    • @cryptonative
      @cryptonative 3 місяці тому +2

      You summarize Rust vs Zig pretty well.
      Rust: Throws errors and message means nothing
      Zig: Throws error with no message

    • @omduggineni
      @omduggineni 3 місяці тому +2

      Yeah, the thing is unsafe rust doesn't have as much undefined behavior as C lol

    • @maleldil1
      @maleldil1 3 місяці тому +8

      I don't know why people say Zig is safer than C. They're basically the same thing: buffer overflows, use after free, memory leaks... All things that C has always done and we've recognised as problems and that Rust (mostly) avoids (leaking is safe in Rust, but it has to be done intentionally). Zig is a much more ergonomic C, but it's not safer.

    • @pierreollivier1
      @pierreollivier1 3 місяці тому +2

      As a Zig longtime user, I've never used it on Windows, but I have to agree with you, on everything except for the memory efficiency, while Rust can like any compiled languages be used to write very memory efficient code, there is often a high skill barrier in front of it, the lack of a stabilized allocator_api certainly doesn't help Rust. From my experience being mainly a C developer, I would say that Zig is a great middle ground for high performance software. Rust can be but sometimes the added time of development and the additional complexity required to prove to the compiler that everything is safe, aren't worth the safety benefits, a lot of software doesn't need to be airtight safe, those who do should obviously try Rust first, but Zig is a good middle ground, as @maledil1 said technically both C and Zig offer the same level of "safety" but from experience I've built a strong opinion that the main reason why C is so problematic is because it lacks a lot of the ergonomics of modern languages, and the tooling certainly doesn't help there too. Zig just like Rust is trying to improve on that by taking a slightly different approach. Rust enforces correctness with a combination of a strong type system, strong static analysis tools, and an airtight semantic that's meant to prevent developers from being able to compile erroneous code. Zig does I would say 50% of that, and the rest requires the developer to be extra explicit which I find to be a great middle ground, because a lot of "unsafe" pattern in Zig are very verbose on purpose to make, it clear to the reader that this code may need more attention. This makes fixing bug so much easier than in C because it's really easy to see block containing tons of cast, and memory access, and it focuses the attention. On top of that they offer a wide array of great tools to help you catch bugs early, the testing framework, the debug allocators that help to catch use after free, double free, the optional semantic, and 99% of C UB are replaced by Zig panics, with stack trace. Technically in C you can enable all of what I've mentioned too, but the point is that this is not a default, and C still relies on UB unless asked otherwise. Plus even with everything enabled the tooling is still inferior and the debug ability is really not good.
      But I think you are wise to keep on using Rust, it's should definitely be used over C, but as a friendly suggestions you might want to wait a few years and revisit Zig once it's a bit more stable :)

    • @complexity5545
      @complexity5545 3 місяці тому +1

      This is why C and Rust are needed. C and C++ for prototyping quick deadlines. Rust for cementing permanent gains.

  • @1337cookie
    @1337cookie 3 місяці тому

    Can you put an transparent background on your chat so text isnt fucked everywhere.

  • @ericmyrs
    @ericmyrs 3 місяці тому

    I got nerd sniped before watching this video by an ascii tic tac toe board.
    I guess I'm writing a cute little game in python now.

  • @Amipotsophspond
    @Amipotsophspond 3 місяці тому +1

    ok this is a optimization video, so time it. C vs rust vs not optimized C vs not optimized rust let's see how optimized it is in different situations.

    • @sigmundwong2489
      @sigmundwong2489 3 місяці тому +1

      Good point, if you claim "I optimized," you should be able to support it with some impirical data.

  • @krisavi
    @krisavi 3 місяці тому

    What really annoyed me was how the words were split between rows. There are some rules, like the splitting is done between syllables. Leaves impression that whoever wrote this article hasn't been properly educated.

  • @CEOofGameDev
    @CEOofGameDev 3 місяці тому

    if its impossible in rust it has a memory bug there somewhere, that's for certain.

  • @dmitriidemenev5258
    @dmitriidemenev5258 3 місяці тому

    Short string optimization in Rust is not implemented by choice.

  • @echoexplore4190
    @echoexplore4190 24 дні тому

    kinda sad that prime is done with Rust im just now getting into Rust! 😇

  • @hellolk77
    @hellolk77 3 місяці тому

    Do you guys know in rust, that everytime when you return struct in stack, it get copied into caller's stack ? Regardless of copy-able or not.

    • @RustIsWinning
      @RustIsWinning 3 місяці тому

      What?

    • @DrGeoxion
      @DrGeoxion 3 місяці тому

      That's half true... IIRC, what you're describing is what Rust instructs LLVM to do. But then LLVM is often able to optimize that away. So there's a lot of returns happening by e.g. registers for small values

    • @rusi6219
      @rusi6219 3 місяці тому

      @@DrGeoxion rust still makes the instruction your coping doesn't help you in any way

    • @DrGeoxion
      @DrGeoxion 3 місяці тому +1

      @@rusi6219 What do you mean? Rustc is the combination of the Rust frontend and the LLVM backend. What is or is not optimized in the middle of the compilation process is not important. Only the end result matters.
      Or do you depend on the IR output of Rustc?
      In that case you can still call on LLVM manually to run its optimization passes on it and get it to output its optimized IR.

    • @rusi6219
      @rusi6219 3 місяці тому

      @@DrGeoxion rustcels always get other people to do the job for them just like now they're trying to get Linux kernel maintainers to do the job for them same with leveraging LLVM to make their language operable

  • @ferdynandkiepski5026
    @ferdynandkiepski5026 3 місяці тому

    Doesn't glibc do this?

  • @Fleebee.
    @Fleebee. 3 місяці тому +1

    I have gpt api rechecking itself after some data analysis .
    The problem is though , like when you use the desktop version , it can infinitely give you the wrong answer , even after it apologises and tells you it’s correcting itself

  • @deezydoezeet
    @deezydoezeet 3 місяці тому

    Actually bonkers!

  • @LubosMudrak
    @LubosMudrak 3 місяці тому

    Using unsafe in Rust basicaly means "trust me bro".

    • @anotherelvis
      @anotherelvis 3 місяці тому

      The goal is to encapsulate the unsafe behavior in small well tested libraries.

  • @Reydriel
    @Reydriel 2 дні тому

    You can tell who hasn't actually watched the video with some of the comments here lol

  • @constantinefedotov8394
    @constantinefedotov8394 3 місяці тому +3

    Lock-free structures appear extremely hard to implement

    • @skeetskeet9403
      @skeetskeet9403 3 місяці тому +2

      How would they be any different from doing it in any other language? Rust has the same atomics, same pointers etc. You will just need to use unsafe for the parts that cannot be statically proven within safe Rust.

    • @sbdnsngdsnsns31312
      @sbdnsngdsnsns31312 3 місяці тому +2

      Unsafe rust is horrible to use safely though. The undefined behavior problem is real. If rust wants to use unsafe as an excuse for not supporting more complex lifetime models, then it needs to fix unsafe to be equivalently easy to write as the C version, and ideally as easy as the Zig version.

    • @skeetskeet9403
      @skeetskeet9403 3 місяці тому +2

      @@sbdnsngdsnsns31312 Unsafe Rust isn't horrible to use, you just need to uphold the basic requirements of the Rust memory model. Yes, they're different from the requirements of the C and Zig memory models, but they're not actually impossible to follow. They are more restrictive than the requirements placed on you by the C or Zig memory models, sure. But once you know them, working around them is typically trivial.

    • @taragnor
      @taragnor 3 місяці тому +1

      Lock-free programming is really difficult in general. There's a reason using locks is generally the preferred solution unless you really need the extra performance. If a language makes lock-free programming seem easy, be very careful, because it's probably just letting you do a lot of things that are going to cause problems down the line.

  • @Flynn-lk8im
    @Flynn-lk8im 3 місяці тому +7

    What is rust?

    • @anar0gk158
      @anar0gk158 3 місяці тому +9

      The thing that rots you slowly.

    • @raidensama1511
      @raidensama1511 3 місяці тому +21

      An interesting computer language that many people get emotional over; one way or another.

    • @yohannnihalani5079
      @yohannnihalani5079 3 місяці тому

      low-level programming language with memory safety and a borrow checker

    • @cezarhg2007
      @cezarhg2007 3 місяці тому +1

      blazingly fast

    • @rnts08
      @rnts08 3 місяці тому +9

      Oxidized iron

  • @sjoer
    @sjoer 3 місяці тому

    Making assumptions based on what you read is not dyslexia.

  • @lackofsubtlety6688
    @lackofsubtlety6688 3 місяці тому

    You dont optimze rust, rust optimized you.

  • @kamertonaudiophileplayer847
    @kamertonaudiophileplayer847 3 місяці тому

    No, ptr points to the string, len and other crap before the actual data. Want the len? *(ptr-4)

  • @edmundas919
    @edmundas919 3 місяці тому +6

    1 minute gang

  • @andrewdunbar828
    @andrewdunbar828 3 місяці тому

    comparison are fast

  • @mike200017
    @mike200017 3 місяці тому +1

    So, is this called a German string because the prefix is always equal to "uber"?

  • @James2210
    @James2210 3 місяці тому

    Tail calls are impossible in Java

  • @necauqua
    @necauqua 3 місяці тому

    Bro made things more complicated with the inlined arc - and even mentioned that at the end - only for you to stop reading right before the sentence where he said that, smh

  • @redcrafterlppa303
    @redcrafterlppa303 3 місяці тому

    I love rust but it has some strange decisions and limitations that every time I use it I have a train of thought and think "and now I can do that" but then some obscure limitation gives me an error and I can't write the code I want. It's similar to how in java generics are extremely shallow.
    That's why I'm working on a hobby project writing a language that combines Java and Rust having the runtime capabilities of Java and the compile time speed of rust. Meaning rust with easier lifetimes less restrictions and reflection and easy subtyping capabilities.
    Edit 33:10 lifetimes are planned to be more conservative and implicit in my language. Struct coloring is done 100% by the compiler and functions have a more straightforward rule when they need lifetimes and they are almost always inferred at callsite.
    For example the basic rule for requiring a lifetime on a function is the case where you create a reference that will outlive the current function. Creating a lifetime and marking the return type or arguments with it will bind the referenced element to this lifetime delaying the drop until the lifetime is assigned to the current function higher up in the call chain. You can even return references to local variables changing their lifetime from the current function to the lifetime of the lifetime.

  • @galnart5246
    @galnart5246 3 місяці тому

    Rust that is impossible in optimization

  • @DerSolinski
    @DerSolinski 2 місяці тому

    I'm German...
    And I'm not sure if I should feel honored or offended...
    Is that a homage to German efficiency...
    or bashing on stereo types...

  • @salsaman
    @salsaman 3 місяці тому

    Short string optimisations - been there, done that...define data_struct, get size_of(data_struct). Pad up to next multiple of $cacheline_size. Redefine data_struct including buffer. Place immediately before pointer to key in data_struct. Now for the fun part: long strings - padding is filled with 0s, pointer to points to allocated string. Short strings - since padding is immediately before pointer, we can use combined space to hold a short string. Then simply check whether padding[0] is zero or non zero.

  • @derendohoda3891
    @derendohoda3891 3 місяці тому +4

    naming strings after nationalities violates rust code of conduct

    • @techpriest4787
      @techpriest4787 3 місяці тому +3

      I am not sure what you mean. Such rules only apply to the Rust Foundation. And not everybody who uses Rust. So such naming may only not appeare in the STD lib.