Speeding up Rust Code

Поділитися
Вставка
  • Опубліковано 11 вер 2024

КОМЕНТАРІ • 75

  • @TsodingDaily
    @TsodingDaily  Рік тому +121

    Fewer timestamps today because I'm starting to hit the description size limit of 5000 characters.

  • @adrianjdelgado
    @adrianjdelgado Рік тому +60

    That buffering gotcha tho. I think the reasoning is that it wants to give you the greatest control of syscalls possible. Buffering in some cases would not be zero cost (and it needs heap allocations).

    • @vasiliigulevich9202
      @vasiliigulevich9202 Рік тому +11

      I think the API is designed to separate concerns. The raw stream is intentionally lacking buffering to make its API simple and orthogonal to the buffer layer.

    • @ayoubbelatrous9914
      @ayoubbelatrous9914 Рік тому +1

      i think because rust creator was interested in embedded systems maybe the unbuffered readers/writers got added first.

  • @__aj2000__
    @__aj2000__ Рік тому +20

    more rust sessions please...

  • @alexpyattaev
    @alexpyattaev Рік тому +10

    Buffering is not necessarily a sensible default. Interactive applications behave really strangely with buffering. In Python it takes some reffort to disable buffering on IO streams every time you do interactive stuff.

  • @neunmalelf
    @neunmalelf Рік тому +2

    I like that you take the time to index your streams. Mostly i watch them in one step but every now and then it helps to find an important part. 👍🙏

  • @user-zh5ef7hd8r
    @user-zh5ef7hd8r Рік тому +8

    0:53
    Тудей-сюдей, today, сегодня))

  • @bartpelle3460
    @bartpelle3460 Рік тому +12

    I think the cynical frowning upon Rust's "defaults" is really misplaced because the examples you gave are also not the defaults in the C APIs; if you did fopen & fread surely you're also not surprised that it's not buffering for you behind the scenes? :thonk:

    • @alexpyattaev
      @alexpyattaev Рік тому +2

      If you code in C you obviously know exactly what every stdlib function is doing. Also, since C programmers are naturally smarter than all other developers, this is not an issue:)

    • @bartpelle3460
      @bartpelle3460 Рік тому +8

      @@alexpyattaev shiiiiii, do I take the bait?

    • @teenageoperator7246
      @teenageoperator7246 2 місяці тому

      agreed!

  • @julkiewicz
    @julkiewicz Рік тому +6

    Forcing the program to go a certain path is usually called a unit test

  • @hsider
    @hsider Рік тому +4

    I would suppress the noise from the beginning, that would reduce the stored data too since indexed data will be eventually stored to a DB. Nice content of course.

  • @naytivlostlastname7632
    @naytivlostlastname7632 9 місяців тому

    1:12:00 - this feeling is precisely what is at the heart of mathematics and why it is studied in its pure form at all. i think itd be really cool to do a stream where you explore a higher level pure mathematics concept and work on implementing it, the same way you often do for a lot of higher level development ideas. i believe there's a lot of value in seeing how the (now) two different studies, really arent as different as they look

  • @Momoyon
    @Momoyon Рік тому +2

    Thanks for the timestamps mr.tsoding

  • @josedejesuslopezdiaz
    @josedejesuslopezdiaz Рік тому +7

    4ms + all the http overhead is impressive indeed

  • @movization
    @movization Рік тому +3

    Actually I found that random perdoc is kinda funny word for me too. Thank you for making me a little laugh.

  • @ujjawalsinha8968
    @ujjawalsinha8968 7 місяців тому +1

    I do use LSP for highlighting issues with my code, and I really like it, although knowing this universal way of compiler driven refactoring is also nice, at least I now know what to do if I have to program just using a notepad and a command prompt lolz.

  • @naturallyinterested7569
    @naturallyinterested7569 Рік тому +5

    Fuzzy, live searching would definitely be interesting. Although for fuzzy you'd probably need a separate list of all terms.

    • @javierflores09
      @javierflores09 Рік тому

      this is already fuzzy search what do you mean

    • @naturallyinterested7569
      @naturallyinterested7569 Рік тому

      @@javierflores09 Oh is it already? Sry I assumed from this that it's just direct matching between search terms and the index. I must have missed him implementing fuzzy term selection in a previous stream.

    • @javierflores09
      @javierflores09 Рік тому

      @@naturallyinterested7569 ah, no it wasn't streamed yet I believe, he implemented stemming off camera or so it seems. I too thought I missed it when I looked at the repo but after watching this stream I see that it isn't the case

  • @user-dl6uc7vn6w
    @user-dl6uc7vn6w 4 місяці тому

    13:45 I think this monologue should be put on the main page of the rust website

  • @michaelmueller9635
    @michaelmueller9635 Рік тому +4

    Patterns are symmetry; you want to continue the pattern, because it's symmetrical and being symmetrical is easy. The edge of symmetry and breaking symmetry leads to new discoveries. Physicians do that all the time.

    • @ezg5221
      @ezg5221 Рік тому +1

      Physicians are medical experts, actually. Shout out to Group Theory and Category Theory tho!

    • @hanswoast7
      @hanswoast7 Рік тому

      @@ezg5221 I think he meant physicists :)

  • @havocthehobbit
    @havocthehobbit 7 місяців тому

    I was so proud of myself being a rust beginner only starting 2 weeks ago ,when he did a count for an iterator, around 21:13 and I was screaming how is he making it work without defining as mutable the compiler's thugs beat me up when I do that but then he compiled and got an error . I now know I am not alone in this world

  • @9SMTM6
    @9SMTM6 Рік тому +22

    Oh wow, someone soured on Rust.
    I mean, I get that this situation is suboptimal. But just having buffering always enabled also is suboptimal, I've seen that cause a lot of issues in other situations. Which is probably why it's off by default, also finding a way, that doesn't end up causing issues, to opt out would not be easy or even possible.
    Its also not as if other languages are always buffered. They buffer in some instances, don't in others, good luck finding out when what is happening.
    Rusts design isn't intended to make someone feel stupid. It is meant to make you aware of what is happening, to be unsurprising and very tunable, and generally it does succeed with that. In many instances other than the one here it'll find a way to make you aware of these gotchas. In this case it failed, I'll give you that, and yes, in MANY cases buffered IO will work better.
    But saying that the reasons it was designed as it was is to make one feel stupid is, sorry, but just wrong and IMO not a very good approach. Rather say that you are fine with not being able to tune some things in the exceptions, that the added ease of use in most other situations is worth it, and that thus you prefer language X that "just does the right thing" in most cases.

    • @javierflores09
      @javierflores09 Рік тому +2

      That's a lot of words just to say you are being rude to my favorite language

    • @9SMTM6
      @9SMTM6 Рік тому +9

      @@javierflores09 that's because that isn't what I intend to say.
      Rather this is a specialized version of "blindly being rude ain't productive".
      Rust is indeed my favorite language, but it sure aint perfect. The root cause behind what he's saying is indeed problematic, but sadly hard or perhaps impossible to avoid, and what he's saying isn't exactly actionable or close to the root issue.

    • @bartpelle3460
      @bartpelle3460 Рік тому +3

      @@javierflores09 oh wow, someone soured on someone soured on someone soured on Rust. Better go pretend they're a zealot!

  • @abujessica
    @abujessica Рік тому +4

    13:45 edge lord mode activated ⚡

  • @blu35cr3w
    @blu35cr3w Рік тому

    I like your approach. I'm just starting to write Rust code and I will use seroost for other docs. I'm pretty sure of that. Maybe the search page should get some love... Awesome stuff and you've got a new subscription to your channel.

  • @josephattia6040
    @josephattia6040 Рік тому +2

    Looks like we're live! Hello TsCoding!

  • @SuperKombain
    @SuperKombain Рік тому +3

    Could you just move search token loop higher than document loop to calc idf once per search token?

  • @ilovepeaceandplaying8917
    @ilovepeaceandplaying8917 Рік тому +2

    can your google do that?

  • @dickpiano1802
    @dickpiano1802 Рік тому +1

    14:00 what a guy

  • @simonfarre4907
    @simonfarre4907 Рік тому +2

    Why use `PathBuf` as the key for the hashmap in the TermFreqPerDoc? Why not use the inode instead. The inode is just a u64. Then store a inode -> path separately and retrieve the relevant data when needed.

    • @user-yo6xb6ud6d
      @user-yo6xb6ud6d 11 місяців тому

      Well for one, using inodes would make the code non-cross-platform.

  • @user-dl6uc7vn6w
    @user-dl6uc7vn6w 4 місяці тому

    The god of debagging

  • @dwightk.schrute8696
    @dwightk.schrute8696 Рік тому +1

    wasn't pagerank basically tf-idf over a graph of tf-idfs?

  • @RuslanKovtun
    @RuslanKovtun Рік тому

    simple & easy = просто и легко. С различиями ты загнул жестко.

  • @Anhar001
    @Anhar001 Рік тому +1

    Hey man did you do any stop words and word stemming algorithm?

  • @imhugofonseca
    @imhugofonseca Рік тому

    Keep it up 👍

  • @Someniatko
    @Someniatko 6 місяців тому +1

    My goal is to generate as many compilation errors as possible. (c) Tsoding

  • @Neuer_Alias_erstellen
    @Neuer_Alias_erstellen Рік тому +2

    lets play RUST the videogame

  • @gzoechi
    @gzoechi Рік тому +21

    "pub" as default would be really bad.
    Requiring the developer to be explicit about everything instead of lots of "magic" is one of the big advantages of Rust. If that's not what you want, Python or something like that is probably a better choice.

    • @nataestanislaubastos7637
      @nataestanislaubastos7637 Рік тому +2

      What you are saying doesn't make sense.

    • @nataestanislaubastos7637
      @nataestanislaubastos7637 Рік тому +4

      It literally has no connection between those two things.

    • @gzoechi
      @gzoechi Рік тому +10

      @@nataestanislaubastos7637 That you do not understand it, doesn't mean it makes no sense.

    • @punkystone
      @punkystone Рік тому +1

      are you the guy that i often see on flutter stackoverflow answers?

    • @nataestanislaubastos7637
      @nataestanislaubastos7637 Рік тому

      @@gzoechi It is just a matter of default option. You be explicit on public or private, just a matter of default. It doesn't make one more explicit than the other.

  • @badstep495
    @badstep495 Рік тому +1

    You are inserting a 1 in the global frequency table when the term does not exist; this is wrong I believe, instead it should be the term count from the given document which can exceed 1.

    • @mingiux92
      @mingiux92 Рік тому +1

      I may be wrong but I think the right thing is to use a 0 in the unwrap_to and resolve the division by 0 issue with the log rule log(n/m) = -log(m/n) as n is always > 0. I think this makes compute_tf and compute_idf can be easly factorized.

  • @justinpeter5752
    @justinpeter5752 Рік тому

    better to index all the words and calculate tf-idf for each word with meta data. the words and their metadata could be hashed and retrieved in O(1) time during a search.

  • @PouriyaJamshidi
    @PouriyaJamshidi Рік тому +2

    00:13:45 so damn true!

  • @SimGunther
    @SimGunther Рік тому +4

    13:25 If Stockholm Syndrome were a language...

  • @harrybilsonia
    @harrybilsonia Рік тому +1

    Lmao per-doc!!

  • @kibels894
    @kibels894 Рік тому

    Crate devs love feature switching way too much. ifdef compilation switches are bad practice in C for a reason, giving them a different syntax doesn't make them a good idea.

  • @kibels894
    @kibels894 Рік тому +1

    Serde serialized a struct as an array? wtf? Is that an option you can change in the derive macro? I guess an array is more efficient but it's throwing away information about the name of the struct fields.

    • @anafabula
      @anafabula Рік тому +7

      No, it didn't. It serialized the tuple as an array, which doesn't have field names anyway. It made the struct into an object.

  • @Satoshinork
    @Satoshinork Рік тому +1

    What's up! Build a package managar for C or C++.

  • @glebbash
    @glebbash Рік тому

    PerDoc 🤣🤣

  • @neunmalelf
    @neunmalelf Рік тому

    The Rust Language behaves to a programmer like its creators ... 😉