New Go Billion Row Challenge w/ Great Optimizations | Prime Reacts

Поділитися
Вставка
  • Опубліковано 22 бер 2024
  • Recorded live on twitch, GET IN
    / theprimeagen
    Become a backend engineer. Its my favorite site
    boot.dev/?promo=PRIMEYT
    This is also the best way to support me is to support yourself becoming a better backend engineer.
    Article link: r2p.dev/b/2024-03-18-1brc-go/...
    By: Renato Pereira
    MY MAIN YT CHANNEL: Has well edited engineering videos
    / theprimeagen
    Discord
    / discord
    Have something for me to read or react to?: / theprimeagenreact
    Kinesis Advantage 360: bit.ly/Prime-Kinesis
    Hey I am sponsored by Turso, an edge database. I think they are pretty neet. Give them a try for free and if you want you can get a decent amount off (the free tier is the best (better than planetscale or any other))
    turso.tech/deeznuts
  • Наука та технологія

КОМЕНТАРІ • 207

  • @ivanovcharov7534
    @ivanovcharov7534 2 місяці тому +265

    OMG ITS MY FAVOURITE PROFESSIONAL YAPPER!

    • @yaaaayeet745
      @yaaaayeet745 2 місяці тому

      5 DOLLARS A MONTH 🗣🗣🗣🗣🗣✋✋✋✋✋

    • @apexdude105
      @apexdude105 2 місяці тому +29

      "professional yapper" what a good job description for a streamer lmao

    • @charlesyoung601
      @charlesyoung601 2 місяці тому +3

      nl clears

    • @oat1000
      @oat1000 2 місяці тому

      nl my goat ​@@charlesyoung601

    • @jostasizzi818
      @jostasizzi818 2 місяці тому +1

      Why do I feel this is every so called tech UA-camr right now

  • @neruneri
    @neruneri 2 місяці тому +78

    Asking Flip to take something out seems like the most reliable way to ensure that it absolutely does not get taken out.

  • @R4ngeR4pidz
    @R4ngeR4pidz 2 місяці тому +171

    Narrator:
    Flip did, in fact, not take that out (16:00)

    • @teejaded
      @teejaded 2 місяці тому +4

      Flip. Take this anti-flip propaganda out.

    • @flipmediaprod
      @flipmediaprod 2 місяці тому +9

      I stand against the establishment

    • @Kannatron
      @Kannatron 2 місяці тому +3

      @@flipmediaprod truly and upstanding and forward thinking editor. You kept it in for the people, 👏🤯🤯🤯

    • @GermanClaus
      @GermanClaus 2 місяці тому

      He sounds like he is begging :D

  • @Sw3d15h_F1s4
    @Sw3d15h_F1s4 2 місяці тому +45

    the JDSL implementation would be 10x faster. Tom's a genius!

    • @jerichaux9219
      @jerichaux9219 2 місяці тому +4

      JDSL would have melted the CPU from how fast it would be parsing those rows.

  • @strangnet
    @strangnet 2 місяці тому +23

    Wow: a 4.7HGz with 6000mhz memory. Those millihertz come in handy with the HenryGigaz processor...

  • @MHarris021
    @MHarris021 2 місяці тому +9

    Tip for remembering stalagmites and stalactites. "Stalagmites have a g for ground and stalactites have a c for ceiling", it's how I remember which is which. It was a tip in a Xanth novel by Piers Anthony. I think it was "Man from Mundania", but I'm not sure because I haven't read them in 20+ years. Gosh, that makes me feel old. :)

    • @retropaganda8442
      @retropaganda8442 2 місяці тому +1

      Ahaha, the true mnemonic is actually just the etymology of the word. I don't know if it's Latin or Greek, but for example, in french it's m for monte (raise) and t for tombe (fall). Simple.

    • @collinstasiak4994
      @collinstasiak4994 2 місяці тому

      Stalagmite sounds like dynamite and you don't wan to put that on ceiling is how Ive always remembered it

    • @Eutropios
      @Eutropios 2 місяці тому

      Stalactites stick tight to the ceiling. Stalagmites might grow upwards

  • @hierax49
    @hierax49 2 місяці тому +71

    the author has a brazilian name. brazil mentioned

    • @rawallon
      @rawallon 2 місяці тому +5

      Dev do Gamers club do fallenzão (2:06)

    • @Thoer
      @Thoer 2 місяці тому

      let's go!!!

    • @user-zg2bx4oz2p
      @user-zg2bx4oz2p 2 місяці тому

      It is also a Portuguese name

    • @microcolonel
      @microcolonel 2 місяці тому

      Nobody lives in Portugal 😂​@@user-zg2bx4oz2p

  • @MrDadidou
    @MrDadidou 2 місяці тому +19

    French gang:
    Stalag-mite (M like "monter" in french, to go UP)
    Stalag-tite ( T like "tomber, to fall)

    • @_kostant
      @_kostant 2 місяці тому +2

      Always remembered it from the C in stalactite being “ceiling” lol.

    • @OnStageLighting
      @OnStageLighting 2 місяці тому +2

      'might go up, tights come down.'

    • @microcolonel
      @microcolonel 2 місяці тому +2

      ​@@OnStageLightinggiggity

    • @itsthesteve
      @itsthesteve 2 місяці тому +1

      Stalag (ground), Stalac (ceiling)

  • @metropolis10
    @metropolis10 2 місяці тому +13

    Primeagens reactions in this video "wow that's a lot slower than I would have thought... well I GUESS it is a BILLION items" x1 Billion

  • @andyvisser
    @andyvisser 2 місяці тому +4

    My guess on the read buffer and diminishing returns: I bet you get max performance when the buffer size aligns with the underlying hardware's size. Like it's best when you read a sector at a time (or however SSDs are addressed/broken down in firmware).

    • @TehKarmalizer
      @TehKarmalizer 2 місяці тому +1

      Or file system block size. Typically reading in multiples of the block size is most efficient.

  • @rapzid3536
    @rapzid3536 2 місяці тому +3

    mmap
    Split the memory space into the number of Cores
    Hand out pointers start/end to threads
    Walk all but the first pointer start forward until after the next new line or EOF.
    Start ripping from there.
    Profit.

  • @michealkinney6205
    @michealkinney6205 2 місяці тому +4

    "Managers be like push it to prod! We're done... Good enough!" @ 20:16. Lol, like every non-technical manager ever.

  • @jackevansevo
    @jackevansevo 2 місяці тому +2

    I love these posts, there's a lot of tidbits of information to learn.

  • @Olodus
    @Olodus 2 місяці тому +6

    Dammit, now I feel like I will have to do this in Zig or something... But great article. Really shows the experimentation and learning process.

  • @i_sometimes_leave_comments
    @i_sometimes_leave_comments Місяць тому +1

    4:35 Assuming go's `map` is a self-growing (via reallocation) array (like C++ `vector` or C# `List`), as the `map` grows, you'd have to mem copy the whole underlying array, and a bunch of pointers would be way cheaper than a `struct`

    • @anon1963
      @anon1963 Місяць тому

      you can do vec.reserve(n) in c++. eliminates need for expensive reallocation

  • @sanderbos4243
    @sanderbos4243 2 місяці тому +3

    It drives me bonkers how they used 10 instead of '
    ', and even went so far as to describe the magic integers at 28:21 with comments like "if b == 45 { // 45 == '-' signal"

  • @kodekata
    @kodekata 27 днів тому

    A Goroutine is Go's syntax for Tony Hoare's Concurrent Sequential Processes (CSP, not like the browser's CSP though). Fun fact: the creator of Go had made several previous languages, all with CSP baked in. In Clojure[script], the simple syntax for CSP was enabled via a library.
    CSP has been implemented in JS via generators, but there are implementations with more usage (eg. for Clojure).

  • @SimonBuchanNz
    @SimonBuchanNz 2 місяці тому +3

    I did some basic aggregation with node on a 2 GB ini file: from memory with a bunch of work i got it down from 40s done somewhat naturally to about 7s done by a crazy person. The dumb 10 line Rust code took 3s or something.

  • @danielmccann2979
    @danielmccann2979 2 місяці тому +8

    For one second I read that as milli hz of ram and was like why is you ram going only 6 hz, are you manually clocking that thing

  • @PhilipAlexanderHassialis
    @PhilipAlexanderHassialis 2 місяці тому

    I like how its from 95s to 1.96s whilst inside the article a sub-second result is mentioned.

  • @hinzster
    @hinzster 2 місяці тому +9

    Oh damn, for-loops are now considered boomer loops? What about while(true)/break loops? Are those dinosaur loops?

    • @hinzster
      @hinzster 2 місяці тому +3

      Also, back when I was doing that obscure shift organizer program for hospitals, I used my own fixed point package to optimize stuff - everything was one single digit of precision anyway, so I just worked with ints determining 10ths of hours (another "problem"). Worked well, fast, and didn't use as much space as those pesky floats. I did this before the FP coprocessor was included in intel processors (ie. before the 486. My actual development machine was an original IBM PC XT, running an 8088 at 4.77MHz! I needed all the speed I could get).

    • @weakspirit_
      @weakspirit_ 2 місяці тому +1

      nah, the dinosaur loops are the asm branch loops 🦖

    • @FakeDumbDummy
      @FakeDumbDummy 2 місяці тому

      Well, go don't have while loops, so yes dinosaur loop for me

    • @SandraWantsCoke
      @SandraWantsCoke Місяць тому

      Those are biblical times loops

  • @Thorarin
    @Thorarin 9 днів тому

    FYI: Buffer size of 1024 is terrible, because most modern disks use 4kB sectors nowadays. So some multiple of 4kB is immediately better.

  • @evergreen-
    @evergreen- 2 місяці тому +5

    This video gives me huge flashbacks

  • @valhalla_dev
    @valhalla_dev Місяць тому

    "I have very little experience in these kinds of investigations"
    Me: Oh, word, he and I will be talking on the same level
    ...
    Me: Oh, shit, I understand none of this

  • @StrengthOfADragon13
    @StrengthOfADragon13 2 місяці тому

    Can't wait for the "what is your 1 billion row challenge time" question in interviews. (Actually though, taking a legit stab at the challenge for myself sounds super fun and I really wanna see if work will greenlight letting me work on it as part of my training hours)

  • @SimonBuchanNz
    @SimonBuchanNz 2 місяці тому +13

    "mutex is a spin lock" technically mutex is just the semantics, not an implementation, and there's a few ways to do it, with different trade-offs.
    They generally *start* with a spin lock, but that's just an optimization assuming the lock time is short. They then need a way to put the aquiring thread to sleep, and there's a bunch of ways to implement that. You can do it in user space with just thread sleep and wake functions, which can be good for "fair" locks, but you can also use events or explicit kernel mutexes, which might be better for thread residency.

    • @Kane0123
      @Kane0123 2 місяці тому +3

      I’m going to give you a like based purely on the amount of text. I’m happy for you though, or sorry that happened.

    • @rawallon
      @rawallon 2 місяці тому

      technically, anything is just the semantics

  • @MikePaixao
    @MikePaixao 2 місяці тому +1

    I remember having to parse 600TB databases in the gamedev industry, I ended up using python and the windows copy buffer to just snapshop the file into memory

    • @dv_xl
      @dv_xl 2 місяці тому

      Interesting , have a few questions.
      Obviously it can't loat 600TB into memory at once, did you chunk your reads or were the underlying DB files split up naturally?
      Were you using a network file system?
      Did you run multiple processes and map/reduce or just a single process? I'm curious how long it took in either case

    • @MikePaixao
      @MikePaixao 2 місяці тому

      @dv_xl the first layer was using perforce, so any previous work or code could compare against cached version of all unchanged files locally synced
      Next you need to break up Parallel loops based on file types, ascii files are super easy to write regex logic (think file mirroring) I would quickly build a list of all file dependencies (if I was parsing a game map, I listed all the models, if it was a 3d model, it connected what maps and textures used it etc etc...
      Now for the copy trick, depending on file size, when having to parse through larger 1gb+ files you can choose to either copy an entire folder or individual files, and binary format you need to do the painful thing of writing a custom binary parser for the now copied into memory data
      I remember back on wolfenstein a couple of times having to checkout the entire repo because German lawyers were like "nein! You cannot have any file names with verboten naming on disk" and when you need to edit file names across an entire project that is weeks away from gold master.. not a lot of wiggle room :P

    • @MikePaixao
      @MikePaixao 2 місяці тому

      ​@@dv_xl So the data was all stored in perforce, so I would store a snapshop with a perforce timestamp, so I could choose a chached or fresh mapping
      depending on folder/file size, sometimes you could copy entire folders to parse through larger files... it really depended on file types or single files at a time with custom binary interpreter. so you could skip entire sections of files and pull out relevant info (I was tracking all assets, where they showed up in engine or in a map and then all the related textures, models, audio etc..) It was a reflection system across data formats :P
      All done in parallel, and a weird reason to do batches of folders and not file by file is the limited number of threads python would spin up before hitting some per machine arbitrary number of threads windows can keep track of :P (also, early exist everywhere, I don't need to parse a 3D models vertices, or the animation sequence in a skeleton!)
      At some point I was checking out the entire project because german lawyers were like "Nein! Verboten! you cannot have nazi named file folders on the shipped disc"
      "but it's wolfenstein?" -> glad I added the "find and replace" option so I could do mass edits while it was parsing through :D
      timing I had it under around a few seconds, under 1s if the perforce cache existed (db was stored as sql file with no read/write locks in perforce)

  • @jhk940
    @jhk940 2 місяці тому +4

    I must have missed something. The SSD (Kingston SSD SV300S37A/120G) has a maximum read rate of 450MB/s, so reading the 13GB should take 28.88 seconds minimum. wat. Can someone explain?

    • @jhk940
      @jhk940 2 місяці тому

      Well, I guess the complete 13GB file is cached in RAM by Windows.

    • @TurtleKwitty
      @TurtleKwitty 2 місяці тому

      @@jhk940 Yup every os keeps hot files in ram; the java one actually had a final implementation with a ramdisk instead so the ssd overhead didnt matter

  • @mikejohnstonbob935
    @mikejohnstonbob935 2 місяці тому

    Devin's out there taking notes. This whole article is honestly like an AI overtraining on a specific dataset. Its language capabilities even degrades as it reaches the its max context window

  • @hosseines276
    @hosseines276 2 місяці тому

    whoa! really enjoyed!

  • @TurtleKwitty
    @TurtleKwitty 2 місяці тому

    The mighty stalags rise, while the other stalags hold tight is my way of remembering which is which hahah

  • @fuzzy-02
    @fuzzy-02 2 місяці тому

    Renato Pereira alone sounds like a cool secret agent driving a very fast classical car

  • @thekwoka4707
    @thekwoka4707 2 місяці тому

    Probably could do pretty fast with Bun. Bun.file has some good ability to read file partials, so you could see how big the file is, spawn a ton of threads and handle only the parts for each....
    JavaScript does also have cool things like SharedArrayBuffers that could enable some more low level style memory control...

    • @marcomassa84
      @marcomassa84 Місяць тому

      I got the 1BRC down to 5.5 sec with nodejs. Bun has a bug with highwatermark option that make it less performant than node (at least in my test)

    • @anon1963
      @anon1963 Місяць тому

      remember about Amdahl's law

  • @CipovPeter
    @CipovPeter 2 місяці тому

    i an wondering why you need mutex when reading from file. why not open file x times for reading ? and using seek start reading from right position ? right positions can be computed in main thread at the beginning. sort of index. did not test ot but suppose ut would remove a lot of merge logic from the end of article

    • @Jesse_Carl
      @Jesse_Carl 2 місяці тому

      I was also wondering this

  • @caedenw
    @caedenw 2 місяці тому +1

    I can’t believe I have to point this out but his SSD can’t do 13GBps and so this is all coming from his page cache in RAM. Don’t expect anything close to these results if you flush the cache. In light of that, he should be seeing a much better score if implemented correctly since he has so many threads.

  • @rogerdinhelm4671
    @rogerdinhelm4671 2 місяці тому

    Current top Java implementation reaches 300ms, but measurements are done on reference hardware (32 cores / 64 threads), and thus might be different to whereever the Go guy was running it at.

  • @9remi
    @9remi 2 місяці тому +1

    16:00 flip did NOT take that out

  • @KaydotOrigin
    @KaydotOrigin 2 місяці тому

    Would be awesome to see you do it in ts/js

  • @absurd0000
    @absurd0000 2 місяці тому +2

    Flip, more like Slip, cuz he be slipppppin

  • @willembeltman
    @willembeltman 2 місяці тому

    8:00 reason is the buffersize of your hdd/ssd.

  • @parikshitpatil1421
    @parikshitpatil1421 2 місяці тому +3

    I guess best java solution used mmap.

  • @burkskurk82
    @burkskurk82 2 місяці тому

    Prime, what about Redis changing licensing model and Garnet (by Microsoft) written in C# outperforming Redis in C++. Help us make sense of it.

  • @ReedoTV
    @ReedoTV 27 днів тому

    They should have used their "4.7HGz" PC to run a spell checker

  • @retropaganda8442
    @retropaganda8442 2 місяці тому +1

    I just clicked on the first search engine result for the one billion rows challenge in c language and the result of the guy beats the "official" java winner.
    Not surprised.

    • @morosis82
      @morosis82 Годину тому

      Not that surprising, the first result is likely to be the best linked (highest ranked) when everyone is talking about fastest implementation in language X.

  • @JackDespero
    @JackDespero 2 місяці тому +6

    I am sorry, but you are wrong.
    Boomer loops are GOTO and CONTINUE loops.
    The simulation code that we use at work was written in modern FORTRAN (FORTRAN 77, not 65) and is full of
    GOTO 1000
    Do stuff
    1000 CONTINUE

  • @MichaelSalaverry
    @MichaelSalaverry 2 місяці тому +11

    One billion comments, lets go!

  • @weakspirit_
    @weakspirit_ 2 місяці тому +1

    i'm calling it, multithread/multiprocess overhead is going to show that his single process/thread solution is actually faster

  • @dand4485
    @dand4485 2 місяці тому

    I'm thinking one way to convert the temp (float) is have a hash map for all 100 possible different values i.e. map("99.9") simply return 99.9....

    • @imaymakesomevids
      @imaymakesomevids 2 місяці тому

      There are 2000 values, cos of the decimals.
      The hash and lookup would be a lot slower than just parsing the numbers directly.

    • @retropaganda8442
      @retropaganda8442 2 місяці тому +1

      Don't hash it! Just make a 2000 element array, use the raw bits as an index, and it's gonna be fast.

  • @GermanClaus
    @GermanClaus 2 місяці тому

    KOTLIN mentioned!!!

  • @bluecup25
    @bluecup25 2 місяці тому

    Prime, do it. Just do it.

  • @michelvandermeiren8661
    @michelvandermeiren8661 2 місяці тому +5

    Java has proven to be the fastest lang on earth with this challenge ! No other lang can compete

    • @dv_xl
      @dv_xl 2 місяці тому

      Firstly this statement is inherently false, it can never be as fast as the fastest asm or c. But more importantly, where did you get that idea? I looked up the results for Java from the test and they were 6 seconds. It's not clear what the hardware used for the testing was, but it doesn't look to me like there's a good cross language comparison table anywhere

    • @michelvandermeiren8661
      @michelvandermeiren8661 2 місяці тому

      @@dv_xl fastest java took 1.4 sec

  • @MrWalrus3451
    @MrWalrus3451 2 місяці тому

    Flip ain't taking it out brother.

  • @thatmg
    @thatmg 2 місяці тому +1

    PORTO MENTIONED!

  • @pylotlight
    @pylotlight 2 місяці тому +3

    does flip even watch the videos or just use the markers seeing he misses every cut request ;p

  • @rezyadlf
    @rezyadlf 2 місяці тому

    2 business days got me)))

  • @Sw3d15h_F1s4
    @Sw3d15h_F1s4 2 місяці тому +2

    someone should do the 1 billion row challenge using vim

  • @RenThraysk
    @RenThraysk 2 місяці тому +3

    Unfortunately produces corrupt data. If run it multiple times over the same 13Gb dataset, it'll produce a different result each time. Some temperature values end up in the 10s of thousands, and also new locations appear. Signs of race/memory corruption issues.

    • @anon1963
      @anon1963 Місяць тому

      What? Your program or the program in the video?

    • @RenThraysk
      @RenThraysk Місяць тому

      @@anon1963 The solution in the video.

    • @anon1963
      @anon1963 Місяць тому

      @@RenThraysk ah ye, they probably ran finished program once and were like: "good enough!"

  • @issacwessing4945
    @issacwessing4945 2 місяці тому +1

    I'm having some problems solving this in HTML

  • @retropaganda8442
    @retropaganda8442 2 місяці тому

    The word "buffer" CRIES for underoptimised implementation with data being copied between kernel memory and user space process memory.
    I think i'd start by doing an mmap of the whole data on disc, assuming it's already in the fs cache.

  • @Tony-dp1rl
    @Tony-dp1rl 2 місяці тому

    forEach, map, etc. are the devil in JS

  • @ytdlgandalf
    @ytdlgandalf 2 місяці тому +3

    These times are too good tobe true. Heavy caching through pagecache. He should flush pagecache before every try. 13GB in 1.96 =~ 6.5GB per second. No way in hell with the mentioned ssd. Flushing cache for honest numbers on the same system is benchmarking 101. Did he ever run the java implementation on his own system to set a baseline or did he just take the other benchmaker's results? Do people even know how to benchmark?

    • @arden6725
      @arden6725 2 місяці тому

      why would you want a software optimization benchmark to be limited by your disk speed, that’s literally pointless

    • @ytdlgandalf
      @ytdlgandalf 2 місяці тому +2

      @@arden6725 why? For reproducibility. His results could now easily be skewed from run to run if for example chrome is having a bad day and is filling his memory and thereby flushing his oagecache during some runs but not others. If you are unaware of this you make wrong conclusions on what changes made your program faster or not. If you want to take ik out of the equation than the benchmark should've stated to use a ramdisk or generate the data in-process

    • @javierflores09
      @javierflores09 2 місяці тому

      @@ytdlgandalf this kind of code isn't meant to be run within a workstation but a server, meaning it'd be the able to take full advantage of the machine. When it comes to a workstation, all of these low-level impl will fall short behind the general impl because there's no way to predict the amount of resources the environment is willing to give this program in question in order to complete it at the fastest time possible.

    • @ytdlgandalf
      @ytdlgandalf 2 місяці тому

      @javierflores09 this is about reproducibility. Doesn't matter if its your workstation or a "server".

  • @soggy_dev
    @soggy_dev 2 місяці тому

    I actually prefer specific syntax for multiple return parameters 🤷‍♂️ The language is almost certainly creating an anonymous struct under the hood anyway, so I'd rather it be more obvious they're connected/contiguous. Plus you have the option of passing around the entire tuple or destructuring into the components depending on what's the most convenient which just seems objectively better to me. I love go but that's up there with lack of sum types on the list of things that bother me

    • @aurele2989
      @aurele2989 Місяць тому +1

      we do a little struct { int a, b, c; } fn(int in) { /* ... */ return (typeof(fn(0))){ a, b, c }; }

  • @lskywalker5
    @lskywalker5 2 місяці тому

    GOD DAMN IT FLIP

  • @birdbrid9391
    @birdbrid9391 2 місяці тому

    flip did not cut it out

  • @truehighs7845
    @truehighs7845 2 місяці тому

    2 business days: from Friday to Monday.

  • @Alguem387
    @Alguem387 2 місяці тому +1

    MMAP?

  • @thekwoka4707
    @thekwoka4707 2 місяці тому +1

    forEach is faster than boomer loops in newer versions of node and in bun.
    Pretty wacky, but true.

    • @ThePrimeTimeagen
      @ThePrimeTimeagen  2 місяці тому +1

      Actually not true
      This test was done in 20.x, 18.x, and 16.x
      By the very definition they cannot be faster. They can be of equal speed if extremely clever compiler stuff happens.
      This would require jit to take place as well

    • @lucsoft
      @lucsoft 2 місяці тому

      ​@@ThePrimeTimeagen Mmmh i tested NodeJS 21 and actually found it was faster:
      const array = Array.from({ length: 1_000_000 }).fill(1);
      time = performance.now(); array.forEach((e) => e); console.log(performance.now() - time);
      // run was between 10 - 14ms
      compared with
      time = performance.now(); for (e of array) { e; }; console.log(performance.now() - time);
      // run was between 14 - 20ms
      Wonder why its faster

  • @olhoTron
    @olhoTron 2 місяці тому

    Before even watching the video I'll guess the biggest gains will come from reducing allocations

  • @Wielorybkek
    @Wielorybkek 2 місяці тому +1

    I don't get it, the File Read Buffer took only 0.98 s!!!! Why everyone is ignoring it!!!

  • @michaelgreenberg6344
    @michaelgreenberg6344 2 місяці тому +6

    On his hardware, he's I/O bound and any optimization is useless.
    Dude has 32 gigs of RAM. Meaning that, on an idle enough system, most of that memory will be used for file system cache, into which a file with the size of 13GB fits quite neatly.
    I will probably not be too exaggerating if I say that he only read the file from disk once - the first time he ran his program. If not once, then by the fifth run, the entire file would be up in RAM for sure. All the rest of the "I/O" tests were performed against the memory, which just checked how fast memory copy in chunks of different sizes and multiples of allocations can be performed. Had he been performing actual I/O, there's no way he'd be getting >13GB/s (which a time of ~0.98s suggests.)
    In fact, his drive is rated at 497MB/s (manufacturer spec), so on that hardware, it's useless to play with the buffer size, since you won't be reading the file faster than ~27 seconds, as the first file read test with the buffer size of 1024 would suggest. 13*1024/497=26.78, and i'm pretty sure that all the allocations were done during iowait, so it's safe to assume the file size is not exactly 13GB, but more around 13.3-13.5 :D
    This article is written by someone who probably doesn't understand storage or operating systems too well (using windows for development - first hint... jk,) but it's a nice experiment to see how well you can optimize such an algorithm if your disk bandwidth is infinite.

  • @Kane0123
    @Kane0123 2 місяці тому

    I’m waiting for a cloud vendor to suggest just running all billion in serverless - scale up to what you need to scale down when you’re done bro, e.z.

  • @amjad-se
    @amjad-se 2 місяці тому

    Could you please do a video on Pocketbase?

  • @JackClawson
    @JackClawson 2 місяці тому

    Boomer loops sounds like a great cereal, now with fiber.

  • @Tony-dp1rl
    @Tony-dp1rl 2 місяці тому +1

    I still don't understand how these BILLION row challenges are not entirely IO limited ... I mean even in JS, how to you spend more CPU time than it takes to read that much data? :/

    • @Tresla
      @Tresla 2 місяці тому

      This is my question. How are they getting millisecond solutions? What are they running on? My NVMe drive tops out at around 1500MBps, so I couldn't even process the file in less than 10 seconds...

  • @sedrakpc
    @sedrakpc 2 місяці тому

    How it’s done in Java in 1.5 second? Now you have to read the java version)

    • @lazyh0rse
      @lazyh0rse 2 місяці тому +2

      they used native GraalVM, it compiles java to machine code

    • @javierflores09
      @javierflores09 2 місяці тому

      ​@@lazyh0rsethis wasn't the only reason, sure it reduced the time by removing the startup cost however there are many tricks that led to the 1.5 second (and even, 323ms when using all the 32 cores of the test machine instead of just 8). There is a great blog post by QuestDB that explains the tricks used in the top solutions in detail.

  • @rasalas91
    @rasalas91 2 місяці тому

    flip did not take that out

  • @mechmaverick
    @mechmaverick 2 місяці тому

    I just found your channel and your the dr disrespect of software, get some sunglasses

  • @sebastianwapniarski2077
    @sebastianwapniarski2077 2 місяці тому +1

    Can anyone suggest a streamer that is as good with SWE but on the other side of the spectrum - TEMPERAMENTwise. I'm more of an Uncle Bob kind of guy.

  • @avalagum7957
    @avalagum7957 2 місяці тому

    That Go person used tabs (8 spaces)?

    • @Yawhatnever
      @Yawhatnever 2 місяці тому

      All Go code uses tabs. The reason it looked excessive was because the default browser styling for the tab-size property is 8 spaces, and apparently they didn't change it with css.

    • @avalagum7957
      @avalagum7957 2 місяці тому

      @@YawhatneverOh, thank you. I didn't know that.

  • @bhuvya11
    @bhuvya11 Місяць тому

    I want someone to try this in javascript 😂😂😂

  • @b0nes95
    @b0nes95 2 місяці тому

    how can you read 13GB from disk in 1.5 seconds even :/ I need to watch the rest of the video lol, the timer must've been started while the 13GB was in mem

    • @Tresla
      @Tresla 2 місяці тому

      RAM disk possibly?

  • @user-jw9iw2zy1k
    @user-jw9iw2zy1k 2 місяці тому

    13GB in one second? I think the ssd couldn't even be that fast, right?

  • @FaZekiller-qe3uf
    @FaZekiller-qe3uf 2 місяці тому

    Joelang

  • @qazarify
    @qazarify 2 місяці тому

    This cannot be true, the Kingston SSD SV300S37A is not capable of transferring 13Gb/sec

    • @Yawhatnever
      @Yawhatnever 2 місяці тому

      Windows caches file reads in RAM when it can, so it's plausible that not all of the reads are hitting the disk

  • @viktorhugo1715
    @viktorhugo1715 2 місяці тому

    Renato Pereira is a Brazilian name soooooo...
    BRAZIL MENTIONED LWSGOOOOOOOOOO BRAZIL!!11!1!1!1!!1!1!1!11!1!1!1!1!!!!1!!1!1!1!!1!

  • @bluecup25
    @bluecup25 2 місяці тому

    15:55 - Ignored

  • @pantsoff
    @pantsoff 2 місяці тому

    Flip didn't take it out

  • @ytdlgandalf
    @ytdlgandalf 2 місяці тому +1

    nobody is wondering how he can read 13GB in under a second? Really?

  • @FrederikSchumacher
    @FrederikSchumacher 2 місяці тому

    Gopoutine

  • @ismbks
    @ismbks 2 місяці тому +1

    the one guy in your chat spamming "hardly know her" jokes

  • @havokgames8297
    @havokgames8297 2 місяці тому

    Stalagmite - *might* reach the ceiling one day
    Stalagtite - holding on *tight* so it doesn't fall

  • @sebastianwapniarski2077
    @sebastianwapniarski2077 2 місяці тому

    There are two kinds of great professionals who show of their skills: 1) will make you inspired 2) will throw you into despair. For me Prime is the second kind. But he's funny. I give him that. And him boasting about how he ruined every ones day when he got that calc test way ahead of others back in his uni times is just a proof of this.

  • @jazzochannel
    @jazzochannel 2 місяці тому

    how can i insert a yomoma joke here, or an insult involving your mom?

  • @TheRadischen
    @TheRadischen 2 місяці тому +1

    2

  • @selimpy8105
    @selimpy8105 2 місяці тому +1

    damm so early

  • @truehighs7845
    @truehighs7845 2 місяці тому

    Why windows, that's gotta count for half the slow down, you want to optimise, get rid of windows.

  • @chasep9440
    @chasep9440 2 місяці тому

    Or you could just code in Elixir because its just straight better.

  • @himbo754
    @himbo754 2 місяці тому

    32 GB RAM? So laughably small...

  • @mkvalor
    @mkvalor 2 місяці тому

    Ain't no way you're a Boomer. _Maybe_ Gen X.

  • @dave4148
    @dave4148 2 місяці тому

    please do this in javascript!