I just wanted you to know that after almost 10 years after abandoning programming (because of... reasons), your videos made me realize I want to learn and start practicing again. I did forget that programming was supposed to be fun.
I love how people say, in any programming "teaching" style of videos, that they'd never learned as much as the said video after 5 years of uni, or after 3 semesters of X discipline. One of my theories is that you learned a lot during those times but after the video it solidified a certain concept because of how the video is put together.
Some optimization tips: - Since you're lexing the file anyway, you can simply skip tags. Skip everything between ''. That way you can remove the XML parsing. - I believe at least with western languages (like English), words of 2 characters or less are often simply omitted for TF/IDF indexing. Cuts down on the has table size and generally doesn't loose any information.
As the absolute beginner I am, do I understand your comment correctly? Lexing: the process of reading the file Has table size: amount of entries in a "list" of some kind that contains everything that is supposed to be filtered out Thanks in advance
@@datpudding5338 I haven't watched the video (yet), but I think "has" was supposed to be "hash", becoming "hash table size.." which would make more sense to me, unless it's terminology over my head.
@@datpudding5338 the lexer splits the document into specific chunks that are easier to work with than the whole string. here it basically just splits the text contained in the XML along the spaces. but i think the term comes from parsing where it refers to splitting a string into predefined tokens like keywords and variable names. a hash table is a specific data structure that‘s very useful to quickly look up data in. but your extrapolation is correct, it‘s just a very fancy list with special properties.
25:50 Absolutely fricking loved what you said about turning off the internet and becoming one with the project you’re working on. It’s something called flow state, explained beautifully in the book Flow by Mihaly Csikszentmihalyi.
HolyFUck. There is no materials on how to implement a search engine on UA-cam. And here are you, and not in some shitlang, but in Rust!! I have very big expectations. Because whole playlist with several 2hour videos does NOT look like bullshit to me. It is serious SHIT. THANKS
какой приятный слегка заметный русский акцент у автора, спасибо за полезное видео, очень понравилось) what a nice slightly noticeable russian accent the author has, thanks for the useful video, I really liked it)
Came for Rust code, got philosophical thoughts about Internet. I was not disappointed :D And yeah, Rust is very well build and easy to use once you understand how it works. Traits are amazing.
The story about computer speed and how programmers added delay to make people trust it, and how now it's the opposite lol made me laugh hard. I like these kinda videos, I miss my C years :/
I learn something new and useful every time I watch a stream. If it’s not about the main subject itself it’s usually just some random bit of bash or another arg for a commonly used command. How bout that
One reason your program might be slow is because I don't think those functions that take in readers and writers do any buffering so you might get a decent speedup by wrapping your files into std::io::BufReader and std::io::BufWriter respectively
55:26 - Well, I recently ported a legacy PHP image generator to JS (offloading the work on the client browser instead of the server). And I actually hardcoded a 1 or 2 seconds delay into the presentation of the result to actually indicate to the user, that work is indeed being done. 😂
I completely agree that rust is for smart enthusiastic people who like sending their time making really complicated abstractions. That's the problem with rust is that you waste your time building these abstractions. You send your time debugging your programming language knowledge rather then debugging the application.
45:25 After learning about the term "nerd sniping", I searched for the xkcd comic and spend about and hour on that website before returning to the video xD is there also a term for that?^^
54:05 my grandma was working as an accountant, she has started with abacus. When first mechanical calculators appeared in their bureau, she was double checking every calculation with an abacus. When first electronic calculators started appearing in their bureau, she was double checking them with the mechanical calculating machine. When first PC's were installed she started double checking the calculations on a calculator.
Hey Tsdoing, i love your skills and videos ,just one request is that when you do a video can you please add some study materials and references that we can look up before and the basic prerequisites needed to atudy to easily follow your chain of thought. Also some kind of announcement of your next video topic so that we can read it and be better prepeared
Windows has something similar (but not quite as good) if you hit Win key and Plus key at the same time (get out of it with Win + Esc), if you're on Windows
1:43:30 yeah, trying to serialize it to JSON is hard. If I were you I'd store it in an SQLite file. It'd be easy to write the query to add each entry in `tf_index` to the db and also straightforward to parse it later
27:00 Disconnecting from the internet for some time is quite brililant. I was disconnected recently but not by my own will, rather through circumstances. All I could do was work on my old projects and I couldn't actually download any libraries.
I think that Google Search is becoming shitty because nowadays there are lots of AI generated articles and people who specifically make it their JOB to bait Google results into their articles with SEO. But yeah, old Google was way better and I'm just waiting for a better search engine to come along.
@@matroqueta6825 it's just goodhart's law; "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes."
1:45 itoa is just a one faster than the one in the std, because the std one use Formatter and support advanced options and format. the crate itoa is just a lot faster
I personally found it to be not THAT bad. It took me a while to become productive in rust since there is such a large shift in how you implement things. Certain approaches used in C is very hard and/or bad in rust. I would consistently spend my time googling on how I can work with certain data structures and then realize there already was a trait that gives me what I want etc. I think the greatest realization was when I finally understood that rust has a different paradigm to C (Rust is declarative compared to the imperative C). Truly understanding how to work with the borrow checker came with experience. My knowledge in C did help however since rust is sensitive about datatype sizes (see sized and unsized traits) and the knowledge from C and assembly carried over into it. Nowadays rust is clearly my favourite programming language, I've largely replaced C and C++ with Rust, Go and Python.
@@yeetusdeleetus Thank you for all info. Another question: Do you think you will miss a lot of C++ libraries while working in Rust? or Rust has most necessary libraries?
Result has a method called "expect" and it unwraps it but if there's an error, it prints out a message you give it, which is basically what you were doing with unwrap_or_else but in less code. If you want formatting, just call .expect(format!("ERROR: ..."));
"I know this is not traditional in the rust community, but this is my project and i like to do things my way" 👏👏👏👏👏👏👏 i wholeheartedly agree! one thing i struggle with is feeling like i MUST do things exactly 1 specific way or i'm absolutely wrong even though it might still work. I understand the benefits of having everyone sticking to one established way of coding but Rust compiler is already strict enough, for good reason, don't restrict me even more on top of that.
Ryan Dahl created npm in C and then Deno in Go and rewrote it in Rust - and he said he will never ever start a new project in C. So here is the work ... and the opinion of a productive creator. Also I think there is a phase in every language were libs and frameworks etc. have to be created by smart enthusiastic people. Like in PHP - nobody would care if you start now a project like Symfony or Laravel. But back then it was huge and important.
Tf idf is pretty good. U can use a bit of extra stuff to make it ridiculously good. So bm25 changes the formula by a bit but it's the same core idea Modern stuff use ml embeddings but for a large query what u would usually wana do is run bm25 then take those results and rank them by the embeddings.
I wana make a lang that is like c but safe and functional. Like varibles are functions that return stuff. And if u need weird stuff u have something like a class or something where u can modify the intern varibles of the function
HUH non-idiomatic rust ???? you clearly didn't read da roulex , you now must shall be updating index for as much years as many clippy pedantic lints you have violated
also if you don't wan't to wait too long use sparse-registry with -Z sparse-registry flag or config in .cargo (will be in 1.68 version in march) can't send links but is's in rust blog
I asked AI about the color theme of your IDE: As an AI language model, I do not have access to real-time information about the current state of Tsoding's IDE or any other developer's IDE. However, based on some of Tsoding's past videos, he has used different color themes for his IDE, including a dark theme with a black background, white text, and colorful syntax highlighting for different programming languages. In some of his videos, he has also used a light theme with a white background and dark text. It is possible that he may have changed his IDE color scheme since then, but without any specific information, I cannot say for certain.
@@ArisenDrake What's so hard about fopen and checking/handling the error? It's what you do in Rust anyway. Do you think unwrapping is some magical statement that solves errors?
@@ElPikacupacabra The difference is simple: Rust ENFORCES error handling, while in C, someone could forget to do it. Humans do mistakes. Also, while this is subjective, Rust's API for this just looks nicer.
@@ArisenDrake Rust doesn't really enforce error handling. You can simply ignore errors and the program will panic. That's no different from C erroring out. Regarding your desire for syntactic sugar: sure, the dedicated mechanism makes it nicer, but you also pay with complexity. IMO, it's not worth it. It doesn't solve a significant engineering problem.
I've learned more by watching tsoding streams than studying CS for 5 years
He thinks like a true engineer and teaches to do the same
Tsoding has a really good teaching style for people who learn by doing/seeing in practice which is usually neglected in CS education. Love him!
studying CS for 5 years enabled you to understand the tsoding streams
me too
damn sounds like you went to a terrible school
I just wanted you to know that after almost 10 years after abandoning programming (because of... reasons), your videos made me realize I want to learn and start practicing again. I did forget that programming was supposed to be fun.
May the force be with you
Have you programmed the past 6 months?
I love how people say, in any programming "teaching" style of videos, that they'd never learned as much as the said video after 5 years of uni, or after 3 semesters of X discipline.
One of my theories is that you learned a lot during those times but after the video it solidified a certain concept because of how the video is put together.
I liked your idea and agree with you.
When I'm bored at work, I use to open the docs and source code of our main dependencies. Sometimes, I get great insights reading it
Thank you Zozin, you taught me how to write code and how to think a way more properly than i used to, you're such a genius!
welcome to yet another ą̵͚͓͉̣̩̳͗͗̊̓̐͑͂̈́̿̓̚ͅz̶̡͈̞͚̞̼̭̰̬̝͙̮͋̌͛̌̎̽̓͠͝ó̶̡̨̧̙̰͇̻̲̩͔̳̤̠̝̜̊̇́͊̐̌̂̀͘͠ž̴͉̓͑͐ͅï̸̢̬̮̠̭̱̈́̓͐́͑̂̕̕n̸̨̪̜̰̠̹͂̏̿̈́͛̒̍͐̈́̍̈́͝ session
ZALGO HE COMES
Some optimization tips:
- Since you're lexing the file anyway, you can simply skip tags. Skip everything between ''. That way you can remove the XML parsing.
- I believe at least with western languages (like English), words of 2 characters or less are often simply omitted for TF/IDF indexing. Cuts down on the has table size and generally doesn't loose any information.
As the absolute beginner I am, do I understand your comment correctly?
Lexing: the process of reading the file
Has table size: amount of entries in a "list" of some kind that contains everything that is supposed to be filtered out
Thanks in advance
@@datpudding5338 I haven't watched the video (yet), but I think "has" was supposed to be "hash", becoming "hash table size.." which would make more sense to me, unless it's terminology over my head.
@@datpudding5338 the lexer splits the document into specific chunks that are easier to work with than the whole string. here it basically just splits the text contained in the XML along the spaces. but i think the term comes from parsing where it refers to splitting a string into predefined tokens like keywords and variable names.
a hash table is a specific data structure that‘s very useful to quickly look up data in. but your extrapolation is correct, it‘s just a very fancy list with special properties.
"If you're watching this on UA-cam I'm probably going to cut this" *snaps fingers and proceeds not to cut it* 😆😆😆
I am just glad that you're still alive and well. Hope parents are doing good too.
do u know him ?
25:50 Absolutely fricking loved what you said about turning off the internet and becoming one with the project you’re working on. It’s something called flow state, explained beautifully in the book Flow by Mihaly Csikszentmihalyi.
Tsoding streams are so philosophic
HolyFUck. There is no materials on how to implement a search engine on UA-cam.
And here are you, and not in some shitlang, but in Rust!!
I have very big expectations. Because whole playlist with several 2hour videos does NOT look like bullshit to me.
It is serious SHIT.
THANKS
Its so cool watching Tsoding both year ago and now
Watch some from his old channel 😮
@@NoOne-ev3jn Which channel are you talking about?
какой приятный слегка заметный русский акцент у автора, спасибо за полезное видео, очень понравилось)
what a nice slightly noticeable russian accent the author has, thanks for the useful video, I really liked it)
Came for Rust code, got philosophical thoughts about Internet. I was not disappointed :D
And yeah, Rust is very well build and easy to use once you understand how it works. Traits are amazing.
The story about computer speed and how programmers added delay to make people trust it, and how now it's the opposite lol made me laugh hard. I like these kinda videos, I miss my C years :/
I was feeling so unmotivated to code these days, this is getting me both motivated and teaches a lot of stuff 😭
This was the first time I saw a 2h long video! Now I want to build something too, I may try building it with another language
loving these rust series, you are an angel god
That rant at ~25 minutes is really interesting. Efficiency isn't always better for our brain!
I learn something new and useful every time I watch a stream. If it’s not about the main subject itself it’s usually just some random bit of bash or another arg for a commonly used command. How bout that
Was is the git clone --depth flag by any chance?:)
@@myt436 this time it was yes
Same! I like this channel so much
Please continue with this, i loved this one!
One reason your program might be slow is because I don't think those functions that take in readers and writers do any buffering so you might get a decent speedup by wrapping your files into std::io::BufReader and std::io::BufWriter respectively
Agreed. Unbuffered IO is slow as shit. At least in rust.
Fixed in commit e2ab6f5 three weeks ago. Us youtube frogs will always be a step behind.
55:26 - Well, I recently ported a legacy PHP image generator to JS (offloading the work on the client browser instead of the server). And I actually hardcoded a 1 or 2 seconds delay into the presentation of the result to actually indicate to the user, that work is indeed being done. 😂
Salve galera br do canal!! I noticed many Brazilian people were subbing to you this stream
DF and IDF are the same statistics used by elastic search I guess to power their search functionality
I completely agree that rust is for smart enthusiastic people who like sending their time making really complicated abstractions. That's the problem with rust is that you waste your time building these abstractions. You send your time debugging your programming language knowledge rather then debugging the application.
45:25 After learning about the term "nerd sniping", I searched for the xkcd comic and spend about and hour on that website before returning to the video xD is there also a term for that?^^
going down an XKCD rabbit hole
54:05 my grandma was working as an accountant, she has started with abacus.
When first mechanical calculators appeared in their bureau, she was double checking every calculation with an abacus.
When first electronic calculators started appearing in their bureau, she was double checking them with the mechanical calculating machine.
When first PC's were installed she started double checking the calculations on a calculator.
Hey Tsdoing, i love your skills and videos ,just one request is that when you do a video can you please add some study materials and references that we can look up before and the basic prerequisites needed to atudy to easily follow your chain of thought.
Also some kind of announcement of your next video topic so that we can read it and be better prepeared
tsoding: "The youtube people can't hear us"
UA-cam people: 😅
There is no serde in the post apocalypse
Great video, but what I really want to know is how you zoom your whole screen like that live? 1:13:22
its his own project called boomer
Windows has something similar (but not quite as good) if you hit Win key and Plus key at the same time (get out of it with Win + Esc), if you're on Windows
@@areshaistg thanks! I'll try it out
1:43:30 yeah, trying to serialize it to JSON is hard. If I were you I'd store it in an SQLite file. It'd be easy to write the query to add each entry in `tf_index` to the db and also straightforward to parse it later
Really interesting video, hope to see more of this project.
27:00 Disconnecting from the internet for some time is quite brililant. I was disconnected recently but not by my own will, rather through circumstances. All I could do was work on my old projects and I couldn't actually download any libraries.
Better name proposition: CrustacIndex
I think that Google Search is becoming shitty because nowadays there are lots of AI generated articles and people who specifically make it their JOB to bait Google results into their articles with SEO. But yeah, old Google was way better and I'm just waiting for a better search engine to come along.
so basically it's not that google is shitty now, it's just that the internet is shitty now
@@matroqueta6825 I mean there have always been shitty stuff, it's Google's jib to filter that out
@@matroqueta6825 it's just goodhart's law; "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes."
7:54 they only recently started using those because they are easier to game with bots. social media cannibalized the entire link graph
29:27 Be google on a small scale. God damn it, you're right again.
Lord Tsoding is goated
how about that ? how about that ? i love this
he does rust too?? this is just amazing what
Any advice that helped you learn rust on your developer journey? Thanks
subscriptions goes BRRRRRRRR
I love to work from tty because I can feel the lag of the terminal when i'm using X11.
1:45 itoa is just a one faster than the one in the std, because the std one use Formatter and support advanced options and format. the crate itoa is just a lot faster
How hard Rust was for you as a C programmer? And is it easier to program in rust than in C?
I personally found it to be not THAT bad. It took me a while to become productive in rust since there is such a large shift in how you implement things. Certain approaches used in C is very hard and/or bad in rust. I would consistently spend my time googling on how I can work with certain data structures and then realize there already was a trait that gives me what I want etc.
I think the greatest realization was when I finally understood that rust has a different paradigm to C (Rust is declarative compared to the imperative C). Truly understanding how to work with the borrow checker came with experience.
My knowledge in C did help however since rust is sensitive about datatype sizes (see sized and unsized traits) and the knowledge from C and assembly carried over into it.
Nowadays rust is clearly my favourite programming language, I've largely replaced C and C++ with Rust, Go and Python.
@@yeetusdeleetus Thank you for all info. Another question: Do you think you will miss a lot of C++ libraries while working in Rust? or Rust has most necessary libraries?
21:36 The youtube people can hear you very well!
I heard my guy hear say two sentences and knew I was in for a good time.
28:18 Now I think you are a character from South Park.
Simplicity with security…. **Go entered the chat**
Missed Opportunity to call it RustyEngine
1:51:40
your emacs didn't, but my neovim surely did die
Result has a method called "expect" and it unwraps it but if there's an error, it prints out a message you give it, which is basically what you were doing with unwrap_or_else but in less code. If you want formatting, just call .expect(format!("ERROR: ..."));
If I need an error object I can't get it from expect, I need unwrap_or_else in this case
yea the only thing that sucks about rust is the fanatics and the foundation... I handle errors like that in zig as well
"I know this is not traditional in the rust community, but this is my project and i like to do things my way"
👏👏👏👏👏👏👏
i wholeheartedly agree! one thing i struggle with is feeling like i MUST do things exactly 1 specific way or i'm absolutely wrong even though it might still work.
I understand the benefits of having everyone sticking to one established way of coding but Rust compiler is already strict enough, for good reason, don't restrict me even more on top of that.
Would you say it's possible to do a go-along with the tutorial and make it in c++ (or even c or java)?
7:27 yes but it was more about pagerank and authority
That was amazing! Thanks a lot!!
pog❤
I really like the color scheme he is using. Is it a custom scheme or can i find it somewhere ?
Any chance you try rust analyzer sir?
Is the second video uploaded I couldn't see it?
youtube people cannot hear us lmao
Oops! Forgot to cut it out! :D
>UA-cam people cannot hear us
Lmao
like first, watch next
Love this!!
57:00 - Is great purpose to use more threads.
you are good, i like it!
Can you do JS/ts?
Anyone know what font he uses ? thanks
maybe Liberation
You mean you want to do something like spotlight or alfred?
Ryan Dahl created npm in C and then Deno in Go and rewrote it in Rust - and he said he will never ever start a new project in C.
So here is the work ... and the opinion of a productive creator. Also I think there is a phase in every language were libs and frameworks etc. have to be created by smart enthusiastic people. Like in PHP - nobody would care if you start now a project like Symfony or Laravel. But back then it was huge and important.
Огнище 🔥🔥🔥 кайф контент 🔥
that's legendary
It's kinda wild how long taking text to hash map takes in rust...
In python it'd this
Count(Text.replace(", "," ").split())
Then
Json dump
Tf idf is pretty good. U can use a bit of extra stuff to make it ridiculously good.
So bm25 changes the formula by a bit but it's the same core idea
Modern stuff use ml embeddings but for a large query what u would usually wana do is run bm25 then take those results and rank them by the embeddings.
I wana make a lang that is like c but safe and functional.
Like varibles are functions that return stuff.
And if u need weird stuff u have something like a class or something where u can modify the intern varibles of the function
lua?
@@bhavyakukkar not quite it's dynamic and gced
Find text via regex: NAH
FZF to reduce context and search easily: _YEAH_
The problem with google is that it tries to know everything about everything
"Can we have a safe programming language without sacrificing the simplicity?"
Yes, we can! It's called Zig ! ;)
@@andrewdunbar828 When performance is not that critical, go is the better language. You can finish the work 5x faster.
use handcam to check your hand movement! Also music is distracting ; if you could lower down or delete it ; would be very cool.
Ты забыл порезать видос ) 21:40
fuck, this is what I am looking for, craftman :)
Why do you need to put something behind your camera?
So that he knows what part of the screen the viewers can't see
@@stewartzayat7526 thanks for replay, that really smart
Online detoxification is awesome.
Straight up thought this guy died. He hasn't posted on his original channel in years right?
interesting
Not interesting it's really interesting.
@@ItzSwiftyBoy_Gaming aren't really interesting things interesting?
HUH non-idiomatic rust ???? you clearly didn't read da roulex , you now must shall be updating index for as much years as many clippy pedantic lints you have violated
cargo clippy - -all - - -W clippy::all -W clippy::pedantic -W clippy::nursery -D warnings 2>&1 >/dev/null | rg error | wc -l on current commit gives 29 so ...
also if you don't wan't to wait too long use sparse-registry with -Z sparse-registry flag or config in .cargo (will be in 1.68 version in march) can't send links but is's in rust blog
dude how to be cool like you! Please tell me in one line!
you gotta beat your mom weekly
get gud
@@mustafazakiassagaf1757 🤣🤣
Years of learning and practice
nice
how old are u?
His uptime is 31 years and counting
21:36 😈
I asked AI about the color theme of your IDE:
As an AI language model, I do not have access to real-time information about the current state of Tsoding's IDE or any other developer's IDE. However, based on some of Tsoding's past videos, he has used different color themes for his IDE, including a dark theme with a black background, white text, and colorful syntax highlighting for different programming languages. In some of his videos, he has also used a light theme with a white background and dark text. It is possible that he may have changed his IDE color scheme since then, but without any specific information, I cannot say for certain.
That is true Google become shit. And not only search services but other services of Google too
"c is much simpler than rust"
"look how easy it is to read a folder" (it would took much more shitty c code to read the contents of a folder)
No, it wouldn't. Just as easy to read in C.
@@ElPikacupacabra If you ignore all the possible ways to blow your leg off by doing the "simple" way in C, sure.
@@ArisenDrake What's so hard about fopen and checking/handling the error? It's what you do in Rust anyway. Do you think unwrapping is some magical statement that solves errors?
@@ElPikacupacabra The difference is simple: Rust ENFORCES error handling, while in C, someone could forget to do it. Humans do mistakes.
Also, while this is subjective, Rust's API for this just looks nicer.
@@ArisenDrake Rust doesn't really enforce error handling. You can simply ignore errors and the program will panic. That's no different from C erroring out.
Regarding your desire for syntactic sugar: sure, the dedicated mechanism makes it nicer, but you also pay with complexity. IMO, it's not worth it. It doesn't solve a significant engineering problem.
Where’s the simplicity? Where’s the actual f**cking work? Tsoding was exclaiming, while having a working Rust program in 5 minutes.
Yepic
he didnt cut HAHA