After watching so much "clean code" and "good standards" and "being production ready" in youtube, this feels refreshing. A down to earth coding session for a fun project.
2:35 This does work, however `&&` in bash will only execute the second command (your echo) if the exit code of the first program is `0`. In your case the exist code is `2`. For this use case, you want to use a semicolon (`;`) instead of `&&` to chain the commands, to have your `echo` run regardless of the first programs exit code.
To be honest this is a great video. I have a ton of experience from software development but I never tried to make my own compiler. This guy just goes at it. "it's just another program" love the mentality that nothing is too hard to do.
The fun to come in parsing is that infix/prefix expressions (why would anyone do postfix in current year?) is done via pratt parsing while statements are parsed via recursive descent or GLR. Wish you good luck and don't let the modern C++ features bring you down!
In your Peek() functions, you're not using the 'ahead' value when retrieving the character/token using the .at() function. This could lead to some problems down the road, if it isn't already. Also, afaik from previous attempts at doing this sort of thing, it's a good practice to make sure there's a new-line at the end of the file when you load it (or just add one to the loaded text) so your tokenizer doesn't miss the final token. Could be worth looking into. Loving these videos though!
Also, the ahead value should default to 0, to coreect behaviour to what is currently expected with regards to the return value if not none. Also, the > should be a >= when comparing the index+ahead with the length, again to preserve correct behaviour.
I love these vid. i just finished the last one. it was the perfect video to watch after taking the final to computer system and architecture. biggest part of our final project was writing an assebler, so your vid felt perfect to watch next
commenting second part lessgo didn't found any other sussy timecodes 44:20 malloc & free 29:40 lisp langs looks like perfect AST example, they even don't need much explanation: (display (+ 1 (* 69 (- 10 2)))) 11:50 void type exists for no return 10:43 yay my sugesstion from part 1 was mentioned 8:40 you can return '\0' char
Lol "an abstract syntax snake". Great video, but If one is serious about wanting to learn to write programming languages I recommend learning to compile to C or LLVM. Both of those are cross platform and will generate fast code. Assembly is cool, but really niche at this point and creates countless platform headaches. Btw for those who are scared of C++ one can still write a compiler in python or javascript, and when that is done one can make the compiler self-compile. The best languages IMO for doing this is F# or Rust both have advanced pattern matching and great debugging and testing frameworks.
2:40 possibly the main issue why it did nothing is that && operator executes the next command when the left command ends successfuly (exit code 0) but you are using all sorts of numbers to test it, which however results in basically error as far as the shell is concerned so it does not call echo at all.
Nice video:) never tried doing something like this but might just try myself sometime. Some small things I noticed; Modern C++ prefers stringviews over passing const string references. This prevents any copies being made. And since this is a video about writing a compiler… splitting your header files and impementation (CLion can do that quickly for you) can speed up your compilation of your compiler dramatically 😛
Loving these videos, as someone who enjoys tinkering and making their own strange languages it's fun! I would say, do you not think for the nodes instead of prefixing with Node it would look better post-fixed with Node? Such as ExitNode, ExprNode. Also, exit in C is defined in cstdlib and is a function (that the compiler does funny things to when it is using due to there being no way to actually express an exit). Maybe lexing and parsing a function call would be better in the long run here. FnCallNode could contain a vector and then it scales a bit. Then you can just hardcode checks for the function name for now to mess. Anyway, good video bro keep em coming
What do you mean "then it scales a bit"? No doubt that having a type for bulletin functions that take in a vararg of "values" and returns a single "value" result makes so much more sense than these raw keywords pointing to an ExitNode or SysCallNode.
@@SimGunther Well because they have just begun, I thought that having baseline functionality for functions would make life easier in the long run. Just a simple construct that takes arguments and optionally contains a return value. Then it scales because they can implement more functions without much boilerplate.
@@keykeyjean2003 That's a similar train of thought I had for that construct. There's a call expression I evaluate to see if the function name belongs to a builtin/intrinsic before calling it as a regular function and not an evaluation of the vector of statements with an environment localized to the non builtin function.
You can define the functions without inline. all inline does is tell the compiler to inline the functions contents directly into the place where you call it.
The only problem will appear if you compile separate object files that has the same thing defined. Use a single cpp or include them like this. You could even error out if the same cpp file is included twice. Avoid header files as they are there just for the problems we want to avoid - multiple compilation units that share code. 😅 A better more suitable keyword would be "static" which makes the function local to the compile unit. I think. It has been ages since I touched C/C++
inline keyword actually has very little to no affect on the compilers decision to inline for the major compilers. It's more for allowing multiple definitions without violating ODR, therefore allowing implementations of non-template functions in headers. It also has novel uses in static variable initialization
Had a project to make a language in Java while in school, so this is going to be interesting to follow to see what choices you make! btw, "./out ; echo $?" should work for a one-liner. it'll run the first command before the second rather than together.
Why has no one pointed out that the off by one in peek was because you're checking one ahead for end of string/vector while actually peeking the current character/token. The amount to peek should default to 0 and you should return the character/token of index+amount. Now the amount does nothing and your peek is only accidentally complete when you check if peek+amount is larger than length/size.
Do all high level languages generate Parse Trees? I once was in a group project writing a compiler for Prolog in Haskell and we also had to implement sld-resolution, so thats why i wonder?
./out $$ echo $? fails because the return code of ./out is non zero. (It doesn't execute following commands if the command fails) What I tend to do is to put them in a bash script and just run the bash script.
2:37 Using && only runs the right hand side if the left hand side was 0. You can run both on one line the way you want by using a semicolon instead, like this: ./out; echo $? 1:03:42 i feel like not inlining everything in headers could've avoided this :^) you'd only need to include the bare minimum in the headers themselves, and then can include everything you need in just the .cpp files
Making your member functions inline when you define them directly in the class is redundant, they're already inline. And inline is not a performance related specifier, it's not about inlining, it's about making something avoid the ODR.
@@toby9999their point is that 'inline' has little to no affect on compiler inlining intrinsics for the major compilers. It's all about ODR (one definition rule - allowing multiple definitions i.e. definitions in header). And the OP is right that member functions should be implicitly 'inline' already IIRC
so... I have been refreshing the channel page every 20 mins on average since i watched this video. just for pt. 3 I have pt. 1 & 2 done before and the explanation was good. JUST the last step is missing for me. How I get the tree to do stuff... EDIT: I do not need _optimized_ code, i need _explained_ code. This is well explained (with some very minor hiccups, : D so cute). [+ minor edits on the phrasing]
I would write my compiler in JavaScript, since then no one would have to bother with C-Make and all that other stuff. There are a lot of drawbacks to using JS, but at least it is a lot simpler to run.
10:42 couldn't figure out the answer for what subscript is called and the answer is quite literally right there in front of him on the screen XD. don't worry i've done similar quite often.
Anyone Plz help me. I code like video said, It work like a charm. But when I use terminal commend. It open the external application. For example when I type ./hydro, another application outside clion open to execute that command. same with ./out. It open external app. And I cannot echo retrun value. Edit : I know where it went wrong. I am running window terminal and I just need to change to linux.
The sequence of includes is a higher intelligence, trying to tell you, that you could as well have a Tokenizer instance in your parser and you tokenize lazily instead of using that std::vector. As a Common Lisp fan, the classical "Dragon Books" approach to building compilers looks just wrong. If you have a homoiconic syntax, you do not need to change the grammar and the parser and lexer each time, you add a new idiom to your language. Maybe one day in the future, you will find it useful to try it the Lisp way...
32:48 “…and we’re talking about a formal grammar here, not just your, you know, your simple English grammar like what even is that; English is all confusing. Are there even rules to it at this point there are so many exceptions…” I LAMO’ed when I heard this cause I’ve thought about how terrible English is as well (sometimes I text using parenthesis to group the words to prevent confusion or include sub-notes like what I’m doing now). Lol
"simple" English grammar :) it's a whole lot more complex than a grammar for a programming language. The so-called exceptions are really rules nobody cared to explain in school.
It allows me to not have to put the declaration in a header file and implementation in a cpp file. The inline keyword allows you to include the header file (with function implementations) in multiple cpp files without the compiler complaining about a "multiple definition" error
@@pixeled-ytYou don’t need to use the inline keyword if you implement methods in a header file. It will be inline implicitly, tho if it’s actually inlined is still up to the compiler.
Great video, but if you really want to bridge the gap to the metal, why not directly emit x86 opcodes/operand bytes with a InstructionBuilder abstraction or something and construct an ELF file around that? That removes the last bit of magic imo. Also whenever your compiler stringbuilds instead of going to an IR (or in the final step machine code), you are probably doing something wrong. The only people who have a license to construct compilers THAT horible are the ML compiler people writing everything in python 🙃🙃🙃🙃
After watching so much "clean code" and "good standards" and "being production ready" in youtube, this feels refreshing. A down to earth coding session for a fun project.
nice profile foto 🤣
props to him for writing code that actually does something.
So-called "clean" code is often sparse in terms of actual things being done.
I think the function peak has to be renamed peek at both files, since peak means the top and peek means to look ahead.
I was confused why he named it that way!!
u right
I miss the peek and poke intrinsics from C64 basic! 😅
@@henrikholst7490 that is OLD!
I wasn't born in those years of Commodore 64 and old computers but I'm fascinated by them!!
@@pixeled-yt how r u using ubuntu on windows
2:35 This does work, however `&&` in bash will only execute the second command (your echo) if the exit code of the first program is `0`. In your case the exist code is `2`. For this use case, you want to use a semicolon (`;`) instead of `&&` to chain the commands, to have your `echo` run regardless of the first programs exit code.
To be honest this is a great video. I have a ton of experience from software development but I never tried to make my own compiler. This guy just goes at it. "it's just another program" love the mentality that nothing is too hard to do.
this is peek content
Glad it peeked my interest.
I See what you did there...
19:18 Bjarne once said: "There are only two kinds of languages: the ones people complain about and the ones nobody uses"
I really hope you continue this series! its been so fun to stumble along with you
This is one of the most interesting series I've seen on UA-cam. Just perfectly paced, understandable, great presentation. Thank you.
This series is so much fun and so interesting. It makes me feel so smart
The fun to come in parsing is that infix/prefix expressions (why would anyone do postfix in current year?) is done via pratt parsing while statements are parsed via recursive descent or GLR.
Wish you good luck and don't let the modern C++ features bring you down!
ua-cam.com/video/8QP2fDBIxjM/v-deo.html
In your Peek() functions, you're not using the 'ahead' value when retrieving the character/token using the .at() function. This could lead to some problems down the road, if it isn't already. Also, afaik from previous attempts at doing this sort of thing, it's a good practice to make sure there's a new-line at the end of the file when you load it (or just add one to the loaded text) so your tokenizer doesn't miss the final token. Could be worth looking into. Loving these videos though!
Also, the ahead value should default to 0, to coreect behaviour to what is currently expected with regards to the return value if not none. Also, the > should be a >= when comparing the index+ahead with the length, again to preserve correct behaviour.
Yeah, I was surprised this flew under the radar but I'm guessing trying to keep a train of thought while talking is making it harder for him
This Video series is a great source of learning. Thanks for making. Keep Uploading More Videos like this
I love these vid. i just finished the last one. it was the perfect video to watch after taking the final to computer system and architecture. biggest part of our final project was writing an assebler, so your vid felt perfect to watch next
this is a great series, i just finished the last video, excited to watch this one
This is *peak* () entertainment
commenting second part lessgo
didn't found any other sussy timecodes
44:20 malloc & free
29:40 lisp langs looks like perfect AST example, they even don't need much explanation: (display (+ 1 (* 69 (- 10 2))))
11:50 void type exists for no return
10:43 yay my sugesstion from part 1 was mentioned
8:40 you can return '\0' char
I am just binge watching a dude going insane 😂 love the videos on the compiler
Lol "an abstract syntax snake". Great video, but If one is serious about wanting to learn to write programming languages I recommend learning to compile to C or LLVM. Both of those are cross platform and will generate fast code. Assembly is cool, but really niche at this point and creates countless platform headaches. Btw for those who are scared of C++ one can still write a compiler in python or javascript, and when that is done one can make the compiler self-compile. The best languages IMO for doing this is F# or Rust both have advanced pattern matching and great debugging and testing frameworks.
I find Rust scary. C++ looks more scary than it need be due to the feature bloat added in recent years.
@@toby9999 I agree. Rust is really nice, but it takes some time to get used to.
Hey! Nice explanation dude! This vids should become more rated!
2:40 possibly the main issue why it did nothing is that && operator executes the next command when the left command ends successfuly (exit code 0) but you are using all sorts of numbers to test it, which however results in basically error as far as the shell is concerned so it does not call echo at all.
Nice video:) never tried doing something like this but might just try myself sometime. Some small things I noticed;
Modern C++ prefers stringviews over passing const string references. This prevents any copies being made.
And since this is a video about writing a compiler… splitting your header files and impementation (CLion can do that quickly for you) can speed up your compilation of your compiler dramatically 😛
Loving these videos, as someone who enjoys tinkering and making their own strange languages it's fun! I would say, do you not think for the nodes instead of prefixing with Node it would look better post-fixed with Node? Such as ExitNode, ExprNode. Also, exit in C is defined in cstdlib and is a function (that the compiler does funny things to when it is using due to there being no way to actually express an exit). Maybe lexing and parsing a function call would be better in the long run here. FnCallNode could contain a vector and then it scales a bit. Then you can just hardcode checks for the function name for now to mess. Anyway, good video bro keep em coming
What do you mean "then it scales a bit"?
No doubt that having a type for bulletin functions that take in a vararg of "values" and returns a single "value" result makes so much more sense than these raw keywords pointing to an ExitNode or SysCallNode.
@@SimGunther Well because they have just begun, I thought that having baseline functionality for functions would make life easier in the long run. Just a simple construct that takes arguments and optionally contains a return value. Then it scales because they can implement more functions without much boilerplate.
@@keykeyjean2003 That's a similar train of thought I had for that construct. There's a call expression I evaluate to see if the function name belongs to a builtin/intrinsic before calling it as a regular function and not an evaluation of the vector of statements with an environment localized to the non builtin function.
yo this series is awesome. i love programming
This guy makes my adderall sleepy
Maybe the "exit" check should have a space? To be "exit " example of issue
String brexit13;
Would cause a crash.
finally a youtuber that listens. ggs
The first breaking change, let's go to version 1.0 🎉
loving this series so far
You missed the joke about the ASS - Abstract Syntax Snake
9k views with
Such a well done series, looking forward to the upcoming parts. Will you ever use the ahead parameter or your peak (peek) functions?
28:40 when you got to this point shouldn't you have written tests to make sure your compiler stays in a working condition?
You can define the functions without inline. all inline does is tell the compiler to inline the functions contents directly into the place where you call it.
The only problem will appear if you compile separate object files that has the same thing defined. Use a single cpp or include them like this. You could even error out if the same cpp file is included twice. Avoid header files as they are there just for the problems we want to avoid - multiple compilation units that share code. 😅
A better more suitable keyword would be "static" which makes the function local to the compile unit. I think. It has been ages since I touched C/C++
inline keyword actually has very little to no affect on the compilers decision to inline for the major compilers. It's more for allowing multiple definitions without violating ODR, therefore allowing implementations of non-template functions in headers. It also has novel uses in static variable initialization
@@henrikholst7490 Tbh everything you said here tells me you never knew how to write C/C++ propperly in the first place.
Had a project to make a language in Java while in school, so this is going to be interesting to follow to see what choices you make! btw, "./out ; echo $?" should work for a one-liner. it'll run the first command before the second rather than together.
&& will only run the second command, if the first one ran successfully (returned 0).
Because his program returned 20, the && didn't run the echo.
@@Kiwi-tq2fy Accurate, better than my late night explanation :)
I'm seriously sure you get bounds checking with array lookup of an stl vector, ie using []. Its the same thing.
Hey, great series of videos, looking forward for this compiler series.
What's your CLION theme?
Keep up the great work, and informative series!!!😀
And what font do u use btw?
Font: Iosevka
Theme: One Dark
this is peek content (bah dum tss)
Why has no one pointed out that the off by one in peek was because you're checking one ahead for end of string/vector while actually peeking the current character/token. The amount to peek should default to 0 and you should return the character/token of index+amount. Now the amount does nothing and your peek is only accidentally complete when you check if peek+amount is larger than length/size.
Again, good video. This can be such a good series!
Such a great series ! ❤
46:08 in VSCode it's Alt+Shift+up/down arrows to move lines and Ctrl+L to select multiple lines before moving. Not sure if it's the same in CLion.
12:00 A "const" method could have side effects such as modifying a "mutable" member.
@12:05 "Hey it's C++, you know how it works!" - nope
Making a snake from a pe*is made my day
I think this stream would be 2x if he did this with pair programming. Feel free to steal my ideas.
33:10 yeah, and we though C++ was bad, its just a mini-boss.
just finished watching pt1. lol
Dumb question, but what's that font for code called? I see it sometimes but forget it
Iosevka
Do all high level languages generate Parse Trees? I once was in a group project writing a compiler for Prolog in Haskell and we also had to implement sld-resolution, so thats why i wonder?
Not necessarily, most c-style languages do. Might not be 100% necessary depending on the syntax of the language like a functional or stack-based one
@@pixeled-yt Thanks :)
I had the same is never used when I used CLion. It was always annoying, and there doesn't seem to be a way to fix it.
Ah yes, the Abstract Syntax Snake®, or the ASS for short.
./out $$ echo $?
fails because the return code of ./out is non zero.
(It doesn't execute following commands if the command fails)
What I tend to do is to put them in a bash script and just run the bash script.
He is I guess the first person I ever watched that uses #pragma once, clean code standards are truly something for him..
FYI, if you want to do ./out and echo $? on the same line. Use a semicolon insead of && as this:
./out; echo $?
you shouldnt need any of the inlines, methods defined directly in the class body are implictly inline and shouldnt cause any ODR violations
tsoding from wish kinda goes hard ngl.
tsoding from wish lol
2:37 Using && only runs the right hand side if the left hand side was 0. You can run both on one line the way you want by using a semicolon instead, like this: ./out; echo $?
1:03:42 i feel like not inlining everything in headers could've avoided this :^) you'd only need to include the bare minimum in the headers themselves, and then can include everything you need in just the .cpp files
Cant you just include everything in a precompiled header and include the pch everywhere (ofc while keeping pragma once)?
What exactly will be the product that will emerge at the end of these videos?
17:40 petition the rename unpeek to regurgitate.
hey there i have got a doubt in making parsers because how do they get their binaryexprs and whats the order of precedence
part 3 when?
hey your compiler playlist is backwards
Fixed, thanks for letting me know
every time there is a big change in the code i get about 30 errors and i never know what it wants cuz my compiler is broken
when you use [] if the item in the vector/array/list doesnt exist, itll add it to the vector/array/list. So .at is just a lot better.
@Pixeled When you edit the test.hy file, why did you need to recompile?
I am pretty sure that peak in your context, is actually spelled peek btw
Lol, you're right
@TigranK115 lmao
Making your member functions inline when you define them directly in the class is redundant, they're already inline. And inline is not a performance related specifier, it's not about inlining, it's about making something avoid the ODR.
Inlining is also about performance if it eliminates the function call overheads.
@@toby9999their point is that 'inline' has little to no affect on compiler inlining intrinsics for the major compilers. It's all about ODR (one definition rule - allowing multiple definitions i.e. definitions in header). And the OP is right that member functions should be implicitly 'inline' already IIRC
so... I have been refreshing the channel page every 20 mins on average since i watched this video. just for pt. 3
I have pt. 1 & 2 done before and the explanation was good. JUST the last step is missing for me. How I get the tree to do stuff...
EDIT: I do not need _optimized_ code, i need _explained_ code.
This is well explained (with some very minor hiccups, : D so cute).
[+ minor edits on the phrasing]
Check tomorrow morning ;)
@@pixeled-yt : O
°(^_^)°
really loving this series ! 🔥
I would write my compiler in JavaScript, since then no one would have to bother with C-Make and all that other stuff. There are a lot of drawbacks to using JS, but at least it is a lot simpler to run.
Been doing C++ application development on Windows for 25 years. I never use CMake. Must be a Linux thing?
10:42 couldn't figure out the answer for what subscript is called and the answer is quite literally right there in front of him on the screen XD.
don't worry i've done similar quite often.
Great video, Tsoding!
Great one. Pls keep up thnx 🥰
Why do you else after a continue at 1:05?
2:38
./out; echo $?
55:34 another compiler of course
Anyone Plz help me. I code like video said, It work like a charm. But when I use terminal commend. It open the external application. For example when I type ./hydro, another application outside clion open to execute that command. same with ./out. It open external app. And I cannot echo retrun value.
Edit : I know where it went wrong. I am running window terminal and I just need to change to linux.
What clion theme?
One Dark theme
I would make way more utility functions, like `consumeWord` or `consumeDigits`
Yeah the code is really dirty
By the way, that's not what the inline keyword means in c++.
I'm back for more!
You also don t have to recompile after editing the .hy file 😅
it's "peek" not "peak"
Awesome ❤
Bro i think i am too noob. Still i will make my own compiler
Wonderful! 👍
The sequence of includes is a higher intelligence, trying to tell you, that you could as well have a Tokenizer instance in your parser and you tokenize lazily instead of using that std::vector.
As a Common Lisp fan, the classical "Dragon Books" approach to building compilers looks just wrong. If you have a homoiconic syntax, you do not need to change the grammar and the parser and lexer each time, you add a new idiom to your language. Maybe one day in the future, you will find it useful to try it the Lisp way...
Do you heard about two programs called "yacc" and "lex" ? :|
This is just an overview of how the compiler works.
I don't think he will create it for future use.
32:48 “…and we’re talking about a formal grammar here, not just your, you know, your simple English grammar like what even is that; English is all confusing. Are there even rules to it at this point there are so many exceptions…”
I LAMO’ed when I heard this cause I’ve thought about how terrible English is as well (sometimes I text using parenthesis to group the words to prevent confusion or include sub-notes like what I’m doing now). Lol
Nice video, but
You messed up!
when I see him using c++ I reallize that java would be perfect if it compiled to an executable... 😅
I hate java. It's not the best at anything in my opinion.
@@toby9999 ar least you came know what the "FileInputStream" class does
for a guy who's using c++ he seems to really hate c++
that’s just anybody who uses c++
Ur so cute
create a makefile to make things easier
That is what CMake is for 🥰
makefile is still much easier
"simple" English grammar :) it's a whole lot more complex than a grammar for a programming language. The so-called exceptions are really rules nobody cared to explain in school.
“This will be inline too just because I don’t like writing things in separate files” I don’t think that keyword means what you are implying it means 🤔
It allows me to not have to put the declaration in a header file and implementation in a cpp file. The inline keyword allows you to include the header file (with function implementations) in multiple cpp files without the compiler complaining about a "multiple definition" error
@@pixeled-ytYou don’t need to use the inline keyword if you implement methods in a header file. It will be inline implicitly, tho if it’s actually inlined is still up to the compiler.
Great video, but if you really want to bridge the gap to the metal, why not directly emit x86 opcodes/operand bytes with a InstructionBuilder abstraction or something and construct an ELF file around that? That removes the last bit of magic imo. Also whenever your compiler stringbuilds instead of going to an IR (or in the final step machine code), you are probably doing something wrong. The only people who have a license to construct compilers THAT horible are the ML compiler people writing everything in python 🙃🙃🙃🙃
This is great
28:13 almost fixed it 🙃
Hi