Making a calculator from scratch -

Поділитися
Вставка
  • Опубліковано 10 чер 2024
  • Disclaimer: This video is rather programming heavy at points. You are welcome to skip these parts, but do note you might miss some small detail somewhere. These sections have the "Code:" prefix in the timestamps.
    Evaluating math expressions is not an easy problem to solve, despite seeming extremely simple.
    This video is a step by step guide to doing so, taking you through the motivations behind these steps as well.
    This is my entry for the Summer of Math Exposition, hosted by @LeiosLabs and @3blue1brown
    some.3b1b.co/
    Timestamps:
    0:00 - Intro
    1:15 - Lexer/Tokenizer
    3:04 - Code: Lexer Implementation
    4:33 - Beyond Lexing
    5:46 - Structuring Math Expressions
    7:03 - Code: Representing AST Nodes
    7:46 - Evaluating an AST
    9:44 - Code: Basic Parsing Structure
    11:02 - Order of Operations
    12:12 - A general expression
    13:57 - Colouring expressions
    16:19 - Code: Parse Expression function logic
    19:14 - Parenthesized Expressions
    19:57 - Code: Parsing "Terminal" expressions
    21:51 - Finished the basic pipeline
    22:06 - Code Bonus: Implicit Multiplication
    23:14 - Recap and outro
    Links:
    Tagged Unions - en.wikipedia.org/wiki/Tagged_...
    Or possibly, a nicer alternative - blog.ryanmartin.me/tagged-unions
    Code for this project in C - github.com/PixelRifts/math-ex...
    Discord - / discord
    Thanks to:
    Garklein
    Allen4th
    IdkRandomTry
    s m p l
    Asher
    GamesWithGabe
    for your feedback
    Music:
    Somnia I - Reed Mathis
    Interplanetary Alignment - NoMBe
    Creeping Up on You - Godmode
    #SoME3 #C #calculator
    #MadeWithMotionCanvas

КОМЕНТАРІ • 81

  • @voxelrifts
    @voxelrifts  10 місяців тому +59

    Please note, this is *not* supposed to be the easiest way to make a calculator like this. As many have mentioned there are algorithms specifically for evaluating math expressions like shunting yard, which would probably be easier to implement. However this way of generating an AST has its own benefits:
    1) This can be easily extended into an actual programming language
    2) Since we get the full AST at the end, we can do some cool things with it if we just add an 'x' variable node. A few examples can be simplifying an AST by walking the tree and applying rules, or even creating an AST for the derivative of the given expression by analyzing the generated tree.
    Really the opportunities are endless here.

    • @jacobophoven90
      @jacobophoven90 9 місяців тому +1

      I have a question: so in the parse_terminal_expr it does if self.current.node_type == NodeType.NUMBER but then after that it also does if (self.current.node_type == NodeType.NUMBER or self.current.node_type == NodeType.OPEN_PARENTHESIS): which would be always be true if it passed thru the first one. but then in that if statement it does parse_expr which calls parse_terminal_expr and it just keeps going in a loop. I am trying to code your C code in python. Please tell me what I did wrong or how your C code doesnt go in a loop

    • @voxelrifts
      @voxelrifts  9 місяців тому

      @@jacobophoven90 there's a few more conditions above the second condition you mentioned. Namely for unary plus and minus, which you can't implicit multiply with. Which is why they aren't in the list

  • @tomiivaswort6921
    @tomiivaswort6921 9 місяців тому +39

    I love how the word "AST", so as Abstract Syntax Tree, as the german word "Ast" translates to branch, wich is exactly what an AST is made out of

  • @angeldude101
    @angeldude101 10 місяців тому +52

    In a way, this entire calculator is essentially an interpreter for a very simple (non-Turing complete) programming language, so it's completely natural that it'd be similar to other interpreters. It's also possible to argue that the lexer and parser are essentially simple compilers in their own right, translating a source string into more usable data, and then transforming it again into _more_ usable data. Really, the only difference between a compiler and an interpreter is that interpreters reduce their input to the final output on its own while compilers serialize the data it transformed the source to into a format the CPU and OS can understand. Most modern compilers just transform the input into something a more established compiler framework (usually LLVM) can understand and then pass that to said framework, at which point it can apply its own transformations to simplify the data so that the final serialization is smaller and/or faster. (You could even argue that a CPU is just an interpreter implemented in hardware. Modern CPUs also have hardware implemented _compilers_ too to transform the machine code even further into micro-operations that _then_ get interpreted.)
    I remember implementing a simple calculator with the Shunting Yard algorithm outputing RPN, which is basically just a flattened representation of the syntax tree, in, _of all things, DCPU-16 assembly. Gosh_ that was a long time ago...

  • @cynical5062
    @cynical5062 9 місяців тому +24

    As a compiler author, I really enjoyed this. Great video!

    • @tfr
      @tfr 9 місяців тому +4

      A compiler author. The holy messiah amongst us all

  • @EmKonstantin
    @EmKonstantin 9 місяців тому +12

    The "loading" (or "waiting") animation on the tree nodes while evaluation works really well ! So visually simple, yet it clearly shows the order of operations.

  • @nevanjohn
    @nevanjohn 10 місяців тому +40

    Thank you for taking the time to make a video on this topic :D I've been trying to get into low level programming and though I don't fully understand all the code and concepts in this video (on my first viewing), I hope to look into the stuff mentioned here and comeback later to help me implement my own version of this.

    • @AmCanTech
      @AmCanTech 10 місяців тому

      As in assembly?

    • @robkojabko
      @robkojabko 9 місяців тому

      @@AmCanTech even C and C++ are low level

  • @albachteng
    @albachteng 10 місяців тому +16

    This was a fantastic video. Well put together, clear and thorough without holding hands and filled with intuition. Great job!

  • @odomobo
    @odomobo 9 місяців тому +14

    I was expecting you to talk about left- vs right-associativity, since exponentiation is right-associative. I think it's worth a mention at least.

    • @rosuav
      @rosuav 7 місяців тому +2

      Yeah, ditto - I actually thought the inclusion of exponentiation was deliberate as an opportunity to talk about associativity. It isn't too hard; it just changes whether "case 2" is handled like "case 1" or "case 3". But it does mean you have to track a bit more information about the operators.

  • @brummi9869
    @brummi9869 10 місяців тому +6

    I feel your pain, I've been trying off and on for half a year now to program my own calculator/CAS program, and my last iteration (before i stupidly decided to restart from scratch) somehow manages to evaluate Sums, integrals and derivatives analytically (so not just approximations with Riemann sums) and other stuff, but shits itself and dies (crashes my arduino) if it is asked to add 3 decimal numbers. It is very difficult to avoid technical debt, and program functions... in such a general way that you don't have to manually program in every interaction, but so concretely that they still do what you want them to

    • @voxelrifts
      @voxelrifts  10 місяців тому +4

      I have yet to try it but it should definitely be possible (easy?) to analyze the AST and generate a new one for the derivative of a function. But calling anything easy is always famous last words

  • @Isaac-zy5do
    @Isaac-zy5do 10 місяців тому +13

    Nice! I think this approach (pratt parser) is the best way for building up towards writing a conventional programming language, but if you wanted to make a stack based language like dc or forth, or a s-expression lang like lisp, you could get a similar level of usability with a much simpler parser (don't need operator precedence in those languages). A neat trick for implementing operator precedence if you need it that i saw on wikipedia is to add brackets around the operators based on their precedence level, e.g. + -> ))+(( , * -> )*( , (-> (((, )-> ))) and surround the whole expression in (()). Apparently this was done in early fortran compilers.

    • @palpytine
      @palpytine 10 місяців тому

      You should look up "reverse polish"

  • @gameofpj3286
    @gameofpj3286 9 місяців тому +2

    This is explained really well! I really like the part about the recursion and coloring :D Really makes this click for me!

  • @Apple-vm5gc
    @Apple-vm5gc 10 місяців тому +4

    This video is very useful for the compiler design course.

  • @NStripleseven
    @NStripleseven 10 місяців тому +6

    This is halfway to just straight-up being a programming language interpreter. Actually, it kind of already is, in a way.

  • @palpytine
    @palpytine 10 місяців тому +5

    For a simple expression language like this, parsing straight to reverse polish notation is simpler, faster, and uses significantly less memory. The AST approach is better suited to a more complex language with multiple expressions where you're also adding a symbol table into the mix. It arguably does a better job of reporting errors in the expression.

    • @voxelrifts
      @voxelrifts  10 місяців тому +4

      That is indeed true. Infact I cut part of the video that got into parsing function calls and other things because it was just getting far too long and tedious and the concepts were already explained.

  • @laujimmy9282
    @laujimmy9282 9 місяців тому +2

    I really learned a lot from this video. I have no idea that's how a calculator works, now i will look into it. ❤❤❤

  • @SimGunther
    @SimGunther 9 місяців тому +1

    Chidi Williams has neat writeups on parsing where he uses recursive descent for statements while pratt parsing is used for expressions. The code he uses is in TypeScript, but the concepts apply nicely in C as well.

  • @fantasypvp
    @fantasypvp 7 місяців тому

    I did this in rust a few months ago and it was such a fun project, recently I've been working on adding support for functions so you can use sin(), ln() and other stuff

  • @vitorschroederdosanjos6539
    @vitorschroederdosanjos6539 9 місяців тому

    That sounds very interesting for working in non usual fields (rings, groups) pretty nice!!

  • @abdigex8255
    @abdigex8255 9 місяців тому

    You did this so radically different to me. Who knew there were so many ways to make a calculator.

  • @ahmadshami5847
    @ahmadshami5847 9 місяців тому +1

    sheesh that really makes me appreciate that 15$ casio calculator even more now

  • @wChris_
    @wChris_ 9 місяців тому +1

    Take a look at reverse polish notation! Transforming a token stream into this format makes evaluation trivial, as you can use a stack. Numbers will be pushed onto it and operators will take the top 2 numbers and apply the operation and push the result on the stack. The value on the stack after everything has been evaluated is the result.

    • @rosuav
      @rosuav 7 місяців тому

      Heh, I once made something that basically worked like that, but IMO it would have made a terrible explainer. It worked just fine (and I think that code is still in use in one of my old programs somewhere), but ultimately, the "convert to RPN" and "evaluate RPN" steps are very similar to a simplified version of "convert to AST" and "evaluate AST", with the tree being represented by two stacks (one for numbers, one for lower-precedence operators - in effect, the operator stack is for converting to RPN and the number stack is for evaluating).
      But this way is far far better at explaining how you would really want to think about this.

  • @lazergenix
    @lazergenix 10 місяців тому +1

    This video reminds me, I need to get back to work on my compiler I'm making. Need to start writing the type checker 😭😭

  • @isaacmccracken5892
    @isaacmccracken5892 10 місяців тому +2

    Next you should talk about general parsing for functions and structures like you did in your compiler

  • @kasugaryuichi9767
    @kasugaryuichi9767 10 місяців тому

    Thank you!

  • @chennebicken372
    @chennebicken372 8 місяців тому

    My Simple Regex Parser was failing for a long time on unary operators. Also the fact that, they can take Nothing as an argument, made it very difficult: Such as (a|), which means that a or Nothing can come next.

  • @AmCanTech
    @AmCanTech 10 місяців тому +2

    Reverse hungarian notation and shunting yard?
    Interesting stuff, we used a stsck and did this in c++ ... gets quite nested and lots of if statements
    When i had white space i said +0 ...

  • @yash1152
    @yash1152 9 місяців тому

    wow, awesome.

  • @VioletJewel1729
    @VioletJewel1729 10 місяців тому

    shunting yard

  • @philtoa334
    @philtoa334 10 місяців тому

    Nice.

  • @AK-vx4dy
    @AK-vx4dy 9 місяців тому

    When i was young...
    I wrote this from scratch without knowing any proper algorithms... it kinda worked by i stuck on minus followed by unary minus...

  • @codealonewithGod
    @codealonewithGod 10 місяців тому

    First to grab this information 🎉

  • @9remi
    @9remi 9 місяців тому

    5:19 f-bomb. i don't care at all; but i was kinda shocked. calm and informative video, and then in your face. unexpected. i can imagine some kid watching this video in front of strict parents etc and boom. now THAT kid's in trouble, because they clicked on YOUR video. (the reason i'm even commenting about this is because this happened to me many times, i've been that kid.) ranting aside, great video, +1 thumbs up, you've earned a sub.

    • @voxelrifts
      @voxelrifts  9 місяців тому +1

      That's fair xD, i doubt people younger than like... 16 would watch this video, but I will try to see whether I can in place edit it out :P

    • @ciso
      @ciso 9 місяців тому

      ​@@voxelriftsOh no you actually cut it :(

    • @voxelrifts
      @voxelrifts  9 місяців тому +1

      @@ciso xD yeah I did, better safe than sorry

  • @yourfutureself4327
    @yourfutureself4327 9 місяців тому

    💙

  • @lilyzheng2322
    @lilyzheng2322 9 місяців тому

    this video understand me

  • @Xdetonando
    @Xdetonando 9 місяців тому

    I asked chatgpt for an medium exercise and he gave me this lol, im struggling with operator precedence, where did you find all the information needed to solve this problem?

    • @voxelrifts
      @voxelrifts  8 місяців тому +1

      Combination of craftinginterpreters.com, Doing my own projects like a compiler and a math graphing game and also a bunch of tweaking and exploring.

  • @titaniumtomato7247
    @titaniumtomato7247 9 місяців тому

    I did my best to replicate the program however the expression always gets evaluated to 0, more precisely it doesn't know what to do when it reaches the end of file token, how are you supposed to handle that?

    • @voxelrifts
      @voxelrifts  9 місяців тому

      There isn't anything special you need to do to handle eof (only make sure it is mapped to 0 precedence automatically or manually). When the parser reaches end of file, as long as there is no error in the expression, it will be looking for an operator there. So the parser can look into the precedences array/map with the eof tokentype which should return 0. Since this precedence is lower than every other precedence till that moment, the parser will just return out of all parse expression calls and return you an ast of the entire expression

  • @Dg47PRO
    @Dg47PRO 6 місяців тому

    how do you animate these videos with code?

  • @JohnDlugosz
    @JohnDlugosz 10 місяців тому +1

    2:29 Why didn't you just make TokenType the enumerated type?
    E.g.
    enum TokenType { TT_EOF, TT_Error, TT_Ident .... };
    You're defining the enumerated constants in an anonymous enumerated type, and then using them as numbers in a plain integer, losing the type checking. It's like you're missing the point of having an enumerated type, and you just ported code that had a bunch of #define's.

    • @voxelrifts
      @voxelrifts  10 місяців тому +2

      True, but that's really just the style I use for consistency. I do a lot of flags with enums so I use a typedef to typedef the name to an u32 then use an anonymous enum. It doesn't defeat the point of using the enum though, even if typechecking is lost there, the enum is still auto setting values with an increment. It also makes it possible to control the enum's base type. Also since this Token type enum is something that is being set on a token *once*, losing typechecking there is hardly a problem

  • @thanhlongtran9163
    @thanhlongtran9163 10 місяців тому

    The way your parser works looks a lot like the way Jon Blow said in this video: ua-cam.com/video/MnctEW1oL-E/v-deo.html
    Is that where you learned about the idea. Have you tried any other techniques, and what are your thoughts on those?

    • @voxelrifts
      @voxelrifts  10 місяців тому +3

      No this is not where I learned about recursive descent parsing. I actually learned about it through crafting interpreters. The problem with that code is that it's encoded In a dispatch table which makes it hard to intuit what goes on. So what I did was unrolled that into switches and tried to figure out some cool things from it, like how the callstack mimics the expression, etc

    • @thanhlongtran9163
      @thanhlongtran9163 10 місяців тому

      @@voxelrifts -I was talking about the way your expression parser generates the correct tree order (recursive descent parsing is irrelevant)- So turns out this parser is also called recursive descent parser. That is what I think makes expression parsing different from other types of parsing. There are other algorithms like "shunting yard" or fixing up your tree order before/after returning. Have you tried any other techniques?

    • @voxelrifts
      @voxelrifts  10 місяців тому +1

      ​@@thanhlongtran9163 I've heard of shunting yard but never implemented it myself yet. I did do the "fix up order after generating an improper tree" technique, and I much prefer this method honestly, but it's subjective. This method is nice because it's pretty much easily "embeddable" within a programming language parser.

    • @thanhlongtran9163
      @thanhlongtran9163 10 місяців тому

      @@voxelrifts What do you mean by "embeddable"? Is this the best method that you have tried so far? Also, unrelated, but I think this method is sometimes called Pratt parser, Precedence climbing, or Top-down parser.

    • @voxelrifts
      @voxelrifts  10 місяців тому +1

      @@thanhlongtran9163 by embeddable I mean it can easily made part of a full programming language parser. Also yes this is called a Pratt parser (I mentioned the name in the video).
      This is definitely the most intuitive technique I've found till now though.

  • @unLinuxeroMas
    @unLinuxeroMas 8 місяців тому

    can you make a tutorial? would be nice to have a more detailed explanation step by step I mean

    • @voxelrifts
      @voxelrifts  8 місяців тому

      Is this not a tutorial? Definitely not super handholdy, but I do think I covered everything

    • @unLinuxeroMas
      @unLinuxeroMas 8 місяців тому

      @@voxelrifts thanks for the answer dude, and first of all is a incredible video maybe I am too newbie in programming
      to understand it to the fullest.I was referring to a full tutorial showing coding in real time

    • @voxelrifts
      @voxelrifts  8 місяців тому

      @@unLinuxeroMas Oh I don't plan on doing that sort of thing sorry. I don't think that helps people understand topics at all so I actively try to avoid showing code being typed line-by-line.

    • @unLinuxeroMas
      @unLinuxeroMas 7 місяців тому

      @@voxelrifts that is actually true , thanks for the answer again, I was asking for that kind of tutorial becose I want to see how is the program structured in the files , and what is that code that you put in a part of the screen when you are explaining the code? is that an separated file ? another function that you coded?, kinda dummy question but engish is not my first language that is why I may loose a bit of information that is said in the video XD

    • @voxelrifts
      @voxelrifts  7 місяців тому +1

      @@unLinuxeroMas that's understandable. I've put a github link in the description if you want to see how the thing is structured

  • @HoSza1
    @HoSza1 10 місяців тому +2

    Professionally you'd rarely code up your parsing from scratch, as there are quite a few standard tools well suited for a wide range of types of grammars. Anyhow this is a nice intro video for anyone unfamiliar with the topic.

    • @NStripleseven
      @NStripleseven 10 місяців тому +2

      Yes, but I imagine the choice was made not to use one of those here because this is an educational video.

    • @HoSza1
      @HoSza1 9 місяців тому +1

      @@NStripleseven That's perfectly fine, and my addition here is to add a "where to go from here" section, which is customary in this genre.

  • @beanieteamie7435
    @beanieteamie7435 9 місяців тому +1

    If you feel the need to explain what your abbreviated variable "ident" means. It probably shouldn't be abbreviated in the first place.

    • @voxelrifts
      @voxelrifts  9 місяців тому +1

      You're not wrong, but ident is quite a standard abbreviation which many compilers use, which is why I explained what it meant

    • @beanieteamie7435
      @beanieteamie7435 9 місяців тому +1

      @@voxelrifts That's completely fair. Also, I'd like to say that I really enjoyed your video! I hope my previous comment didn't come off as hostile or anything 😅

  • @jermainex364
    @jermainex364 10 місяців тому

    Promo-SM

  • @schweinmachtbree1013
    @schweinmachtbree1013 8 місяців тому

    why did you submit this to SoME?? this isn't mathematics.

    • @voxelrifts
      @voxelrifts  8 місяців тому

      All subjects allied to Mathematics are allowed as declared on the website (Especially since the topic is math related). I also specifically asked on the discord to make sure it is valid :)

  • @tylerduncan5908
    @tylerduncan5908 10 місяців тому +1

    Really good video, however yoy may want to find a different way to phrase the thing at 5:19 to be more family friendly if you want the most people to see this.

    • @azmah1999
      @azmah1999 10 місяців тому +3

      There's only one F bomb so it's PG13. I don't think people younger than 13 would be interested in the video anyway 😂

  • @neuvx
    @neuvx 9 місяців тому

    harmful