Making a New Compiler

Поділитися
Вставка
  • Опубліковано 20 січ 2025

КОМЕНТАРІ • 47

  • @Otakutaru
    @Otakutaru Місяць тому +9

    Aye, I like your approach for a compiler. You can get very, very far by using solid building blocks. There will be a point where you need to dispell the concept of these blocks and start mixing and reorganizing instructions, specially for optimization purposes. But that's way in the distance and to get there, the approach you have is perfect, both for development and for the videos.

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому +2

      Thanks! Glad to hear that I might be on the right track.

  • @zehalmeida
    @zehalmeida Місяць тому +1

    I find your project idea great, I myself took the challenge to build something out of 6502 assembly and, in my case, I built a C# Object oriented emulator.
    It is just like you said, you get very excited about things working out even though you avoided it because you knew how much effort you would end up investing.
    And it's exactly that, an investment.
    From what I can see from this video, you are actually doing quite well, I had to pause the video and go through the description and I am very sad I couldn't find a git link for any of this!
    I am super interested in how this will work out, it is something I always wanted to try and your video only fuels my wish more.
    Please continue, I really want to see more and how this will evolve in the future!
    Thank you!

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому +2

      I already promised one of the other viewers that I would open source the code at some point, probably with an MIT license. Right now though, it's in such rough shape it would be embarrassing to share. Also, seeing as how I had to break up this video into effectively two parts, posting up the code as it is now, would effectively amount to giving away a huge spoiler with regards to the next part.
      That said, I'm really glad to hear that you are enjoying this project! Hearing that from other people certainly makes me feel like making these videos is well worth my time. Thanks!

    • @zehalmeida
      @zehalmeida Місяць тому

      @@ModernRetroDev Hey, don't believe for a second you should be embarrassed by your code, dude. It is a work in progress, it is a foundation for what you are building and, probably most important, it is working.
      Yeah, it may be rough around the edges, maybe you have some "creative" work-arounds here and there, but hey, you got to solve the problem first and then refactor.
      You should be proud: you're building something people are interested in and are keen to contribute, even if it's just a comment here and there.
      And yeah, I agree about the spoiler part, I think I would rather wait for the revel, at this point!

    • @stevetodd7383
      @stevetodd7383 16 днів тому

      C# is an odd one in that you can make .NET compile it for you on the fly using the Roslyn APIs. You can for examples take a C# text file, convert it to an in-memory binary on the fly and execute it on the spot. No emulation required.

  • @wChris_
    @wChris_ Місяць тому +2

    usually your intermediate level language isnt quite as high level as what you have shown. The IL is usually represented in a graph like structure where each node is a basic block, that contains a sequence of instructions, that must be executed in order without branches in the middle. You would have all the type information from the semantic analysis, which you didnt do and therefore have to resort to hacks like embedding the type information in the operations. The IL is great to perform lots of optimizations on it. After having performed optimizations you would start by generating code, where each IL instruction would ideally map to one sometimes more assembly instructions, like what you did with the subroutines, but embedded in the compiler. Code generation is also responsible for allocating registers which is very hard on 6502-like systems, unless you use the zero page for registers as well.
    Edit: If you want to learn more about making a compiler i can highly recommend "Programming Language Pragmatics, 4th Edition" by Michael Scott

    • @danilolr
      @danilolr 7 днів тому

      I agree with @wChris_.
      What I had read until now is about the operations being simple.
      So I think you are on the right path with your add sub mult and div operations.
      The += must not be in the low level. But of course if fine on the high level.
      All math operations must be in the form dest = var_or_constant_1 operation var_or_constant_2 (en.wikipedia.org/wiki/Three-address_code)
      Maybe the println should not be a primitive on the IR. You can have a generic 'call' primitive to call funcions on your standard library. On my project I use a concept like of the asmsub on prog8, you can take a look.
      And maybe you should not have 'if' and 'loop' control flow on it. But labels and conditional 'goto' primitives.
      So not this :
      if a = b
      #do this
      elif
      #do this
      end if
      ...
      You will have :
      v1 := a = b
      ifnot v1 goto else_label
      # do this (if block)
      goto endiflabel
      :elselabel
      # do this (else block)
      :endiflabel

  • @adixperience
    @adixperience Місяць тому +1

    I love your content

  • @wWvwvV
    @wWvwvV Місяць тому +1

    I don't understand. Why do you apologize implementing a += operator function? Is it about not inlining it?

  • @AndrewErwin73
    @AndrewErwin73 Місяць тому +2

    very nice project... looking forward to seeing where you go! are you going to open source the code?

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому +1

      Thank you! Yeah, I wouldn't mind eventually releasing the compiler as an open source code... probably under an MIT license. However, I'd like to get a bit further along before I do that as the code is pretty messy at this point as a result of the project itself being a learning exercise.

  • @hughdavenport1
    @hughdavenport1 Місяць тому +2

    I hadn't thought of doing 10* using bitshifts. I knew 1 bitshift is 2*, so it looks like this is 8*A + 2*A = 10*A. Voila!!
    Excited to see your next video :-)

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому +3

      Like I was saying, there are some really fun and rewarding things about learning to write code in assembly. For me this was one of those cases.
      Also, I'm really glad to hear that other people found this to be interesting as well. This makes me think that perhaps my plan to cover projects a bit in breadth and a bit in depth, is a good one.
      Thanks for watching!

    • @hughdavenport1
      @hughdavenport1 Місяць тому

      @ModernRetroDev As someone who does similar projects (just more long form videos), I just do what I want. Usually people stick round. Certain projects do get more views (compilers especially haha). Just hard sometimes to get engagement like comments!

    • @CallousCoder
      @CallousCoder Місяць тому

      This is how we always approached multiplication within CPUs that had no multiply instruction.
      Even on 8086 I would use it because often it was faster than the MUL instruction.

    • @hughdavenport1
      @hughdavenport1 Місяць тому

      @CallousCoder I recall at uni memorising a fancy way to divide things as div is slow. Tbh I assume you could do same trick here to div/8+div/2 if div ever comes up

    • @CallousCoder
      @CallousCoder Місяць тому

      @@hughdavenport1 Yes div (in integers) is exactly the same shift right and subtract

  • @AssassinC01
    @AssassinC01 Місяць тому +1

    Would it be possible for you to share some source code or links to resources you found useful? I'd love to dive deeper if possible. I have been loving the video so far, so keep up the fantastic work.

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому

      So I am planning to post my source code on Github at some point (once it is in a less horrible state), but at the moment its in a pretty rough state.
      Regarding resources that I find useful... its mostly stuff on the 65XX assembly front, as I very comfortable with programming in general, just new to assembly. That said, let me share a few of those. First off, I printed up a page with this on it: mimuma.pl/opcodes/opcodesA4_final.png (this is great for reference). Next, I've been reading through the book "Programming the 65816" by David Eyes. This is a good book which goes into incredible detail about these processors, but is not a particularly fun or easy read. Most recently I got a copy of the brand new book "6502 for beginners" by John Dale. While I haven't yet finished reading this book (only a couple of chapters in). It's a very easy read, does a great job of explaining/showing the various memory loading/indexing operations. So far, I would give this book a very strong recommendation.
      Finally, the first book I started reading on the topic of retro computing was "Retro Game Dev (C64 Edition)" by Derek Morris. While I don't think I'd recommend that book for learning assembly. I suppose I can credit the book with me more getting interested in developing for retro systems.

  • @OneWingedShark
    @OneWingedShark Місяць тому +1

    For the low-level, might I suggest you take a look at Forth? It's really quite good in that arena, being a concatenative stack-based language which has as its core the definition of Word being: _"A sequence of words to execute, or else a chunk of machine-code to execute."_
    For the high-level, allow me to suggest looking at Ada, in particular the facilities for generics.

    • @CallousCoder
      @CallousCoder Місяць тому +1

      @@OneWingedShark I just a video in writing a FOrTH interpreter in Zig. Which doesn’t compile words to assembly as that’s a hassle in a memory managed environment. But it’s a fun language.

    • @OneWingedShark
      @OneWingedShark Місяць тому +1

      @@CallousCoder -- I "cheated" with my implementation in Ada: a variant-record that contains either a list of words or else an access-to-procedure [with the 'VM' as an IN OUT parameter], then wrote all the core words in that form, compile, and let the Ada compiler handle generating the machine-code for me.

    • @CallousCoder
      @CallousCoder Місяць тому

      @ aah I had a similar thought to inline words and compile a new binary (didn’t do it) but that’s a similar approach I guess.
      I miss the old days where you could just do self modifying code without a lot of OS memory management hassle. Memory protection is so overrated 😉🤣😜
      I actually hope to have some more time and do this on the PiPico or a bare metal RiscV that will actually generate binary code to jump to and execute and not just reinterpret.

    • @OneWingedShark
      @OneWingedShark Місяць тому +1

      ​@@CallousCoder Maybe they're similar approaches. The word "reinterpret" makes me a bit unsure. The technique I was describing is essentially:
      TYPE Operation is not null access procedure( Object : in out VM ); -- VM: the Forth stacks, collectively, plus the dictionary [populated w/ core on init].
      Procedure ADD ( Object : in out VM ) is
      Begin
      Push( Pop(Object) + Pop(Object) );
      End ADD;
      Since it's native Ada, the compiler generates the binary for the procedure and the VM's initialization associates that with the dictionary entry for "ADD" - because the Forth is meant to run on the same machine I'm compiling for, the access to the procedure _is_ the "chunk of machine-code" to execute - and so I can "hijack" the compiler's codegen so I never _have_ to touch assembly by-hand.
      But, you're right that sometimes the OS steps on your toes inconveniently.

  • @rudelude
    @rudelude Місяць тому

    Hey, I came across this video randomly. When Commander x16 was first announced, this was something I was hoping to see eventually. A modern language designed for retro computers.
    Have you looked into LLVM? Depending on how complex you want this to be, you would be able to reuse most of their tools and use their framework to define your high level language. And you could basically port their low level layer to 65816 (if someone hasn’t already). It may be a good option for you if this gets overly complex but you still like the language you made

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому

      These are some great comments questions and I think I might have some great answers as well.
      First off, I am far from the first person to attempt a modern language for retro computers. There's another guy who is active in the X16 community who is way farther along on one and it is awesome by the way (take a look at this: prog8.readthedocs.io/en/stable/ ). If you just want to get started dipping your toe into X16 development, Prog8, is a great place to start.
      Regarding LLVM... that is actually one of the paradigms I was thinking of as I contemplated how best to approach this task. I think fundamentally it has a great design which enables supporting multiple higher level languages. I suspect it is less prone to regressions as a result of this design as well.
      Regarding the potential of using LLVM framework within my project. While this could be possible, I suspect the framework would be so big that this would prevent me from ever being able to make a compiler which could run natively on a 6502 system. Also, there is another cool project which directly leverages LLVM to compile C code for 6502 targets (see: llvm-mos.org/wiki/Welcome ). So that's another similar and fun project.

    • @knaar13
      @knaar13 19 днів тому

      I would not recommend LLVM. It's extremely heavy-weight. If this is your full time job, then it's a no brainer, spending several months learning LLVM and getting productive is fine, and you'll want to stand on the shoulders of giants. The company you're working for likely would settle for nothing less. For hobby stuff, ain't nobody got time for that. Keep it simple.

  • @TheGabrielMoon
    @TheGabrielMoon Місяць тому

    what books do you recommend about compilers?

  • @anon_y_mousse
    @anon_y_mousse Місяць тому

    I applaud the effort, especially given how many systems use the 6502 which vastly complicates code generation for function calls and variable use.
    However, if you're not going to embrace the C style of using braces to encapsulate blocks of code, how about the sort of Bash style of using inverted keywords to match block ends? (Sort of because it's really just for two constructs.) For instance, you could start a function definition with `fun` and end with `nuf`. It even kind of sounds like you're saying "enough" if you pronounce it. Then you could have `if` and `fi`, a shortened "else if" as `elif` and instead of while you could use `do` and `od` and have either a `while` or `until` after the initial `do` or after the final `od` for cases like in C with do/while.
    Also, if you want to maintain a simplified syntax, Python's operator overloading is a great example. It's far easier to implement than the C++ style of operator overloading.
    Also also, if you're just getting off the ground, try targeting C as a compilation target rather than either assembly or machine code. It's far easier to do and a great way to verify that your compiler works because you can feed its output to an existing compiler that works like cc65.

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому +1

      Regarding code style: thus far I have been focusing on the low-level language interface. Earlier versions of my compiler (prior to my making of this video) did infer the actual types on either side of mathematical operations, however as I described in the video, this was proving to be error prone and was removed from the low-level implementation in favor of being fully explicit. Both this and the other stylistic comment are likely more relevant for the higher-level interface. To be honest, I haven't thought too much yet about what that higher level interface should look like, but once the lower-level parts are more complete I will start contemplating on the matter.
      Regarding C as a compilation target. I think that's a great idea, and was already something that I was considering, but not so that I can compile for 6502 systems. Rather so that I can easily compile code from this novel language onto fully modern Linux/Windows/OSX systems. Also, I am already using cc65 in this project as I am feeding the assembly I generate into its assembler.

    • @anon_y_mousse
      @anon_y_mousse Місяць тому

      @@ModernRetroDev As far as syntax decisions go and using type inference, I took inspiration in my own language from QBasic. You can use a statement to guide the compiler towards a "natural" default. Maybe it sounds silly to take inspiration from such a language, but it was an idea ahead of its time as far as clarifying type inference for the users. If you're using a lot of floating point types it can be annoying to constantly type whatever typename you use for it, so for my language I can use a directive and just #default type=float32 and for all the code after it and until a new directive is encountered every variable that uses type inference will just start with the assumption that it's floating point. Obvious incompatible types will either cause errors or the user will have to explicitly type the variable. That way you could just assign a value to a variable and if it can be converted by the compiler or interpreted as floating point the compiler does it.

  • @stevetodd7383
    @stevetodd7383 16 днів тому

    I do not think that byte code means what you think it means. Byte code refers to a low level but NOT pure binary code that requires further work before the CPU can handle it (.NET or Java use this idea, they have a virtual CPU design that would execute this code directly. They then need either an interpreter or compiler to convert this code to native machine instructions).
    State of the art compilation these days compiles to an intermediate virtual machine with infinite registers and then, during code generation for the target CPU, simplifies this code and looks for optimisations. Using JSR to common code is one practice that is only used if highly memory constrained, in-lining is more efficient in terms of machine cycles. Loop unrolling is another more memory intensive trick that would likely gain you performance. With these 8 bit (or 8/16 in the case of the 65816) machines you need to carefully balance one against the other (maybe set a target memory constraint, try compiling with all optimisations turned on and successively turn them off if the results don’t fit inside the memory space desired).

  • @SaffatUllah
    @SaffatUllah Місяць тому

    Boy have I stumbled into a gold mind🫨❤️ Good content bro

  • @kittyboiyt
    @kittyboiyt Місяць тому

    Did you accidentally get the audio messed up in this video or is it just like that?

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому +1

      I've been constantly working to improve my production quality, but I am still very new at producing videos and as such, I still have a lot of room for improvement. In fact, for this video I switched over to a new camera than I was using before. Due to my lack of familiarity with the new camera, I ended up routing the microphone input directly through my computer instead of through the camera. So it might have been that, it might have been improper gain levels. Or perhaps overly aggressive noise suppression algorithm. Beyond that I do audio compression and normalization steps as well.
      Long story short, making the audio within a video sound decent is a bit of work, not particularly straightforward, and I am inexperienced in this area. I'll try and do better with the audio next time, but as with everything, I'm teaching myself this stuff and learning as I go. Audio quality aside, I do hope you enjoyed the video.

  • @markuskluever4059
    @markuskluever4059 Місяць тому

    nice!

  • @mrinalyadav4261
    @mrinalyadav4261 Місяць тому

    you better keep going

  • @wjffhfgj7045
    @wjffhfgj7045 Місяць тому

    please make an orchestrator for docker containers from scratch 🎉🎉

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому +2

      I'm not going to lie. I totally had to search the internet for the term "orchestrator for docker containers". Unfortunately, that is way outside of my expertise and also, not really within the umbrella of topics that would fit within this channel. That said, I do hope that you'll enjoy some of the things I talk about here. Thanks for watching!

    • @toby9999
      @toby9999 26 днів тому

      Docker?... noooooooo....

  • @weinihao3632
    @weinihao3632 Місяць тому

    Compiler design is a classical topic of theoretical computer science. The code you struggled with is usually not written by a human, but by a compiler compiler. The human specifies the grammar of the new programming language in abstract form. You might want to consult the first edition of the Dragon book (also featured in the very accurate movie "Hackers") for further details.

    • @ModernRetroDev
      @ModernRetroDev  Місяць тому

      Very interesting. Thanks for sharing that book recommendation, but good golly is it expensive. That reminds me of being back in college and buying from the campus bookstore.

    • @weinihao3632
      @weinihao3632 Місяць тому

      @@ModernRetroDev Wow, you are right. The prices for the new book are ridiculous! But luckily on ebay the edition from 1986 is offered for ~20$.

    • @knaar13
      @knaar13 19 днів тому

      I don't agree. Write a simple recursive decent parser by hand. It won't take long and will have much better error handling than a compiler compiler's generated garbage. Besides, the hard part about a compiler is not the parser. It is type checking. And the dragon book basically has just a handful of high level paragraphs on type checking and spends almost the entire rest of the book on parsing, which is so easy you can write software to do it for you. Waste of money unless you're really interested in parsing.

  • @EclipseCat
    @EclipseCat 24 дні тому

    :)