Poor Man's Conditional Jump - Superscalar 8-Bit CPU #36

Поділитися
Вставка
  • Опубліковано 27 чер 2024
  • My homebrew CPU has recently gained four glorious arithmetic flags in its ALU. In this video, we are going to use these flags and the improved control over when registers are updated to define compare, test, and flag swap instructions. An updated assembler with support for these instructions will then allow us to write a fresh new program that puts the flags through their paces and even implements a poor man's conditional jump to compute a multiplication.
    This video series explores the concepts and techniques that make modern computer processors so incredibly fast and powerful. I build my very own 8-bit processor from individual logic gates and gradually evolve it to become a superscalar out-of-order machine. Along the way, we take a deep dive into contemporary computer architecture in a hands-on fashion and rediscover some of the foundations of modern computing.
    Previous Video: • Adding Flags to My CPU...
    Series Playlist: • Build a Superscalar CPU
    ALU Playlist: • Homebrew Arithmetic Lo...
    GitHub Repository: github.com/fabianschuiki/supe...
    00:00 - Intro
    00:44 - FSWAP/CMP/TEST Instructions
    04:21 - Assembler Updates
    06:06 - Using FSWAP and the Flags
    10:47 - Using CMP
    14:23 - Using TEST
    17:15 - A First Conditional Jump
    24:13 - Testing the Zero Flag
    24:51 - Testing FSWAP
    26:08 - Testing the Sign Flag
    26:35 - Testing the Overflow Flag
    29:39 - Testing CMP
    32:52 - Testing TEST
    35:39 - Testing the Multiplication
    42:40 - Outro
    #alu #homebrew #8bit #breadboard #superscalar #computer
  • Наука та технологія

КОМЕНТАРІ • 24

  • @KeesJanLogemann
    @KeesJanLogemann Місяць тому +1

    I was ready to see you implement the support for labels in the assembler.py assembler as you used "loop:" in the source file.
    spoiler? coming in the next video?

    • @fabianschuiki
      @fabianschuiki  Місяць тому

      Definitely coming soon 🙂 Computing those labels manually is getting old pretty fast 😅

  • @janhofmann3499
    @janhofmann3499 2 місяці тому +1

    Great as always and the multiplication was the sugar on top. The HUD overlay is fantastic but looks like a lot of work. Can you at least somehow automate/script/.. this process or is it a click orgy in your editing software?

    • @fabianschuiki
      @fabianschuiki  2 місяці тому

      Thanks 🙂! The animations are pretty straightforward with Manim. It's just a Python script that assembles the animations. So for these overlays, I can put the list of instructions into an array and then let the script more or less simulate the CPU and update the overlay. Not very clean, but gets the job done. Manim has been fantastic. It's annoying for schematics, but for anything that is regular and repetitive, like updating a state overlay, it's brilliant.

  • @sarge2742
    @sarge2742 2 місяці тому +1

    This is progressing really nicely, I'm personally quite interested to see what you have in mind for 'proper instruction decoding' that you've mentioned a few times.

    • @fabianschuiki
      @fabianschuiki  2 місяці тому +1

      🙂 I've been kicking that can down the road for quite some time now. I'm probably going to do something similar to James Sharman and Ben Eater, but with less of a centralized control vibe. If you look at modern CPUs, the instruction decoding stage is mainly responsible for figuring out which registers get read or written, and at what general functional unit in the processor you can throw the instruction. But the decoder often doesn't care about the details too much. I'm inclined to do something similar: have the decoder mainly figure out which registers (and flags) are read and written. This will be necessary for the hazard detection and prevention mechanism down the line, once instructions can complete out of order. The decoder will have to know if all register operands are already available, or if it has to stall and wait for their computation to finish. Then later on, reservation stations can allow the decoder to dispatch the instructions with incomplete data, and have them sit and wait at the corresponding functional unit.
      So long story short: the decoder will likely focus on general register interaction, and leave detailed decoding up to the corresponding functional units, like the decoder in the ALU.

  • @lawrencemanning
    @lawrencemanning 11 днів тому +1

    I've never seen someone use the flags register state to directly calculate a branch offset. Well done. :)
    Question: computed branches seem a bit unusual. Computed jumps, sure, many uses. But branching through an offset held in a register? How many ISAs have that?

    • @fabianschuiki
      @fabianschuiki  11 днів тому +1

      None that I am aware of 😃. Usually you would just load a base address into a register and add the offset onto that yourself. But since this is an 8 bit machine I thought it might come in handy. And it was basically free hardware-wise. Immediates and the rs2 operand are on the same wires 🙂

  • @OscarSommerbo
    @OscarSommerbo 2 місяці тому +1

    This series, along with James Sharman's inspired me to design my own architecture. I am nowhere skilled enough to create the physical cpu/computer (it is big project) but Fabian gives me ideas on how to do some interesting things. Once I get more Ideas nailed down I might do some videos about it.

    • @fabianschuiki
      @fabianschuiki  2 місяці тому

      That sounds fantastic! I'd love to see your CPU design documented in video form. Glad to hear that 🙂

    • @OscarSommerbo
      @OscarSommerbo 2 місяці тому +1

      @@fabianschuiki As a teaser, a split bus. One data bus and one instruction bus. That is the starting point, everything else kinda flows from that.

    • @fabianschuiki
      @fabianschuiki  2 місяці тому

      @OscarSommerbo That's a great idea!Harvard architectures are very popular for microcontrollers and safety-critical systems. Are you planning on strictly separating storage for data and instructions, or to just split the buses early on and have some caching for data and instructions before they access a joint memory?

    • @OscarSommerbo
      @OscarSommerbo 2 місяці тому +1

      @@fabianschuiki I didn't know of the Harvard architecture prior to starting mapping out the system, but I found out from ChatGPT. I use the chatbot to organize my ideas and to get inspiration.
      My current plan is for strict separation, but I am starting to see the flaws in that. So I will massage that concept some more.
      The big bonus with having a clean instruction bus is that adding in co-processors and/or ASIC accelerators is fairly trivial and with minimal overhead. And with a blitter type chip the main CPU can offload ram-ram transfers, a crypto ASIC could decrypt and encrypt RAM and keep strict ACLs for processes, all on its own.
      Most of the instant upsides have been security related, mainly because security have been a major failing of cpus recently. But I am thinking about how a GPU could be hooked in, I am thinking the blitter copies out ram regions to a double/triple frame buffer and the GPU is more of a specialized math coprocessor.
      There I go, spilling the beans.

    • @fabianschuiki
      @fabianschuiki  2 місяці тому

      This sounds very exciting. Do you plan to build an ASIC?

  • @JaenEngineering
    @JaenEngineering 2 місяці тому +1

    This is really starting to come along. If I remember correctly, can't we already alter the step size in the program counter? If so then couldn't we use some logic to either step to the next instruction which would be a relative jump back to the start of the loop or double step past the relative jump to exit out of the loop depending on the flag status. Seems like another good use for our pal the PAL!😅

    • @fabianschuiki
      @fabianschuiki  2 місяці тому +2

      Haha great point 🙂 the PAL pal does have a lot of uses. Using the step size for skipping instructions is a nice idea! I was thinking about using the flags to derive a condition codes for jumps, and then feeding that into the select signals of the PC: if the condition holds, do a relative jump, and if it doesn't hold, do a regular step. That would allow you to write things like `breli.z -16` to branch backwards by 16 bytes if the zero flag is set.

  • @andrewwatts1997
    @andrewwatts1997 2 місяці тому +1

    amazing progress! one step closer to a great cpu.
    do you have any specific tasks you want it to perform when it's done?

    • @fabianschuiki
      @fabianschuiki  2 місяці тому +2

      Thanks! 🙂 I think I want to be able to write a simple operating system to run on it. Something with a little bit of virtual memory, some form of user space vs. kernel space separation, and a trivial form of multithreading. I don't think the CPU needs a lot of features for that. But it would be cool to have something like an 80s era homebrew CPU with a modern twist 😃

    • @andrewwatts1997
      @andrewwatts1997 2 місяці тому +1

      @@fabianschuiki That sounds like a small version of linux ;)

    • @fabianschuiki
      @fabianschuiki  2 місяці тому +1

      @andrewwatts1997 That doesn't sound like a terrible thing 😁 Well, it would be a very tiny version of it. But in the spirit of exploring the fundamentals of how modern CPUs do their things, it doesn't sound too bad to toy around with the basics of an OS 😏