Fabian Schuiki
Fabian Schuiki
  • 40
  • 97 310
Adding Labels to My Assembler – Superscalar 8-Bit CPU #39
The assembler I wrote from scratch is a pretty convenient tool to build programs for my homebrew CPU. Manually computing address offsets for jump instructions isn't fun though. In this video, I extend my assembler to support the usual syntax for declaring jump labels and using them in actual jump instructions.
This video series explores the concepts and techniques that make modern computer processors so incredibly fast and powerful. I build my very own 8-bit processor from individual logic gates and gradually evolve it to become a superscalar out-of-order machine. Along the way, we take a deep dive into contemporary computer architecture in a hands-on fashion and rediscover some of the foundations of modern computing.
Previous Video: ua-cam.com/video/mWuuduwPqNo/v-deo.html
Series Playlist: ua-cam.com/play/PLyR4neQXqQo5nPdEiMbaEJxWiy_UuyNN4.html
Assembler Playlist: ua-cam.com/play/PLyR4neQXqQo6zHnja5rwdyrByXFyZHSul.html
GitHub Repository: github.com/fabianschuiki/superscalar-cpu
- Assembly Language: en.wikipedia.org/wiki/Assembly_language
00:00 - Intro
02:16 - Label Declarations
05:56 - Labels As Operands
10:10 - Encoding Labels as Jump Offset
12:42 - Computing Label Offsets
14:39 - Resolving Label Names
22:58 - Using Labels in Programs
24:41 - Outro
#assembler #compiler #homebrew #computer
Переглядів: 1 722

Відео

Adding Conditional Moves to My CPU - Superscalar 8-Bit CPU #38
Переглядів 1,5 тис.Місяць тому
Based on the new condition code checking circuitry in the ALU, we add a new conditional move instruction to my homebrew 8 bit CPU. We then teach the assembler how to deal with CMV and write a fancy new program that makes ample use of this new instruction. And as a final test, we crank up the clock speed to see just how fast we can compute the largest 16 bit Fibonacci number on my CPU. This vide...
How Condition Codes Work - Superscalar 8-Bit CPU #37
Переглядів 1,1 тис.Місяць тому
The ALU of my homebrew CPU has four arithmetic flags. In this video, we are going to figure out how to make sense of these flags by defining a list of interesting condition codes. We then extend the ALU with three additional chips that allow us to evaluate the condition codes and determine which flags are set, if the operands to the previous comparison were equal or not, or which operand was le...
Poor Man's Conditional Jump - Superscalar 8-Bit CPU #36
Переглядів 1,3 тис.2 місяці тому
My homebrew CPU has recently gained four glorious arithmetic flags in its ALU. In this video, we are going to use these flags and the improved control over when registers are updated to define compare, test, and flag swap instructions. An updated assembler with support for these instructions will then allow us to write a fresh new program that puts the flags through their paces and even impleme...
Adding Flags to My CPU - Superscalar 8-Bit CPU #35
Переглядів 1,5 тис.2 місяці тому
Arithmetic flags are a powerful tool to make the most out of the limited compute resources offered by simple processors. My homebrew 8 bit CPU currently only has a carry flag, which has proven useful for performing wide arithmetic and to implement bit reversal and popcount functions. In this video, we are going to extend the ALU with a proper zero, sign, and overflow flag. To make flags managem...
Doing Boolean Algebra on My CPU - Superscalar 8-Bit CPU #34
Переглядів 1,7 тис.2 місяці тому
The big missing piece of my 8 bit ALU is the ability to perform bitwise boolean logic operations. In this video, we are going to extend the ALU with a bit of circuitry that can perform any 2-input logic operation and replace the current XOR gates that conditionally invert the right-hand side for subtraction. By hooking this new circuitry up to the ALU operation decoder, we can define new AND, O...
Decoding ALU Micro-Ops - Superscalar 8-Bit CPU #33
Переглядів 1,5 тис.3 місяці тому
My homebrew CPU can already do a lot of interesting instructions with its ALU: ADD, ADDC, SUB, SUBC, NEG, SHLL, SHLC, SHRL, and SHRC. But not all potential ALU operations are accessible as instructions. Due to the limited number of instruction bits we have available, operations like NOT or SHRA are not accessible from a program. In this video, we are going to fix this issue by introducing a dec...
Adding Bit Shift Instructions - Superscalar 8-Bit CPU #32
Переглядів 1,1 тис.3 місяці тому
The ALU of my homebrew CPU can do simple bit shifts, but the CPU does not have shift instructions yet! In this video we hook the modified ALU back up to the CPU, define new SHLL, SHRL, SHLC, SHRC, and NEG instructions, and extend the assembler to support them. These instructions allow us to write a new program that performs negation of a number, 16 bit left and right shifts, reverses the bits o...
Shifting Bits in my ALU - Superscalar 8-Bit CPU #31
Переглядів 1,1 тис.4 місяці тому
Many interesting algorithms require the bits in a number to be shifted left or right. My homebrew CPU can add and subtract numbers, accounting for the carry flag, but it does not have any bit manipulation capabilities yet. In this video we extend the ALU with a few "shift operation" multiplexers that allow us to shift the bits in a number to the left or right. This shift makes one bit of the nu...
Carry Flag - Superscalar 8-Bit CPU #30
Переглядів 1,4 тис.8 місяців тому
My homebrew CPU can add and subtract 8 bit numbers. That is nice if we only need to work with numbers between 0 and 255. But what if we need larger numbers? In this video we add a "carry flag" register and extend the ALU with a "carry operation" multiplexer. Based on this we define new ADDC and SUBC instructions and add support for them to the assembler. These instructions allow us to write a n...
How Computers Fake Negative Numbers - Superscalar 8-Bit CPU #29
Переглядів 1,1 тис.8 місяців тому
My homebrew CPU can add numbers. That’s nice, but it would be fantastic if it could also subtract numbers! The solution is surprisingly simple. We don't even need a subtraction circuit. In fact, modern computers have no idea what a negative number is, or how to do subtraction. In this video we take a close look at modulo arithmetic the fact that overflows in the adder causes numbers to loop bac...
Teaching My CPU Addition - Superscalar 8-Bit CPU #28
Переглядів 1,5 тис.9 місяців тому
It's time I teach my CPU how to add numbers. We take the first step towards an Arithmetic Logic Unit by wiring up an adder and integrating it into my processor. To make this work, we introduce a write-back bus such that instructions can decide to write the adder result back to the register file. We then extend the Instruction Set Architecture with new ADD and ADDI instructions, and add support ...
How Computers Add Numbers - Superscalar 8-Bit CPU #27
Переглядів 1,5 тис.9 місяців тому
Computers feel like they can do anything: run video games, play music and movies, or browse the internet. But if you look closer, all these seemingly complicated things boil down to only a few basic operations, chief among them addition. In this video, we build up the fundamental circuits that allow a computer to add numbers. We synthesize the gates for half and full adders from a basic truth t...
Output Binaries from My Assembler - Superscalar 8-Bit CPU #26
Переглядів 1,2 тис.Рік тому
To finally make the assembler for my 8 bit CPU useful it needs to produce an output file. In this video we make the assembler collect all the encoded instructions into one binary blob, print the blob in a hexadecimal format that is easy to read, and write the binary to an output file that the CPU can directly execute. This video series explores the concepts and techniques that make modern compu...
Encode Instructions in My Assembler - Superscalar 8-Bit CPU #25
Переглядів 1,4 тис.Рік тому
The assembler for my 8 bit CPU can already read a program in human-readable assembly and compute the exact addresses of every instruction. In this video we teach the assembler how to compute the exact encoding for each instruction, that is, the exact bit-pattern that the CPU can understand and execute. This video series explores the concepts and techniques that make modern computer processors s...
Layout Programs in My Assembler - Superscalar 8-Bit CPU #24
Переглядів 1,6 тис.Рік тому
Layout Programs in My Assembler - Superscalar 8-Bit CPU #24
Parse Instructions into My Assembler - Superscalar 8-Bit CPU #23
Переглядів 5 тис.Рік тому
Parse Instructions into My Assembler - Superscalar 8-Bit CPU #23
My CPU Does 16 Bit Jumps Now - Superscalar 8-Bit CPU #22
Переглядів 1,4 тис.Рік тому
My CPU Does 16 Bit Jumps Now - Superscalar 8-Bit CPU #22
Expanding the Register File - Superscalar 8-Bit CPU #21
Переглядів 1,6 тис.Рік тому
Expanding the Register File - Superscalar 8-Bit CPU #21
Register PCB - Superscalar 8-Bit CPU #20
Переглядів 2,2 тис.Рік тому
Register PCB - Superscalar 8-Bit CPU #20
Circuit Testing with Python - Superscalar 8-Bit CPU #19
Переглядів 3,8 тис.Рік тому
Circuit Testing with Python - Superscalar 8-Bit CPU #19
Build Your Own Logic Analyzer - Superscalar 8-Bit CPU #18
Переглядів 6 тис.Рік тому
Build Your Own Logic Analyzer - Superscalar 8-Bit CPU #18
How to Access Registers - Superscalar 8-Bit CPU #17
Переглядів 2,2 тис.Рік тому
How to Access Registers - Superscalar 8-Bit CPU #17
How a Register File Works - Superscalar 8-Bit CPU #16
Переглядів 4,4 тис.Рік тому
How a Register File Works - Superscalar 8-Bit CPU #16
Program Counter PCB - Superscalar 8-Bit CPU #15
Переглядів 2,4 тис.Рік тому
Program Counter PCB - Superscalar 8-Bit CPU #15
Upgrading to 16 Bit Instructions - Superscalar 8-Bit CPU #14
Переглядів 2 тис.2 роки тому
Upgrading to 16 Bit Instructions - Superscalar 8-Bit CPU #14
How to Fetch 16 Bits from an 8 Bit Memory - Superscalar 8-Bit CPU #13
Переглядів 2,9 тис.2 роки тому
How to Fetch 16 Bits from an 8 Bit Memory - Superscalar 8-Bit CPU #13
Defining First Instructions - Superscalar 8-Bit CPU #12
Переглядів 2,1 тис.2 роки тому
Defining First Instructions - Superscalar 8-Bit CPU #12
Program Memory - Superscalar 8-Bit CPU #11
Переглядів 2,3 тис.2 роки тому
Program Memory - Superscalar 8-Bit CPU #11
Resetting the Program Counter - Superscalar 8-Bit CPU #10
Переглядів 1,7 тис.2 роки тому
Resetting the Program Counter - Superscalar 8-Bit CPU #10

КОМЕНТАРІ

  • @anon_y_mousse
    @anon_y_mousse 5 годин тому

    You might want to consider '.' as a valid label character as well. Not that prior examples really matter here, but `gcc` does use '.L' as a label prefix, and I'm not sure if it has the same semantics as it does for `nasm`, but in `nasm` it's used for local labels so you could have a .L1 in each function and it would still work. Just 0xf00d for thought.

    • @fabianschuiki
      @fabianschuiki Годину тому

      That's an excellent point! I like the idea of giving special meaning to labels starting with a `.` and treating them as relative/scoped to the previous label that did not have a leading `.`. Very elegant.

  • @milesrout
    @milesrout 12 годин тому

    The way you add onto strings within a loop (buffer += bytes(...)) is accidentally quadratic. Every time you append a new thing to the end of a string, you have to allocate a whole new string. This will start to bite with bigger programs. You should use bytearray or a list which you b"".join() at the end.

    • @fabianschuiki
      @fabianschuiki 6 годин тому

      Excellent point! 👍 I'll fix that 🙂

  • @dmoisset
    @dmoisset 6 днів тому

    This channel is great, i love the editing and presentation style. I'm curious, what software do you use to create your animations? Those look amazing and add a lot to the production quality and communication

    • @fabianschuiki
      @fabianschuiki 6 днів тому

      Thank you very much! 😃 I use Manim by the fantastic Grant Sanderson @3blue1brown

    • @dmoisset
      @dmoisset 6 днів тому

      Oh, that's cool! I've seen his videos, I should have recognized the style 😊

    • @fabianschuiki
      @fabianschuiki 6 днів тому

      @dmoisset 😃 It has a very distinct way of animating things. I love the style of his videos and how much effort and attention to detail he puts into it. And the fact that it's a programmatic way of animating helps a lot with the more systematic animations, like showing the CPU's internal state.

  • @milesrout
    @milesrout 8 днів тому

    Good stuff. Pity the syntax highlighting of '1f' and '1b' isn't the same in your text editor though. Custom syntax highlighting is tedious but so nice once you do it. It's pretty easy in Sublime Text too if I remember correctly (I haven't used Sublime in over 10 years!). Just a few regular expressions. I wrote an assembler and emulator during the 0x10c/DCPU-16 craze. I pretty sure more assemblers were written and published online as free software in the space of a month than have ever been written in any other calendar month in history.

    • @fabianschuiki
      @fabianschuiki 8 днів тому

      😃 I'll definitely go and fix the syntax highlighting at some point!

  • @lawrencemanning
    @lawrencemanning 8 днів тому

    I cheated and just borrowed ARM’s 4 bit codes. 😂 Except I made 0000 “always” as it bugged me otherwise. Edit: and on a previous build I had instruction bits to burn so just had a nybble for “cares” flags and a nybble for what value the cared for bits had to be. It works fine and the programmer can dream up nonsensical tests (eg. Zero and negative) if they want, but it is wasteful.

    • @fabianschuiki
      @fabianschuiki 8 днів тому

      Haha, that's definitely a good approach! 😃 How come you had bits to spare?

    • @lawrencemanning
      @lawrencemanning 8 днів тому

      @@fabianschuiki my very first softcore processor was a 16 bit address and data multi cycle. Many instructions took trailing 16 bit immediates, including branching. Was quite pleased with it (got as far as programming Snake (video on my channel if you are interested), but in retrospect it wasn’t great. Latest 32 bit core is a mostly RISC like 2 stage pipeline with embedded immediates. It’s not quite as friendly on the assembly programmer, but more interesting technically. I’ve implemented Boulderdash on that. Yes I build processors to play 80s computer games! 🤣

    • @fabianschuiki
      @fabianschuiki 8 днів тому

      @lawrencemanning That's really nice 😃🤓!

  • @lawrencemanning
    @lawrencemanning 8 днів тому

    I've never seen someone use the flags register state to directly calculate a branch offset. Well done. :) Question: computed branches seem a bit unusual. Computed jumps, sure, many uses. But branching through an offset held in a register? How many ISAs have that?

    • @fabianschuiki
      @fabianschuiki 8 днів тому

      None that I am aware of 😃. Usually you would just load a base address into a register and add the offset onto that yourself. But since this is an 8 bit machine I thought it might come in handy. And it was basically free hardware-wise. Immediates and the rs2 operand are on the same wires 🙂

  • @lawrencemanning
    @lawrencemanning 8 днів тому

    Have you done any tests to determine the current fMax? I’m curious. 😊 Very neat design, though the downside of GALs (etc) is they obscure what’s going on. I’ll let you off with the Boolean ops. 😂 I liked James’s solution too and it would have been a bit repetitive to solve it the same way!

    • @fabianschuiki
      @fabianschuiki 8 днів тому

      I haven't done any tests yet. It might be worth to statically compute the timing of the CPU and then compare it to the actual hardware, and then figure out where to place registers to cut the long paths.

  • @lawrencemanning
    @lawrencemanning 9 днів тому

    So you didn’t fancy implementing a zero page mode then. 😂

  • @lawrencemanning
    @lawrencemanning 9 днів тому

    On decoupling caps: my preferred solution is to put them (and other passives) on the back. Few folks do this and I’m not sure why. The only drawback is you need mounting posts on the corners to bring the board off the table when attaching the parts to the back, with suitable holes for the screws holding the posts in in your stencil if you really need one, but it works well: place the (decent size) cap in the centre of the back of the IC with some fat traces going through a via to the power pads on the front.

    • @fabianschuiki
      @fabianschuiki 9 днів тому

      It would be fantastic to have components on the back 🥳! I haven't figured out how to make that reliable during soldering though. You probably have to go solder the caps manually afterwards because the hot air would heat the PCB to a point where the front parts would fall off again 🤔 Not sure how this is done industrially. I've seen glue being used to hold components in place occasionally -- not sure if that's the answer though.

    • @lawrencemanning
      @lawrencemanning 9 днів тому

      @@fabianschuiki I’ve never had a problem with caps falling off. I usually solder the caps etc on the back with hand placed paste, then flip the board over and solder the front parts the same way. I used PCB posts on all corners on both the front and back by screwing posts into posts in a stack. That way the board is always flat on desk. I do not usually solder a whole board in one sitting with a stencil as my boards are bigger, but I have done it before. I don’t believe enough heat will travel through the board to make components fall off, but don’t hold me to it. I have been doing that for a few years now though with my projects and never had a problem. Also, your caps are disproportionately small compared to the SOIC? 50mil packages. In my opinion anyway. 1206 or 0805 would probably be the size I’d go for.

  • @lawrencemanning
    @lawrencemanning 9 днів тому

    I’ll never like this style of schematic. Far better is to use symbols that don’t resemble the physical parts. Do you not have busses in that schematic capture tool? Case in point: what if you suddenly had to switch to a different package type for the IC? This would be much easier if logical symbols were used. It really goes against the principle of what a schematic is for; they are more then a means to design a PCB they are for explaining how the circuit works. Also how come you didn’t use KiCAD, which is far more friendly for people building their own? Sorry for coming across as negative. I love what you’re doing.

    • @fabianschuiki
      @fabianschuiki 9 днів тому

      It's pretty much down to limitations in EasyEDA. The neat thing about it was that it's incredibly easy to get going, you can do all the PCB ordering right in the program, and it has pretty much every LCSC part already in the library. KiCAD will definitely be my tool of choice for future projects 🙂

    • @lawrencemanning
      @lawrencemanning 9 днів тому

      @@fabianschuiki I’m glad you are going to be looking at KiCAD soon. You’ll like it. The workflow for getting a board made is really not that scary: export as gerbers (defaults are almost certainly sufficient), export the drill file, zip it up in a directory and upload it to wherever you like. I actually use JLC at the moment, and they will decode the zip, work out what layer is which for you, figure out what layers are internal, silkscreen, mask etc, and tell you how much it costs. It’s pretty simple. I wouldn’t be surprised if someone has written a KiCAD plugin to make it even easier actually. FYI the pathological example of logical vs physical symbols is the use of individual gates, which become nearly indecipherable when a physical representation of the package is used, compared to it its logical symbol. Consider also the realistic possibility that you might decide you only need one NAND gate on a board and could swap out that quad ‘00 for a single gate package. Check out some old computer schematics for a nice illustration of what to aim for. Someone “digitised” the Amiga schematics and they are like art. 😊

  • @lawrencemanning
    @lawrencemanning 9 днів тому

    I think you could have done with some explanation as to why you went for a wider instruction word vs a fully encoded instruction format like the Z80. Presumably you didn’t discuss this because it opens up some bigger questions around instruction sequencing and microcode etc. Also 2 ROMs would have been an option that would have been interesting to discuss. That would have been my preferred solution as it avoids some interesting timing considerations. Anyways, loving this series. 😊

    • @simontillson482
      @simontillson482 9 днів тому

      Seems to me he’s going for a more RISC style instruction set, which lends itself to a more hardwired instruction opcode decoding. I too was wondering about the timing issues - surely, toggling the latch enable at the same time as the address LSB will cause something of a race. I was a bit surprised that it seems to work so well! (Actually, he does explain why it works further down this thread - it seems the ROM is way slower than the latch, so the latch has effectively stored the old value before the ROM’s data lines being to transition to the new value. Cool.)

    • @fabianschuiki
      @fabianschuiki 9 днів тому

      I wanted a setting where instructions have a fixed length and somewhat rigid layout. For operation at 1 or 2 instructions per cycle, you need to be able to decode the instruction in a single cycle, including all operands. Variable length encoding makes this much more annoying, especially when you decode two instructions in parallel. (Although it's a very attractive thing in "subscalar" designs where you have multiple clock cycles available per instruction.) The wide instruction word is just to have enough registers and opcode space. If I had only four registers, I could have at most 16 different two-operand 8 bit instructions. With eight registers it would be down to 4 instructions. So the 16 bits were a necessity.

    • @fabianschuiki
      @fabianschuiki 9 днів тому

      @simontillson482 The timing feels pretty wonky at first 😃. In synchronous digital design you want to have all your signal toggles launched exactly at the clock though. This works here as well since the hold time of the latch (time after the clock that the data needs so stay stable) is around 1-2ns, but the contamination delay of the ROM (time between changing an input and seeing first changes at an output) is somewhere around 40-70ns. I've seen a little bit of instability in the ALU though that is likely related to the latch not holding its data for long enough after it becomes transparent (propagation delay faster than the hold time in the ALU flags register).

  • @lawrencemanning
    @lawrencemanning 10 днів тому

    Nice to see you using minipro and not the awful Windows software. 😊

  • @lawrencemanning
    @lawrencemanning 10 днів тому

    Can’t believe I haven’t found your videos before! Very very cool. 5:29 surprised not to see basic pipelining (one instruction issue per clock, multiple clocks to complete an instruction) here, since it’s the logical next step after concluding your clock rate won’t scale with every instruction needing to execute in a single clock cycle. Anyway, great stuff!

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      Thanks! 😃 Basic pipelining will definitely come into the picture.

  • @perkyelixir2254
    @perkyelixir2254 10 днів тому

    i realize this might be a bit much to ask, but if you would make a video on writing an llvm backend (or some other high level language thing) at some point in the future, that would be great

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      That's a great idea! I've been toying with the thought of taking the assembler and building out a simple IR and register allocation, to conceptually explore what LLVM and other compiler backends do. Adapting LLVM for my CPU would be a very nice thing to do 😃

  • @fab4key
    @fab4key 10 днів тому

    You can ass more things into assember as .align, .text and .data

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      Absolutely! Once the assembler supports expression evaluation, a lot of cool other features get unlocked in a sense. I like the idea of having segments like `text` and `data`, and allowing the user to lay those out in memory. Also, having the ability to plop down data and strings would be really handy.

  • @fab4key
    @fab4key 10 днів тому

    Cool video! Your channel actualy very helped me with my own assembler written in Lua. Can i ask what ':=' in if statement means?

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      Thanks! 🙂 `:=` assigns the value on the right to the variable on the left, and also returns the value on the right. It's a nice way to check if a value is not none in an if, and then have the value available in a variable inside the if block.

    • @DavidLatham-productiondave
      @DavidLatham-productiondave 10 днів тому

      It's called the walrus operator. Which is kinda cute.

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      @DavidLatham-productiondave Haha, I love that name 😂

  • @lawrencemanning
    @lawrencemanning 10 днів тому

    I use CustomASM for my softcore. It’s far from perfect, but it’s pretty good. Have you played with it? Not that there’s anything wrong with doing it yourself, even if CustomASM meets all your needs. 😊

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      I haven't really played around with it. The prospect of writing an assembler totally from scratch was too exciting 😅

    • @lawrencemanning
      @lawrencemanning 10 днів тому

      @@fabianschuikiyes indeed! I will probably look at it eventually. One of the thing CustomASM can’t do (AFAIK) is generate linkable objects, so you end up with includes to bring in your “modules”. It works, but it’s not nice. I shall have a look at your other videos later; I’ve never been brave enough to build a processor on breadboard and have massive respect for folks who take this on!

  • @OscarSommerbo
    @OscarSommerbo 10 днів тому

    Very nice and tight video. Pacing was just right.

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      Thanks! 😃 Trying to be a bit more on-point 😉

  • @mekafinchi
    @mekafinchi 10 днів тому

    One feature I'd strongly recommend are local labels - where using a prefix (usually '.') prepends the most recent normal label to the logical name of a label. This lets you have descriptive names without global scope rather than being limited to globals or numbers. Local labels could also be accessible from any context by using the full logical name e.g. "normal.local" referring to ".local" in "normal" even outside normal's block

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      That is a fantastic suggestion, thanks a lot! Will definitely add those 🙂👍

    • @DavidLatham-productiondave
      @DavidLatham-productiondave 10 днів тому

      I also use unnamed labels in ca65. An unnamed label is a : by itself. Then you can branch to a count of unnamed labels in forwards or backwards direction. Unnamed labels are scoped from the previous normal label until immediately before the next normal label. Eg.(6502) ``` count_to_65536: ; here unnamed labels are scoped to count_to_65536 ldx 0 : ldy 0 : iny beq :- inx beq :-- next_lable: ; here unnamed labels are scoped to next_label ```

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      @DavidLatham-productiondave That's a pretty neat approach! I like how this doesn't clutter the text at all. The relative labels I have implemented tend to be just `1:` and `2:` in practice... Makes sense to leverage that and provide more compact syntax! 👍

  • @OscarSommerbo
    @OscarSommerbo 10 днів тому

    YES!! Labels! Finally, something I would have added way earlier. But then I come from a C background and I learned assembler backwards by looking at the compiled byte code, and labels are essential. Of course, you can do what Fabian have been doing so far calculating his own jump offsets, but when you have an incredibly powerful calculator why not use it. 😊 This will be a fun episode I bet.

    • @fabianschuiki
      @fabianschuiki 10 днів тому

      Yeah, it was about time to add those 😁

  • @0ffGridTechClub
    @0ffGridTechClub 13 днів тому

    This is the coolest thing I've ever seen ! I'm just starting to learn PCB design as well as i2c design and debugging.

    • @fabianschuiki
      @fabianschuiki 12 днів тому

      Thanks! 🙂 It comes in pretty handy for testing more complicated behaviors of your PCBs.

  • @calculus7
    @calculus7 15 днів тому

    I assume your CPU will also be pipelined like James Sharmans? I agree that his ALU design is quite elegant, but since I’m designing a stepped processor, I’m thinking of not running everything through the adder. In James’ design, I believe this requires two pipeline stages before an ALU result can be obtained…not a problem for a pipelined processor. I’m not quite sure why two stages are needed but I’m thinking for my stepped processor design it might be better to keep each calculation circuit separate (add, shift, logic ops, etc) and only choose between which one to apply at the output at the final output of the ALU. Hope that makes sense. My cpu, though a stepped processor, currently takes three cycles for most operations. That’s why I’d like to minimize the number of cycles needed in the ALU.

    • @fabianschuiki
      @fabianschuiki 12 днів тому

      Yes, at a later point I'll start adding pipeline stages in the places where they make sense. It's a good idea though to quickly sketch out on paper what kinds of chips you have feeding into each other, and then sum up the propagation delay along those paths. The adder is pretty fast, probably 1/4th of the time a ROM-based decoder would take to produce an output. AND/OR/XOR and multiplexers are going to be even faster. So you can probably rack up quite a few adders or logic chips in your ALU before they start making your CPU slower (because they overtake the decoder as criticial path). Keeping the ALU paths separate for the different functions is a nice idea! That would likely take quite a few additional chips because you have to replicate some work (XOR for subtraction in parallel to logic ops, maybe some redundancy in the logic ops?) but you should be able to get a bit more speed out of it 👍

  • @notabeneenterprises4210
    @notabeneenterprises4210 25 днів тому

    Hi Fabian. Absolutely loving this series. Thank you so much. (And kudos for recognizing James Sherman’s excellent work too.) One question: What font are you using in sublime? Looks really cool and I’d like to try it in my IDE.

    • @fabianschuiki
      @fabianschuiki 25 днів тому

      Hey, thanks for the kind words! 😃 The font is Iosevka.

  • @somethingnonsense5389
    @somethingnonsense5389 26 днів тому

    great and extensive work on explaining and showing what's going on, haven't seen anything close except the well-known ben eater series. But i was wondering if it would be possible to also have through-hole pcb's? I would love to build along, but i can't do SMD soldering myself, but through-hole i can.

    • @fabianschuiki
      @fabianschuiki 26 днів тому

      Thank you for the kind words! 🙂 You can definitely replace the ICs with through-hole variants on the PCB. Thinga are going to be a whole lot bigger, which may become a nuisance later on. Especially with superscalar execution things like the registers are going to require quite a few additional chips. You could completely change the buildup and swith to a backplane into which you plug the PCBs with a card edge style connector. That could be really cool, and it buys you a lot of space for through-hole ICs 😃🥳

    • @lawrencemanning
      @lawrencemanning 8 днів тому

      As someone who stuck rigidly to throughhole for probably a decade, I can strongly recommend you give SMT a crack. It is not nearly as hard as it looks, nor is it expensive in terms of equipment. A hot air station and basic standalone microscope is all you need to learn. Stick to 50mil SO and SOIC parts, with 1206 resistors and you are good. You will not regret it. I laughed at one of Fabian’s comments when he said that he really hoped the orientation on some parts was correct in a build; swapping say an LED is a 10 second job vs many minutes, if you are fortunate and own all the tools, on a throughhole LED.

  • @davidrosset4457
    @davidrosset4457 28 днів тому

    davidr@raspberrypi:~/Documents $ python tester.s Check write of 0x00 inputs: WD=00000000 WE=1 RE1=0 RE2=0 RE3=0 clock inputs: WD=00000000 WE=0 RE1=1 RE2=0 RE3=0 outputs: RD1=00000000 (00000000) RD2=00000000 (ZZZZZZZZ) RD3=00000000 (ZZZZZZZZ) inputs: WD=00000000 WE=0 RE1=0 RE2=1 RE3=0 outputs: RD1=00000000 (ZZZZZZZZ) RD2=00000000 (00000000) RD3=00000000 (ZZZZZZZZ) inputs: WD=00000000 WE=0 RE1=0 RE2=0 RE3=1 outputs: RD1=00000000 (ZZZZZZZZ) RD2=00000000 (ZZZZZZZZ) RD3=00000000 (00000000) Check write of 0xFF inputs: WD=11111111 WE=1 RE1=0 RE2=0 RE3=0 clock inputs: WD=00000000 WE=0 RE1=1 RE2=1 RE3=1 outputs: RD1=00000000 (11111111) RD2=00000000 (11111111) RD3=00000000 (11111111) inputs: WD=00000000 WE=0 RE1=0 RE2=0 RE3=0 FAILED: 9 mismatches davidr@raspberrypi:~/Documents $ Here's the error I get from minute 40:51... I'm pretty sure the code is matched, what can this be?

  • @PewDiePie777
    @PewDiePie777 29 днів тому

    I love this channel.

  • @AbelShields
    @AbelShields Місяць тому

    Damn, this is so underrated - discovered this series on Friday and binged it over the weekend, now I'm hooked and eagerly awaiting more!

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Thanks for the kind words! 🙂 Lots of PCB work incoming to make room for more pipeline fun 👍

    • @AbelShields
      @AbelShields 29 днів тому

      @@fabianschuiki I genuinely can't wait, I've seen several circuit board processors but they all tend to be simple. I wrote a Gameboy emulator in C++ and while the simple/early processors can help you understand the basics of modern ones, it doesn't give you a real feel for how modern out-of-order superscalar processors actually got this fast, so seeing someone actually implement some of these features could be really useful, for me and for anyone else interested in CPU design or low-level programming :)

    • @fabianschuiki
      @fabianschuiki 27 днів тому

      😃 That is my hope for this project. There are a lot of pitfalls and ways for the complexity with discrete ICs to get out of hand though. It's quite a balancing act.

  • @PewDiePie777
    @PewDiePie777 Місяць тому

    Finally, the content I am willing to watch.

  • @AbelShields
    @AbelShields Місяць тому

    Plain try/except in python is often discouraged, if you're going to catch a Ctrl-C you should be explicit and `except KeyboardInterrupt`

    • @fabianschuiki
      @fabianschuiki Місяць тому

      You're totally right, that's a great point! 👍 Without the explicit check for KeyboardInterrupt I'm silently stopping on any kind of exception, with no useful error message. Thanks! 🙂

    • @dzhong111
      @dzhong111 29 днів тому

      I would even leave out the 'except'. This is a good use case for 'finally' which is used to clean up regardless of whether an exception was raised or caught.

    • @fabianschuiki
      @fabianschuiki 29 днів тому

      @dzhong111 Ah that's a great point! Thanks 👍

  • @bradywb98
    @bradywb98 Місяць тому

    y'all smell that? i think we're about to be cooking with gas.

  • @quazillionaire
    @quazillionaire Місяць тому

    It's really starting to come together into a real CPU, very cool! I'm very much looking forward to seeing all the wires turn into pretty PCBs. Awesome work!

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Thanks! 🙂 Yeah I also can't wait to see that. Finally some order to the chaos. 😅

    • @KeesJanLogemann
      @KeesJanLogemann Місяць тому

      ​@@fabianschuikiI hope you won't need bog-wires... 😮😅

    • @fabianschuiki
      @fabianschuiki Місяць тому

      First time right? Fingers crossed 😃

  • @Patrick1985McMahon
    @Patrick1985McMahon Місяць тому

    can someone explain the capacitors on the power rail? These videos are great but there's still lots of small things I wish were explained.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      They are considered good practice on the supply rails because of how digital circuits operate: when the clock signal goes from 0 to 1, all chips start to toggle and change at the same time, until all values have been computed and the changes have died down. Each of these chips toggling require energy that they draw form the power rail. So when the clock happens and the activity starts, there's a sudden huge rush of current being pulled from the power rails. Especially on breadboards and through long power cables the power rail can't immediately serve that current, but there's a certain inductance/reluctance that causes the current to slowly ramp up (like trying to abruptly move a heavy object). The capacitors add a local reservoir of energy that is close to the chips, which can supply current over a much shorter wire and inductance. This evens out the power drawn over the main power cable.

  • @alexw5093
    @alexw5093 Місяць тому

    You continue to inspire me. I’m learning how to script, plan, and shoot videos like this so I can create content like this in future. Can’t wait to continue to see this project unfold.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Thank you very much! 🙂 I'm glad the series is useful and motivates you to do a series too. I'd totally watch that 😃

  • @KeesJanLogemann
    @KeesJanLogemann Місяць тому

    How do you intent to use labels as in the source examples? You have only the jrelr to use, where you (conditionally) change the content of the register you use to jump, because the jreli uses the upper 8bit as immidiate to jump. Or are you planning something like a conditional jreli (cjreli?) but where to store the immidiate value in the instruction?

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Great point! 🙂 On the one hand I plan to have conditional jumps as you say, with jump offsets encoded as immediates. (With proper instruction decoding, jumps can use the rd operand bits as condition code, which leaves the upper 8 bits for the immediate.) But you should also be able to compute the offsets with labels in the assembler, for example as `ldi r4, (exit2-loop2-2)`.

    • @KeesJanLogemann
      @KeesJanLogemann Місяць тому

      @fabianschuiki the value of the address of the label is available in the Layouter part. So you need an Operand kind "Expression" and "Label" that can be resolved in the 2nd iteration of Layouter (first iteration is getting the value of the addresses with "label opcode" like "loop:" and "exit:" , 2nd iteration is resolving the "label operands" and "expression operands" where label values are used). For each Opcode where a Label operand is used, in the Layouter you resolve the "Imm operand". For example if inst.opcode == Opcode.JRELI, change the "Label operand" for the "Imm operand" with the value of Label address - current_address, which gives the proper Immediate jump value (ahead or back, + or -).

    • @fabianschuiki
      @fabianschuiki Місяць тому

      @KeesJanLogemann Yeah, this sounds great! 😃 Having at least a limited form of expressions available is going to be very useful 👍

    • @KeesJanLogemann
      @KeesJanLogemann Місяць тому

      ​@@fabianschuiki​​ it took me some posts to realize UA-cam comments with links are destroyed 😢 To use "plain labels" for the jreli opcode, I forked your GitHub superscalar-cpu to mine (kjlogic) and added label support. Find the repo at / kjlogic / superscalar-cpu / assembler Use at your convenience 😊

    • @fabianschuiki
      @fabianschuiki Місяць тому

      @KeesJanLogemann That's awesome! 🥳 Thanks for the pointer 😃 Very convenient 😏

  • @TavishMcEwen
    @TavishMcEwen Місяць тому

    Pog :3

  • @costa_marco
    @costa_marco Місяць тому

    Did you validate the cmp -128, 127 and cmp 127, -128 cases? I was expecting at least a passing comment on them in the video.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Not yet unfortunately 😕. I recorded the testing and the ALU build in the last episode in one recording session without time in between to react.

  • @PhilBoswell
    @PhilBoswell Місяць тому

    I'm wondering how many instructions you could conceivably add conditions to: I have a dim memory of a RISC architecture that made any instruction potentially conditional, with the default being "TRUE" 🤪 I'm also wondering whether you might be able to have multiple ALUs, etc, for more parallelism 🤔 This is really fun and intriguing, thank you so much‼

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Thanks 🙂! If I remember correctly, the Arm instruction set had a feature that sounds like what you describe. There were some conditional prefixes you could attach to instructions (or they might have been part of the instruction itself), which allowed you to only execute them under certain cirumstances. I'm not sure if that applied to all instructions, or whether that was limited to a certain subset.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      And I think there's definitely room for multiple ALUs. Once we move to superscalar execution, we'll want at least two ALUs to showcase how that can help with performance. If all goes well, I'll end up with a reservation station for the ALUs that collects instructions to be executed, and then you'd have two ALUs that work on executing those instructions once they are ready.

  • @schrodingerscat1863
    @schrodingerscat1863 Місяць тому

    This channel should have way more subscribers, easily one of the best explanations of how processors work and how to construct one. Everything explained in great detail making it a great resource for anyone interested in the subject.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Thank you very much for the kind words! 🙂

  • @OscarSommerbo
    @OscarSommerbo Місяць тому

    Very nice, however the testing was overly thorough in my opinion. While testing all cases are fine for circuit validation in a video, it gets a bit repetitive and tedious. But that is just my opinion and I might not represent that average viewer.

    • @schrodingerscat1863
      @schrodingerscat1863 Місяць тому

      For a microprocessor implementation testing and validation is absolutely key to success. You have to test every permutation because you never know when an operation on the edge of timing conditions is going to start randomly failing. This project is following best practice in this respect and as it becomes more complex and clock speeds get ramped up this will really be invaluable. The reason ARM is so successful is precisely because their testing and validation systems are so extensive and complete. Testing is essentially most of the value of that company and extremely time consuming and expensive to implement which is why so many companied license ARM architecture rather than creating their own.

    • @OscarSommerbo
      @OscarSommerbo Місяць тому

      @@schrodingerscat1863 Did I say or indicate that he shouldn't test? I merely suggested that doing it on camera makes for rather drawn out segment. I agree that comprehensive testing of each function is mandatory, just that it doesn't make for great content. And I certainly didn't suggest that my view was any more valid than anyone else's.

    • @schrodingerscat1863
      @schrodingerscat1863 Місяць тому

      @@OscarSommerbo Your OP didn't really make that clear, seemed to suggest that thorough testing was time wasting in my reading of it. I think showing the full testing process gives a realistic representation of how hardware design is done and how long testing takes compared to knocking up the circuit to test.

    • @OscarSommerbo
      @OscarSommerbo Місяць тому

      @@schrodingerscat1863 I might have misplaced a comma. My opinion is: Testing is essential, showing testing can get boring. "Can get", different viewers want different things, I was merely stating a preference, and since I am not making these videos I have no actual input.

    • @schrodingerscat1863
      @schrodingerscat1863 Місяць тому

      @@OscarSommerbo It's a good point, the editing on the presentation could be tighter.

  • @JaenEngineering
    @JaenEngineering Місяць тому

    This is really starting ro come along. I feel for the poor legs on that program ROM though! I think you definitely need to treat it to a ZIF socket when you get it onto a PCB. And now we have a PC Register and Flags Register, I'm wondering if its worth setting up a Special Purpose Register file to compliment the GPR? Could also add a Counter Register so we're not constantly hitting the ALU for a simple "do this x number of times" operations.

    • @Artentus
      @Artentus Місяць тому

      From a hardware design perspective, it's much easier to just add more GPRs than design SPRs. If you want to offload counting from the ALU, since this is supposed to become a superscalar, the way to go would be to add an increment/decrement unit that can operate independantly.

    • @michaczerwonka8720
      @michaczerwonka8720 Місяць тому

      @@Artentus or more ALU's

    • @fabianschuiki
      @fabianschuiki Місяць тому

      😂 Oh yeah, I definitely need a ZIF socket. The sad part is that I had a whole box of ZIF sockets sitting around for two years, but their legs don't mate with the breadboard properly. So yeah, moving to PCBs is going to be fantastic for the ROM chip's legs 😁 SPA day! I will definitely be adding a set of special purpose registers, but more along the lines of the Control and Status Registers (CSRs) in RISC-V: basically an input/output feature of the CPU to break out of the tightly-controlled and timed out-of-order and superscalar pipeline, and get a simpler interface with just address, data, and a read/write strobe. That will allow you to have non-critical registers such as cycle counters, an LCD display, UART, and more, in a way that is still tightly coupled to the CPU, but outside the tight constraints of the pipeline. A counter register like you mentioned would be pretty interesting. I remember seeing that on other CPU architectures, mainly Digital Signal Processors, as "hardware loops". They do interfere with the instruction fetching and superscalar execution in potentially annoying ways -- I'll need to think about whether something like that can be integrated. It could be a nice frontend-only piece of hardware, but it adds state to the pipeline which is very tricky to handle during context switches and interrupts.

    • @JaenEngineering
      @JaenEngineering Місяць тому

      My thinking for a counter register was to use 74hc191 presetable up/down counters. You load with a value to want to count to, and every time you enable it, it counts down one and then raises a flag once you get to zero which you can check with a conditional jump instruction. The advantage is you can have the ALU speculatively executing what would happen if the flag isn't raised.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      @JaenEngineering Yeah that sounds very neat! And in case an interrupt happens or you context-switch into the operating system, you would read the current count and store that as part of the state, and then later restore it when you resume execution of the program. That might work! 😃

  • @reinoud6377
    @reinoud6377 Місяць тому

    I wonder why you didnt use a programmable ic as you used before with the expressions, this might safe a few ICs in the pcb, heck maybe even a small lut

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Yeah that would have been a pretty nice idea! 🙂 You could probably fit the entire condition matching circuitry into a single 16V8 PLD. I'm trying to avoid using PLDs for everything because they could essentially absorb almost all logic in the build. Maybe you could have a maximum-PLD build that tries to use them to their greatest potential. They do also have a few downsides however, especially in terms of power consumption. I haven't really figured out a rule for myself as to when a PLD is okay, and when I'd rather use discrete chips. But for something like the circuitry in this episode, where almost all gates in the chips are actually utilized, having discrete chips instead of a PLD feels okay. But your point definitely stands: this could have been a PLD 🙂

  • @akkudakkupl
    @akkudakkupl Місяць тому

    The purpose of the relative base is to have eg. stack relative programs? Could do that in a RISCy way too I think - sum the base with the offset into another register, use that register as the absolute jump location. Doable as pseudo instruction in the assembler if you declare that some registers are for "assembler use" - so you can use them for variables in function calls or some values that you need in the nearest future in the program and don't care about being overwritten. This really makes me want to draw up some "MIPSy" 16-bit CPU ;D

    • @fabianschuiki
      @fabianschuiki Місяць тому

      The relative base is for future versions of the CPU where the instruction fetch is pipelined: the PC would keep advancing to fetch instructions from the ROM, which may take a couple of cycles to arrive at the instruction decoder and to be decoded. If it's a relative jump, the base address of the juml has to be the PC of the decoded instruction, and not what's currently in the PC, since the PC has already continued on in order to fill the pipeline. MIPSy 16-bit CPU definitely sounds like a UA-cam series worth watching 😏

  • @akkudakkupl
    @akkudakkupl Місяць тому

    You could have used 2 to 1 muxes to select normal steps/relative adressing - add either the normal offset to the current adress or add the sign extended relative offset to get eg. -128/+127 step jumps. Then another set of 2 to 1 muxes would select this or absolute jump adress source to put into the register.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Agreed, that would have been a nice design point as well 🙂. My thinking here was that I wanted to always have the address of the next instruction available, to have the option of storing it to a return address register. If I use only one adder and select between step/relative offset with a mux, I no longer have access to that address.

    • @akkudakkupl
      @akkudakkupl Місяць тому

      @@fabianschuiki If you want to grab the absolute value to return then you can do so from the adder - just set it so it would point to the next instruction after the jump, you grab it synchronously with doing the absolute PC load ;-) You want to go back to the next instruction after the jump anyway, right? BTW I'm really enjoying these series :-)

    • @fabianschuiki
      @fabianschuiki Місяць тому

      @akkudakkupl Thanks! 😃 It's great that you share your insights, ideas, and suggestions 🥳. Regarding the absolute value of the return address: I think you're right if you're only ever doing calls to absolute addresses. Which, to be fair, is probably a sensible limitation. I was trying to give the PC the capability to perform a relative jump while at the same time producing the instriction address after the jump as the return value. For this I need two adders: one to compute the next/return address, and one to compute the relative jump target.

  • @smallduck1001001
    @smallduck1001001 Місяць тому

    It's not a "hashtag" character, it's a hash or poundsign. A word that begins with that character is a hashtag, ie. when someone says "hastag foo" this doesn't mean "hashtag character, foo", it means "this is a hashtag: foo".

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Yes you're right! Misspoke there in the heat of the battle 🙂

    • @KushLemon
      @KushLemon 25 днів тому

      Peasant! That's an "octothorpe". 😤

    • @fabianschuiki
      @fabianschuiki 25 днів тому

      😂

  • @costa_marco
    @costa_marco Місяць тому

    Is comparing against -128 allowed in your implementation? From my understanding, SF will always be different from OF when comparing to -128, on either operand.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Hmmm, I think it should work like any other signed number 🤔 I'll have to recheck that carefully though. Thanks for the pointer!

  • @akkudakkupl
    @akkudakkupl Місяць тому

    Very nice! What you might want next is labels (for neater jumps) and macros (so you don't have to repeat yourself all the time). But maybe a little different approach to the assembler would make it easier to expand. As it is now you have to declare new operands, arguments, etc. And after that you need to write how to parse those things, how to print them and how to encode them. Maybe instead of matching a string like "if self.consume_identifier("ldi"): do_stuff" you could take the string as an argument and use a dictionary? Eg. "mv": [symbol, argument_type, ...]. Data in this dictionary could then be interpreted by functions (to parse, print, encode) to automagicaly do the things that you have to add manually. This way instead of updating your code in several places when adding new opcodes you would just expand the LUT and rest of the code would stay as it was. Maybe it would even parse larger files faster in the future because it might end up with less if statements to check? I'm not a programmer (just a PLC wrangler), not a python guru and certainly not a parsing expert, so take it with a grain of salt, but I believe this might be a good improvement.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      That's a great suggestion! The parsing, printing, and encoding is highly repetitive, and there are only 3 or 4 distinct instruction formats. I'll definitely move to a table-based approach in the future 🙂

  • @akkudakkupl
    @akkudakkupl Місяць тому

    Those floating unused inputs are bugging me ;D

    • @fabianschuiki
      @fabianschuiki Місяць тому

      😃 Yeah they definitely need to be tied off.

    • @akkudakkupl
      @akkudakkupl Місяць тому

      @@fabianschuiki I left a comment on your assembler video (last part), IDK if you get notifications on old videos. Very nice watch after the first two, I must say. Skimmed the second two a bit because I had and idea that might make your life easier and just had to comment. I'm certainly going to follow this like the James Sharman CPU series :-)

    • @fabianschuiki
      @fabianschuiki Місяць тому

      @akkudakkupl Thanks 🙂! Labels and a table-based approach are going to make the assembler a lot nicer to work with and extend 🥳

  • @naikrovek
    @naikrovek Місяць тому

    register your copy of sublime text lol

    • @fabianschuiki
      @fabianschuiki Місяць тому

      😃 Yeah I should. I think I have a license for it lying around somewhere.

  • @OscarSommerbo
    @OscarSommerbo Місяць тому

    This video answered a question I always had about the various conditional jump instructions, why are there so many, even in simple systems? Because you have to have a few, and those few can trivially combined to make the more specialized conditional tests. Great video to learn from.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Thanks! 🙂 I'm learning a lot about x86 and its humble beginnings by going through these circuits. I did most of my work on more modern RISC-style architectures, where flags are mostly absent. It's great to work with them for a change! Although I can totally see why they will become very annoying once you're trying to do out-of-order execution 😬

    • @janhofmann3499
      @janhofmann3499 Місяць тому

      @@fabianschuikiI thought that the flags register in OoO CPUs gets renamed like any other architectural register. Microbenchmarks e.g. on the Firestorm cores in Apples A14/M1 suggested that it has a flags register file of 128 entries. It’s on the other hand astonishing that all the needed logic can be implemented with so few components..

    • @fabianschuiki
      @fabianschuiki Місяць тому

      @janhofmann3499 Yeah I think when you move to OoO execution, you promote the flags register to just yet another register. And all your ALU instructions have an implicit flags register operand that is read and/or written. That makes it almost just a compression scheme in the instruction set which makes certain register operands implicit. It's kind of fun to think about, but I also get why more modern ISAs like RISC-V skip flags entirely 🙂

  • @DavidLatham-productiondave
    @DavidLatham-productiondave Місяць тому

    I was trying to figure out why you didn't use the inverting output of the multiplexer. But then I realized you needed a way to select the inverted output or not. Which would have required another multiplexer. So your design choices make sense to me now. I know this comment is pointless, but who knows. Maybe someone else will be wondering the same thing.

    • @fabianschuiki
      @fabianschuiki Місяць тому

      Yeah I felt bad about not being able to use that output. It's already there! 🙂 But the XOR gate was already there, so I didn't even have to add another chip. 😃