How Condition Codes Work - Superscalar 8-Bit CPU #37

Fabian Schuiki

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 15 чер 2024
The ALU of my homebrew CPU has four arithmetic flags. In this video, we are going to figure out how to make sense of these flags by defining a list of interesting condition codes. We then extend the ALU with three additional chips that allow us to evaluate the condition codes and determine which flags are set, if the operands to the previous comparison were equal or not, or which operand was less than, less than or equal, greater than, or greater than or equal the other, in signed and unsigned arithmetic. This piece of condition checking hardware paves the way for powerful future instructions that execute only under certain conditions, such as conditional moves and conditional jumps.
This video series explores the concepts and techniques that make modern computer processors so incredibly fast and powerful. I build my very own 8-bit processor from individual logic gates and gradually evolve it to become a superscalar out-of-order machine. Along the way, we take a deep dive into contemporary computer architecture in a hands-on fashion and rediscover some of the foundations of modern computing.
Previous Video: • Poor Man's Conditional...
Series Playlist: • Build a Superscalar CPU
ALU Playlist: • Homebrew Arithmetic Lo...
GitHub Repository: github.com/fabianschuiki/supe...
- Jump Instructions on x86: www.unixwiz.net/techtips/x86-j...
Chips:
- 74HC151 (MUX8): www.ti.com/lit/ds/symlink/cd7...
- 74HC32 (OR2): www.ti.com/lit/ds/symlink/cd7...
- 74HC86 (XOR2): www.ti.com/lit/ds/symlink/cd7...
00:00 Intro
00:41 Condition Codes
12:29 Condition Checking in Hardware
17:16 Breadboard Build
24:16 Updating the Decoder
28:20 Testing
33:31 Outro
#alu #homebrew #8bit #breadboard #superscalar #computer
Наука та технологія

КОМЕНТАРІ • 28

@DavidLatham-productiondave Місяць тому ⁺³
I was trying to figure out why you didn't use the inverting output of the multiplexer. But then I realized you needed a way to select the inverted output or not. Which would have required another multiplexer. So your design choices make sense to me now. I know this comment is pointless, but who knows. Maybe someone else will be wondering the same thing.
@fabianschuiki Місяць тому ⁺¹
Yeah I felt bad about not being able to use that output. It's already there! 🙂 But the XOR gate was already there, so I didn't even have to add another chip. 😃
@OscarSommerbo 29 днів тому ⁺¹
This video answered a question I always had about the various conditional jump instructions, why are there so many, even in simple systems? Because you have to have a few, and those few can trivially combined to make the more specialized conditional tests. Great video to learn from.
@fabianschuiki 29 днів тому ⁺¹
Thanks! 🙂 I'm learning a lot about x86 and its humble beginnings by going through these circuits. I did most of my work on more modern RISC-style architectures, where flags are mostly absent. It's great to work with them for a change! Although I can totally see why they will become very annoying once you're trying to do out-of-order execution 😬
@janhofmann3499 29 днів тому ⁺¹
@@fabianschuikiI thought that the flags register in OoO CPUs gets renamed like any other architectural register. Microbenchmarks e.g. on the Firestorm cores in Apples A14/M1 suggested that it has a flags register file of 128 entries. It’s on the other hand astonishing that all the needed logic can be implemented with so few components..
@fabianschuiki 29 днів тому
@janhofmann3499 Yeah I think when you move to OoO execution, you promote the flags register to just yet another register. And all your ALU instructions have an implicit flags register operand that is read and/or written. That makes it almost just a compression scheme in the instruction set which makes certain register operands implicit. It's kind of fun to think about, but I also get why more modern ISAs like RISC-V skip flags entirely 🙂
@Artentus Місяць тому ⁺³
Conditional move instructions are under-appreciated but I love them. Whenever you can express a branch in terms of conditional moves it avoids so much pipeline stalling.
One thing to note here is that technically you don't need a separate move instruction anymore because it's equivalent to a conditional move with condition set to always.
Although that might be slightly sub-optimal if you ever get to superscalar execution because it uses the ALU while the normal move does not. That could also be solved in the decoder tho.
@fabianschuiki Місяць тому ⁺³
Yeah I couldn't agree more! If your CPU already goes through the trouble of having flags, the conditional moves are often almost free. And as you said, they don't mess with the instruction fetch pipeline at all 🥳.
And you're right, the always condition would allow you to use a conditional move as a regular move. My current idea for out-of-order execution is to make the registers store IDs when an instruction is still in flight. With a dedicated move as I have it now, you can just copy that ID into another register, and then have both registers store the result when it becomes available. If the move ran through the ALU, you'd occupy a reservation as you suggested.
@alexloktionoff6833 24 дні тому ⁺¹
@@fabianschuiki but having ALU flags makes all commands interdependent, so out of order becomes very complicated. Why to have flags register in your superscalar design at all? You don't have legacy to be backward compatible. Why not use ALPHA, MIPS and RISCV way of using register values by themselves for conditionals and just reorder commands based on used registers?
@fabianschuiki 24 дні тому
@alexloktionoff6833 Yes, you're definitely right! All instructions that interact with the flags become dependent in one form or another. And I agree, other ISAs like RISC-V have a more elegant approach that feels cleaner and more modern. One thing to keep in mind though is that these generally are wider CPUs with wider instructions, 32 bit for RISC-V for example. This means that you only need a single register operand to provide a jump offset, or you have plenty of bits in the instruction for immediate jump offsets. That allows you to have conditional branches like `blt r0, r1, label`, which compares r0 and r1 and, if r0 is less than r1, does a relative jump to the label. For narrow instructions like my 16 bits, it's difficult to encode two register operands plus a reasonably large jump offset in a 16 bit instruction. This is also why flags tend to remain popular on CPUs with narrow datapaths, e.g. 8 bits like here, or narrow instructions, like 8 or 16 bits. As soon as you clear the 32 bit hurdle, you're in territory where addresses start to fit into single registers, and you have enough instruction bits to encode enough data to no longer need the flags crutch.
@alexloktionoff6833 24 дні тому ⁺¹
@@fabianschuiki but there are ways not to store offset in jump instructions: use conditional SKIP instruction or conditional RET instruction with registers # and simple dumb jump with only constant offset.
@fabianschuiki 23 дні тому ⁺¹
@alexloktionoff6833 Yes, you're totally right, there are definitely ways to avoid most of the mess! A flag-less 8 bit CPU would definitely be worth exploring 🙂
@reinoud6377 25 днів тому ⁺¹
I wonder why you didnt use a programmable ic as you used before with the expressions, this might safe a few ICs in the pcb, heck maybe even a small lut
@fabianschuiki 24 дні тому ⁺¹
Yeah that would have been a pretty nice idea! 🙂 You could probably fit the entire condition matching circuitry into a single 16V8 PLD. I'm trying to avoid using PLDs for everything because they could essentially absorb almost all logic in the build. Maybe you could have a maximum-PLD build that tries to use them to their greatest potential. They do also have a few downsides however, especially in terms of power consumption. I haven't really figured out a rule for myself as to when a PLD is okay, and when I'd rather use discrete chips. But for something like the circuitry in this episode, where almost all gates in the chips are actually utilized, having discrete chips instead of a PLD feels okay. But your point definitely stands: this could have been a PLD 🙂
@costa_marco 28 днів тому ⁺²
Is comparing against -128 allowed in your implementation? From my understanding, SF will always be different from OF when comparing to -128, on either operand.
@fabianschuiki 28 днів тому
Hmmm, I think it should work like any other signed number 🤔 I'll have to recheck that carefully though. Thanks for the pointer!
@akkudakkupl 29 днів тому ⁺¹
Those floating unused inputs are bugging me ;D
@fabianschuiki 29 днів тому
😃 Yeah they definitely need to be tied off.
@akkudakkupl 29 днів тому ⁺¹
@@fabianschuiki I left a comment on your assembler video (last part), IDK if you get notifications on old videos. Very nice watch after the first two, I must say. Skimmed the second two a bit because I had and idea that might make your life easier and just had to comment.
I'm certainly going to follow this like the James Sharman CPU series :-)
@fabianschuiki 28 днів тому ⁺¹
@akkudakkupl Thanks 🙂! Labels and a table-based approach are going to make the assembler a lot nicer to work with and extend 🥳
@naikrovek 29 днів тому ⁺¹
register your copy of sublime text lol
@fabianschuiki 29 днів тому
😃 Yeah I should. I think I have a license for it lying around somewhere.

Наступне

Автоматичне відтворення

Adding Conditional Moves to My CPU - Superscalar 8-Bit CPU #38