The Madness of Z80 I/O

Noel's Retro Lab

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 28 тра 2024
This episode is sponsored by PCBWay www.pcbway.com Do you know what exactly OUT (C),A does? Are you sure? Prepare for a deep dive into the madness that is Z80 I/O and some sneaky ways to get around it.
Support Noel's Retro Lab on Patreon: / noelsretrolab
You can also support Noel's Retro Lab on UA-cam by joining this channel:
/ @noelsretrolab
🛠 Tools used in this video (affiliate links):
Signal generator amzn.to/3O2DMMB
Oscilloscope amzn.to/48OnYoB
Power supply amzn.to/3RYPrNA
Other tools I use noelsretrolab.com/tools.html
Chapters:
00:00 Intro
00:26 Memory-mapped IO
01:38 Dedicated IO lines
02:53 OUT
05:28 Measuring
08:33 What about B?
13:08 Amstrad CPC
17:18 Alternative to OUT
Connect with Noel's Retro Lab:
Discord ➤ / discord
Facebook ➤ / noelsretrolab
Twitter ➤ / noelsretrolab
Instagram ➤ / noelsretrolab
Mailing list ➤ noelsretrolab.com
Наука та технологія

КОМЕНТАРІ • 449

@bread8070 4 місяці тому ⁺¹²⁵
To understand why the Z80 has separate memory and I/O spaces you have to look at its history. The Z80 is an enhanced 8080. The 8080 was an enhanced 8008. The 8008 was a single chip version of the Datapoint 2200. The Datapoint 2200 had a processor made from over 100 TTL logic chips. It used (in the original version) serial memory and shift registers, and had a 1-bit bus and 1-bit ALU.
Accessing I/O was very different to accessing the memory so it needed separate instructions for doing that. The 8008, 8080, Z80, and even 8086 and Pentium have just inherited that separate I/O space for backwards compatibility.
BTW accessing memory on the Datapoint was very similar to accessing registers. That’s why the instruction set bundles memory reads and writes into the same instructions as register reads and writes - as in LD r,(HL) and LD (HL),r in Z80 mnemonics. It’s fascinating that these quirks are still present in Pentiums 50 years later.
On to Amstrads: I hate being pedantic, but the gate array doesn’t control RAM banking (that’s a personal soap box), but it does control ROM enables, hence why it needs access to A15 and A14.
Also, the gate array port address and current settings of register 2 are permanently cached in the BC’ registers. (Register 2 controls video mode and ROM enables). Thus when calling into a ROM, or otherwise changing ROM enables, it just needs to swap to the alt registers, modify C’ and output the new value. This saves it having to read and write such values to/from memory and makes such operations much faster.
But that’s enough waffle. Thank you for the always excellent videos.
@OscarSommerbo 4 місяці тому ⁺¹⁰
The distinction of banking and ROM enable is a crucial one, I think your soapbox is entirely justified.
@fr_schmidlin 4 місяці тому ⁺¹²
Bonus advantage: The separate I/O space also simplified memory cache implementation years later.
@gcewing 4 місяці тому ⁺⁶
It was quite common in early computer architectures for memory access and I/O to be done very differently. The DEC PDP-8 and Data General Eclipse come to mind. As pointed out, it does have the advantage that you can use all of the address space for memory, which was an important consideration in those days when address spaces were very small by today's standards.
@gcewing 4 місяці тому ⁺¹¹
Using "LD" as the mnemonic for almost all instructions that move data around is a feature of the Z80 assembly language only. The standard 8080 assembly language wasn't like that -- it used "MOV" for movements between registers, and various mmemonics starting with "LD" and "ST" for memory access. It was quite quirky in various ways, e.g. it confusingly referred to the 16-bit BC, DE and HL registers as just "B", "D", and "H". The Z80 assembly language did an excellent job of cleaning all that up.
I guess they didn't unify the memonics for IN and OUT because English doesn't really have a single word that works for both directions.
@Curt_Sampson 4 місяці тому ⁺⁸
The separate I/O address space isn't just for backwards compatibility; it's also convenient in that you can generally use less decoding hardware than you need with memory mapped I/O systems.
@VikOlliver 4 місяці тому ⁺⁵²
Amstrad dev here. It wasn't just a cost issue driving the weird I/O mapping. We also had to consider the number of logic gates available in commercial gate arrays. Side note: The prototype hardware used discrete logic to produce a Gate Array Simulator with the same pinout as the final chip, called the "GAS Board." This also held the EEPROMs that eventually became the ROM. Later we would hack off the EEPROM sections to use for ROM development and called these bits "Small GAS Board" which evolved into us calling them Smorgasbords. You're welcome.
@ollerich32 3 місяці тому ⁺¹
Awesome insider view, thanks a lot!
@MarceloSilva-lh9mh 4 місяці тому ⁺⁹⁸
Hey, big Z80 fan here, since ZX-80 times (TK-83 in Brazil). I wrote a good amount of assembly code back in the day, just for fun. Thank you for addressing a question that has boggled my mind for 40 years. The reason for the OTIR (rather than OUTIR) mnemonic is that the instruction mnemonics in ZX-80 Assembly language had to be kept under 4 characters, for display formatting reasons, when the code had to be shown as mnemonics (Disassembly programs).
@fr_schmidlin 4 місяці тому ⁺⁶
Exactly. Memory had a very high cost back then, so they had to limit the mnemonics to 4 characters.
@stephenamor7762 3 місяці тому ⁺¹
And, that is exactly why I'm useless at "meaningful" variable names!
@EssArrB 4 місяці тому ⁺⁶⁹
Coming soon, Noel has a total meltdown over Z80 Interrupt Mode 2 vectored interrupts with Zilog's peripheral chips (CTC,DMA,PIO,SIO) !
@ncot_tech 4 місяці тому ⁺⁷
Here be dragons! Especially if you try that on a ZX Spectrum.
@lasersimonjohnson 4 місяці тому ⁺⁸
Mode 2 interupts worked fantastically. At the time though, I barely understood them 😂
@TheEulerID 4 місяці тому ⁺³
I used all those apart from DMA in writing a race track control and timing system complete with pre-emptive multi tasking in what was an embedded system. I was used to writing OS code for IBM mainframes, so the approach of separating memory and I/O address spaces came naturally, albeit the programmability of Z80 peripheral chips was very limited compared to writing channel control programs for the mainframe.
It was, of course, all mode 2 interrupts. However, the engineering of that sort of system is very different to mass market domestic computers.
@tvalenca 4 місяці тому
@EssArrB that was what I was expecting when I saw the thumbnail/title.
@ncot_tech Specially on any platform that isn't CPC.
@smf3472 4 місяці тому ⁺²
@@gppsoftware the interrupt table couldn't be in contended ram and the table had to be 257 bytes long. In IM2 the value on the bus is a byte offset into a word table & because it was essentially random as there was no hardware in the spectrum that put a byte on the bus during an interrupt acknowledge cycle, then you had to make sure it could fetch a valid word for every possible byte. So for example you would put your code at 0xd1d1, then have 257 0xd1 in ram at 0xd000 and set I to 0xd0
@static-san 4 місяці тому ⁺²
I remember learning years ago that OUT (C), A was really OUT (BC), A. But I also heard that Zilog didn't really intend to do that; as someone mentioned, masking out B would've been more complexity they couldn't initially fit on the chip. So the first Zilog manuals didn't mention it!
How the Amstrad took advantage of this reminds me of how other home computers had partial decoding, too. TI did that in their 99/4a. In the 4a, some hardware was memory mapped and some was in the 9900's version of I/O ports. But neither were completely decoded, so all the hardware was accessible at multiple addresses.
@johanderek3383 4 місяці тому ⁺¹⁶
I always assumed that the reason why BC is output to the address lines instead of just C is because of how the register pairs are internally wired to the address bus. You would need additional logic for the IN/OUT instructions to mask B instead of just reusing whatever is being used for HL, IX, and IY. When designing a system it would have been better to just stick to the chip designer's intent of having just 256 ports addressable by an 8-bit register.
PS. Personally I think the IN, OUT, and LD mnemonics were well chosen. Especially the OUT instruction make it quite clear that you are now messing with the state of an external device out there. I always thought the Intel MOV mnemonic is rubbish because you're copying, not moving. But it's all just mnemonics anyway; one could potentially fork one's favourite assembler on GitHub and come up with better mnemonics.
@smf3472 4 місяці тому ⁺¹
I agree, it's the 8080 OUT (d3)/IN (db) instructions that made no sense when Zilog implemented them on the Z80. On the 8080 it took an 8 bit port and uses it for bits 0-7 and 8-15. Zilog put the A register on bits 8-15, but the instructions use A for the data bus as well.
@markevans2294 4 місяці тому
@@smf3472 I suspect that the behavior of A8-A15 with these two opcodes (D3 & DB) is an unintended side effect. Which happens to be different between the I8080/8085 and the Z80.
Whilst Zilog intended the likes of IN A,(C)/OUT (C),A to be 16bit I/O instructions. But for some reason didn't document that the port is value is BC rather than just C. Not documenting ED70 & ED71 since neither IN (HL),(C) nor OUT (C),(HL) make any sense.
IIRC the INI, OUTI, etc instructions also use the BC register pair to specify the port. But also use the B register as a counter. Effectively making these 8 bit IO instructions.
Things would have been easier if Zilog had used DE instead of BC for 16 bit IO. Possibly that was too difficult since the instruction EX DE,HL (EB) renames/remaps the registers. With there being an additional level of remapping for the index registers using the DD & FD prefixes.
@Keldor314 4 місяці тому
@@markevans2294 B is a special purpose register in that it's the only one that can be used as a counter in autodecrement instructions, such as OUTI, INI, DJNZ, LDI, and so forth. So BC was the only choice that would give them the option of DMA style burst transfers with OTIR and INIR.
OUT (n),A is also an interesting beast.
@undercoveragent9889 3 місяці тому
Well, I always thought of the register pairs as 'HL' being the 'source' of data, DE being the 'destination' of the data and BC being a 'Binary Counter' that acted like a For/Next counter.
I loved the Spectrum. _And_ the Z-80.
@seankayll9017 4 місяці тому ⁺¹²
I loved LDIR and used it a lot when Z80 asm programming back in the early 80s.
@fr_schmidlin 4 місяці тому ⁺⁴
Yes! Incredible how many people forgot that LDIR/LDDR/OTIR/OTDR were a cheap alternative for DMA memory transfers, and advertiser as that. DMA was horribly expensive back then.
@shaunhw 6 днів тому
LDIR was slower per byte transfer, than using long strings of LDI instructions, for example using 32 LDIs to copy over line of screen pixels out on a Sinclair ZX Spectrum.
@rastersoft 4 місяці тому ⁺²²
Just two details: the Sinclair Spectrum also uses the upper 8 bits of the address bus for addressing the keyboard's semirows. It also uses the same trick of using one bit for each device: A0 for the ULA, A1 for the ZX Printer, A2 for the memory pagination/AY-3-8192 in the 128K, and A3-a4 for the Interface 1/Microdrives, which also leaves only three bits for other devices (like A5 for the Kempston joystick).
@fr_schmidlin 4 місяці тому
So that's were the insane idea came from! 😅
Thanks for the explanation.
@Aeroman66 4 місяці тому ⁺¹
Thanks God I'm not the only one who noticed this.
@Aeroman66 4 місяці тому
Thanks God I'm not the only one who noticed this.
@herrbonk3635 4 місяці тому ⁺¹
Same with the Z81 and Z80, regarding keyboard scanning.
@herrbonk3635 4 місяці тому ⁺²
@@fr_schmidlin How is it "insane"? You cannot have seen many hardware designs 😉 Most cheap computers used unorthodox methods and/or (more or less) undocumented aspects of processors or others components to save cost.
@TheEulerID 4 місяці тому ⁺³
There is nothing mad about Z80 I/O that I can think of. That is from experience, having used the architecture for a fully interrupt driven embedded application complete with what was a pre-emptive multi-tasking operating system. That included programming CTC, PIO and SIO as well as a couple of backplane-connected vdu cards. It was all mode-2 interrupts, and the nature of the application was that it did not need DMA as it was low data rate, but a requirement for very fast interrupt handling as it was used for timing on a race track.
It drove two printers, a giant 7 segment display, a single operator console, three timing beams, three car identification systems, start lights, jumped start and car redirection lights. A bit of an oddity, as the original commercial model failed, but the circuit remains, less the (mid 80s) control system, now being a rather large kart track in Milton Keynes, England.
As the code all had to fit within a 32k ROM, it was compact and was in no way a general purpose OS, and all tasks (7 of them, including an "idle" task) had to be assembled and burned into the ROM with what was the OS core. Tasks were not re-entrant (no need), but there was a large number of shared utility and system routines, all of which were re-entrant and were also shared with the single level interrupt routines (nice on the Z80 with its partial alternative register set). Code that had to be single threaded was simply done under non-interruptible conditions which could be nested. I/O and inter-task communication was done via ring-buffers, although I/O could be direct, as was done in startup via redirecting system routines. The buffers could also be configured as to purpose during startup, although redirecting dot matrix output to a screen would lead to odd results.
The serial port drivers for various peripherals were driven at (dedicated) task level, using wait states, with the core only handling the interrupt status.
In short, you could do some very powerful things with Z80 peripheral chips, and some aspects reminded me of what I did as a day job, which was writing operating system code for IBM architecture machines.
In any event, I was extremely pleased with what I could cram into the Z80s address space, without having to resort to page registers and memory banks. Of the I/O system, at least with standard Z80 peripheral chips, I was very happy.
As for LDIR and the like, then I was used to SS - store, store instructions on the IBM, and it is a great way of reducing code size, although anathema to the load-store RISC crowd.
@AndrewRump 4 місяці тому ⁺⁵
You forgot one important detail!
On Amstrad, io devices must also monitor the memory access pin - and - if it is active at the same time as the io access pin - ignorere the io access pin - because the cpu is doing memory banking!
If was nearly bitten by this when I made an interface for Lego Mindstorms version 0.
Luckily when I was about to give up because my board would be loaded with random values from time to time (when doing memory banking) I just read about the feature in an Amstrad book.
Fortunately I had connected the io pin access pin to both pins on a NOR gate. I just had to remove the jumper and added the memory access pin to the chip and everything worked!!! 🎉
@fr_schmidlin 4 місяці тому ⁺¹
Man, and I though the CPC architecture was hacky enough with this video.
An honest thank you for giving more details!
@SimonEllwood 4 місяці тому ⁺²²
Z80 is a superset of the Intel 8080/8085. Those only had 256 IO addresses.
@wearwolf2500 4 місяці тому ⁺²
That get send out twice on the address bus (top 8 bits and bottom 8 bits) when doing an in/out instruction
@Torbjorn.Lindgren 4 місяці тому ⁺¹⁰
Yeah, trying to understand the Z80 after only experiencing 68xx/65xx and without first at least reading up on the 8080 is going to be mindbending. Zilog basically threw the kitchen sink at the already existing 8080 design which is the "why" for so many weird corners - they had to work around an existing design. The 16-bit I/O address is a good example, it had to be 100% compatible, but they saw an opportunity to allow the hardware designer more freedom IF necessary for their design.
@fr_schmidlin 4 місяці тому ⁺²
@@Torbjorn.Lindgren You had some of the few sensible/non-fanboyish explanations here.
@herrbonk3635 4 місяці тому
@@Torbjorn.Lindgren It was their own design though!
Faggin and Shima designed the 8080 at Intel as well as the Z80 at Zilog.
@etchedpixels 4 місяці тому ⁺⁵
Pedantically 8080 but not 8085. 8085 added official different instructions (SIM/RIM) and hidden ones discovered later (LHLX, SHLX, LDSI etc) that were designed for compilers and added a load of 16bit and stack ops. Why Intel hid them nobody seems to know - perhaps to avoid 8086 competition because the 8085 with those instructions ran C and other high level code several times faster than an 8080
@bryede 4 місяці тому ⁺⁹
Old NMOS chips uses the capacitance of its transistor gates like a free register from clock cycle to cycle. So, the same effect that creates DRAM cells also works to move data through the chip as long as your cycles aren't so far apart that the transistor discharges. The later CMOS versions are built differently and can be halted completely.
@dfs-comedy 4 місяці тому ⁺¹
Some CMOS chips still use dynamic logic, so can't be completely halted. This is done to improve circuit density.
@markrosenthal9108 4 місяці тому ⁺⁵
Nostalgia time again.
I wrote terminal emulators on the Z80 and the IN, OUT instructions were never any problem. The way they were implemented was with a separate 256 byte address space for I/O. They would typically occur only once each in any program, wrapped in assembly subroutines I would name "inchar" and "outchar". Very simple for async serial I/O.
The real complexity challenge, especially for data communication, was setting up and responding to hardware interrupts from whatever communication controller chip the implementation used - for a Z80 typically a Z8530. The capabilities of the Z8530 SCC still impress me today.
LDIR is one of my favorite instructions. Used it frequently for block moves with memory mapped video and such.
Where OUTI and INI came into play was with "block mode" terminals or emulators and/or synchronous communication (again, the Z8530 as an example). For example, HP3000 mini computers could do block mode terminal I/O and had an application programming support layer for that called VPlus. And don't forget "green screen" terminal I/O from IBM. A defining characteristic of these terminals was the hidden data transfer followed by instantaneous refresh of the entire display (LDIR again on the Z80).
@richardkelsch3640 4 місяці тому ⁺¹²
The ZX Spectrum uses the full BC address for scanning keys.
@theALFEST 4 місяці тому
but spectrum can use 'IN A, (FE)' instruction with A holding high byte of the port address. It is impossible on amstrad cpc.
@qno-oj3py 4 місяці тому ⁺¹
Ah, yes. The ZX Spectrum. I used to have one in the 80s. Build an EPROM programmer with it to copy drum computer sounds. I remember the keyboard was in the way. Had to use high addresses to not get data garbled. I used a couple of 374 8 bit buffers and a 273 (from memory). Address decoder was a 138. Hardware debugging with a single channel oscilloscope.
@michaelhaardt5988 4 місяці тому ⁺³⁰
For true madness, you could have included the hidden 1 bit output port by manually loading bit 7 in R to latch it during refresh. :) The Z80 just suffers from 8080 compatibility at some places.
@NoelsRetroLab 4 місяці тому ⁺²
Ohhh, I didn't know about that!
@andyhu9542 4 місяці тому ⁺¹
And an additional 8-bit port with the I register if you don't use Zilog's interrupt system!
@greenaum 4 місяці тому
Are you talking about output ports or register space?
@smf3472 4 місяці тому ⁺¹
Nintendo's Popeye arcade PCB uses 1 bit of the I register for acknowledging vblank NMI (and for enabling/disabling vblank NMI). Until you look at the schematics and realize it's latching the bit during a refresh cycle, then it is kinda confusing trying to work out why it's changing the I register.
OUT (c) makes sense as the 8080 had an OUT instruction that only took an 8 bit port, which was then used for bits 0-7 & also for bits 8-15 on the address bus & always outputs A register on the data bus. What really doesn't make sense is that zilog made this instruction output the A register on bits 8-15, which is OK for IN but quite useless for OUT. I don't know what they were thinking
@andyhu9542 4 місяці тому ⁺³
@@greenaum If you latch A8~A15 during the memory refresh cycle, you get the contents of the I register. Therefore, this is like a super easy to use output port that can be loaded with a LD I,A instruction.
@dhpbear2 4 місяці тому ⁺³
12:38 - Easy, the 'B' register is sent to A8-A15 for those who need to decode more that 256 I/O addresses! :)
@rustandmagic 4 місяці тому ⁺¹⁷
I like the Z80 assembly language, I find it logical, the Intel 8080 rubbish on the other hand is madness, I mean "MOV dest,source"!!!!, "LD dest,source" is much more logical, but I guess it is a matter of taste but ok, they could have done the IO instructions better ;)
@edgeeffect 4 місяці тому ⁺⁴
Yeah, I did my first low-level programming on the Z80... when I moved to CP/M and everything was 8080 based (even though most of the actual hardware was Z80) I hated all those "horrible" mnemonics in 8080 assembler.
@LarryRobinsonintothefog 3 місяці тому
The change to the 80x86 assembly language was different, but you had more memory to access.
@deang5622 3 місяці тому
Clearly you never used the 6800, 6502, 6809. Far more elegant instruction sets.
@rustandmagic 3 місяці тому
@@deang5622 Used all of them and I agree, but we where talking about Z80's
@LarryRobinsonintothefog 3 місяці тому
I've programmed the 6800 and 6809 not the 6502. The 6809 had a MUL instruction and had to manually do that in the Z80. @@deang5622
@GodmanchesterGoblin 4 місяці тому ⁺⁴
I love the LDIR instruction. It's great for fast block copying. Less well known, is that if the two blocks overlap and begin only one memory location apart, it's possible to clear a large block of memory just by zeroing the first byte and then use LDIR to copy to the next byte repeatedly until the entire block is initialsed.
@jeromethiel4323 4 місяці тому ⁺³
Yep. This is exactly how the TRS-80 cleared video memory. You can also use LDIR to shift the video memory one character up, down, left, or right. It's all in how you set up the pointers. Way back in the day i wrote a small machine code routine that piggy backed off of the DOS CMD command (i didn't have a disk system, so no DOS), and it could fill the screen with a character, move the screen as i noted above. Used it for writing games back in the day.
@TheEulerID 4 місяці тому
I was used to SS. (storage-storage) instructions on IBM mainframes, so I wasn't outraged by the existence of LDIR on the Z80, albeit that IBM MVC instruction encodes a fixed length. For variable lengths which iterates using registers, you have MVCL, which is closer to how LDIR works.
There is another way, using an EX instruction to execute those SS instructions and, essentially, temporarily over-write the length encoded in the second byte of tge instruction (where the length-1 is encoded on an SS format instruction), but I fear that would cause explosive outrage...
@flatfingertuning727 4 місяці тому
If one wants to fill a bunch of 256-byte chunks of memory with a 4-byte pattern, using a sequence like LP1: LD (HL),A / INC L / LD (HL),C / INC L / LD (HL),D / INC L / LD (HL),E / INC L / JP NZ,LP1 / INC H / DEC B / JP NZ,LP1 will be a fair bit faster (about 13.5 cycles/byte) than LDIR. LDIR is handy, but it's tragically slow because of how much time it spends using a 4-bit ALU to update BC.
@mikafoxx2717 4 місяці тому
LDIR is almost like a hacked-in CPU bound DMA. It's very handy though, and makes for small binaries that can move a lot of data, even if not the fastest way to do it.
@PS-bp4ju 4 місяці тому
@@TheEulerID Both LDIR and MVCL are 2-byte instructions and both ... are slow as hell. In places where a performance matters, MVCL is changed to a bunch of MVCs and LDIR to LDIs. But anyway the fastest way to fill/copy data on z80 is via the stack.
What's also funny, MVCL is also using more registers than specified in the instruction.
@drgusman 4 місяці тому ⁺¹⁸
To me it makes sense. If you think in addressing a register-based external IC you need a /CE signal, an address to select a register and a value to set to that register. Then, with what Zilog did you can use an address decoder connected to the low 8 bits and /IOREQ to drive /CE, the high 8 bits to select the register and the 8 data lines for the value. With this you can have data tables for all the registers of the IC and a simple OTIR will load all of them, or read the full state of the chip with INIR :)
@smf3472 4 місяці тому
But you also need an address decoder for the RAM and ROM, so it doesn't particularly help. I don't believe Zilog ever considered how simple it would be to hookup. The decision had been made on the 8008 & that had been based on a TTL design that used a delay line instead of RAM. So you couldn't have memory mapped I/O. The intel 4004 also had a strange way of accessing RAM that would make memory mapped I/O impractical. Hence Intel went for non memory mapped I/O
@lasersimonjohnson 4 місяці тому ⁺⁵
LDIR is a fantastic instruction. Used it many times in my past 😂
@fr_schmidlin 4 місяці тому ⁺⁴
Yes! Incredible how many people forgot that LDIR/LDDR/OTIR/OTDR were a cheap alternative for DMA memory transfers, and advertiser as that. DMA was horribly expensive back then.
@disdroid 3 місяці тому
Fewer clocks if you use a block of LDI and jump into the right position. LDIR increments the IP and does a compare then conditionally decrements it again, adding an additional cycle.
@korsibat 4 місяці тому ⁺⁶
As this video is not about Z80 I/O being mad but instead as you state in the video it's the Amstrad CPC hardware being the odd
@donaldcongdon9095 4 місяці тому ⁺³
That was deeply fascinating Noel! Reminds me why I always tell people that assembly language is the coolest way to program. Keep those deep dives coming. Z80 forever! Thanks.
@rty1955 4 місяці тому ⁺¹
It sure is. I've been programming in assembly for almost 60 yrs. Began on IBM 1401 then onto 360, 370, 4300 series, s/390, PDP, Data general, CDC and way too many micros to name here. I can do things in assembly that others programmers cant even dream of.
I was an expert at reading core dumps as well. I see hex and think in binary
@etmax1 4 місяці тому ⁺⁴
This sort of operation is done by every IC manufacturer of the time in one way or another, they didn't do VHDL synthesis and and all that nice stuff we do now, and also transistors were limited to less than 10000 around this time so if an instruction had an artefact then so be it. What's important is that the things they say happen, actually happen. If you look at the 6502, it has stacks of undocumented features/instructions that came about by a chance of sorts.
@herrbonk3635 4 місяці тому
Exactly, well put.
@Zeal8bit 4 місяці тому ⁺¹⁴
Great video! I love the Z80 and videos that dive deep in its instructions set😍
There is one instruction that you have not talked about, which is very important in my opinion: OUT (n), A
Instead of using BC, it uses an immediate value as the port. On the Amstrad, this instruction seems to be unusable since the upper address bits are also taken from A register.
If we considered the I/O bus as a 16-bit bus, this instruction would need to be of the form OUT (NN), A and that would take even more clock cycle and code space since there is one more byte to fetch.
Regarding the instruction table, does it also show the undocumented instructions? There are plenty of them, so some "empty" cells in the table would in fact result in undesired behavior on real hardware 😅
@TheUtuber999 4 місяці тому ⁺⁴
6:18 You can absolutely single-step a Z-80. Just use static RAM instead of dynamic RAM in your implementation. There is a video from five years ago on my channel as one example. Cheers.
@melkiorwiseman5234 4 місяці тому ⁺⁵
I think what he's saying is that just like the 6502, some early Z80s used CMOS dynamic RAM for their internal registers, meaning that a certain minimum clock speed needed to be maintained in order for the registers to be refreshed. That's something I didn't know about. I thought that all Z80 CPUs always used static RAM for their internal registers.
@johncochran8497 4 місяці тому ⁺¹
You can't single step the original Z80 design.
The original Z80 introduced in 1976 used a dynamic design in NMOS and had a minimum clock frequency of 250kHz. Going any slower could cause a loss of state and a crash.
The CMOS version of the Z80, which was introduced in 1985, had a static design and could be single stepped.
@deang5622 3 місяці тому
I wouldn't use SRAM over DRAM just so I could single step through the code. That is a very poor engineering decision. SRAM is a lot more expensive per byte compared to DRAM and not necessarily a good choice in a design where the production cost is sensitive.
Just use a decent in-circuit emulator.
Use the proper debugging tools.
@TheUtuber999 3 місяці тому ⁺¹
@@deang5622 You're kidding, right? A single chip with 64kb of SRAM (the maximum addressable amount of memory for a Z-80) costs about $3.95 USD.
@captainboing 4 місяці тому ⁺³
Long term professional Z80 programmer here. I prefer to think of things the other way around. Very simply the Z80 had 16bit IO addressing. The CPC approach capitalized on this properly and it allowed whole pages of IO to be used for your own hardware. Used this by decoding page FF as enable for existing 8bit addressed IO. Everything else that ignored the top 8bits was either sloppy or following the 8080/8085 method... Unless you were being clever and getting 16bits of data in your OUT (C),A by grabbing the B register from A8-A15.
@smf3472 4 місяці тому
The z80 design doesn't really scream 16 bit i/o addressing to me. They messed up the IN/OUT instructions by putting the A register on the top 8 bits, which is fine for IN but for OUT it means the data lines and top 8 address lines have to have the same value. In reality the z80 has 8 bit I/O addresses, but an implementation detail means that you can achieve 16 bit if you write the software in a specific way.
@captainboing 4 місяці тому ⁺¹
@@smf3472 that "certain way meant choosing which you were going to use before you started designing the system". Not the word salad it is in Zaks or the datasheet, but just think of it that way and it works. If the documentation simply said BC appears on the address bus and A on the data bus, it is very simple - and defo 16 bit IO addressing. That is what actually happens, AND... With 65,000 IO addresses it did encourage "rationalised" designs resulting in lots of mirroring of addresses - but that's what you get when designing to a price-point. We used to "gut" 6128s and use them as embedded controllers in weather RADAR equipment with some older IO hardware using legacy 8 bit addressing. Used IO page 0FFxxh with 8 diodes and some simple logic to pick the top byte as the enable for existing hardware. tiddly little daughter board was all that was needed for upgrading from legacy.
IN A,x and OUT x,A were legacy commands and provided code/operability continuity with the intel 8080/85 designs that were the main target of Zilog at the time. You are right that the A register appearing on A8-A15 is a bit odd - I am sure there must have been some reason for it - did any design ever use it?
There are historically a lot of areas where the Z80 could have been a BIG improvement over the 8080... many of the complex instructions, often do not provide code-size or execution time advantages (SET, RES etc) and things like LDIR can be done much quicker by clever use of SP - albeit not as arbitrarily. The 4 bit ALU was a major area they could have spent time, the addition of a MUL command would be so easy and a real coup over Intel, and 16bit relative addressing for CALLs, JUMPs and IX/Y. There were great improvements over the 8080 but a lot were seldom used because they didn't go far enough. In ZILOG's defence,the actual deployment of philosophy of RISC was a decade away, even though the 6502 performed very well with a slower clock and only three registers. Motorola's 6800 series was even better.
@Randrew 4 місяці тому ⁺²
Before watching your video, I'm gonna say: August 30, 2006 I edited the wiki Z80 page to add "undocumented 16-bit I/O-addressing" information. Actually the documentation was in the Z80 Hardware Reference Manual all along (still got it around here somewhere) but isn't too clear about it. It shows the BC register being asserted to the full 16 address bus in the OUT (C),A and inverse instructions. I actually took advantage of this feature in the late '80s in interfacing a some 16 bit (address) IO cards in a Z80 system.
Now, on to see what you have to say ;)
@ErazerPT 4 місяці тому ⁺³
Fun video. As for the last part, that sounds a damn lot like what the Graffiti "display adapter" for the Amiga did. It just "kept state" with the graphics data being outputted, and if it received a very specific setup that made very little sense in normal use, bam, it was not bitplane data but "chunky" data. Sure, it's ugly, but when you're repurposing existing stuff for new purposes, that's how you go...
p.s. in the software side, it's the equivalent of when you have to keep using some data structure that already exists without changing it, you abuse it a bit on the set up side, and on the other end if it's set in the "right way", you know that the content is not straight "regular data" but "data that needs further decoding". The amount of abuse string data gets with "json/xml inside" kludges makes these hardware shenanigans look tame ;)
@cjh0751 4 місяці тому ⁺⁸
Hi Noel, I was just watching your video about the Fujitsu FM-7 last night. I think the UA-cam algorithm suggested it because of the British Post Office Scandal with the faulty Fujitsu Horizon Post office software. Anyway it's always great to see a new video from you.
@NoelsRetroLab 4 місяці тому ⁺³
Haha, that's hilarious if that's the reason it's pushing that video. I should look to see if there's a recent bump :-)
@HenkvanHoek 4 місяці тому ⁺¹
I was also watching the postoffice episode. Funny how the UA-cam algorithm works.
@disdroid 3 місяці тому
Maybe Fujitsu forgot about the upper address line in the software?
@lordsmeagol3390 4 місяці тому ⁺¹
On LDIR/LDDR if you need a bit more speed, use several LDI/LDD instructions and loop them; LDIR is 21 cycles per iteration, LDI is 16.
On INIR/OTIR, I think they can be useful for RAM Disk; Using 256 byte 'sectors', the B countdown with INIT/OTIR can be used to address each byte of the sector, making the read and write code very small and fast.
@jeroentaverne8232 4 місяці тому ⁺¹
Sinclair ZX80,ZX81 and Spectrum actually used A15..A8 as row selector for the keyboard matrix when executing an IN instruction to read the columns. Just to save costs for a real output register to select the row.
@Mistasparkaru 4 місяці тому ⁺⁵
All the z80s ive used were quite happy running hertz speed i.e. one switch press per clock tick. Ive only test ed ~5 cpus though
@isaacmarinobavaresco7397 4 місяці тому ⁺¹⁰
CMOS Z80s run OK at zero clock, but for the old NMOS devices, the minimum speed is about 125 kHz.
@____________________________.x 4 місяці тому
@@isaacmarinobavaresco7397 "the Clock Pulse Width (Low) maximum is 2000 nsecs (2 usecs). so for
a square-wave clock with a 50% duty cycle, the minimum clock frequency is 250 KHz" - I thought it was around 500Khz myself? but that's what the net says. hth
@cokesandwich1668 2 дні тому
I'm pretty sure mine ran on +5 VDC only. I presume that means it was CMOS, not NMOS?
In any case I had trouble when I first built the system. It ran all janky.
So I replaced the 2.5 MHz clock with something around 2 Hz and it worked fine.
So I pulled out the scope and found crazy high noise on the +5 V rail.
I cleaned that up and things ran just fine at 2.5 MHz.
@0cgw 4 місяці тому ⁺²
Yep, having converted assembler to hex codes before I got a proper assembler for my ZX Spectrum back in the early 80s, I immediately shouted at the screen #C9 = ret. 😄
@Foersom_ 4 місяці тому
I did the same but using Spectravideo and MSX.
@MK-jo1gi 4 місяці тому ⁺⁶
LDIR is not an eyesore! I used it to copy blocks around in my Schneider 6128 between memory banks. It was very handy, not having to write the loop. :)
@fr_schmidlin 4 місяці тому ⁺³
Yes! It's incredible how many people forgot that LDIR/LDDR/OTIR/OTDR were a cheap alternative for DMA memory transfers, and advertiser as such. DMA was horribly expensive back then.
@GodmanchesterGoblin 4 місяці тому ⁺²
Yes. I modded my Spectrum back in 1982-83 up to 80k (16k with two 32k banks) and used LDIR as a blitter to create some pretty slick animations by pre-drawing multiple images in RAM and then repeated LDIR to drop them into the active screen buffer.
@fghsgh 4 місяці тому ⁺³
5:16: BC is output because it reuses the same circuitry that LD (BC),A uses. It isn't part of the mnemonic because hardware is _supposed_ to ignore that. OUT (C),A is actually a Z80 extension. The Intel 8080 only had OUT (imm8),A and IN A,(imm8) (using Z80 syntax). Note the 8-bit port number. The Z80 added a variable-port any-register variant to its extended (ED) instruction block.
10:49: You can actually reach much faster speeds than LDIR (or even than an unrolled LDI loop) using an overengineered PUSH/POP loop, down to 12-13 clock cycles per byte copied, as opposed to 16 and 21 for LDI and LDIR respectively (assuming no wait states). The reason LDI is so slow is because the ED opcode fetch slows it down, the reason LDIR even slower is because instead of having the loop built-in, the z80 just decrements PC twice to go back to the start of the instruction on every iteration, which is a very inefficient operation. (EDIT correction: it may actually be doing a 16-bit+8-bit addition like JR uses. i need to look at the die shot.)
21:13: FD FD is not empty, it is undocumented. A standalone FD byte actually does the following things:
1. set a flag that disables all interrupts (including NMI) for one instruction. Other instructions that do this are all the other prefixes, and also EI.
2. set a flag that replaces any instance of H or L with IY for one instruction, and if (HL) is accessed, read another byte for an index offset
This means that every instruction that is valid in the base instruction set is also valid in the FD extension. Only some of these are useful though, because a lot of them won't be any different from the base instruction set.
Therefore, when the Z80 encounters the second FD, it doesn't ignore it and keep reading, it ignores the _first_ FD. The second FD overwrites the first one.
Another device that uses a Z80 and has it wired up to use all 16 bits is the Texas Instruments 84 Plus C Silver Edition graphing calculator. Except it actually uses all of the ports separately rather than just wiring up some bits to some chips. And it has an ASIC with an integrated CMOS Z80 rather than an original standalone NMOS Z80. Actually, it does 19:50 too: it has a special sequence of 5 instructions that needs to be executed from a "privileged" rom page before letting you access certain hardware ports. However, no check is performed to see if the bytes are executed. You could just read them as regular memory accesses, which means there are a few variants that use undocumented instructions, that can e.g. avoid the IM 1 that is part of the sequence.
@amidarius 4 місяці тому
Any example of "overengineered PUSH/POP loop" please ? 😃
@fghsgh 4 місяці тому ⁺³
@@amidarius Uh, alright, so, for a general purpose one:
loop:
ld hl,#
add hl,sp
ld sp,hl
pop af
pop bc
pop de
pop hl
exx
ex af,af'
pop af
pop bc
pop de
pop hl
pop ix
pop iy
ld sp,#
push iy
push ix
push hl
push de
push bc
push af
exx
ex af,af'
push hl
push de
push bc
push af
ld (#),sp
ld a,r
jp nz,loop
It uses _all_ registers (aside from I, i guess), and uses R as a loop counter. Interrupts (obviously) have to be disabled. The first # is the difference between source and destination, the second # should be initialised to the address of the byte after the destination block, the third # is the address of the second #.
As the loop increments R 37 times and R wraps around after 128 (which is coprime with 37), every number of iterations up to 128 can be achieved in this manner.
Of course, the setup for using this method is very expensive, and even then it would copy at 15.9cc per byte (assuming no wait states), which is only marginally better than an unrolled LDI loop at 16cc per byte. And of course you also need to jump into an unrolled LDI loop to cover the remainder after the 20-byte blocks this loop copies. This is a lot of calculations, especially on a Z80 that can't do multiplication or division, but there are definitely cases where several of these values can be hard-coded.
(i did say it was overengineered)
However, where this method really shines, is if you know source & destination at assembly time, and are willing to unroll completely:
ld sp,#
pop af
pop bc
pop de
pop hl
exx
pop bc
pop de
pop hl
ld sp,#
push hl
push de
push bc
exx
push hl
push de
push bc
push af
This can do 12.5cc per byte(!!), in 14-byte multiples. (and it doesn't use AF' because that would actually slow it down to 12.75cc per byte)
LDI is slow because it's an extended instruction :(. PUSH/POP are crazy efficient like this because they're one-byte instructions that can do two memory accesses. So copying one byte takes 3 total memory accesses, instead of LDI's 4.
It's even faster if you just need to zero out a block of memory or something:
ld hl,0
ld b,l
ld sp,block+4096
loop:
push hl
push hl
push hl
push hl
push hl
push hl
push hl
push hl
djnz loop
This clears a 4096-byte block starting at `block`, taking 25875cc in total, for a rate of 6.32cc per byte(!!!!!).
But remember to save&restore SP. And remember you can use SMC for that too to save another 10cc :p.
@fghsgh 4 місяці тому ⁺¹
Okay i did some math to figure out more about the initialisation. The number of iterations is BC/20. The number of bytes to be copied outside of the loop is BC%20, so the JR offset into a block of unrolled LDIs is (BC-1)%20*2. To find the number to set R to, multiply the number of iterations by 45, add 36, and AND with 127. Or something other than 36 depending on where during the initialisation you're setting it. However, this method only lets you copy up to 2560 bytes (+ 19 for the unrolled LDI loop), so you can save at most 256cc, which is less than the initialisation would probably take. Unless you added an outer loop counter too, or unrolled the inner loop some.
@amidarius 4 місяці тому
@@fghsgh Yes, the first metod is really overengineered. 😄 I knew for second (With 8-byte multiples (Without exx.) was already enough fast. 13cc/byte) and third one. Anyway, nice post for all Z80-every-cc-counts newcomers. 😂
@francescosacco4969 4 місяці тому ⁺³
Incredible! I loved how in depth you explained and electronically showed the behavior of the Z80. I would love to see more episodes like this!
@OscarSommerbo 4 місяці тому ⁺⁵
This video all but convinced me to learn z80(e) assembler, so much fun stuff in the Z80, compared to the 6510, which I programmed for the last time in the 80s.
@smf3472 4 місяці тому ⁺¹
I have done both 6502 and Z80 in recent years, I find 6502 vastly more enjoyable than Z80.
@ojonasar 4 місяці тому ⁺³
The ZX Spectrum uses the full 16 bits when reading the keyboard.
@bendertherobot910 4 місяці тому ⁺¹
Clearly, your channel is one of the best about these topics in UA-cam, as well as your edition and storytelling skills. Thanks!!
@jrkorman 4 місяці тому ⁺⁴
Zaks' "Programming The Z80" was my bible when I got started back in 1980. Hand assembling my code for probably the first 6 months as well has hand disassembling also! This was on the Radio Shack Model 1 computer. You really LEARN when you're doing it that way!
The "problem" probably came from Zilog "overloading" existing instruction "code". I later years I learned to be very cautious about overloading/extending code because of weird stuff in the base code.
@Plons0Nard 4 місяці тому ⁺¹
Rodney Zaks was quite active in those days. I had two books of him for the 6502. 🤝
@retrozmachine1189 4 місяці тому
I had both the Z80 and 6502 books but only have the Z80 one still. I think mine is a 1982 edition, packed in a box somewhere so it's not at hand.
@DouglasFish 4 місяці тому ⁺⁴
I think Ben would still be proud
@thek3743 4 місяці тому
No. Ben's videos are much easier to follow.
@semibiotic 4 місяці тому ⁺²
16-bit bus extention also implicitly used in legacy IN A, (NN) / OUT (NN), A instructions.
@johncochran8497 4 місяці тому
Not really. In for the "in a,(n)" and "out (n),a" opcodes, the upper 8 bits of the address bus is the contents of the "A" register at the start of the operation. So you can sorta get the behavior you want using the "in a,(n)" operation by loading A with the desired upper 8 bits prior to the IN opcode. But unfortunately, that doesn't apply to the OUT operation in which if you attempt to use the upper 8 bits, that would simply result in a set of 256 port addresses, each of which will only receive a single possible value.
@semibiotic 4 місяці тому
@@johncochran8497 I didn't say that it is always useful extention.
@laser31415 3 місяці тому
i wish some of these videos were available in 1984, it would have saved me many many hours of trial and error. Every now and then I still code my old z80 based TS2068. A few years ago I breadboarded (TTL logic only) my own dot matrix printer interface for it. That had been in my bucket list for 37 years.
@cbmeeks 4 місяці тому ⁺⁷
Man, the scratches on that Z80. That CPU has some stories to tell. lol
Great video! (and that's coming from a hardcore 6502 fan)
@Flashy7 4 місяці тому ⁺²
Those might be from Noel's fingernails while figuring it out :)
@mikafoxx2717 4 місяці тому
6502 definitely seems a bit more elegant.. z80 might be easier to program, though. But it has some real wonky things. Shadow registers, output bus shenanigans, 8080 backwards compatibility you have to keep in mind if you want to run your software on both, like CPM software usually did..
@ChrisWalshZX 3 місяці тому
Fantastic Video.
I'm a ZX Spectrum software developer and that's the best exclaimation I've seen for the IN (C) and OUT (C) not having B included in the mnemonic. As a software developer, I often ues LDI, LDIR, LDD, LDDR but rarely have I every used OUTI OTIR INI INIR, just the regular IN|/OUT and so the counter versions because the root reason hasn't really occurred.
I'm glad modern hardware does full address decoding. :-)
The Amstrad CPC info was new to me. Thanks.
@scottlarson1548 4 місяці тому ⁺²
I think one of the epiphanies of understanding my computer as a kid was figuring out how a chip like the 6850 serial port was at a certain memory address. I assumed it was something clever and elegant and I remember being disappointed when I saw it required a whole bunch of logic chips connected to *all* of the address lines just to enable the chip at that address.
@flatfingertuning727 4 місяці тому
If one is using e.g. 8Kx8 RAM and ROM chips, and can afford 8K of address spacing for eight I/O devices, one would use one 74LS138 to select the RAM chips, ROM chips, and the second 74LS138 to select a particular I/O device. When using separate I/O space, one would still need two 74LS138 chips.
@scottlarson1548 4 місяці тому
@@flatfingertuning727 You mean have the 6850 registers filling up an entire 8K block of memory space? 😬
@flatfingertuning727 4 місяці тому
@@scottlarson1548 The second 74LS138 would partition an 8K chunk of address space into eight 1K chunks, one for each I/O device. That would reduce the amount of directly-addressable memory from 64K to 56K, but despite some people's excessive aversion to bank switching, having 64K of linear RAM address space is for many tasks not the most useful means for a system to support 64K of RAM.
@stinchjack 4 місяці тому ⁺³
Yes Zilog could have picked a better mnemonic than "out (C),r". But I believe on the Intel 8080 "push b" was used instead of "push bc", so Zilog did improve some of the mnemonics. A lot of the rest of the complaints are Amstrad-specific.
More interestingly/annoyingly for me is that the Z80 adds extra wait-state is added for IO cycles than RAM cycles. Devices I have connected to Z80 either dont need it all, or need much longer delays than 1 cycle.
I am glad for the Z80's seperate I/O space, it means that IO selection can be done easily with a 74LS138 (address decoder chip)
@taxessux 4 місяці тому
The wait states were necessary for many parts in the day. Memory speed was all important.
@mikafoxx2717 4 місяці тому
Wasn't the wait state arbitrarily long, depending on when the ready was received? Checking would at least eat a cycle, though, even if it's instant.
@stinchjack 4 місяці тому
the Z80 doesnt have ready pin ...
@mikafoxx2717 4 місяці тому
@@stinchjack Ah, must be thinking of another processor
@stinchjack 4 місяці тому
@@mikafoxx2717 I have an 8253, an 8251, SAA1099, 58167 RTC , HT6542, and YM3812 connected to my project fine at 6MHz without needing additional wait states. LCD screen will only go as far as 1.5 Mhz tho.
Im not sure which chips would have needed extra wait states at 3.5MHz/4Mhz ?
@rebeccaabraham8652 3 місяці тому
Oh gods - this is a trip down memory lane! Used to work with the Hitachi 64180 - an enhanced Z80 - in assembly language - writing disk utilities…. Great fun - even if the company wasn’t up to much - and I then moved to GEC and started playing with unix workstations; but I’ll always have fond memories of the Z80 days!
@tmbarral664 4 місяці тому
Brilliant explanation, full of details ! Kudos!
@lister_of_smeg6545 4 місяці тому ⁺²
FD 70 xx is LD (IY+x), B
FD 07 xx will do RLCA, followed by whatever instruction xx is.
@NoelsRetroLab 4 місяці тому
Oh crap, did I reverse the opcode? Oops! You get what I meant though.
@lister_of_smeg6545 4 місяці тому
@@NoelsRetroLab Yep :) It's quite clever how they've chosen instructions that will put a particular register on the data bus, which can be snooped and read as a parameter by the Dandanator.
@fr_schmidlin 4 місяці тому ⁺⁶
Holy mother of cr*p! This video is not about the madness of the Z80 I/O, it's about the madness of the CPC I/O! And I thought the SMS already had some very unwise shortcuts. 😱
And decoding I/O like "special" opcodes on instruction fetch states is not only an immense detour to workaround a nearsighted architecture decision, but it's very costly hardware-wise. To the point of being insanely expensive if it was to be implemented in the 80s. Much more than the damn single 74LS138 that Amstrad omitted from the design, and could have avoided this whole mess.
At times like this it becomes clear how the MSX was a real computer architecture, instead of just a bunch of hackily wired together chips. And how clever, flexible and expansible its architecture was.
@Foersom_ 4 місяці тому ⁺¹
Well said.
@MarkOfBitcoin 4 місяці тому ⁺¹
That is the deepest dive on one instruction I’ve ever seen! Well done 😃
@abyssal-space65 4 місяці тому ⁺³
Running Z80 on breadboard at 10Mhz without issues, even doing hardware serial trough z80 SIO chip... otherwise - very nice overview of IORQ :D
@NoelsRetroLab 4 місяці тому
That's surprising. I never tried, but I would have thought that it would have all sorts of interference at that rate. Heck, look at my IORQ signal. It looks horrible already! :-)
@herrbonk3635 4 місяці тому ⁺¹
Me too, although partly on veroboard, but using a vanilla 4 MHz rated Z80 (some manufacturer's "4 MHz" Z80 work at that speed, some don't). The same design was stable at 12 MHz too, when using a 6 MHz rated Z80. (Not really at 16 MHz though, that one was sensible to glitches.) It very much comes down to using fast enough RAM and ROM!
@Leahi84 4 місяці тому ⁺²
Happy new year! Great to see a new video.
@NoelsRetroLab 4 місяці тому ⁺¹
Thanks! Happy new year too!
@john2001plus 3 місяці тому
I have programmed Z80 in 4 different decades. First on a TRS80, and finally on Gameboy Color. I didn't know any of this stuff. Thank you.
@chainq68k 4 місяці тому ⁺⁶
I'm not a Z80 guy, but I know the I/O space from old school, pre-protected mode x86 programming, and this video triggered my PTSD. ... And made me remember why I'm a Motorola 68k fan. I mean, all retro computers are fun in their own way, but in some cases, their creators were so busy to determine if they could, they did not stop for a moment to think if they should. :) Still, we love them for their faults too.
@andrewclegg9501 4 місяці тому ⁺²
I’ve always found z80 and x86 clunky, probably as i started with 6502 then 68k
@fnjesusfreak 4 місяці тому ⁺²
x86 has common origins with Z80 - that's why their ASM look so similar.
@ArneChristianRosenfeldt 4 місяці тому ⁺²
@@andrewclegg9501I hate this overflow flag pin hack on 6502. And IO range in zero page on later derivatives. Give me a 16 bit z address register instead to look for tape data at the normal location!
@herrbonk3635 4 місяці тому ⁺³
I don't understand your point really. What's wrong with a separate I/O-address space? To me, it feels pretty natural. And if you dislike complexity I cannot really see how you can be a 68000 fan... 😉
@melkiorwiseman5234 4 місяці тому ⁺¹
On the subject of protected mode, it occurred to me a year or two back that you could implement a "protected" mode on almost any CPU (but in particular on the Z80) by using bank switched RAM and some hardware to detect which banks were switched in and what addresses and ports were being accessed by what instructions and at what memory locations.
Specifically, you could implement a NMI to force the program to return to the OS if the running program attempted to directly access ports, memory bank switching to any bank not assigned to it by the OS, or a direct call to an OS routine without going through the correct call address. Access to all of the above would have to be done by the program setting up data in registers and then jumping to a particular address (a-la CP/M) in order to perform the function via the OS.
I wonder how often that was actually done IRL and how effective it was?
@adilsongoliveira 4 місяці тому ⁺¹
Yeah, new Noel Labs video. Great way to close my Friday :)
@ojonasar 4 місяці тому ⁺¹
Many many years ago my older brother and I built a Z80 system for a Geiger counter logging device that output to a thermal printer. It used a single EPROM, 1 I/O input port and 1 I/O output port - no RAM as there we sufficient registers to not need it, even using one of them to substitute for the stack.
@mogwaay 4 місяці тому ⁺²
Great video Noel, always here for a good nerdy deep dive! I feel like the 8088 does something similar with the high address lines with its it's IO OUT command, hmm might look into that some time... Cheers!
@timhill9039 3 місяці тому
Another peculiarity of the Z80 is the added IX and IY registers. The Intel 8080 (which the Z80 was designed to be a superset of) had various registers (BC, DE, HL) that could be used as 16-bit registers, but each could also be used as two 8-bit registers (B,C,D,E,H,L). However, the new IX and IY registers could only be 16 bits, not 2x8 bits. In most cases, anywhere in the Z80 instruction set you could use the HL register, you could instead use the IX and IY registers (very useful). It didnt take too long to notice that the opcodes for these IX and IY instructions were the SAME as the equivalent HL instruction, but with one of two opcode prefix bytes that changed the next opcode meaning to use either IX or IY instead. Armed with that hackers (like me) experimented and discovered that if you added those opcode prefixes to instructions that accessed either the 8-bit H or L registers, the CPU would access the upper or lower 8-bits of the IX or IY registers. Again, very useful, and to this day I dont know why Zilog didnt document this feature (no doubt the instruction set designers didnt realize that this was hot the chip design ended up operating).
@klausmoritzpeitzsch690 4 місяці тому ⁺¹
Thx for your great content! I was already lost after a couple of minutes since I am still in the breadboard phase of my Z80 build and am still busy understanding the architecture as such. Therefore, good to know about the I/O tricks.
@ingmarm8858 4 місяці тому ⁺²
I love LDIR, have since the 1970's 🙂
@the-pink-hacker 4 місяці тому
This makes me so thankful that the TI-84 (EZ80) doesn't use the IN or OUT instructions at all. Good ol' memory mapped hardware.
@cthutu 3 місяці тому
If I recall, IN A,(nn) and OUT (nn),A instructions put the A register on the upper 8 bits along with nn for lower 8 bits.
@taxessux 4 місяці тому
You also need to take into account that I/O instructions made it into the x86 realm, so far that I/O cycles were a special cycle on the PCI bus. The limited amount of I/O addresses were a constant annoyance, mostly because a large swath was generally taken out of each page to account for aliasing of old ISA cards. The ability to have so many devices on the PCIe bus would cause legacy devices to run out of allocated I/O areas. I think that happened somewhere around 10 ethernet controllers.
Fortunately, when NOT using legacy access modes, we were able to scrounge up more I/O ranges. It was amazing to me how long the I/O ranges lasted in x86 CPUs. As a BIOS programmer, it was hell.
@boptillyouflop 4 місяці тому
The fact that the PC survived so many massive changes (protected mode, 32bits, out of order, 64bits, PCI, GUI, the insane variety in SVGA cards and sound cards, 3d acceleration, multi-core, VGA, paging) is definitely some kind of miracle.
4 місяці тому
Sinclair did the same. There are ports actively being used in ZX Spectrum 128 and Amstrad produced Spectrums as well. Port #1FFD (+2A, +3x mem ctl), #7FFD (128 mem ctl), #BFFD (AY port), #FFFD (AY port).
Other usable method is memory mapped IO. For example for ZX Spectrum, there's a ROM at first 16K of address space. By pulling up the ROMCS signal (it has the required resistor already), ROM chip will not get selected and the whole 0-$4000 is now available for IO. Or external ROM/RAM/whatever. It can even be triggered but memory write to a location, where ROM resides. This is how most of the extensions for Spectrum work.
@etchedpixels 4 місяці тому ⁺²
The out behaviour for A and the 256 port side also comes from the 8080 which the Z80 was trying to be some level of compatible with. This is also why the IN A and OUT A instructions don't affect flags but the (C) versions do.
Guess what the 8080A did when you did an OUT instruction ? It put the contents of A on both the upper and lower halves of the address bus. This is why the Z80 has that behaviour, to be compatible with hardware that relied upon this (eg by decoding some I/O off each half to the bus to avoid the usual problem with fan-out limits on the low bits)
The choice of having an actual I/O space with IN and OUT goes back the 8008 and so presumably the board of discrete logic in the terminal that it was supposed to replace.
@smf3472 4 місяці тому ⁺²
Close, 8080 OUT/IN instructions took a byte which is the 8 bit port and that gets put on both bits 0-7 and 8-15 of the address bus, OUT puts the A register on the data bus and IN sets the A register from the value put on the data bus by the hardware. Zilog versions of those instructions are different, the 8 bit port goes on bits 0-7 and the A register is put on bits 8-15, which is pretty useless for the OUT instruction as it's also the value that will be put on the data bus.
@PebblesChan 4 місяці тому ⁺²
The microbee computer also uses A[8:15] as supplementary data bus for 16-bit I/O instructions and the IN instructions for some output port & register functionality. 😊
@herrbonk3635 4 місяці тому
Yes, and many embedded systems too (including some of my own).
@Bunny99s 3 місяці тому
As far as I remember the classical gameboy also used a slightly modified Z80 which didn't use the in / out instructions at all. Everything in the gameboy was memory mapped. Input, synthesiser, graphics output and even the "network link". I once wrote a simple assembler / disassembler to read and write gameboy roms in Delphi back in the days. I toyed around a bit with a gameboy emulator but nothing serious. Though it was a fun time. Since I'm kind of a "data-messi" I'm sure I still have the assembler somewhere as well as the GB emulator (I think it was "rew") ^^.
@domramsey 4 місяці тому ⁺¹
Literally no idea what you just said. But I watched It all, of course.
@fluiditynz 4 місяці тому
I had an Amstrad CPC664 many years ago and rolled my own DIY epansion for it. Looking at the data book for the Z80, I expected to be able to use interupt mode 3 with the most features( I connected a UART which I wrote a 68HC11 cross assembler for to program 68HC11E2 chips and inputs to scan a DIY mouse) Unfortunately, Amstrad chairman Alan Sugar had chosen interupt mode 1? (it was a long time ago), which did not support real time interupts. As a consequence, I dropped down to polled encoder inputs and my mouse only worked very slowly. Early days!😆 I remember writing assembler routines accessed via Amstrad Locomotive BASIC and setting up BASIC configurable screen colour redirects and automatic cycling pallette, then writing maths art 3D Z-scaled algorithms with some modulo cycling through the pallette. So much fun and my pot-head flatmates were awed by the psycadelic results! Writing the 68HC11 was a great experience. Took me a month of my spare time and I was so pleased the assembler mnemonics were not too CISC, Motorolla's micros had elegant mnemonics.
@emesde 3 місяці тому
On msx you have a choice, you can use I/0 ports or memory mapped. It has a slot select mechanism where you can choose out of 16 slots on 4 memory locations (16KB pages) . There is also some other I/o addressing in the standard which makes it possible to go beyond 256 I/0. All these things were standardized . I think probably that is why there are soo many extensions for this system.
@michaelcrisp562 3 місяці тому
Hi Noel, great trip down memory lane, thanks. I cut my teeth on Z80 assembly back in the day, just like you I used a switch to toggle the clock line. Pretty sure it worked without issue ie totally static design. I think the original Zilog data even mentioned this in the clock specifications dc to 2MHz, I remember being impressed by this. keep up the good work 😊
@liontuga155 4 місяці тому
Oh, the memories... Tried to make a 1 bit sound sampler of sorts with INIR back in the 80’s on my Spectrum. Good times! :D
@zxborg9681 4 місяці тому ⁺²
Very interesting analysis. I always treated IO as still fundamentally based on the 256 byte space of the 8080/8085, and saw the extra B register output on the address MSB as just a half-implemented idea for an abortive 16-bit IO space architecture idea. But the Amstrad explanation is interesting, how they made use of it after all. Thanks for the deep dive.
@WacKEDmaN 4 місяці тому
Excellent stuff Noel... now i understand why CPCs OUT instruction is handled different to the rest of them!.. and why it doesnt like the LDIR (and other loop) instructions in certain circumstances..
@gaku8108 4 місяці тому
In the case of OUTI, B is decremented first.
OUTI: B←B-1 (C)←(HL) HL←HL+1
INI: (HL)←(C) B←B-1 HL←HL+1
@ncot_tech 4 місяці тому ⁺¹
OK this is comforting to know and that the documentation is randomly inconsistent, and that when I was doing some Z80 on my Agon Light I wasn't in fact going insane when things were acting strangely. Also I guess they called it OTIR to keep the mnemonic as four characters?
@michaelmoorrees3585 3 місяці тому
I spent the bulk of the 1980s using the Z80 almost exclusively, with a little deviation using the 8086. But used mostly Intel peripheral chips (8051, 8053, 8055, & 8059) with that Z80. Yeah, a little extra TTL "glue" to mate the signaling. Into the 90s, went to mostly x86, when a full processor was needed, and microcontrollers when it was simpler. Starting with the HC05, since it was a joy to code in assembly. Moved over to the AVR line, with a transient pass thru with a few 8051 projects.
Dollar wise, the Z80 was still the biggest selling processor well into the 1990s. This was when an individual Z80 sold for under a buck, compared to x86, including early pentiums, selling in the $200 range, each !
@bendertherobot910 4 місяці тому ⁺¹
Oh, sorry. I'm not sure about this, but reading new Zilog Z80 manuals I realized that these have a lot of mistakes. I prefer to read old manuals (like those stored in Bitsavers website). By the way: Happy New Year, Noel!! Thanks for the awesome video (as always)!!!
@scsirob 4 місяці тому ⁺²
Great coverage of the subject, thanks! Using 'blank' or 'undocumented' opcodes is indeed a clever hack, but not without risk. There are Z-80 compatible CPUs that have additional opcodes in the unused space, such as the Hitachi HD64180. A program may auto-detect the use of that CPU by attempting an opcode sequence that behaves like you described on a genuine Z80, but has a different result on a Hitachi chip. Granted, the chances of running into such software on an Amstrad aren't that high ;)
@greenaum 4 місяці тому ⁺¹
The Z80 includes hardware to automatically refresh DRAM, where other CPUs needed extra hardware to do that. So it sends out the odd request to RAM that's not in the code it's running. If you were to put IO there, it might end up being triggered by those refresh cycles. Of course, a ROM chip won't mind an attempt to refresh it, it'll just do nothing. So it makes sense there's an extra IN / OUT instruction that just raises the right pins so some device can respond appropriately.
@ehsnils 4 місяці тому
I remember the fact that the Z80 couldn't be single stepped, which caused me some headaches in debugging once. Took a second or two before the processor went socks up and a restart was needed. I realized pretty soon what the problem was, but it still was a headache.
Since I started with the Z80 processor the I/O it has seems normal to me.
A more interesting aspect is to use "IN (BC),A" (sorry if I borked it a bit), but then you could use the instruction to create a matrix decoder of a keypad.
@Blitterbug 4 місяці тому
Interesting take on what seemed perfectly reasonable to me back in 1982! In fact I remember feeling that the 6502 was crippled by comparison, until I realised how much more work it could do per clock, and became somewhat of a 6502 convert. Z80 code still seems sensible, tbh.
@LarryRobinsonintothefog 3 місяці тому
Never used OUTI but LDIR was my favorite but would have to PUSH and POP registers. Didn't get to memory accessed ports till the 6809, memory was gradually becoming more available. Used to have the red book on the Z-80.
@briannebeker2119 4 місяці тому ⁺¹
LD instructions always reads and writes data it is just a matter of what the source and destination are. That is why there is not a read and write instruction.
The IO instructions are much more limited as they were an add-on to the instruction set and don't need to have numerous addressing modes like the LD instruction, thus they are targeted to what the function is. IO instructions where meant to be 8 bit addresses but there are 16 address bits and they need to be set to something. Using B as the upper address means you can use 16-bit addresses for IO if you choose.
@mikeh_nz 4 місяці тому
Love it!
@ian_b 4 місяці тому
This has puzzled me since 1982 or so. Thanks for addressing it!
@jtsiomb 4 місяці тому ⁺¹
using the high address byte during I/O is far from a unique oddity. The ZX spectrum, which you mentioned as a counter-example, in fact does use the high byte as a key matrix row select. in a, (c) with c being feh will load the state of the row selected by b into a.
In fact if you think the CPC is mad, you'll find the spectrum utterly insane, with its resistor-split bus, garbage on the data bus during interrupts forcing you to fill all 256 and A HALF slots of the vector table with the same address, and that address has to have boths its bytes identical... the ULA using a single bit to decode accesses thus grabbing all the even I/O address space.... it's so bad ... but it was my first computer....
@NoelsRetroLab 4 місяці тому
Maybe that will have to be part 2 😃
@timhill9039 3 місяці тому
Welcome to the world of backward compatibility. The Z80 was designed to be instruction-set compatible with the Intel 8080 (a strict superset). Intel had only designed the I/O space of the 8080 with an 8-bit address space (256 I/O ports), and so the 8080 instructions used only the C register to generate the I/O port address. By the time the Z80 came along it was clear this was a mistake, as more I/O space was needed. Zilog's rather clever solution was to simply allow the B register to supply the extra 8 bits, expanding the I/O space to 64K without having to add any new instructions. This means the Z80 worked fine in existing 8080 designs (since they ignored the top 8 bits entirely), but new designs that were Z80 aware could use the expended 16-bit I/O space. The only oddity was the retention of the existing assembler syntax for the I/O instructions. In fact, very early Z80 documentation DIDNT mention that the B register appeared on the address but, and I suspect that this was not originally intentional, it was just cheaper to let the whole BC register appear on the bus (as it did with other instructions). And only later did Zilog document this as a "feature" when the hacker community had noticed (and used) this behavior in the wild.
@gasparinizuzzurro6306 20 днів тому
Noel, there is another thing you can explain of : what do the out (0xf0),a instruction? ok, that places 0xf0 on the lower A0-A7 bit, but guess what register content is going to A8-A15?
@antoninkolouch5161 4 місяці тому
The standard Sinclair Interface-1 was also using M1 to capture rst #8 to swap shadow ROM with a different code to handle it. The rst #8 was called by the standard BASIC interpreter in any case of syntax error so it allows easy expansion of commands.
@cbmeeks 4 місяці тому ⁺⁴
OK, now I really know why I prefer the 6502. I designed a 6502 SBC a few years ago and it was pretty simple. In the 6502, you can get 32K of RAM and 32K of ROM with only 256 bytes of IO with THREE TTL chips. So you don't have to waste a lot of RAM for IO.
@retrozmachine1189 4 місяці тому ⁺²
Neither do you need to waste a lot of memory space for IO if you decide this is the IO scheme you want on a Z80 either rather than port based. Guess what, you can do the decode with a low number of TTLs too. Not saying your like of the 6502 is flawed of course, but your errr logic, is.
@fr_schmidlin 4 місяці тому ⁺²
Please (*1) don't mistake the CPC nonsense with the Z80 architecture. If you want something more sensible, take a look at the SMS architecture or, even better, the MSX architecture. (The SMS still had some cheap shortcuts, since it was a legacy from earlier cheaper designs of the SG-1000)
*1: Honest remark, no offense meant
@cbmeeks 4 місяці тому ⁺²
@@retrozmachine1189 not sure what you meant by my logic being flawed. I wasn't putting the Z80 architecture down. I was simply stating you don't have to waste a lot of memory space for IO on a 6502 design.
@cbmeeks 4 місяці тому ⁺¹
@@fr_schmidlin Yeah, I figured this was more a strangeness of the CPC. No offense taken.
@smf3472 4 місяці тому
@@gppsoftware The 6502 was designed to compete with the intel 4004 or TTL, while the Z80 was an 8080 super set. So I think it's fair to say it was more primitive, however in many ways the design makes for much faster code. The 6502 zero page addressing allows for 256 8 bit registers, or 128 16-bit registers. You could argue that it's not really registers, but the equivalent z80 code is often slower. I've written an 8080 interpreter for 6502 and it takes average of 14 6502 clocks per 8080 clock. I have been looking at going the other way and it is nowhere near as efficient.
@Frisky0563 4 місяці тому
I would want a PCB. What a mess I learned about the Z80 in School and switched to MC6800. I enjoyed your video very much 🎈
@jeffburrell7648 4 місяці тому ⁺⁴
Annnnnd it just gets crazier. A standard OUT (xx), A or IN A, (xx) instruction puts the accumulator on the upper 8 bits of the address bus. I think all of this is the result of trying to minimize transistor count.
@damouze 4 місяці тому ⁺¹
That would not surprise me at all. I wonder if there is something on it on Ken Shirrif's blog.
@PebblesChan 4 місяці тому ⁺¹
It’s not crazy, seems deliberate & is a very useful feature used often with the microbee computer. It’s a way to add additional peripherals by piggybacking on existing decoding logic. 😊
@kensmith5694 4 місяці тому
IIRC, the ZX80 ran a line to the expansion connector that allowed you to prevent the address decode on the unit from doing its thing. This cost them very little extra hardware but also allowed some very creative things to be done with a ZX80.
@r00tyschannel52 4 місяці тому
Having very recently written a Z80 emulator (not deliberately, but more of a proof of concept for a framework for designing CPU emulation) I found that documentation is an interesting thing. But, the actual Zilog manual is verbatim the same as the one you showed and this was clear to be, for OUT (c) you just slap BC onto the address line and for OUT (n) you put the A register onto the high order bits.
But! Find some consistent documentation for how the Half Carry flag is handled. Especially when it comes to the ADC/SBC instructions! That was a rollercoaster ride. For anyone embarking down this route it's actually not that bad.
For 8 bit arithmetic operations, you want to take the first add/subtract value. Put it into a 16 bit storage and & with 0xf. Then take all values you want to add (or subtract) (INCLUDING THE 1 for carry set on the xBC instructions) one at a time to the value, with & 0xf applied to each. So if you have 1f + 2b + 49 you would add f + b + 9 = 23. At the end if the result is above 0xf then set half carry. However, for 16 bit operations you need to do pretty much the same except to use & 0xfff on each value and at the end of the value is > 0xfff then set half carry.
Finding a single place with this actual information was a nightmare.
Also, I want to say the ZX spectrum does NOT only use the low order values. While it's true the ULA is addressed by 0xFE, it does look at the high order bits when reading the keyboard. The high bits address the 5 keys you want to scan right now.
@andyc8257 4 місяці тому
It's pretty clear from the design that Zilog originally intended only 8-bit IO addressing, probably assuming 256 output ports was plenty. B appearing on the bus was most likely a glitch due to the internal design, in the same way A appears on the bus when you do OUT (n), A.
Lots of people saw the advantage of a 16-bit IO space, since you could cheap out on hardware address decoding. The Spectrum, for example, also does the same so Amstrad were far from unique in this regard. I suspect that's also why Zilog eventually documented that behaviour.
@jesusarias4320 4 місяці тому
Nice video! The Z80 is probably the trickiest CPU ever made and I/O is only one of its many weirdness. BTW, The ZX Spectrum also uses an IN A,(C) instruction for keyboard scanning. I think that's the only case where the 8 high lines of the address bus are used on a ZX peripheral. I spent some time recreating the ZX and the Amstrad CPC in an FPGA (search for ZX Spectrum in a FPGA) and that also means I had to study these computers in deep, concluding that, in spite of our fond memories about them, if their designers were aeronautical engineers I would never travel by plane ;)
@markevans2294 4 місяці тому
That keyboard scanning method originated with the ZX80. The Spectrum BASIC also has the commands IN and OUT.
@andyc8257 4 місяці тому
It's pretty clear from the design that Zilog originally intended only 8-bit IO addressing, probably assuming 256 output ports was plenty. B appearing on the bus was most likely a glith due to the internal design, in the same way A appears on the bus when you do OUT (n), A.
Lots of people saw the advantage of a 16-bit IO space, since you could cheap out on hardware address decoding. The Spectrum, for example, also does the same so Amstrad were far from unique in this regard. I suspect that's also why Zilog eventually documented that behaviour.

Наступне

Автоматичне відтворення