Wednesday, January 30, 2013

Phoenix: Graphics


The video functionality was the most interesting and most rewarding aspect of the Phoenix hardware design.

I decided to use composite video output ("analog TV") for Phoenix, as this is what most Z80-based computers tended to use, and it's fairly easy to generate using a microcontroller (at least for monochrome).

Some articles I used as inspiration:


A composite video signal contains horizontal sync, vertical sync, brightness and color information. If we just look at monochrome (no colors or even greyscales) the signal has 3 voltage levels: 0.3 volts for black, 1.0 volts for white, and 0.0 volts for synchronization pulses (both horizontal and vertical).  It's easy to generate these voltage levels using 2 digital output pins and a few diodes and resistors, as described in the articles above.

I decided to use an ATmega328p as the microcontroller for the video signal generation. It may seem like cheating to use a modern microcontroller in a retrocomputing project; on the other hand many Z80-based computers used custom ICs for video. Programming a microcontroller is just a different way of making a "custom IC".

I started with generating the sync pulses. This is basically just an exercise in careful timing. Rather than using the AVR timers, I just wrote this in AVR assembler as a single loop with pin toggles and delay loops. Counting the number of clock cycles for each instruction, this is a relatively easy (but very boring) process. The result on the logic analyzer:


On the left a few normal horizontal sync pulses (the last few scan lines of a frame), followed by the series of pulses making up a vertical sync: these are half a scan line long (6 half-lines of short pulses, followed by 6 half-lines of long pulses, and then 6 short ones again).

Now we need the actual image. Since I grew up with a ZX Spectrum, I decided on a resolution of 256x192 pixels. For 256x192 monochrome pixels a framebuffer of 256/8*192 = 6144 bytes (6 KB) is needed. Since Phoenix already has 32 KB of RAM, I can just set aside 6 KB for the framebuffer, although simultaneous access from the video circuitry and the CPU needs to be figured out (more on that below).

An ATmega328p is not fast enough to read from memory and produce the brightness signal. Instead, I used a slightly different approach: the ATmega will just put addresses on the bus, and since each byte in the frame buffer contains 8 pixels, a 74HC165 8-bit parallel-in-serial-out shift register is used to output the pixels one at a time:



Each group of 8 pixels is basically generated as follows (assuming the ATmega328p has full access to the memory):

  • place correct framebuffer address on the address bus
  • load byte from data bus into shift register (this outputs the first bit in the byte)
  • shift the bits in the shift register repeatedly to display the remaining 7 pixels

To get uniform pixels, the load and shift pulses must happen exactly evenly spaced at the pixel clock frequency.

To get roughly square pixels the pixel clock needs to be about 6 MHz. Toggling a pin on and off requires 2 clock cycles on the AVR. However, we also need to increment the address and put it on the bus every 8 pixels. To keep an even pixel size, we therefore need 3 clock cycles per pixel to do all these operations. At a 6 MHz pixel clock, this means the ATmega needs to run at 18 MHz, which is within its specifications.


The logic analyzer shows this pattern: a load pulse every eight pixels, and shift pulses for the 7 remaining pixels. The pixel data from the shift register is updated at a 6 MHz rate, and the address bus is updated every 8 pixels.

The only remaining issue is making sure the Z80 CPU and the ATmega play nice together for access to the RAM. The solution I chose is not very sophisticated: the Z80 has a "bus request" signal which allows an external device (such as the ATmega in this case) to request access to the address and data bus. After asserting this signal, it may take a few clock cycles for the Z80 to grant access, which it indicates using the "bus acknowledgement" signal.

Given the tight pixel timing requirements, we have to request the bus for the entire period of the 256 horizontal pixels. This means that while pixel data is being sent to the screen the Z80 is effectively halted. This works out to approximately 50% of the time, so even though the Z80 is clocked at 6 MHz, it performs like a 3 MHz one.

The final bit of the video design involves taking control of the bus as soon as the Z80 grants it. This is implemented using a 74HC541 buffer whose output is gated by the Z80 bus acknowledgement signal. As soon as the Z80 acknowledges the bus request, the 74HC541 puts the MREQ, IORQ, RD, WR, A15 and A14 signals in a defined state.

The AVR assembler code for this project can be found on GitHub.

And finally two pictures:
Early prototype of the video circuit: the ATmega328p and 74HC165 wired together. There's no memory, instead I'm using the address lines as the input to the shift register.
Final version of Phoenix showing a test pattern on an Adafruit 4" NTSC monitor.

Monday, January 21, 2013

Phoenix: Memory (part 1)

Z80-based computers had memory sizes varying from 1 KB (e.g. ZX81) to 128 KB (e.g. ZX Spectrum 128). Because the Z80 only has a 16-bit address bus, addressing more than 64 KB of memory requires bank switching.

Even for relatively small memory sizes, multiple memory chips were needed. For example, 16 KB of memory could consist of 8 separate 16 kilobit memory chips, each hooked up to a data line of the CPU. For Phoenix, the situation is easier: we can just use a single 32 KB SRAM chip.

This SRAM chip has 15 address lines, 8 data lines and 3 control lines:
  • CE (Chip Enable): if this is inactive (high) the other control signals (OE and WE) are ignored.
  • OE (Output Enable): setting this active (low) does a memory read and outputs the byte on the bus.
  • WE (Write Enable): setting this active (low) does a memory write.
These signals have the exact same semantics as the Z80 MREQ (memory request), RD (read) and WR (write) signals, so we could hook up the memory directly to the CPU without any "glue" logic.

A computer with just RAM isn't very useful, so we either need to add some (programmable) ROM or find a way to fill the RAM before the CPU starts up. Again, inspired by Veronica, I chose the latter way (for now at least: the final Phoenix design does have an EEPROM).

To pre-fill the memory, I used an Atmel ATmega324 microcontroller. If you're familiar with Arduino, the ATmega324 is very similar to the ATmega328 used in the Arduino, except it has more I/O pins. The ATmega324 has a total of 32 I/O pins, more than enough to drive the 15 address lines, 8 data lines and 3 control lines.

However, the ATmega and Z80 can't both drive the memory at the same time, so we do need some glue logic after all:

On the left are the signals from the ATmega (PB0 and PB1) and from the Z80 (MREQ and WR). On the right are the signals to the memory (CE and WE).

While pre-filling the memory, the ATmega holds pin PB0 low. This has two effects: it enables the SRAM chip (through the AND gate, keep in mind that the signals are active-low), and keeps the Z80 in the reset state, so the address and data buses are kept in high-impedance state by the Z80, and can be controlled by the ATmega.

It then puts an address on PORTC and PORTD, data on PORTA, and briefly sets PB1 low to write this to the SRAM. After cycling through the entire memory contents this way, it then sets PORTA, PORTC and PORTD to high-impedance (so the Z80 can control them again), and sets PB0 and PB1 to high. Now the Z80 starts up and MREQ and WR are propagated through the AND gates to the SRAM.

We can reprogram the ATmega while it's connected to the memory and CPU: this uses pins PB5, PB6 and PB7, which are not connected to either the Z80 or the memory. During programming all I/O ports are in high-impedance state, the pull-down resistor ensures the Z80 is held in reset during this time as well.

To verify that everything works, I uploaded a trivial Z80 program (only 4 bytes). It doesn't do much useful: it just tries to write a dummy value to a nonexistent output port. At least we should be able to see the IORQ (I/O request) signal being asserted.

loop: OUT (0xFE), A
      JR loop

The full ATmega source code to pre-fill the memory:

#define F_CPU 8000000UL

#include <avr/io.h>

#include <avr/pgmspace.h>
#include <avr/power.h>
#include <avr/sleep.h>
#include <util/delay.h>

#define CE PB0

#define WE PB1

uint8_t data[] PROGMEM = {

              // loop:
  0xd3, 0xfe, //   OUT (0xFE), A
  0x18, 0xfc, //   JR loop
};

int main() {

  DDRB = _BV(CE) | _BV(WE);  // CE and WE are outputs.
  PORTB = _BV(WE);           // CE low (chip enabled), WE high (not writing).

  // CE is asserted, Z80 is in reset, we can take the bus.

  DDRA = 0xFF;
  DDRC = 0xFF;
  DDRD = 0xFF;

  _delay_ms(20);  // Wait for bus to stabilize.


  for (uint16_t address = 0; address < sizeof(data); ++address) {

    // Place address and data on the bus.
    PORTC = address >> 8;
    PORTD = address;
    PORTA = pgm_read_byte(data + address);
    // Toggle WE.
    PORTB &= ~_BV(WE);
    PORTB |= _BV(WE);
  }

  // All done - release the bus.

  PORTA = 0;
  PORTC = 0;
  PORTD = 0;
  DDRA = 0;
  DDRC = 0;
  DDRD = 0;

  // Take CE high, Z80 can take over.

  PORTB |= _BV(CE);

  // Go to sleep, our work here is done.

  power_all_disable();
  set_sleep_mode(SLEEP_MODE_PWR_DOWN);
  for (;;) {
    sleep_enable();
    sleep_bod_disable();
    sleep_cpu();
  }
}

And here is how this looks on the logic analyzer:


The first 3 signals are observed from the point of view of the SRAM. At the start the CE signal is kept low while the 4 write pulses load the program into the memory. Then execution starts as normal and we see the pattern where opcodes and operands are being fetched from memory. The OUT instruction is easily recognizable by IORQ.

And another breadboard picture:


In the front, the Z80 with on its left the 4 MHz oscillator and to the right the 74HC08 AND chip. In the second row, on the left is the big ATmega324, and on the right is the 32 KB SRAM chip. Note that at this point, I had wired up only A0 and A1, enough for the trivial 4-byte program shown above, but anything more serious requires more address lines obviously.

Phoenix is now a working computer able to execute programs, but there's no way to interact with it yet (and hooking up logic analyzer probes, fun as it may be, doesn't really count).

Saturday, January 19, 2013

Phoenix: CPU


8-bit home computers in the early '80s were predominantly based around the 6502 and Z80 processors. Since I grew up with a ZX Spectrum, I decided to base Phoenix around the Z80 (unlike Veronica, which is 6502-based).

The Z80 is an 8-bit processor, with an 8-bit data bus and 16-bit address bus. The latter means it can address up to 64 KB of memory. CMOS Z80's can run at up to 10 MHz. As a comparison, the ZX Spectrum ran at 3.5 MHz.

Apparently, brand new Z80's are still being made even today, more than 30 years after the chip was first introduced. The one I bought from Digi-Key had a date code of 1229, which if I read it right means it was produced in July of 2012.

To get a Z80 up and running, you need a clock signal (not just a crystal). The final Phoenix design uses a 6 MHz oscillator chip but for this post I used a oscillator made out of some components I had at hand: a 16 MHz ceramic resonator, 74HC00 NAND IC, some resistors and and a 74HC74 dual flip flop.  The resonator, resistors and 74HC00 together produce a 16 MHz clock signal, which is then halved twice by the flip flops down to 4 MHz.

The Z80 needs a reset pulse to properly initialize itself. This can be generated easily using a capacitor to ground and a resistor to 5 V.


In addition to the clock and reset signals, all that's needed is to keep the control signals (interrupt, bus request, etc.) in a defined state. Note that these are active-low so need to be connected to 5 V.

Since the data bus is bidirectional, I hooked up the data pins to ground using 10 kΩ resistors. This way there won't be a short if the Z80 issues a write, and a read cycle will produce all zeroes. Conveniently the 0x00 opcode is NOP, so if all works well the Z80 would just execute a steady stream of NOPs, as if it were connected to 64 KB worth of memory filled with zeroes.

Using my Saleae Logic analyzer, we can observe this in practice:


This looks exactly like the timing diagram in the Z80 data sheet. The first line is the 4 MHz clock signal, and we can see that the other signals repeat every 4 clock cycles. Keep in mind that all these are active-low. The MREQ (memory request) is pulsed twice during each instruction, first together with RD and M1 to read an opcode from memory, and then together with RFSH as a DRAM refresh signal.

Let's try a more interesting instruction. Changing pin D4 from 0 to 1 puts 0x10 on the data bus, which is the DJNZ instruction (decrement and jump if not zero):


Here the pattern takes a whopping 13 clock cycles. It starts out like the last one: first an opcode read (MREQ, M1, RD), followed by a memory refresh (MREQ, RFSH). Then there is another memory read to get the jump displacement (MREQ and RD, but without M1 this time). And then another 5 cycles where nothing happens on the bus and the Z80 executes the jump internally.

As you can see, the Z80 is a relatively slow processor requiring 4 clock cycles for even the simplest instructions, in contrast to the 6502 which could execute instructions in a single cycle. The ZX Spectrum ran at 3.5 MHz Z80 whereas the Commmodore 64 had a 1 MHz 6510.

Here's a picture of the Z80 on my breadboard hooked up as described above. On the left are the two ICs to generate the 4 MHz clock signal and on the right is the big Z80 chip with the logic analyzer probes still attached to it.


Of course, executing the same instruction over and over is not very exciting, so in the next post we'll add some memory.

Friday, January 18, 2013

Phoenix

Inspired by Quinn Dunki's Veronica project, I've been working off-and-on the past few months on my own design for a "retro" computer: a computer similar to the home computers of the early '80s.

Here is the result, code named Phoenix:


  • CPU: Zilog Z80 at 6 MHz
  • Memory: 32 KB RAM and 8 KB EEPROM
  • Input: PS/2 compatible keyboard
  • Output: composite video, 256 x 192 monochrome pixels
Over the next few days, I'll post a few articles describing the design and build.