Skip to content

Latest commit

 

History

History
385 lines (291 loc) · 17.4 KB

File metadata and controls

385 lines (291 loc) · 17.4 KB

NET'97-C64 NETWORK SOFTWARE by Gordon Axmann

NET'97 C64 Userport Network (historical reconstruction notes)

This document summarises the reconstructed hardware and software behaviour for a max. 4-node C64 userport network used by “Empire C64”. The original documentation is lost. This document has been reconstructed from memory and looking at the assembler code.

The software utilises the powerful serial interface of the CIA 6526 in the Commodore 64 and achieves approx. 8 kB/s (#$07 in Timer A). Significantly higher transfer rates were possible with only two C64s. (For comparison: the standard VC1541 transfers at 0.6 kB/s.)

Hardware overview

The network is a 4-wire shared bus:

  1. GND
  2. CNT2 (CIA2 serial clock) -> userport pin 6
  3. SP2 (CIA2 serial data) -> userport pin 7
  4. PB7 (CIA2 port B bit 7) -> userport pin L (BUSY line, active-low)

Node IDs are configured locally on each plug via two input bits:

  • PB0 (userport pin C) = ID bit 0
  • PB1 (userport pin D) = ID bit 1

Two bits allow 4 IDs (ID#1..ID#4). The plug labels match the software’s internal numbering.

Physical order on the cable (it is important!! that ID#2 is at the end of the cable):

  • ID#1, ID#4, ID#3, ID#2

Cable length was approximately 3 metres. The cable was physically daisy-chained (two or four plugs on one cable) but electrically a single bus (each signal conductor continuous across all plugs). When you use only two plugs we tested cable length up to 20m with no problems, but when you have open unused plugs in the middle of the cable, you better only use 3m of cable.

PB7 BUSY / arbitration behaviour

From the client and master binaries (“transm 2.75” and “sd2.75”), PB7 is used with this polarity:

  • PB7 HIGH = bus free / released
  • PB7 LOW = bus busy / claimed

Typical sequence observed in both programs:

  • Wait until PB7 reads HIGH
  • Switch PB6/PB7 direction to output (DDRB = $C0)
  • Force PB6/PB7 low (mask PRB with $3F and write back)
  • Perform serial transfer steps
  • Release bus again by switching DDRB back to input (DDRB = $00)

This is consistent with an open-bus style arbitration line: participants only drive PB7 low when claiming the bus.

Bias / termination (as remembered)

  • PB7 had a pull-up to +5 V at ID#1 (master end), typically 4.7 kΩ.
  • ID#1 also had an activity LED that indicates bus usage (LED on when PB7 is pulled low).
  • SP2 and CNT2 were left without explicit external pull-ups (bus relied on device behaviour / internal characteristics).

Note: When reproducing, it might be prudent to provide optional pull-ups for SP2/CNT2 (e.g. 10 kΩ to +5 V at one end) as a stability option.

ID strapping (PB0/PB1): physical wiring vs software numbering

There are two separate concepts:

  1. Physical strap wiring (what you solder):
  • Strap PB0 and PB1 via resistors either to GND or to +5 V (local +5 V on that machine).
  • Recommended safe resistor value: 4.7 kΩ.
  1. Software numbering / plug labels:
  • The software reads PB0/PB1 and maps the 2-bit value to the internal node numbering. The plugs were labelled ID#1..ID#4 and this mapping matches the software.

Mapping to plug labels (ID#1..ID#4):

  • ID#1: PB1 -> GND, PB0 -> GND
  • ID#2: PB1 -> GND, PB0 -> +5 V
  • ID#3: PB1 -> +5 V, PB0 -> GND
  • ID#4: PB1 -> +5 V, PB0 -> +5 V

Software components

Bootstrap:

  • “transm 2.75” is the receiver/client program for nodes ID#2..ID#4.
  • “sd2.75” is the host/distributor program for node ID#1.

Both programs use CIA2 registers ($DDxx), including:

  • $DD01 (PRB) for PB7 busy and PB0/PB1 ID read
  • $DD03 (DDRB) to switch PB6/PB7 between input and output
  • $DD0C (SDR) for serial byte transfer
  • $DD04/$DD05 (Timer A) and $DD0E (CRA) to drive timing / serial mode
  • $DD0D (ICR) to poll/ack serial shift completion

Boot / distribution sequence (operational description)

  1. Clients (ID#2..ID#4) start in receive mode (software: “transm 2.75”), allowing the initial host (ID#1) to detect present nodes via a handshake/ACK scheme.
  2. ID#1 distributes the required software payload over the network (and can also send a command to start it):
    • character set
    • machine code routines
    • map data
    • BASIC program variant (human or computer player)
  3. After distribution, clients start the distributed BASIC program
  4. During play, each node has all required local data; only deltas/updates are exchanged. Updates may be broadcast to all nodes or sent to a single affected node.

Open items / unknowns

  • Whether additional pull-ups on SP2/CNT2 were present in the original plugs (not required by the binaries alone).

Addendum (Jan 2026): arbitration timing + optional PB6 LED

This is based on disassembly of sd2.75 (host, load $9A00, 1391 bytes).

PB6 (optional “THIS C64 uses the bus” LED)

When a node claims the bus it drives both PB7 and PB6 LOW by setting DDRB=$C0 and writing PRB&=$3F. This makes PB6 suitable for an optional per-node activity LED inside each plug (local indicator: “this C64 is currently claiming/using the bus”). PB7 can still be used for a “bus busy at all” LED (typically on the master plug).

(If you implement the PB6 LED: use a low-current LED and a series resistor to +5 V (e.g. 1.5 kΩ), similar to the PB7 LED concept. PB6 is not part of the 4-wire bus; it is local to the plug.)

How simultaneous sends are avoided in practice (BUSY + ID delay + NMI receive)

Claim/release (CIA2):

If a C64 want to send data over the bus:

  • Wait until PB7 reads HIGH (bus free), then claim: DDRB = $C0 (PB6/PB7 outputs) and PRB = PRB & $3F (drives PB6+PB7 LOW).

After claiming PB7 LOW, the code waits an ID-scaled delay before switching the serial hardware into transmit mode. This delay is implemented as a fixed loop repeated ID times and, on PAL, is approximately:

  • ID 1: 261 cycles ≈ 0.265 ms
  • ID 2: 517 cycles ≈ 0.525 ms
  • ID 3: 773 cycles ≈ 0.785 ms
  • ID 4: 1029 cycles ≈ 1.045 ms (Assuming PAL φ2 ≈ 985,248 Hz.)

Serial speed reference (Timer A = $0007, as used by NET’97):

  • ≈ 7,697 bytes/s on PAL → ~0.130 ms per byte (1 byte time)

Receive path and why it matters:

  • Incoming serial traffic triggers a CIA2 NMI (custom NMI handler installed at $0318/$0319).
  • On CIA2 NMI, the handler reads the first byte from SDR and then pulls the rest of the packet synchronously by polling the CIA2 “shift complete” flag and reading SDR.
  • After the NMI handler returns, execution resumes where it was interrupted (typically still in the post-claim ID delay loop).

If no other C64 with lower ID numner interrupts again for sending, our C64 sends data over bus. Thereafter:

  • Release: DDRB = $00 (PB6/PB7 back to input / high-Z).

Practical arbitration effect:

  • Even if two machines were to pull PB7 LOW very close together, the lower ID reaches “start transmit” earlier because its post-claim delay is shorter.
  • While higher IDs are still in their delay window, the first transmitted byte arrives quickly (~0.130 ms/byte at TimerA=$07) and can trigger CIA2 NMI, forcing them to receive the packet before they ever switch into transmit.
  • PB7 remains LOW while they are interrupted/receiving (they already claimed it), so the bus stays in a consistent “busy” state during that period.

Net result: the combination of (1) claiming PB7, (2) an ID-based post-claim delay longer than a byte time, and (3) NMI-driven reception that pulls complete packets, strongly suppresses practical overlaps and gives lower IDs priority.


You can speed up the network transfer by changing the ID waiting procedure as follows (NOT tested yet):

In-memory code (shared core at $9A00):

  • 9BD4: LDX $9FFF ; load ID number (1..4)
  • 9BD7: BNE $9BDE ; HACK: makes it (ID-1) instead of ID
  • 9BD9: LDY #$25 ; delay counter ($25 instead of $32)
  • 9BDB: DEY
  • 9BDC: BNE $9BDB
  • 9BDE: DEX
  • 9BDF: BNE $9BD9
  • 9BE1: SEI ; continue...

This results in the following, new wait/delay times (PAL, φ2 ≈ 985248 Hz):

  • ID1: 11 cycles -> 0.011 ms
  • ID2: 202 cycles -> 0.205 ms
  • ID3: 393 cycles -> 0.399 ms
  • ID4: 584 cycles -> 0.593 ms

Binary patch (byte changes)

A) Patch after loading (monitor / in-memory), both sd2.75 and transm 2.75:

  • $9BD7: F0 08 -> D0 05
  • $9BDA: 32 -> 25

B) Patch inside the PRG files (file offsets, with 2 start bytes at the beginning)

  • sd2.75 offset $01D9: F0 08 A0 32 -> D0 05 A0 25
  • transm 2.75 offset $020D: F0 08 A0 32 -> D0 05 A0 25

This changes the waiting sequence and shortens the per-step delay.


TRANSM 2.75 / SD2.75 packet format and $FFxx exec special-case (confirmed by disassembly)

Both transm 2.75 and sd2.75 contain the same core network code at $9A00 (TRANSM adds a small relocator from $0835 -> $9A00 and then jumps into the shared code). In particular, the receive/NMI (“←x,y[,z]” over BASIC-Interpreter) transfer mechanism is implemented inside the shared $9A00 block.

Packet layout (“← start, end [, exec]” in BASIC)

A “data/transfer” packet is recognised by the first byte having bit 7 set.

Byte 0: DEST (bit 7 set)

  • 0..127 after masking with $7F
  • 0 = broadcast (all nodes)
  • 1..4 = addressed to node ID 1..4
  • If DEST is not 0 and not my ID, the receiver still reads the full packet but does not store it.

Byte 1: START_LO Byte 2: START_HI Byte 3: END_LO Byte 4: END_HI

  • END is treated as an exclusive end address. (Example from your loader: 49152..53248 transfers exactly 4096 bytes.)

Bytes 5 .. (5 + (END-START) - 1): DATA

  • DATA length = END - START bytes, written sequentially starting at START.

Last 2 bytes after DATA: Byte N: EXEC_LO Byte N+1: EXEC_HI

  • EXEC = $0000: no immediate execution; receiver just signals “transfer complete”.
  • EXEC = $FFxx: BASIC special-case (see below).
  • otherwise: code is executed via an indirect jump to EXEC (JMP ($00C3)). The executed subroutine returns with RTS

(Implementation details: START is held in $F7/$F8, END in $AE/$AF, EXEC in $C3/$C4, and “packet complete” flag is set at $9FFA.)

Meaning of EXEC high-byte = $FF (the “$FFxx special-case”)

If EXEC_HI == $FF, the code does not JMP to EXEC. Instead it performs a BASIC “takeover” / warm-start sequence so that a freshly transferred BASIC program can run cleanly:

  • It copies END (exclusive) into BASIC’s VARTAB pointer ($2D/$2E), so BASIC knows where program text ends / variables begin.
  • It sets BASIC memory-top pointers to $9900 (routine at $9AB8), reserving $9900+ for the network code/data.
  • It calls BASIC/KERNAL initialisation/warm-start routines and starts the BASIC program.

In short: EXEC=$FFxx means “a BASIC program was transferred; fix BASIC pointers and start the BASIC program”, not “execute at $FFxx”.


NET’97 core (sd2.75 / transm 2.75) — jump table and calling conventions

Jump table at $9A00 (after relocation). All entries are JMP absolute.

  • $9A00 -> JMP $9A4A (INIT) Purpose:

    • initialises CIA2 serial mode + Timer A (e.g. $0007), installs CIA2 NMI handler
    • reads node ID from PB0/PB1 and stores it to $9FFF
    • initialises the short-packet FIFO offset table at $9FB0[0..31] = 0,8,16,...,248
    • sets FIFO base high byte $9FFE = $99 (FIFO payload area = $9900..$99FF) Inputs:
    • PB0/PB1 straps must be present on the userport Outputs / state:
    • $9FFF = node ID (1..4)
    • $9FB0[0..31] = slot offsets (0..248 step 8)
    • $9FFE = $99
    • NMI vector ($0318/$0319) is set to the NET’97 CIA2 NMI handler
  • $9A03 -> JMP $9BC2 (SEND) Purpose:

    • claims the shared bus (PB7/PB6), applies ID delay, transmits via CIA2 SDR ($DD0C), then releases PB6/PB7 back to input. Inputs (mode is selected by bit 7 of $9FF9): A) Short packet mode (bit7 of $9FF9 = 0): $9FF9 = DEST (0=broadcast, 1..4=single target) $9FF8 = LEN (intended 1..8; code will send LEN bytes, but the buffer is only 8 bytes) $9FF0..$9FF7 = payload bytes On wire: DEST, LEN, PAYLOAD[0..LEN-1]

    B) Long / block transfer mode (bit7 of $9FF9 = 1): $9FF9 = DEST|$80 (0=broadcast, 1..4=single target, with bit7 set) ZP $B0/$B1 = START (source pointer) ZP $AC/$AD = END (exclusive) ZP $C1/$C2 = EXEC (see below) Note: SEND temporarily changes CPU port $01 to read RAM under ROM ($30/$37 toggling). On wire: DEST|$80, START(lo,hi), END(lo,hi), DATA..., EXEC(lo,hi)

     EXEC meaning on receiver:
       $0000  = no immediate execution
       $FFFF  = BASIC restart / warm-start with BASIC pointers adjusted
    
  • $9A06 -> JMP $9C6F (PRINT STRING) Purpose:

    • prints a 0-terminated string via KERNAL CHROUT ($FFD2) Inputs:
    • A = string pointer low byte
    • X = string pointer high byte
  • $9A09 -> JMP $9E00 (MAIN WAIT/DISPATCH LOOP) Purpose:

    • waits for “long packet received” flag and then dispatches EXEC Inputs/state it uses:
    • $9FFA is set to $80 by the NMI handler when a LONG packet is fully received
    • $C3/$C4 hold the received EXEC address for long packets
  • $9A0C -> JMP $9D87 (PROGRAM ENTRY) Purpose:

    • common program entry / UI start (TRANSM uses this as its post-relocator jump target) Note:
    • the “client wait for incoming program blocks” behaviour is implemented via $9A09 ($9E00), not via $9A0C.

Who writes the $9900 short-packet FIFO?

The $9900..$99FF “input buffer” is written by the NET’97 core (present in both sd2.75 and transm 2.75).

When a short packet is received, the CIA2 serial NMI handler (NET core) reads: DEST, LEN, then LEN payload bytes from CIA2 SDR ($DD0C), and enqueues the payload into a 32×8-byte FIFO: slot base = $9900 + 8*slot, with bookkeeping in:

  • $9FFD = write index
  • $9FD0[slot] = payload length (0=empty, 1..8=valid)
  • $9FFB = last received DEST

NET’97 globals (relevant subset)

  • $9FF0..$9FF7 Short SEND payload buffer (8 bytes max, caller-owned)
  • $9FF8 Short SEND length (LEN) Also used by SEND loop as “how many bytes to transmit from $9FF0..”
  • $9FF9 DEST / mode selector for SEND bit7=0 => short packet bit7=1 => long packet
  • $9FFA Long-receive flag: set to $80 by NMI handler when a LONG packet is complete cleared/used by higher-level code for synchronisation
  • $9FFB “last received DEST byte” (set at start of every received packet)
  • $9FFC Short-packet FIFO read index (0..31) (consumer advances this)
  • $9FFD Short-packet FIFO write index (0..31) (NMI receive advances this)
  • $9FFE FIFO payload base high byte ($99) => payload area $9900..$99FF
  • $9FFF My node ID (1..4)
  • $9FB0[0..31] FIFO slot offset table: 0,8,16,...,248
  • $9FD0[0..31] FIFO slot length table: 0 => slot empty 1..8 => number of valid payload bytes in slot
  • $9900 256 byte input buffer (32x 8 bytes)

NET’97 short packets (LEN <= 8): payload types used by MB2.71

The machine programm MB2.71 uses the IRQ to frequently consume this NET'97 FIFO input buffer (using $9FFC as read index) and interprets the payload bytes (type = payload[0]). So: NET core writes $9900 (NMI receive), MB2.71 reads/processes it later via IRQ (dispatcher). (MB2.71 also uses the IRQ for raster-line interrupts to give individual text rows a different background colour (for example, to separate the map from the text area, or to create a large input cursor), and to play sound effects when requested by the BASIC program.)

Short packet on the wire: DEST, LEN, PAYLOAD[0..LEN-1]

  • DEST comes from $9FF9 (0=broadcast, 1..4=single target). Bit7 must be 0.
  • PAYLOAD is queued by the NET core into a 32×8-byte FIFO at $9900..$99FF (slots).
  • MB2.71 consumes the FIFO later and interprets PAYLOAD[0] as a “type” byte. (Unknown type bytes would effectively stall the FIFO, so in practice only the types below are used.)

Type byte = PAYLOAD[0] (a value, not an address / not “#” immediate notation).

Type $01 — ANNOUNCE / LOGIN (client -> host)

Payload: [ $01, client_id ] Meaning:

  • Used as “I am here / logged in”.
  • On host, sets a presence flag for that client ID (2..4).

Type $03 — POLL / REQUEST ANNOUNCE (host -> clients)

Payload: [ $03 ] Meaning:

  • “Please announce yourself.”
  • Client replies with type $01 if ZP $00CC != 0: reply payload = [ $01, my_id ], DEST = 1 (host).

Type $02 — WAIT FOR NEXT LONG TRANSFER (barrier)

Payload: [ $02 ] Meaning:

  • Dequeues this short packet, clears $9FFA, then blocks until $9FFA becomes non-zero again.
  • $9FFA is set by the NET NMI receiver when a LONG (block/file) transfer is fully received. Use:
  • Synchronisation around long transfers (boot/distribution, loading larger blocks, etc.).

Type $10 — POKE TRIPLETS (fast map/screen updates)

Payload: [ $10, addr_hi, addr_lo, value, addr_hi, addr_lo, value, ... ] Meaning:

  • Writes VALUE to the given address(es) (RAM under ROM handling is done in the routine).
  • Fits up to 2 triplets in one short packet (1 + 2*3 = 7 bytes). Use:
  • Ideal for “unit moved A->B”: update two map positions with minimal overhead.

Type $13 — PAUSE / STOP-THE-WORLD FLAG

Payload: [ $13 ] Meaning:

  • Sets $9005 = $80.
  • BASIC checks PEEK(36869) (=$9005) and branches into a “pause / do not continue” path. Use:
  • Host pause (save in progress, tea break, etc.).

Type $14 — SYNC COUNTER (turn/step number)

Payload: [ $14 ] Meaning:

  • INC $9006.
  • BASIC checks PEEK(36870) (=$9006) against a target value to synchronise steps/turns. Use:
  • Turn/step synchronisation barrier (“wait until everyone reached turn N”).