In resurrecting abandoned projects

Or, where the heck have you been?

Once I began analyzing some of the routines internal to PROMAL ages ago, I found myself fascinated and then obsessed with the simple elegance that the system is written with.  I’ve been working on commenting a full disassembly of the program but haven’t yet decided to officially relase it.

Therein comes a challenge for myself.  I’d like to document the steps I used to reverse the language’s concepts, internal variables, and machine code here.  However it will take some effort to refresh my memory as the initial disassembly was performed nearly two years ago, and I’m a bit stuck for time.

Let’s just provide a screenie of the title page for entertainment purposes, as well as a shout-out to the venerable COMPUTE! Books series that covered many CBM systems including drives and computers.


Hopefully life will be a bit more generous with spare time and inspiration so I can complete the work as well as get some serious documentation about the process posted here.

By the way, that font was HARD to find.  Still not an exact match for the one COMPUTE! used, but it’s close enough to pass.:)

[Edit: Found an attempted escape : q and removed it.  Meh.]

Automated parameter parsing in PROMAL

Or, decrypting the unencrypted cryptic code

I decided to spend some more time looking at how PROMAL works internally. The next routine I decided to examine is BLKMOV, as its function is similar to the MOVSTR function examined earlier.

Let’s have a look at the jump table to find BLKMOV:

0f30  4C A9 21    JMP $21A9

Easy enough! Let’s have a look to see how BLKMOV initializes itself, and to see if it accepts a 16-bit length.

21a9  20 15 18    JSR $1815
21ac  33 E0       RLA ($E0),Y
21ae  38          SEC
21af  34 2C       NOOP $2C,X
21b1  A5 34       LDA $34
21b3  38          SEC
21b4  E5 2C       SBC $2C
21b6  A8          TAY
21b7  A5 35       LDA $35
21b9  E5 2D       SBC $2D
21bb  AA          TAX
21bc  98          TYA
21bd  C5 38       CMP $38
21bf  8A          TXA
21c0  E5 39       SBC $39

Wait. WHAT? (Almost) no professional programmer would use illegal opcodes for a final product. The RLA and NOOP $2C,X are invalid opcodes. Also, I’ve tested PROMAL and found that it works well on the CMD SuperCPU accelerator, which will behave differently with illegal opcodes and cause PROMAL to crash.

Caller relative addressing
Let’s see what exactly is going on with 1815. I’ve studied the routine ahead of time and added some helpful comments and names to the disassembly. Because this routine is used by many library routines, I will refer to the caller as “the parent”. After analyzing the routine I’ll provide BLKMOVs results to help clarify the routine’s work.

                                        ; we are always a grandchild!
                                        ; parameter = value from p-code interp
                                        ; argument  = value from caller's table
1815: 68          getprm  PLA           ; get parents address
1816: 85 3A               STA Parent    ;
1818: 68                  PLA           ;
1819: 85 3B               STA Parent+1  ;   and save for our use
181B: 68                  PLA           ; get grandparent's address
181C: 85 6D               STA GParent   ;
181E: 68                  PLA           ;
181F: 85 6E               STA GParent+1 ;   and save for later restoration
1821: 84 6A               STY numparms  ; save number of parameters
1823: A0 01               LDY #$01      ; initialize index
1825: B1 3A               LDA (Parent),Y; get min/max parameters

I knew something was odd. This is an uncommon but handy trick: The data after the JSR in the parent is never executed. If you examine 1816, you see that it stores the parent’s address from the stack as MOVSTR did. It then stores the parent’s parent’s return address (‘GParent’) so that it can get to the parameters on the stack later. After that’s all set up, the number of parameters from PROMAL is saved to numparms and the Y index is initialized to 1.

Why all this work? The method is used when resources are tight: We have parameters on the stack that get processed by many routines in the library. It’s best for code space efficiency if a single routine handles these parameters. However, not all library routines use the same number or even type of parameters. That’s where this routine comes in. The arguments for the ‘getprm’ routine are stored after the JSR from the library routine calling it. This way each library routine will be able to specify what type of information it expects to find on the stack.

On arguments and parameters
In this post I need to distinguish between two things: The data used by the getprm routine, and the data the parent needs from the stack. In this case, ‘argument’ refers to data used by getprm, and ‘parameter’ refers to any data passed on the stack by PROMAL. This is done in consistency with the MOVSTR post.

Let’s have a good look at this routine to understand what it does.

Setting things up
We already have the calling routines’ addresses safely shuffled away, and we have our first argument retrieved from the parent.

1827: 29 0F               AND #$0F      ;  Mask max #parms off
1829: 85 69               STA maxparms  ;   and save
182B: C5 6A               CMP numparms  ;  compare with paremeter count
182D: 90 10               BCC getperr   ;   too many, runtime error

In the segment above, the argument loaded from the parent is masked off and saved. Studying the routine ahead of time helped me understand that the low half of the first argument is the ‘minimum’ number of arguments the parent requires. If the number of arguments provided by PROMAL (‘CMP numparms’) is larger than the maximum, the routine branches off to a fatal runtime error (‘BCC getperr’).

182F: B1 3A               LDA (Parent),Y; get min/max parameters
1831: F0 40               BEQ getpfin   ;  0/0 parms? exit.

The argument is reloaded since it was mangled when setting up maxparms. While it’s loaded and un-mangled, the routine checks to see if there are no parameters to be loaded. If this is the case, the routine exits. It would seem to make no sense to call this routine if you don’t want any parameters. I’d agree, but there must be a good reason to call it in this fashion as a few library routines do just that.

1833: 4A                  LSR           ;
1834: 4A                  LSR           ;
1835: 4A                  LSR           ;
1836: 4A                  LSR           ;
1837: 85 6B               STA minparms  ;  save min #parms
1839: C5 6A               CMP numparms  ;  compare with parameter count
183B: F0 05               BEQ getpok    ;  same? ok.
183D: 90 03               BCC getpok    ;  more than min parms? ok.
183F: 4C 80 10    getperr JMP syserr    ; fail out via system error

Now, the upper half of the first argument is shifted down and store in ‘minparms’. It’s again compared to the numparms value, this time to determine if there are at least the correct number of parameters (beq: bcc).  If not, the routine fails through to a jump to PROMAL’s fatal runtime error routine.

1842: C8          getpok  INY           ; Increment index
1843: B1 3A               LDA (Parent),Y; Get mask bits

There will be a lot of INY : LDA (Parent),y as the routine works its way through the argument table.

1845: 85 6C               STA getpmsk   ; Save in mask byte

The second argument is stored to getpmsk, short for ‘getprm mask.’ This byte is actually eight flags, each indicating the type of parameter to get from the stack. There are two types of data and one way to work with each. As a quick reminder, PROMAL always pushes parameters as words, even when they’re bytes.

Bit = 0    Parameter is a byte
           The next argument byte is a zero-page address and a default value.
           * Store this byte where specified at the address
           * Load and discard high byte from stack if applicable
           * Load and store low byte from stack at the address if applicable
Bit = 1    Parameter is a word
           This argument is one zero-page address.
           * Load and store the high byte from stack at the address+1
           * Load and store the low byte from the stack at the address

The routine appears to not have any facilities for handling a default 16-bit value.  It’ll be up to the parent to detect a missing 16-bit parameter and set up a default value in its place.

Processing arguments and setting up parameters
At this point, the routine is initialized and ready to load parameters as specified by the parent until it’s out of arguments.

1847: C6 69       getpl   DEC maxparms  ; Decrement parameter count
1849: 30 28               BMI getpfin   ;  Out of parms? exit.

Maxparms is now used as a count-down value to determine when the routine’s out of arguments. The name of the location is a bit of a misnomer, I apologize.

184B: C8                  INY           ;
184C: B1 3A               LDA (Parent),Y; get zp address
184E: AA                  TAX           ;  and save

The arguments now always start with a zero page address. This is read from the argument table and saved in the X register to be used as an index. This allows the code to run without modifying itself and is a good example of advanced indexing when used in this situation.

184F: A5 69               LDA maxparms  ; check max parms
1851: C5 6B               CMP minparms  ;  are we out of required parms?
1853: 90 0F               BCC gpfprm    ;  No, go pull it off the stack.

In this section of the loop, maxparms is compared with minparms to determine whether or not we’re out of required parameters.

1855: 24 6C               BIT getpmsk   ; is current parm a word?
1857: 30 05               BMI gpisw     ;  yep, skip

Remember the paramter type mask? This is one of the two checks against the flags in the loop. The BIT instruction does a handful of things, but of interest to the routine is the way it copies bit 7 of getpmsk to the negative flag without modifying any other registers. In this case, if a parameter is a word the negative flag gets set and the BMI (branch if minus) routes the cpu to gpisw (short for getparm is word), below.

1859: C8                  INY           ;
185A: B1 3A               LDA (Parent),Y; get default value or low byte
185C: 95 00               STA 0,X    ;  store at zp address
185E: A5 69       gpisw   LDA maxparms  ; is current argument
1860: C5 6A               CMP numparms  ; greater than parameter count?
1862: B0 0A               BCS gpdefl    ;  Yes, process default value

The next argument byte is loaded if it’s a ‘byte’ type. It’s stored at the zero page location pointed to by X, which was read in earlier. Then it follows through to ‘gpisw’, which checks the current argument against the number of parameters provided to the parent by PROMAL. If we’re out of parameters, we skip off to gpdefl, which is short for ‘getparm default’.

1864: 68          gpfprm  PLA           ; Get parameter from stack
1865: 24 6C               BIT getpmsk   ; current parm = word?
1867: 10 02               BPL gpis8     ;  no, skip high byte store
1869: 95 01               STA 1,X   ;  * store high byte if word
186B: 68          gpis8   PLA           ; get low byte or default value
186C: 95 00               STA 0,X    ; store where requested

In a moment whose reason eludes me, I called this branch point gpfprm. What this section does is first pull the high byte of the next parameter from PROMAL off the stack and then check the parameter type mask to see if it’s a byte type. If so (BPL, as bit 7 would be a zero its plus or positive), it skips to gpis8, discarding the byte. If it’s a word, it gets stored to X+1 by using a base of 1 instead of 0.
Gpis8 always pulls and stores the byte to the location indicated by X.

This method is clever: The routine first loads the default value from the argument block into memory, and then only loads a value if there’s one available on the stack. It’s a good way of ensuring a default is in place if it’s not specified by PROMAL.

186E: 06 6C       gpdefl  ASL getpmsk   ; shift parameter bit mask
1870: 4C 47 18            JMP getpl     ; loop

The argument mask is shifted one to the left to ensure it stays in sync with the argument index in Y. Then, the loop is restarted.

Cleaning up

1873: 98          getpfin TYA           ; transfer index to A
1874: 18                  CLC           ; pre for math
1875: 65 3A               ADC Parent    ; Add our parent's return address
1877: AA                  TAX           ;
1878: A5 3B               LDA Parent+1  ;
187A: 69 00               ADC #$00      ;
187C: 48                  PHA           ;  and put on stack
187D: 8A                  TXA           ;
187E: 48                  PHA           ;  for rts.

As we don’t want to return into a data block, we’ll add our current value for Y to the parent’s calling address and put it on the stack. This ensures we safely RTS into the byte following the argument table.

187F: A5 6D               LDA GParent   ; get grandparent's address
1881: 85 3A               STA Parent    ; place where p-code expects parent's
1883: A5 6E               LDA GParent+1 ;
1885: 85 3B               STA Parent+1  ;
1887: 60                  RTS           ; and return to arg table+1

And as a last bit of cleanup, the grandparent’s address is placed where our parent would expect it to be, leaving the runtime in a good state and keeping the stack clear.

How BLKMOV used this routine
BLKMOV used this routine to set up all of its zero page vectors. Once getprm is done, the routine looks largely like MOVSTR, so I’ll (probably) cover it later.

Here’s what getprm did for BLKMOV. I’ll include the first part of BLKMOV again, with a bit better formatting since we know what the data following the JSR is for.

21a9  20 15 18    JSR $1815     ; jsr to getprm
                  .byte $33     ; min/max number of parameters
                  .byte $e0     ; %1110 0000 - all three parameters are words
                  .byte $38     ; first word stores at $38 and $39
                  .byte $34     ; second word stores at $34 and $35
                  .byte $2c     ; third word stores at $2c and $2d

When getprm runs for BLKMOV, it performs these actions:
* Writes the last parameter (Count) to $38 and $39
* Writes the second parameter (From) to $34 and $35
* Writes the first parameter (To) to $2c and $2d
* Cleans house and returns to the byte following the argument table at 21b1

This might look a little familiar. MOVSTR uses the same vectors.

In summary
The routine is very handy in that you can easily specify what you need loaded, as well as quickly specifying how many parameters you require and how many you can take. The limit for the number of parameters is sensibly eight, given the mask argument is a byte providing 8 flags for parameter types.

The getprm routine is used by many routines in the PROMAL system, including (but not limited to) GETC, GETL, BLKMOV, OPEN, CLOSE, CHKSUM, and EDLINE.

Moving data in PROMAL

or, Losing your mind with PROMAL

Learning how things work

In a recent experiment in learning to work with PROMAL, I needed a method for moving pieces of data around in memory to split strings. The MOVSTR library procedure seemed ideal, but consistently missed the mark and corrupted memory.

As it turns out, I had an addressing issue as well as a misunderstanding of what the procedure will do for me.

For our reference, I’ll quote the MOVSTR proc’s documentation from the Library Manual.


USAGE: MOVSTR FromString, ToString [,Limit]

MOVSTR is a procedure which is used to copy strings, to concatenate strings, or extract substrings (i.e., replaces the LEFT$, MID$, and RIGHT$ functions found in BASIC).  FromString is the address of the string to copy.  ToString is the address of the destination.  Limit is an optional argument specifying the maximum number of characters to copy.

This brings up some useful syntax in PROMAL:  Specifying the address of a string.  In my project, I needed to extract the middle of a string and deposit into another variable.  My first attempt used this method:

movstr buf[3], name, 16

A person would think then, movstr would copy from buf[3] to buf[19] into name, but this was not the case.  After some deep debugging of the PROMAL library routine itself, I learned that I was in fact telling PROMAL to use the address at buf[3] and buf[4] as the source for the string to put into name.  This is an inconsistency in addressing that was learned:  When I specify ‘movstr buf,name,16’ it will use the location of buf[], but if I use ‘movstr buf[3],name,16’ it instead uses a vector placed at buf[3].  To fix this issue, use the # operator to specify ‘the address of…’:

movstr #buf[3], name, 16

The alternate format tells the compiler to use the address of buf[3] instead of a vector at the same location.

Learning inner workings through an assembly debugger

Disclaimer: Most 8-bit fans will balk at using an emulator to develop programs for their beloved 8-bit systems.  I do heavily prefer to develop on the machine itself, but there’s little that beats a debugging system that will stop the system cold: video refresh, hardware timers, everything gets paused.  As I was having trouble doing local development, I moved my data to the Vice emulator and got to work.

The MOVSTR function

In the library, the definition for MOVSTR is ‘EXT ASM PROC MOVSTR AT $F33’.  This tells the compiler that MOVSTR is a procedure that can be called directly in memory at location $0f33.  In my particular installation of Vice, Alt-H opens the debugger, pausing the emulation.  A quick look at f33 will show that it’s part of a jump table:

(C:$f33e) d f33
.C:0f33  4C 38 22    JMP $2238
.C:0f36  4C 18 22    JMP $2218
.C:0f39  4C DF 26    JMP $26DF

Of course, the only real interest is the first instruction: jmp $2238.  Let’s have a look there.

Exploratory surgery (or, finding out how PROMAL thinks)

At $2238 is a fairly straightforward routine.  For reference, the code below is being called by my test program after things are in working order as that’s the only debug log I saved.  There’s still a lot to learn!

Here’s the processor registers when the call is made: a=3 x=3 y=3 sp=f7

First, promal saves the calling address in a scratch space so it can return to the caller:

2238  68          PLA
2239  85 3A       STA $3A
223b  68          PLA
223c  85 3B       STA $3B

Now, we can examine the stack and prepare something ahead of time:

223e  A9 FF       LDA #$FF
2240  C0 03       CPY #$03
2242  D0 03       BNE $2247

At this point, I recognize the #03:  Movstr can have two or three parameters, so apparently the Y register holds the number of parameters for the function.  I specified three in my application, so this falls through to the next instruction:

2244  68          PLA
2245  68          PLA
2246  88          DEY
2247  85 38       STA $38

At first, this confused me greatly.  Why would you pull two bytes from the stack without saving the first?  As it turns out, the third parameter is only supposed to be a byte, rather than a word.  However, the compiler apparently always pushes words to the stack.  The first PLA simply pulls the unused high byte of the word and discards it.  DEY is a setup for the next compare below:

2249  C0 02       CPY #$02
224b  F0 03       BEQ $2250
224d  4C 80 10    JMP $1080

Here’s the second check.  Remember, movstr can have two or three parameters.  Here, Y is checked to see if it’s 2.  If it is, the jmp is skipped.  For reference, $1080 is a runtime error routine.  I checked by entering ‘go 1080’ in the promal executive.  PROMAL replied with this:

AT $C3F3

Continuing to $2250, the routine then gathers more information:

2250  68          PLA
2251  85 35       STA $35
2253  68          PLA
2254  85 34       STA $34
2256  68          PLA
2257  85 2D       STA $2D
2259  68          PLA
225a  85 2C       STA $2C

At this point, the MOVSTR routine has everything set up for the routine below.  The [limit] was processed early on, and now the [tostring] and [fromstring] parameters are stored in zero-page as well.  Tearing apart the actual copy routine is beyond the scope of this post, but I’ll include it for reference.

225c  A5 34       LDA $34
225e  38          SEC
225f  E5 2C       SBC $2C
2261  AA          TAX
2262  A5 35       LDA $35
2264  E5 2D       SBC $2D
2266  D0 1F       BNE $2287
2268  8A          TXA
2269  C5 38       CMP $38
226b  B0 1A       BCS $2287
226d  A0 00       LDY #$00
226f  B1 2C       LDA ($2C),Y
2271  F0 0A       BEQ $227D
2273  C8          INY
2274  C4 38       CPY $38
2276  90 F7       BCC $226F
2278  A9 00       LDA #$00
227a  F0 03       BEQ $227F
227c  88          DEY
227d  B1 2C       LDA ($2C),Y
227f  91 34       STA ($34),Y
2281  C0 00       CPY #$00
2283  D0 F7       BNE $227C
2285  F0 15       BEQ $229C
2287  A0 00       LDY #$00
2289  A5 38       LDA $38
228b  F0 0D       BEQ $229A
228d  B1 2C       LDA ($2C),Y
228f  91 34       STA ($34),Y
2291  F0 09       BEQ $229C
2293  C8          INY
2294  C4 38       CPY $38
2296  90 F5       BCC $228D
2298  A9 00       LDA #$00
229a  91 34       STA ($34),Y

Remember how we started?  We stored the return address at $3a so we could examine the parameters on the stack.  To return, an internal routine is then run, which does the work of putting the calling routine back on the stack and returning:

229c  4C 69 20    JMP $2069
[at $2069]
2069  A5 3B       LDA $3B
206b  48          PHA
206c  A5 3A       LDA $3A
206e  48          PHA
206f  60          RTS

In Summary…

Lesson learned?  The [limit] parameter for MOVSTR has a maximum value of 255, and one has to be very careful about how the parameters are specified.  We don’t have the luxury of a memory protection unit that modern systems have, so an incorrectly specified parameter can cause the whole environment to be overwritten at random.

Also, if you looked carefully at the entire routine, you’ll notice that the copy will stop on the first null ($00) byte it finds.  As it’s a ‘string’ move rather than a block move, it makes sense considering PROMAL uses ‘ascii-z’ strings.

I also got a good chance to see how exactly the PROMAL compiler passes its data to procedures via the stack.  What I learned is confirmed in the promal language indexes, specifically the section on calling external assembly functions and procedures.

Making things a bit faster

Or, reducing frustration

The project has seen a bit of silence recently.  I’d apologize but I’m not really all that sorry about it.  I’ll offer an explanation:

The process for programming the NVRAM takes 6 minutes for a full update, and the tools I made in BASIC are at best a kludge.  If I did anything imperfectly I’d most often have to reload the entire 8k since I don’t have a debugger yet.

Enter PROMAL discovered by an online colleague we lovingly refer to as ShadowM.  He recently acquired the long forgotten and abandoned software and graciously offers it to the Commodore community.

PROMAL is a high level compiled application language originally written for the C-64, Apple2, and IBM PC running DOS.  To date, Ive only managed to find the C-64 version ShadowM has on his site.

PROMAL is a native development environment including a functional editor and commandline environment that supports passing arguments to programs being called.  This is a great advantage, as a user can just specify an action as well as a target, rather than having the action app request what the target is intended to be.

As PROMAL is a compiled language, it does tend to run a bit more efficiently.  There are also other things that can be done in the language that enables one to increase efficiency.  For example, the most common data type is a 16bit word rather than a 40bit float.  All those loops in the BASIC program?  The address counters, the data byte… all of those are stored and processed as floats.  Not so in the programs below, as I only use ‘byte’ and ‘word’ variables, which are optimally sized as 8 bit and 16 bit unsigned numbers, respecitvely.

How much more efficient than BASIC is this implementation?  In this specific application we’re looking at a speed increase of 383%.  Nearly four times the operational speed.  There’s light at the end of the reprogramming tunnel the moment enter is pressed!

Another advantage of PROMAL is the ability to define where a variable rests in memory.  Take the nvdefs.s file below as an example:

ext byte via1[] at $de20

There are some key words here:  ‘ext’ refers to an external reference.  ‘byte’ defines the variable’s values to be stored as bytes.  The name ‘via1’ is assigned and the brackets indicate the variable is an array.  Finally, the specified address is $de20.  Readers might remember the address of the first VIA on the IO card I’m using for this project being at that address.

What does this do for me?  It’s simple.  I use the constant also defined in nvdefs.s called ‘porta’ to set up the output pins as desired, without a poke or a lot of math: via1[porta]=$ff.  This causes the value of porta to be offset into the via1 array, and a $ff is stored there, turning all bits on.  No run around looking up a variable, getting its value, then converting the float to an unsigned integer, setting up a vector, loading porta into an index, and hten finally storing.  It just skips the second through fifth steps.   You might also notice ‘addr’ in nvdefs, which is a WORD type set at DE20.  This means any 16bit address stored in the ‘addr’ variable is automatically stored to both port a and port b on the first via at that address.  No address splitting or additional processing.  This all adds to speed, which really matters when you only have approximately 300,000 operations/second available.

Below is the nvdefs.s file I created for the project.  PROMAL includes among other things the ability to include secondary files in the same way the modern C compiler has headers.  These files are considered part of the compiler’s input stream at the position they’re included, and can provide hardware and operating system abstraction.  If the apple2 version of PROMAL weren’t made of unobtanium and a “john bell” 32bit IO card were in use, nvdefs.h could be changed to reflect the IO port addresses for the apple2 card and the application below could be used without modifications after a simple recompile.

File: nvdefs.s

;DEFs for nvram read/write via Schnedler ultimate interface
;IF VIAS CHANGE also change named items below.

; assume VIA1 is at DE20, VIA2 at DE30
ext byte via1[] at $de20
ext byte via2[] at $de30
con ddra=3
con ddrb=2
con pa=1
con pb=0

; Named items for nvram config:
; Bits on control port: WE CS OE nc nc nc nc nc
con off=$e0 ; chip offline
con wr =$20 ; chip write
con rd =$80 ; chip read

; Named items for via control
con in=0   ; for DDR
con out=255; for DDR

; Named items for nvram address/data bus
ext word addr at $de20 ; access via1 ports a and b as 16bit unsigned!
ext byte dat  at $de31 ; via2 port a for data bus
ext byte dc   at $de33 ; via2 port a ddr for data in/out
ext byte ctl  at $de30 ; via2 port b for chip control

And now the tool written to reprogram the NVRAM, rewritten in PROMAL.  It accepts its parameters from the commandline, making it easy to specify what needs to be done.

File: nvwr.s

program nvwr

include library
include nvdefs

word a       ; address presented to nvram
word length  ; number of bytes to prog
word file    ; file handle
word total   ; total bytes written
byte d       ; data byte to program
byte t       ; scratch space

if ncarg < 1
  put "Usage:  nvprog <file> [addr] [len]",nl
  put "  All values are specified",nl
  put "  in hexadecimal.",nl,nl
if file=0
  put "Unable to open ",carg[1],nl
if ncarg>1
  t=strval(carg[2],#a,16,4) ; address
if ncarg>2
  t=strval(carg[3],#length,16,4) ; bytes to program

output "Loading $#4H bytes from ",length
put carg[1]," into NVRAM.",nl

while getcf(file,#d) and length>0
  ctl =off ; ensure chip is offline
  addr=a   ; set up address bus
  dat =d   ; set up data bus
  dc  =out ; drive data bus
  ctl =wr  ; write byte (cs,we)
  ctl =off ; back offline
  dc  =in  ; prep for verify
  ctl =rd  ; re-read byte
  if dat <> d ; live compare data bits on ram with d
    put "ERROR: failed to verify byte",nl
    output "#4H >#2H <#2H#C",a,d,dat
  length=length-1 ; count bytes left to program

output "Wrote #4H bytes.#C",total


Sadly, I don’t offer anything in the way of pictures in this post, but I hope the information was interesting regardless.  Real Soon Now(TM) I’ll be able to get a more featured toolset for getting my minimal debugger going on the IIEasy Print card.  Once the machine language monitor is fully functional, everything will suddenly become very simple.

To learn more about PROMAL, you can download the documentation easily from ShadowM’s webpage.


WTH is this thing doing?

Or, examining the reset cycle.

In traditional fashion for myself, I bit off a bit more than I can easily chew.  I added some nice routines to a vintage computer’s system debugger in an attempt to quickly port it to the board and hopefully just get rolling.  These software modifications will be covered later.  Today however we’ll be covering a second mental exercise.  Off to the oscilloscope.

For a bit of early reference, I’ll be doing this with a Tektronix 2246 scope, using four channels as described.  They’re listed in the order they appear in the ‘scope pictures, from top to bottom.

  • Channel 1: /RESET line on the cpu
  • Channel 3: Phase2 system clock, to which all CPU transactions are synchronized
  • Channel 2: Floating probe to examine data lines during the reset loop
  • Channel 4: Floating probe with hook to attach to a visual reference signal

The challenge in figuring out the reset sequence is that there’s no really good way to trigger it besides repeatedly resetting the CPU.  Even then, you’d have to trigger on the rising edge of the /RESET line and then change your trigger to the CPU’s PHASE2 output to get good sync with your data transfers or you’ll just see intermittent pulses of unaligned digital noise.


A challenge I faced as well is that I run everything on a two by four foot desk that has a PC keyboard as well as my GPIO enhanced Commodore.  The scope fortunately doesn’t mind running on the floor in a pinch, so it sits to my left.  This excludes any space for a signal generator to provide the repeating reset pulse.  What to do?

Finding a slow clock

As I needed to repeatedly reset the CPU at a low frequency and didn’t have a clock generator handy, I thought of options:

  • Program the 6522 VIA on my Commodore to generate a repeating pulse
  • Use a serial TTY to generate a repeating pulse by sending nulls out its port continuously

The second option was chosen since the serial port was already connected.  I modified the reset pin on the CPU so that it hangs out of its socket instead of being attached, and tied it to the RXD pin on the 6551 ACIA.  Note carefully that I didn’t connect it directly to the serial line, which would have permanently destroyed the CPU.  RS232 lines can run as much as +/-12v.

That fixed, we’re back to the trigger challenge.

Advanced triggering on the Tektronix 2246

The Tektronix 2246 oscilloscope has a pretty remarkable trigger section on it.  I had not considered the use of an A/B trigger setup since college, so it was a bit of self re-education.  The needs are simple: First, wait for /RESET to go high (trigger A, rising slope, channel 1).  Next, trigger on phase2 (Trigger B, rising slope, channel 3).

After a bit of trial and error as well as some remembering how to read the double ghosted image on the Tek’s display, I remembered what was I was seeing and settled on a configuration.  Here’s the steps I take.

  1. Attach lines in the configuration listed earlier in this article
  2. Set up A trigger (the most commonly used scope mode)
  3. Get my PC transmitting nulls:
    jbevren@epicfail:~/projects.local/iieasy$ sudo stty ispeed 9600 </dev/ttyS4
    jbevren@epicfail:~/projects.local/iieasy$ cat /dev/zero >/dev/ttyS4
  4. Set up a stable signal for channel 1:
  5. Trigger source: channel 1, DC coupling, auto level
  6. Place horizontal section into ‘alt’ mode
  7. Reduce the ‘A’ intensity and increase the ‘B’ intensity
  8. Set up a stable trigger for channel 3:
  9. Trigger source: channel 3, DC coupling, auto level

At this point, you should see something similar to the picture here.


The bright section at the left illustrates the section visible in trigger B.  It may help to understand the configuration if you monitor the width of the highlighted section while changing the horizontal sweep time while in the A/B alt setup:  The horizontal setting no longer affects the initial A trigger setup, allowing me to magnify as needed.

At this point, I don’t need to see the ‘A’ section, as I’ll only be checking in on the first few transactions to ensure the NVRAM’s getting read in correctly.  In this case, I set the horizontal section’s mode to ‘B’ only.  The ‘scope still processes the A trigger but no longer displays it.  This saves me some brain time as I won’t have to mentally separate the two overlaid images.

Now, I can see what’s going on through a slight bit of visual jitter on the B trigger.  I’ll attach channel 4 to the CPU’s SYNC output, which asserts at a logic high level each time a new instruction is fetched.


As you can see there’s a bit of ghosting going on the display.  I can see through it, so it’s not a huge issue for me.  However, to try and clean it up a bit, I’ll try triggering on channel 4 (now on SYNC) to see if it remains stable after the A trigger’s processed.


This setup is perfect.  The initial SYNC pulse caused by the reset isn’t visible any longer, and a single SYNC pulse is visible on the right third of the display’s bottom line.

Finally, examining the reset sequence

Our first target is the address on the first SYNC pulse after RESET completes.  The address will let us know if the reset vector gets read correctly from the firmware NVRAM and will also tell us where the CPU’s actually starting up.  The data’s not yet important, so we’ll check address bits one by one.


I chose to provide more entertaining drivel over image processing, so the montage above only contains a few of the address lines I checked, recklessly copy/pasted into a modified image.  They’re address bits 3-0 from top to bottom.  If you look at them aligned with the clock at its low state you’ll see the first fetch is %1100, or a $C- the last digit in $FF5C.

I prefer to work from the high address bits down, as it makes it easier to think in a numeric fashion:  We write our digits in such a way that the highest valued digit is entered first.  In “$1,000,002” the 1 certainly has a higher value than the 2.  Examining the bits in this order allows me to simply enter each binary digit as I walk my way down the chip’s address lines.

To help find the appropriate pins, I’ve marked my CPU out with a bit of pencil to separate the address bits into groups of four.  The lead shines in the overhead light enabling me to quickly see where the labels are.


For anyone following along on a datasheet, remember this isn’t a standard CMOS 65c02.  It’s a g65sc102 CPU, which is software compatible but has a slightly modified pin layout.

In the end, the 16 address bits at the SYNC mark show me this first opcode fetch at address %1111 1111 0101 1100.  Note that the SYNC pulse is two cycles long, and the second address is incremented by one.  This comes to $FF5C, which is correct.  For reasons I’ll share later, a few lightbulbs may be appearing in some readers’ heads.  Don’t feel alarmed if the address looks familiar.

As the reset vector’s getting read correctly but the serial chip’s not getting polled as I expected, I’ll also read the opcodes it should be reading.  In a working system I’d only have one SYNC cycle, as the initial opcode is a JSR to a subroutine.

Here’s what the oscilloscope told me:

  • [sync] 0011 0011 = $33, invalid opcode [nop]
  • [sync] 0010 0000 = $20, JSR
  • [norm] 0010 0111 = $27
  • [norm] 1111 1111 = $FF

If the scope tells the truth, we start with an invalid $33 opcode, followed by jsr $FF27.  This isn’t what should be happening.  Considering I’m prone to errors as a part of being human, it’s entirely possible that my NVRAM programmer wiring or code may be at fault, so it’s time to re-verify that part of the project.  Perhaps it’s even a good opportunity to wire up a board that would plug into the GPIO card rather than running a haystack of wires to a dip socket.😉


Today I learned a lot from things not working as expected.  A few people who do projects have told their audience that the best chance to learn is when there’s a failure.  I can agree wholeheartedly after today’s experience, as I re-learned a skill we got in our second year of college:  debugging a looping program using a simple 4-channel oscilloscope.  I also learned the new skill of advanced triggering on my own ‘scope, and will be able to use it in future projects without a doubt!

More will come later, as I take some time to re-verify the code for the programmer as well as the GPIO wiring to the socket used to set the NVRAM up with its code.

IIeasy Print recycling

Or, Reverse engineering a 6502 SBC

The victim

A few weeks ago I acquired a “IIeasy Print” automatic printer buffer/switch designed for Apple systems.  It was a curiosity because it had a few nice features and appeared to be a self-contained system.  The items below were discovered during research before the card arrived.

  • CPU (believed to be a g65sc102 based on usenet posts, a cmos 6502 variant)
  • 256k ram (two chips, 128kx8 PSRAM)
  • 6 serial ports (6551 acia)
  • system rom (unknown, believed to be 8k 2364 or 2764)
  • system ram (6116 2kx8 sram)

The type of cpu and rom were unknown as they had stickers covering their information in any photos I could locate.  Scouring the image of the card gave me the 6116 and 6551 chip ID’s, as well as a couple of others, such as 74ls138’s to assist address decoding and a 74ls374 near a 14 pin header.

Beyond this, the next step was to imagine possible memory maps and wait for the card to arrive.

Upon arrival, I eagerly opened the box and stripped labels off the unknown IC’s and identify chips further.  The CPU was confirmed to be a g65sc102 CPU, and the master crystal was found to be 7.32mhz.

Regarding system speed, most 6551’s are set up with a 1.84mhz clock to get perfect data rates.  The crystal is exactly four times the ideal clock.  The 6551’s are gs65sc51p-2, indicating a 2mhz ceiling on system speed.  As the CPU runs in sync with its IO chips, it’s also going to run at a max of 2mhz.

In effort to save costs, only one crystal is on card, and a system clock of less than 1mhz doesn’t sound reasonable for handling six serial ports, so a 1.8mhz system clock is assumed.

The next step is to verify the system’s memory map.  As there’s a PLA on card, it’s not going to be easy to work out the memory map based on the PCB traces and logic on the card.  Instead, I chose to brute-force the memory map via the CPU socket.

Before this could be configured I needed to decide on how exactly to scan the card for its memory map.  The most straightforward method seemed to be driving the address bus through the CPU socket and monitoring the chip select signals on the individual IC’s.  A relatively simple program could be built to do the work and create a log of what addresses involve each chip select line.

In the image above, you can see my trusty keyboard-enhanced Arduino work-alike.  Most people would call this a Commodore 64, but in this case I’m using it in the way most people these days would use an Arduino.

The vertical card is in a bus extender plugged directly into the Commodore’s memory bus.  It’s loaded with a pair of R6522P VIA’s, each of which provide sixteen GPIO lines.  The ribbon cables lead to an Atari VBXE adapter.  The adapter simply mirrors the 40 pins in the dip socket to a header on the side, making it easier to connect to the socket reliably.  Due to physical constraints, I had to insert the adapter upside down, but it can still serve its purpose in this configuration.

The upper VIA has all sixteen GPIO lines routed directly to the CPU socket’s address pins.  You can see this configuration with the two 8-line ‘rainbow’ cables leading from the vertical card to the IIeasy Print card under test.

The lower VIA serves two purposes:  First, port B’s higher bits control the CPU socket’s control bus, managing /reset, r/w, and phi2.  This is the smaller ribbon cable that has blue, white, grey, and purple connecting to the left edge of the header on the CPU socket.

The lower bits on port B as well as all bits on port A are configured as inputs, monitoring chip select signals on each IC I believed to be a memory bus target:  The 2k SRAM, each of the 128K sram’s, the system rom, the six 6551 ICs, and the 74ls374.  For socketed ICs, I simply lifted them from their sockets and attached a male pin lead to the /cs line on the target.  /OE was ignored as most systems based on the 6502 will ground /OE and use /CS and /WE to control the IC.  Chips without sockets had a test pin tack-soldered to their respective /CS (or CLK line in the 74ls374’s case) and a male-female pin lead was routed to the lower VIA’s port A lines.

The next step is to write a program to scan the card.  I wrote this in BASIC.  BASIC is slow, but it’s built into the system ROM on the Commodore and does its job well.

10 V0=56864
11 V1=56880
12 PB(0)=V0
13 PA(0)=V0+15
14 PB(1)=V1
15 PA(1)=V1+15
16 POKE V0+2,255:POKEV0+3,255:REM OUT
17 POKE V1+2,240:POKEV1+3,0
18 REM VIA 0A:     ADR LOW
21 REM VIA 1B7:    PHI2 OUT
22 REM VIA 1B6:    R/W OUT
60 POKE PB(0),0:POKE PA(0),0
70 POKE PB(1),0:FORT=1TO300:NEXT
80 POKE PB(1),32
100 FOR AH=0TO255
110 POKE PB(0),AH
120 FOR AL=0TO255
130 POKE PA(0),AL
140 PRINTAH*256+AL"{up}"
150 Y=PEEK(PA(1))
160 X=PEEK(PB(1))
210 PRINT ,"NEW: ";
220 B=(X AND 15)
230 GOSUB 300
240 B=Y
250 GOSUB 300
310 FOR T=7 TO 0 STEP -1
320 PRINTMID$(STR$(SGN(BAND(2^T))),2,1);
330 IF T=4 THEN PRINT" ";
340 NEXT
350 PRINT"  ";

Initial testing confirmed my bus scanner was working correctly by manually setting GPIO lines and checking the 2k SRAM’s /cs pin.  As expected, the 2k SRAM turned out to be at address 0, so it made a good test target.

From there, the program was run and left to scan the bus, while it displayed any changes in the chip select lines’ outputs.

Would you look at all that dust! :P

After the program was run, I evaluated the chip select log and came to this memory map as a conclusion:
Address Range   IC      Desc
0000-07ff       U503    2116 soldered-in ram
0800-1fff       U503    mirrors of U503
2000-9fff       u401/2  Buffer ram (256k is bankswitched here)
a000-bfff       empty
c000-c01f       U101    6551 acia
c020-c03f       U102    6551 acia
c040-c05f       U103    6551 acia
c060-c07f       U104    6551 acia
c080-c09f       U105    6551 acia
c0a0-c0bf       U106    6551 acia
c0c0-c0df       U701    74ls374
c0e0-c0ff       unknown
c100-c7ff       Mirrors
c800-cbff       U504    Read jumpers
cc00-cfff       Buffer RAM mapping
e000-ffff       U502    Firmware ROM (2764)

Of interest to me was that only 7 of 8 devices are present in the C000 range, and that only 32k of the 128k memory was mapped.  It’s time to read the ROM and get some more educated guesses about the card.

Since most of the hardware to fully control the card was already set up, I placed the system ROM back in its socket and changed the lower VIA’s port A to attach to the data lines.  I slowly but surely extracted the system code from the card and saved it to a file.

After loading the data into RAM on the Commodore and viewing it in a debugger, there appeared to be an initial store that wasn’t obvious:  The CMOS ‘STZ’ instruction was used to store a 0 into $CC00, and the debugger didn’t understand CMOS opcodes.  Further testing and experimentation discovered that the $CC00 register controls memory banking on the two 128k SRAMs as detailed below:

Bit     Effect
0       A15 on U401 and U402
1       0=select U401, 1=Select U402
2       A16 on U401 and U402

There are also many writes to the $C0E0 location in memory, but I wasn’t able to determine what IC is connected to it.  I might discover that later.
As I’m now aware of the system’s memory map, I could consider other uses for the card.  Currently, I”m working on a port of the original Apple2 System Monitor, as it’s almost as simple as a system firmware can get, requires only 2k of space, and is something I’m quite familiar with.  In ideal conditions the only real work for getting minimal functionality from the system monitor firmware would be patching code that writes to the screen and reads from the keyboard.  From there, additional changes could be made.  That will be covered in a later post.

Edited May 14 2015 to correct the ACIAs’ addresses.