This page (revision-9) was last changed on 05-Jan-2024 10:23 by bartgo 

This page was created on 05-Apr-2010 19:46 by Carsten Strotmann

Only authorized users are allowed to rename pages.

Only authorized users are allowed to delete pages.

Page revision history

Version Date Modified Size Author Changes ... Change note
9 05-Jan-2024 10:23 13 KB bartgo to previous
8 04-Jan-2024 22:34 13 KB bartgo to previous | to last
7 04-Jan-2024 22:23 13 KB bartgo to previous | to last
6 04-Jan-2024 22:18 40 KB bartgo to previous | to last
5 04-Jan-2024 22:09 15 KB bartgo to previous | to last
4 04-Jan-2024 22:06 15 KB bartgo to previous | to last
3 03-Feb-2023 15:21 14 KB Carsten Strotmann to previous | to last
2 05-Apr-2010 19:46 14 KB Carsten Strotmann to previous | to last
1 05-Apr-2010 19:46 14 KB Carsten Strotmann to last

Page References

Incoming links Outgoing links

Version management

Difference between version and

At line 1 removed one line
At line 11 changed one line
''Note: the below has been published in ANTIC 3/84 - with errors, preventing the code from running. Errata has been published in the following issue. Corrections were on the 1st screen (line 10 and 13) and the last one (1st line). Applied below. Full code (tested with APX Forth and adapted for pasting into Altirra and the Forth editor) maintained here: https://github.com/BartGo/forth-atari/blob/main/DISASM.4TH''
__''Note: the below has been published in ANTIC 3/84 - with errors, preventing the code from running. Errata has been published in the following issue. Corrections were on the 1st screen (line 10 and 13) and the last one (1st line). Applied below. Full code (tested with APX Forth and adapted for pasting into Altirra and the Forth editor) maintained here: [https://github.com/BartGo/forth-atari/blob/main/DISASM.4TH]''__
At line 13 changed one line
This article describes a Forth program to disassemble 6502 machine code instructions. Using it, you can get a listing of assembler mnemonics to help you figure out how programs written by others are put together. This is one of the best ways of improving your programming skills.
__''To run: ''__
At line 15 changed one line
The article is divided into two parts. The first part provides some background information on the 6502 instruction set, to help you understand how the disassembler works. It is not necessary to read this part to get the disassembler working, but it will help you to understand the output. The second part describes the program itelf and gives a sample result from running it.
__''{{{' WORD-TO-CHECK DIS}}}''__
At line 17 changed one line
Machine instructions can contain up to three bytes, the first of which is the operation code (telling the machine what to do), and the remainder give the operand or its address. These "address" bytes can be interpreted in one of several different ways, depending on the "addressing mode".
__''It works in APX Forth, has trouble in Antic Forth (1.4S).''__
At line 18 added 10 lines
__''There may still be errors (results of the disassembly don't look reliable), to be researched.''__
 
This article describes a Forth program to disassemble 6502 machine code instructions. Using it, you can get a listing of assembler mnemonics to help you figure out how programs written by others are put together. This is one of the best ways of improving your programming skills.
The article is divided into two parts. The first part provides some background information on the 6502 instruction set, to help you understand how the disassembler works. It is not necessary to read this part to get the disassembler working, but it will help you to understand the output. The second part describes the program itelf and gives a sample result from running it.
Machine instructions can contain up to three bytes, the first of which is the operation code (telling the machine what to do), and the remainder give the operand or its address. These "address" bytes can be interpreted in one of several different ways, depending on the "addressing mode".
At line 21 changed 2 lines
{{{
___________________
{{{___________________
At line 29 changed one line
In the 6502, absolute addresses require two bytes and the most significint digits of the address are stored in the byte with the highest address. That is why the absolute addresses are shown as CDAB. The notation (X) indicates "the contents of the X register." In this notation (OOAB) + (X) indicates "the contents of the memory byte at address OOAB plus the contents of the X register. A comma is used to separate the high and low bytes of an address where clarity requires.
In the 6502, absolute addresses require two bytes and the most significint digits of the address are stored in the byte with the highest address. That is why the absolute addresses are shown as CDAB. The notation (X) indicates "the contents of the X register." In this notation (OOAB) + (X) indicates "the contents of the memory byte at address OOAB plus the contents of the X register. A comma is used to separate the high and low bytes of an address where clarity requires.
At line 31 changed one line
All multiple address mode instructions in the 6502 instruction set can be used in the absolute address mode. The numerical "mode number" shown in the Table of Address Modes is the difference between an instruction's absolute address opcode and its "mode" opcode (plus hex 10 to avoid negative mode numbers).
All multiple address mode instructions in the 6502 instruction set can be used in the absolute address mode. The numerical "mode number" shown in the Table of Address Modes is the difference between an instruction's absolute address opcode and its "mode" opcode (plus hex 10 to avoid negative mode numbers).
At line 33 changed one line
A table of the absolute address opcodes (+10) for the various multiple address instructions, called MULTIMODE, is included in screen 30. Given an arbitrary opcode (say 65) we can find the first entry in MULTIMODE which exceeds the opcode (7D in this case) and subtract to get the mode number (08, corresponding to Zero Page, X). The mnemonic can be read as ADC from the ninth entry in MULTINAME (7D is the Ienth entry in MULTIMODE).
A table of the absolute address opcodes (+10) for the various multiple address instructions, called MULTIMODE, is included in screen 30. Given an arbitrary opcode (say 65) we can find the first entry in MULTIMODE which exceeds the opcode (7D in this case) and subtract to get the mode number (08, corresponding to Zero Page, X). The mnemonic can be read as ADC from the ninth entry in MULTINAME (7D is the Ienth entry in MULTIMODE).
At line 35 changed one line
The 22 entries in MULTIMODE account for 117 of the 151 valid 6502 opcodes (out of a maximum of 256 possible). The remaining 34 opcodes each identify a single address mode instruction and are dealt with by looking up tables called ONEMODE and ONENAME. These tables also include ten, renegade, multiple-address opcodes that, for reasons best known to the 6502 designers, don't result in correct mode number. The most irregular instruction is LDX, where only two of its five address modes fit the pattern. Th at is why LDX appears three times in ONENAME.
The 22 entries in MULTIMODE account for 117 of the 151 valid 6502 opcodes (out of a maximum of 256 possible). The remaining 34 opcodes each identify a single address mode instruction and are dealt with by looking up tables called ONEMODE and ONENAME. These tables also include ten, renegade, multiple-address opcodes that, for reasons best known to the 6502 designers, don't result in correct mode number. The most irregular instruction is LDX, where only two of its five address modes fit the pattern. Th at is why LDX appears three times in ONENAME.
At line 41 changed one line
Now for the disassembler program itself. The listing appears in screens 30 to 35. As is usual in Forth listings, the interesting part of the program appears last. The first few screens contain the building blocks from which the main program, called "DISASSEMBLE,"' or "DIS" for short, is constructed.
Now for the disassembler program itself. The listing appears in screens 30 to 35. As is usual in Forth listings, the interesting part of the program appears last. The first few screens contain the building blocks from which the main program, called "DISASSEMBLE,"' or "DIS" for short, is constructed.
At line 43 changed 10 lines
Before describing how DISASSEMBLE works, I shall define what each of the words used in the program does.
POINTER A variable containing the address of the current opcode.
ONEMODE A table containing the opcodes of those instructiclns which have only one addressing mode. Each entry consists of two bytes; the first byte gives the mode number and the second is the opcode.
STRING Compile the following text stream into the dictionary.
ONENAME A table containing the mnemonics of those instructions which have only one addressing mode.
MULTIMODE A table of base codes for those instructions which have multiple address modes. The "base code" for an instruction is its absolute mode opcode plus hex 10.
MULTINAME A table containing the mnemonics of those instructions which have multiple address modes.
MODE A table containing mnemonics describing the various addressing modes.
LENGTH A table giving the number of by which follow the opcode for the various addressing modes.
SEARCH OP add len -- I f
Before describing how DISASSEMBLE works, I shall define what each of the words used in the program does.POINTER A variable containing the address of the current opcode.ONEMODE A table containing the opcodes of those instructiclns which have only one addressing mode. Each entry consists of two bytes; the first byte gives the mode number and the second is the opcode.STRING Compile the following text stream into the dictionary.ONENAME A table containing the mnemonics of those instructions which have only one addressing mode.MULTIMODE A table of base codes for those instructions which have multiple address modes. The "base code" for an instruction is its absolute mode opcode plus hex 10.MULTINAME A table containing the mnemonics of those instructions which have multiple address modes.MODE A table containing mnemonics describing the various addressing modes.LENGTH A table giving the number of by which follow the opcode for the various addressing modes.SEARCH OP add len -- I f
At line 54 changed 7 lines
Searches a table of two-byte words of length "len" beginning at address "add" for a match to the single byte OP. The table must be arranged in ascending order; I is the index number of the first table entry, which is equal to OP (f=1) or exceeds OP (f = 0).
PRINTNAME I add - - Prints three characters (i.e, instruction mnemonic) beginning at address add + 3*I.
PRINTMODE MODE/4 -- Print the two-character mnemonic corresponding to the addressing mode "MODE".
PRINTADD MODE/4--f Prints the one- or two-byte "address" (if one exists) following the opcode. Sets f to 1 (to terminate disassembly) if the opcode is one of five instructions which can cause a jump.
CHKMODE I X -- J MODE/4 Calculates the address mode of the current opcode against the Ith base code in the MULTIMODE table and checks whether this is modulo four (i.e, divisible by four with no remainder), if so, J = I, otherwise J = I + 1.
DISASSEMBLE add -- Disassemble the code beginning at address "add".
DIS A synonym for DISASSEMBLE.
Searches a table of two-byte words of length "len" beginning at address "add" for a match to the single byte OP. The table must be arranged in ascending order; I is the index number of the first table entry, which is equal to OP (f=1) or exceeds OP (f = 0).PRINTNAME I add - - Prints three characters (i.e, instruction mnemonic) beginning at address add + 3*I.PRINTMODE MODE/4 -- Print the two-character mnemonic corresponding to the addressing mode "MODE".PRINTADD MODE/4--f Prints the one- or two-byte "address" (if one exists) following the opcode. Sets f to 1 (to terminate disassembly) if the opcode is one of five instructions which can cause a jump.CHKMODE I X -- J MODE/4 Calculates the address mode of the current opcode against the Ith base code in the MULTIMODE table and checks whether this is modulo four (i.e, divisible by four with no remainder), if so, J = I, otherwise J = I + 1.DISASSEMBLE add -- Disassemble the code beginning at address "add".DIS A synonym for DISASSEMBLE.
At line 62 changed one line
Armed with the definitions of the "building blocks," we can now analyze the "main program." I have found the coding form used in the box headed "Description of Disassemble" useful for both analyzing and writing Forth code. The first column is an instruction number(for reference);the second, the contents of the stack; and the third, the instruction (Forth word). The number against a stack entry indentifies the instruction removing that entry from the stack.
Armed with the definitions of the "building blocks," we can now analyze the "main program." I have found the coding form used in the box headed "Description of Disassemble" useful for both analyzing and writing Forth code. The first column is an instruction number(for reference);the second, the contents of the stack; and the third, the instruction (Forth word). The number against a stack entry indentifies the instruction removing that entry from the stack.
At line 64 changed one line
You can use a form like the one I have just described to analyze the remaining code. Of course, you will need to know Forth or have in front of you the fig-Forth glossary. Finally, here is an example of using the disassembler to see how the Forth word C@ replaces an address on the stack with the contents of that address.
You can use a form like the one I have just described to analyze the remaining code. Of course, you will need to know Forth or have in front of you the fig-Forth glossary. Finally, here is an example of using the disassembler to see how the Forth word C@ replaces an address on the stack with the contents of that address.
At line 66 changed one line
The process is started by entering 'C@DIS ~[RETURN]. This sequence puts the parameter field address of C@ on the stack and starts disassembly. The result looks like:
The process is started by entering 'C@DIS ~[RETURN]. This sequence puts the parameter field address of C@ on the stack and starts disassembly. The result looks like:
At line 68 changed 2 lines
{{{
13FB LDA X) 0
{{{13FB LDA X) 0
At line 75 changed one line
Note that the address O,X points to the byte on the bottom of the data stack (it grows down!) and 1,X is the next byte up. F47 is the address of the Forth procedure NEXT, which passes execution to the next Forth word.
Note that the address O,X points to the byte on the bottom of the data stack (it grows down!) and 1,X is the next byte up. F47 is the address of the Forth procedure NEXT, which passes execution to the next Forth word.
At line 77 changed one line
The first instruction loads the accumulator with the byte which was at the 16-bit address on the "top" (physically at the bottom) of the stack. The second instruction at 13FF stores the contents of the Y register (which you can count on being zero) into the high order byte on "top" of the stack. Thus the address on the "top" of the stack is replaced by the byte which was (and still is) stored at that address.
The first instruction loads the accumulator with the byte which was at the 16-bit address on the "top" (physically at the bottom) of the stack. The second instruction at 13FF stores the contents of the Y register (which you can count on being zero) into the high order byte on "top" of the stack. Thus the address on the "top" of the stack is replaced by the byte which was (and still is) stored at that address.
At line 79 changed one line
A word of warning: DISASSEMBLE will disassemble anything! It does not try to stop you from disassembling data, Forth code or even machine code starting at the wrong point. However, you can easily detect a listing of gibberish. The listing will tend to be long (over a screen), the addresses will be all over the place and rarely used instructions will pop up frequently.
A word of warning: DISASSEMBLE will disassemble anything! It does not try to stop you from disassembling data, Forth code or even machine code starting at the wrong point. However, you can easily detect a listing of gibberish. The listing will tend to be long (over a screen), the addresses will be all over the place and rarely used instructions will pop up frequently.
At line 84 removed one line
{{{
At line 86 changed 2 lines
SCR #30
{{{SCR #30
At line 103 changed 3 lines
15 -->
SCR #31
15 -->SCR #31
At line 121 changed 3 lines
15 ENDIF ; -->
SCR #32
15 ENDIF ; -->SCR #32
At line 139 changed 5 lines
15 -->
SCR #33
15 -->SCR #33
At line 159 changed 5 lines
15 -->
SCR #34
15 -->SCR #34
At line 180 removed 2 lines
At line 198 changed one line
15 : DIS DISASSEMBLE ;
15 : DIS DISASSEMBLE ;}}}
At line 200 removed 2 lines
}}}
At line 204 changed 48 lines
|| Step || Stack || Instruction || Comment ||
| |OPadd (2) | | Address of current opcode|
|1 | |POINTER! |Store OPadd in POINTER |
|3 | |CR |Start a new line.|
|4 | |BEGIN CR |Start a loop with a new line.|
|6 |OPadd (11) |POINTER@ |Fetch the opcode address.|
|7 |OPadd (9) |DUP | |
|8 |O (9) |O |Print the address (double precision -|
|9 | |D. | to avoid negative addresses!)|
|10 | |2 SPACES | Leave 2 spaces|
|11 |OP (14) |C@ | Fetch the opcode.|
|12 |OMad (14) |' ONEMODE | Calculate the start address for ONEMODE table.|
|13 |2D (14) |2D | ONEMODE is 2D (45) entries long.|
|14 |I (20)(27)|SEARCH | For OP in ONEMODE table. Leave -|
| | f (15) | | Index and flag.|
|15 | |IF | Test f. False part starts Step 26.|
|16 |I (18) |DUP | TRUE part (f=1),i.e., OP on ONEMODE table. |
|17 |ONad (18) |' ONENAME | Start address for ONENAME table.|
|18 | |PRINTNAME | Type the mnemonic|
|20 |2I + 1 (23) |2 * 1+ | |
|22 |OMad (23) |' ONEMODE | Start address of ONEMODE table.|
|23 |OMad+2I+1 (24)|+ | Adress of MODE for entry I.|
|24 |MODE (25) |C@ | Fetch MODE.|
|25 |MODE/4 (41) |4 / | True part jumps to Step 38.|
|26 | |ELSE | FALSE part OP not in ONEMODE|
|27 | |DROP | The index left by SEARCH (Step 14).|
|28 |OP |POINTER @ c @ | Prepare to search MULTIMODE -|
|29 |MMad |' MULTIMODE | table for opcode. Table is|
|30 |16 |16 |16 (hex) entries long.|
|31 |I (32) |SEARCH | For OP. Leave Index and flag.|
| |f (32) | | f is not used in this case|
|32 |J (33) |CHKMODE | Check whether MODE for entry -|
| |MODE/4 (33) | | I in MULTIMODE table is -|
|33 |J (34) |CHKMODE | divisible by 4. If it is -|
| |MODE/4 (34) | | return J = I otherwise J = I + 1.|
|34 |J (35) |CHKMODE | It may be necessary to -|
| |MODE/4 (41) | | increment I twice (3 CHKMODEs).|
|35 |J (37) |SWAP | |
|36 |MNad (37) |' MULTINAME | Start address for MULTINAME table|
|37 | |PRINTNAME | Start address for MULTINMAE table|
|38 | |ENDIF | terminates the IF at line 15.|
|39 |MODE/4 (40) |DUP | |
|40 | |PRINTMODE |Print address mode mnemonics|
|41 |f (43) | PRINTADD | Print the address part of theinstruction and update the pointer. f = 1 indicates a jump instruction (finish).|
|42 |f (43) |?TERMINAL | f = 1 indicates a key is pressed (finish).|
|43 |f (44) |OR | |
|44 | |UNTIL | Jump to BEGIN (step 4) if f = 0|
|45 | |; | END|
 
At line 253 changed one line
'John Mattes, from Syndney, Australia, is an electrical engineer who has worked in telecommunications for 20 years. He says he is "absorbed" in using Forth with his Atari 800.'
 
At line 182 added one line
 
At line 184 added 54 lines
|| Step || Stack || Instruction || Comment ||
| | OPadd (2) | | Address of current opcode |
| 1 | | POINTER! | Store OPadd in POINTER |
| 3 | | CR | Start a new line. |
| 4 | | BEGIN CR | Start a loop with a new line. |
| 6 | OPadd (11) | POINTER@ | Fetch the opcode address. |
| 7 | OPadd (9) | DUP | |
| 8 | O (9) | O | Print the address (double precision - |
| 9 | | D. | to avoid negative addresses!) |
| 10 | | 2 SPACES | Leave 2 spaces |
| 11 | OP (14) | C@ | Fetch the opcode. |
| 12 | OMad (14) | ' ONEMODE | Calculate the start address for ONEMODE table. |
| 13 | 2D (14) | 2D | ONEMODE is 2D (45) entries long. |
| 14 | I (20)(27) | SEARCH | For OP in ONEMODE table. Leave - |
| | f (15) | | Index and flag. |
| 15 | | IF | Test f. False part starts Step 26. |
| 16 | I (18) | DUP | TRUE part (f=1),i.e., OP on ONEMODE table. |
| 17 | ONad (18) | ' ONENAME | Start address for ONENAME table. |
| 18 | | PRINTNAME | Type the mnemonic |
| 20 | 2I + 1 (23) | 2 * 1+ | |
| 22 | OMad (23) | ' ONEMODE | Start address of ONEMODE table. |
| 23 | OMad+2I+1 (24) | + | Adress of MODE for entry I. |
| 24 | MODE (25) | C@ | Fetch MODE. |
| 25 | MODE/4 (41) | 4 / | True part jumps to Step 38. |
| 26 | | ELSE | FALSE part OP not in ONEMODE |
| 27 | | DROP | The index left by SEARCH (Step 14). |
| 28 | OP | POINTER @ c @ | Prepare to search MULTIMODE - |
| 29 | MMad | ' MULTIMODE | table for opcode. Table is |
| 30 | 16 | 16 | 16 (hex) entries long. |
| 31 | I (32) | SEARCH | For OP. Leave Index and flag. |
| | f (32) | | f is not used in this case |
| 32 | J (33) | CHKMODE | Check whether MODE for entry - |
| | MODE/4 (33) | | I in MULTIMODE table is - |
| 33 | J (34) | CHKMODE | divisible by 4. If it is - |
| | MODE/4 (34) | | return J = I otherwise J = I + 1. |
| 34 | J (35) | CHKMODE | It may be necessary to - |
| | MODE/4 (41) | | increment I twice (3 CHKMODEs). |
| 35 | J (37) | SWAP | |
| 36 | MNad (37) | ' MULTINAME | Start address for MULTINAME table |
| 37 | | PRINTNAME | Start address for MULTINMAE table |
| 38 | | ENDIF | terminates the IF at line 15. |
| 39 | MODE/4 (40) | DUP | |
| 40 | | PRINTMODE | Print address mode mnemonics |
| 41 | f (43) | PRINTADD | Print the address part of theinstruction and update the pointer. f = 1 indicates a jump instruction (finish). |
| 42 | f (43) | ?TERMINAL | f = 1 indicates a key is pressed (finish). |
| 43 | f (44) | OR | |
| 44 | | UNTIL | Jump to BEGIN (step 4) if f = 0 |
| 45 | | ; | END |
'John Mattes, from Syndney, Australia, is an electrical engineer who has worked in telecommunications for 20 years. He says he is "absorbed" in using Forth with his Atari 800.'