A65 Assembler
Back to current versionRestore this version

A65 Assembler ; Copyright (C) Charles W. Marslett#

A65 (A very simple 6502 assembler)

The A65 assembler is modeled after the Atari MACRO Assembler (AMAC) that was distributed by APX along with the MEDIT editor and the DDT debugger. The most significant thing about it is that the source files cannot contain line numbers (so editors that use line numbers must be able to write an unnumbered output file). A second characteristic is that it is a disk-to- disk assembler (the source must be in a disk (or cassette) file and the object cannot be inserted directly into memory.

The assembler is run from the DOS menu using the 'L' command, and prompts you for the source file name, the object file name and the listing file name: a RETURN in response to the object and listing prompts will cause the assembler to generate files on the same drive with extensions '.OBJ' and '.LST' using the same file name as the source file. If no '.' appears in the source file name, '.ASM' is its assumed extension. A listing can be printed directly by specifying 'P:' as the list file or dumped to the screen by specifying 'E:'. A '-' disables the listing if you do not want one.

A start at implementing support for the 65C02 opcodes is in the code, but what is there has not been tested very thouroughly and most of the opcodes and addressing modes are not in place yet. Otherwise, the machine instructions are as any other standard 6502 assembler would expect. Expressions can include +, -, * and / operators as well as HIGH[] and LOW[] functions to extract the high and low bytes of 16-bit numbers. Brackets, [ and ], may be used to group terms in an expression if necessary. Symbols may have up to 11 characters, including upper and lower case letters, underscores and numbers. Symbols must start with an underscore or letter, and case is significant except for the predefined assembler opcodes and the register names A, X and Y.

The machine opcodes recognized are:

ADC, AND, ASL, BCC, BCS, BEQ, BIT, BMI, BNE, BPL, BRA (65C02 only), BRK, BVC, BVS, CLC, CLD, CLI, CLV, CMP, CPX, CPY, DEC, DEX, DEY, EOR, INC, INX, INY, JMP, JSR, LDA, LDX, LDY, LSR, NOP, ORA, PHA, PHP, PLA, PLP, ROL, ROR, RTI, RTS, SBC, SEC, SED, SEI, STA, STX, STY, TAX, TAY, TSX, TXA, TXS, TYA

The assembler directives are:

DB defines a byte, in decimal (12), hex ($0C), octal (@14) or binary (%00001100) and may also be a character string (enclosed in quotes) DC defines a byte, as above, but with the high bit set (if defining a string, only on the last byte of the string) DW defines a word, low byte first, as if an indirect pointer DS allocates a number of bytes without storing any data in them

ORG sets the location counter to a fixed address

EQU defines a symbol to have a specific value = same as EQU

INCLUDE causes the file specified to be inserted at this point in the assembly (can be nested if the DOS supports sufficient numbers of files open at once -- the object, list and source files are kept open at all times, and each level of nested includes requires one more open file. MYDOS and Atari DOS 2 allow 3 normally, which does not allow a listing and include files. If set to 5, the assembler could generate a listing and handle 1 level of nested includes) TITLE specifies the first title line SUBTTL allows entering a second title line PAGE causes the assembler to go to the top of th next listing page

END terminates the program and specifies INIT and RUN addresses

LIST a stub for future expansion MACRO another stub (does nothing, not even generate an error) MEND another stub

Octal numbers, EQU, *=, and INIT and RUN addresses may have bugs in them -- good luck. Most of the rest has been debugged reasonably well.

The END statement can have the following forms:

END no RUN or INIT vectors generated at all END ADDR ADDR is the RUN entry point END ADDR, ADDR is the INIT entry point (no RUN vector) END INIT,RUN both vectors specified

(This is what I call minimum documentation, I will add to it as questions are asked)

Charles Marslett 8/21/85

Release 1.1 Changes since the first release:

The assembler now behaves more reasonably when a forward reference is encountered (it used to just generate junque quietly!) -- forward references are forced to 16-bit values if possible, otherwise A65 assumes you know what you are doing, and assumes it will be an 8-bit value when it is defined.

The assembler allows upper and lower case symbols, and it is case sensitive. It still ignores case, however, when handling the directives and machine opcodes as well as the register names ("A", "X" and "Y"). Underscores are also allow in symbols and are treated as letters (makes C people happy!). Symbols may also be up to 11 characters long (rather than 8 as before).

The initial load address is the HIMEM value from DOS (the lowest possible load address) rather than a random value left over in memory.

Errors are marked in the listing file (as before) but they are also echoed to the screen, and the total number of errors (in HEX) are reported at the end of the assembly. If any errors occur, the automatic return to the DOS prompt is delayed until the [RETURN] key is pressed to give you an opportunity to note any (unexpected) errors reported.

Finally, some errors that were previously accepted (and generated bad code) are now caught and flagged as errors). These included multiply defined symbols and some invalid addressing modes.

Charles Marslett 6/16/89

Release 1.2 Changes since the second release:

Source code lines can be delimited by an ATASCII EOL character ($9B) or a Unix style EOL (a single $0A byte) or a CP/M (or MSDOS) EOL (either $0D followed by $0A or $0A followed by $0D). ASM5.ASM was changed.

Charles Marslett 7/4/89