LALt LTLLLL }lhd`,($ P5.4RFF A$%$% 0"$)$$H hLW , `e$$%`  R@P1   *  T.`GH`@1`  LTCDCH zhC gDD zCDLgL` "LQl LYK2(KHk  A Proposed Assembly Language Syntax For 65c816 Assemblers } by Randall Hyde This is a proposed standard for 65c81}6 assembly language. Theproposed standard comes in three levels: subset, full, and extended. Thesubset standard is intend}ed for simple (or inexpensive) products,particularly those aimed at beginning 65c816 assembly language programmers.The full} standard is the focus of this proposal. An assembler meeting thefull level adopts all of the requirements outlined in this } paper. Theextended level is a mechanism whereby a vendor can claim full compliancewith the standard and point out that th }ere are extensions as well. Anassembler cannot claim extended level compliance unless it also complies withthe full standa }rd. An assembler, no matter how many extensions areincorporated, will have to claim subset level unless the full standard i }ssupported. This ensures that programmers who do not use any assemblerextensions can assemble their programs on any assemb }ler meeting the full orextended compliance levels. In addition to the items required for compliance, this propos}al suggests several extensions in the interests of compatibility with existing65c816 assemblers. These recommendations are} not required for fullcompliance with the standard, they're included in this proposal as suggestionsto help make conversion} of existing programs easier. The suggestions arepresented in two levels: recommended and optional. Recommended items shou}ldbe present in any decent 65c816 package. Inclusion of the optional itemsis discouraged (since there are other ways to ac}complish the same operationwithin the confines of the standard) but may be included in the assemblerat the vendor's discret}ion to help alleviate conversion problems. 65c816 Instruction Mnemonics } ---------------------------- All of the following mnemonics are required at the subset, full,and e}xtended standard levels. The following mnemonics handle the basic 65c816 instruction set: ADC - add with carryAN}D - logical ANDBCC - branch if carry clearBCS - branch if carry setBEQ - branch if equalBIT - bit testBMI - branch if mi}nusBNE - branch if not equalBPL - branch if plusBRA - branch alwaysBRK - break point instructionBVC - branch if overflow} clearBVS - branch if overflow setCLC - clear the carry flagCLD - clear the decimal flagCLI - clear the interrupt flagCL}P - clear bits in PCLR - store a zero into memoryCMP - compare accumulatorCPX - compare x registerCPY - compare y registe}rCSP - call system procedureDEC - decrement acc or memoryDEX - decrement x registerDEY - decrement y registerEOR - exclu}sive-or accumulatorHLT - halt (stop) the clockINC - increment acc or memoryINX - increment x registerINY - increment y re}gisterJMP - jump to new locationJSR - jump to subroutineLDA - load accumulatorLDX - load x registerLDY - load y register}MVN - block move (decrement)MVP - block move (increment)NOP - no operationORA - logical or accumulatorPHA - push accumul}atorPHP - push pPHX - push x registerPHY - push y registerPLA - pop accumulatorPLP - pop pPLX - pop x registerPLY - po}p y registerPSH - push operandPUL - pop operandRET - return from subroutineROL - rotate left acc/memROR - rotate right a }cc/memRTI - return from interruptRTL - return from long subroutineRTS - return from short subroutineSBC - subtract with c!}arrySED - set decimal flagSEI - set interrupt flagSEP - set bits in PSHL - shift left acc/memSHR - shift right acc/memS"}TA - store accumulatorSTX - store x registerSTY - store y registerSWA - swap accumulator halvesTAD - transfer acc to DTA#}S - transfer acc to STAX - transfer acc to xTAY - transfer acc to yTCB - test and clear bitTDA - transfer D to accTSA - $}transfer S to accTSB - test and set bitTSX - transfer S to XTXA - transfer x to accTXS - transfer x to STXY - transfer x%} to yTYA - transfer y to accTYX - transfer y to xWAI - wait for interruptXCE - exchange carry with emulation bit Commen&}ts: CLP replaces REP in the original 65c816 instruction set, since CLPis a tad more consistent with the original 6'}502 instruction set. See "recommended options" for the status of REP. CLR replaces the STZinstruction. Since STA, STX, a(}nd STY are used to store 65c816 registers,STZ seems to imply that there is a Z register. Using CLR (clear) eliminatesany c)}onfusion. CSP (call system procedure) replaces the COP mnemonic. COPwas little more than a software interrupt in both inte*}nt and implementation.CSP helps make this usage a little clearer. HLT replaces the STP mnemonic.STP, like the STZ mnemonic+}, implies that the P register is being storedsomewhere. HLT (for halt) is just as obvious as "stop the clock" yet itdoesn',}t have the same "look and feel" as a store instruction. JML and JSLare not really required by the new standard; but see r-}ecommended optionsconcerning these two instructions. Most of the new 65c816 push and pullinstructions have been collapsed .}into two instructions: PSH and PUL. PEA label becomes PSH #label PEI (label) becomes PSH label /}PER label becomes PSH @label PHB becomes PSH DBR PHD becomes PSH D PHK b0}ecomes PSH PBR PLB becomes PUL DBR PLD becomes PUL D These mnemonics are more i1}n line with the original design of the 6502instruction set whereby the mnemonic specifies the operation and the operandspec2}ifies the addressing mode and address. The RET instruction gets convertedto RTS or RTL, depending on the type of subroutine3} being declared. RTS and RTL still exist in order to force a short or long return. SHL and SHR (shiftleft and shift right4}) are used instead of ASL and LSR. The 6500 family hasNEVER supported an arithmetic shift left instruction. The operation 5}performedby the ASL mnemonic is really a logical shift left. To simplify matters, SHLand SHR are used to specify shift left6} and shift right. SWA (swap accumulatorhalves) is used instead of XBA. Since this is the only instruction thatreferences 7}the "B" accumulator, there's no valid reason for even treatingthe accumulator as two distinct entities (this is just a carry8}-over from the6800 MPU). Likewise, since the eight-bit accumulator cannot be distinguishedfrom the 16-bit accumulator on a9}n instruction by instruction basis (it dependson the setting of the M bit in the P register), the accumulator should always:}be referred to as A, regardless of whether the CPU is in the eight or sixteenbit mode. Therefore, instructions like TCD, TC;}S, TDC, and TSC should bereplaced by TAD, TAS, TDA, and TSA. For more info on these new mnemonics,see the section on "reco<}mmended options". Built-in Macros --------------- The following=} instructions actually generate one or more instructions.They are not required at the subset level, but are required at the >}full andextended levels. ADD - emits CLC then ADCBFL - emits BEQ (branch if false)BGE - emits BCSBLT - emits BCCBTR ?}- emits BNE (branch if true)BSR - emits PER *+2 then BRA (short) or PER *+3 then BRL (long)SUB - emits SEC then SBC @} Recommended Options ------------------- The following mnemonics are alA}iases of existing instructions. The(proposed) standard recommends that the assembler support these mnemonics,mainly to proB}vide compatibility with older source code, but does notrecommend their use in new programs. Some (or all) of these items maC}y beremoved from the recommended list in future revisions of the standard. Noneof these recommended items need be present D}at the subset level. If theseare the only extensions over and above the full syntax, the assemblerCANNOT claim to be an exE}tended level assembler. ASL BRL COP JML JSL LSR PEA PEI PERPHB PHK PHK PLB F} PLD REP TCD TCS TDCTSC TRB WDM XBA Symbols, Constants, and Other ItemsG} ----------------------------------- Symbols may contain any reasonable number of characters at the H}fulllevel. At the subset compliance level, at least 16 characters should besupported and 32 is recommeded. A "reasonable"I} number of characters shouldbe at least 64 if the implementor needs a maximum value. Symbols must begin with an alJ}phabetic character and may contain(only) the following symbols: A-Z, a-z, 0-9, "_", "$", and "!". Theassembler must be caK}pable of treating upper and lower case alphabeticcharacters identically. Note that this does not disallow an assembler fromL}allowing the programmer to choose that upper and lower case be distinct, itsimply requires that in the default case, upper M}and lower case charactersare treated identically. Note that the standard does not require casesensitivity in the assemblerN} (and, in fact, recommends against it).Therefore, anyone foolish enough (for many, many reasons) to create variablesthat diO}ffer only in the case of the letters they contain is risking port-ability problems (as well as maintenence, readability, andP} other problems). The following symbols are reserved and may not be redefined withinthe program: Q} A, X, Y, S, DBR, PBR, D, M, P Nor may these symbol appear as fields to a record or type definition (whichwill be describeR}d later). Constants take six different forms: character constants, stringconstants, binary constants, decimal coS}nstants, hexadecimal constants andset constants. Character constants are created by surrounding a single characterT} by a pair of apostrophes or quotation marks, e.g., "s", "a", '$', and 'p'. If the character is surrounded by apostrophes,U} then the ASCII code for that character WITH THE H.O. BIT CLEAR will be used. If the quotation marks are used, then the ASV}CII code for the character WITH THE H.O. BIT SET will be used. If you need to represent the apostrophe with the H.O. bit clW}ear or aquotation mark with the H.O. bit set, simply double up the characters, e.g., '''' - emits a sinX}gle apostrophe. """" - emits a single quotation mark. String constants are generated by placing Y}a sequence of two or morecharacters within a pair of apostrophes or quotation marks. The choice ofapostrophe or quotation Z}mark controls the H.O. bit, as for characterconstants. Likewise, to place an apostrophe or quote within a stringdelimited [}by the same character, just double up the apostrophe or quotationmark: 'This isn''t bad!' - generates --This isn\}'t bad-- "He said ""Hello""" - generates --He said "Hello"-- Binary integer constants consist of a seque]}nce of 1 through 32 zerosor ones preceded by a percent sign ("%"). Examples: %10110010 ^} %001011101 %10 %1100 Decimal integer constants consis_}t of strings of decimal digits withoutany preceding characters. E.g., 25, 235, 8325, etc. Decimal constantsmay be (opt`}ionally) preceded by a minus sign. Hexadecimal constants consist of a dollar sign ("$") followed bya string of hexa}adecimal digits (0..9 and A..F). Values in the range $0 through $FFFFFFFF are allowed. Set constants are only reqb}uired at the full and extended compliancelevels. A set constant consists of a list of items surrounded by braces,e.g., {0c},3,5}. For more information, see the .SET directive. Address Expressions d} ------------------- Most instructions and many pseudo-opcode/assembler directives requiree}operands of some sort. Often these operands contain some sort of addressexpression (some, ultimately, numeric or string vaf}lue). This proposed standard defines the operands, precision, accuracy, and available operations that constitutes an addreg}ss expression. Precision: all integer expressions are computed using 32 bits. All stringexpressions are computed with strh}ings up to 255 characters in length. Allfloating point operations are performed using IEEE 80-bit extended floatingpoint vi}alues (i.e., Apple SANE routines). All set operations are performedusing 32 bits of precision. Accuracy: all integer operj}ations (consisting of two 32-bit operands and anoperator on those operands) must produce the correct result if the actualrek}sult can fit within 32 bits. If an overflow occurs, the value is truncatedand only the low order 32 bits are retained. If l}an underflow occurs, zerois used as the result. If an overflow or underflow occurs, a special bit willbe set (until the nem}xt value is computed) that can be tested by the ".IFOVR"and ".IFUNDR" directives. Other than that, such errors are ignored.n} Allarithmetic is performed using unsigned arithmetic operations. Allfloating point operations follow the IEEE (and Apple o}SANE) suggestions, andare otherwise ignored by the assembler. Any string operation producing astring longer than 255 charap}cters produces an assembly time error. All setoperations must be exact. Integer operations: The following integer operatiq}ons must be provided at allcompliance levels: + (binary) adds the two operands.- (binary) subracts second operand from thr}e first.* multiplies the two operands./ divides the first operand by the second.\ divides the first operand by the second s}and returns the remainder.& logically ANDs the two operands.| logically ORs the two operands.^ logically XORs the two opert}ands. =<> These operators compare the two operands (unsigned comparison) and< return 1 if the comparison is true, 0 otu}herwise.><=>= - (unary) negates (2's complement) the operand~ (unary) complements (inverts - 1's complement) the operanv}d The following operators must be provided at the full and extended compliancelevels: <- shifts the first operand to tw}he left the number of bits specified by the second operand.-> shifts the first operand to the right the number of bits spx}ecified by the second operand. @ (unary) subtracts the location counter at the beginning of the current statemy}ent from the following address expression. % (ternary, e.g.: X%Y:Z) This operator extracts bits Y through Z from X and rz}eturns that result right justified. Floating point operations: floating point numbers and operations are requiredonly at{} the full and extended levels. The following operations must beavailable as well: + adds the two operands.- subtracts th|}e second operand from the first.* multiplies the two operands./ divides the first operand by the second.- (unary) negates }}the operand. =<> These operators compare the two operands and< return 1 if the comparison is true, 0 otherwise.><=>=~} String operations: strings and string operations are not required at thesubset level, but the standard recommends thei}r presence. The followingstring operations must be provided at the full and extended levels: + concatenates two strings%} (ternary, e.g., X%Y:Z) returns the substring composed of the characters in X starting at position Y of length Z. Generate} an error if X doesn't contain sufficient characters. =<> These operators compare the two operands and< return 1 if th}e comparison is true, 0 otherwise.><=>= Set operations: sets and set operations are required only at the full andexte}nded levels. The following set operations must be provided: + union of two sets (logical OR of the bits).* intersectio}n of two sets (logical AND of the bits).- set difference (set one ANDed with the NOT of the second set) = returns 1 if t}he two sets are equal, zero otherwise.<> returns 1 if the two sets are not equal, zero otherwise.< returns 1 if the first }set is a proper subset of the second.<= returns 1 if the first set is a subset of the second.> returns 1 if the first set }is a proper superset of the second.>= returns 1 if the first set is a superset of the second. % (ternary, e.g., X % Y:Z) e}xtracts elements Y..Z from X and returns those items. In addition to the above operators, several pre-defined function}s are alsoavailable. Note that these functions are not required at the subsetcompliance level, only at the full and extend}ed levels: float(i) - Converts integer "i" to a floating point value.trunc(r) - Converts real "r" to a 32-bit unsigned int}eger (or generates an error).valid(r) - returns "1" if r is a valid floating point value, 0 otherwise } (for example, if r is NaN, infinity, etc.)length(s)- returns the length of string s.lookup(s)- returns "1" if s is a valid} symbol in the symbol table.value(s) - returns value of symbol specified by string "s" in the symbol table.type(}s) - returns type of symbol "s" in symbol table. Actual values returned are yet to be defined.mode(a) - return}s the addressing mode of item "a". Used mainly in macros.STR(s) - returns string s with a prefixed length byte.ZRO(s) }- returns string s with a suffixed zero byte.DCI(s) - returns string s with the H.O. bit of its last char inverted.RVS(s)} - returns string s with its characters reversed.FLP(s) - returns string s with its H.O. bits inverted.IN(v,s) - retur}ns one if value v is in set s, zero otherwise. The following integer functions must be present at all compliance levels:} LB(i),LBYTE(i),BYTE(i) - returns the L.O. byte of i.HB(i),HBYTE(i) - returns byte #1 (bits 8-15) of i.BB(i),BBYTE(i)} - returns bank byte (bits 16-23) of i.XB,XBYTE(i) - returns H.O. byte of i.LW(i),LWORD(i),WORD(i) - returns L.O. word }of i.HW(i),HWORD(i) - returns H.O. word of i.WORD(i) Pack(i,j)- returns a 16-bit value whose L.O. byte is the L.O. byte }of i and whose H.O. byte is the L.O. byte of j. Pack(i,j,k,l)- returns a 32-bit value consisting of (i},j,k,l) where i is the L.O. byte and l is the H.O. byte. Note: l is optional. If it isn't pre}sent, substitute zero for l. The order of evaluation for an expression is strictly left to rightunless paren}theses are used to modify the precedence of a sub-expression.Since parentheses are used to specify certain indirect addressi}ng modes, theuse of paretheses to override the strict left-to-right evaluation orderintroduces some ambiguity. For example}, should the following be treatedas jump indirect through location $1001 or jump directly to location $1001? } JMP ($1000+1) The ambiguity is resolved as follows: if the parenthesis is the first char-acter in the operand field, th}en the indirect addressing mode is assumed.Otherwise, the parentheses are used to override the left-to-right precedence.The} example above would be treated as a jump indirect through location $1001.If you wanted to jump directly to location $1001 i}n this fashion, the state-ment could be modified to JMP 0+($1000+1) so that the parenthesis is no longer} the first character in the operandfield. The use of parentheses to override the left-to-right precedence isonly }required at the full and extended compliance levels. It is notrequired at the subset compliance level. } Expression Types ---------------- Expressions, in addition to} having a value associated withthem, also have a specific type. The three basic types of expressions areinteger, floating }point, and string expressions. Integer expressions canbe broken down into subtypes as well. A hierarchical diagram is the }easiestway to describe integer expressions: integers ------ constants ------------ user defined (enumerated) types } | | | +----- simple numeric constants | | } +-- addresses ------------ direct page addresses | }+----- absolute addresses --- full 16-bit | | } | +- relative 8-bit | +---}-- long addresses This diagram points out that there are two types of integer expres-sions: constants and addresse}s. Further, there are two types of constantsand four types of addresses. Before discussion operations on these differentt}ypes of integer values, their purpose should be presented. Until now, most 65xxx assembler did little to differenti}ate betweenthe different types of integer values. In this proposed standard, however,strong type checking is enforced. Wh}ereas in previous assemblers you coulduse the following code: label equ $1000 lda #Label} sta Label such operations are illegal within the confines of the new standard. Theproblem with this }short code segment is that the symbol "label" is used asboth an integer constant (in the LDA instruction) and as an address }expression (in the STA instruction). To help prevent logical errors fromcreeping into a program, the assembler doesn't all}ow the use of addresseswhere constants are expected and vice versa. To that end, a new assemblerdirective, CON, is used to} declare constants while EQU is used to declarean (absolute) address. Symbols declared by CON cannot be (directly) usedas }an address. Likewise, symbols declared by EQU (and others) cannot beused where a constant is expected (such as in an immedi}ate operand). Although this type checking can be quite useful for locating bugswithin the source file, it can also} be a source of major annoyance. Some-times (quite often, in fact) you may want to treat an address expressionas a constan}t or a constant expression as an address. Two functions areused to coerce these expressions to their desired form: PTR and }OFS.PTR(expr) converts the supplied constant expression to an address expression.OFS(expr) converts the supplied address ex}pression to a constant expression.The following is perfectly legal: Cons1 CON $5A DataLoc EQU $1}000 lda #OFS(DataLoc) sta PTR(Cons1) For more information, see the section on assem}bler directives. PTR and OFSare required at all compliance levels of this proposed standard. While any constant v}alue may be used anywhere a constant is allowed,the 65c816 microprocessor must often differentiate between the various types}of address expressions. This is particularly true when emitting code sincethe length of an instruction depends on the part}icular address expression.If an expression contains only constants, direct page values, absolutevalues, or long values, th}ere isn't much of a problem. The assembler usesthe specified type as the addressing mode. If the expression contains mixed}types, the resulting type is as follows: Expression contains: Result is: | |} | | +------------+-- Constants - Constant | | +-- D}irect | - Direct | +--+ Absolute - } Absolute | +--+- Long - Long Allowable forms: } constant direct constant+direct absolute constant+absolute long constant+}long absolute+long constant+absolute+long This says tha}t if you expression contains only constants, then theresult is a constant. If it contains a mixture of constants and direct}page addresses, the result is a direct page address. Note that direct pageaddresses cannot be mixed with other types of ad}dresses. An error must bereported in this situation (although you could get around it with anexpression of the form "abs+O}FS(direct)"). Likewise, adding a constant toan absolute address produces an absolute address. Adding an absolute anda lon}g address produces a long address, etc. Sometimes, you need to force an expression to be a certain type.For exampl}e, the instruction "LDA $200" normally assembles to a loadabsolute from location $200 in the current data bank. If you need} to forcethis to location $200 in bank zero, regardless of the content of the DBR,the address expression must be coerced to} a long address. Coercion of thistype is accomplished with the ":D", ":A", ":L", and ":S" expression suffixes.To force "LD}A $200" to be assembled using the long address mode, the in-struction is modified to be "LDA $200:L". The coercion suffix m}ust alwaysfollow the full address expression. The ":S" (for short branches) suffixis never required, since a short branch }(for BRA and BSR) is always assumed,but it is included for completeness. For BRA and BSR, the ":L" suffix isused to imply }a long branch (+/- 32K) rather than the long addressing mode. Caveats: If ":D" or ":A" is used to coerce a large ad}dress expressionto direct or absolute, the high order byte(s) of the expression are truncatedand ignored. The assembler mu}st assume that when a programmer uses theseconstructs he knows exactly what he's doing. Therefore, "LDA $1001:D" willhappi}ly assemble this instruction into a "LDA $01" instruction despite theactual value of the address expression. Addre}ssing Mode Specification----------------------------- 65c816 addressing modes are specified by certain symbols in }the op-erand field. A quick rundown follows: Addressing mode Format(s) Example(s) -}-------------- ------------------ ---------------------- Immediate # } LDA #0 = CMP =LastValue Direct Page LDA DPG :D LDA ANY:D Absolute LDA ABS :A LDA ANY:A Long } LDA LONG :L LDA ANY:L Accumulator } {no operand} ASL INC Implied } {no operand} CLC SED Direct, Indirect,} Indexed by Y (),Y LDA (DPG),Y ().Y L}DA (ANY:D).Y Direct, Indirect, Indexed by Y, Long [],Y LDA [DPG],Y } [].Y LDA [DPG].Y Direct, Indexed by X, Indirect (,X) LDA (DPG,X) (.X) LDA (ANY:D.X) Direct, Indexed by} X ,X LDA DPG,X .X LDA DPG.X Direct, I}ndexed by Y ,Y LDX DPG,Y .Y LDX DPG.Y }Absolute, Indexed by X ,X LDA ABS,X .X LDA ANY:A.X} Long, Indexed by X ,X LDA ANY:L,X .X } LDA LONG.X Absolute, Indexed by Y ,Y LDA ANY:A,Y }.Y LDA ABS.Y Program Counter Relative (branches) BRA ABS } @ BRA @ABS PC Relative (PSH) @ PSH @ABS } Absolute, Indirect () JMP (ABS) Absolute, Indexed, Indirect }(,X) JMP (ABS,X) (.X) JMP (ABS.X) Direct, Ind}irect () LDA (DPG) STA (ANY:D) }Stack Relative ,S LDA 2,S .S LDA 2.S } Stack Relative, Indirect, Indexed (,S),Y LDA (2,S),Y (, MVN LONG,LONG } MVP LONG,LONG , DPG- Any direct page expression or} symbol. , ABS- Any absolute expression or symbol. , Long- Any long expressi}on or symbol. expr8- Any expression evaluating to a value less than }256. Note: the only real difference between the existing standard and the proposed standard is that the period (".") can} be used to form an indexed address ex-pression. This is compatible (in practice, as well as philosophy) with the record s}tructure mechanism supported by this proposed standard. This syntax for the various addressing modes is required at all com}pliance levels. Suggestion: ():L, ():L,Y, and (], [],Y, and [ .}EQU <16-bit value>