CHAPTER 12   COMPATIBILITY WITH OTHER ASSEMBLERS


I gave heavy priority to compatibility when I designed A86; a
priority just a shade behind the higher priorities of
reliability, speed, convenience, and power.  For those of you who
feel that "close, but incompatible" is like saying "a little bit
pregnant", I'm sorry to report that A86 will not assemble all
Intel/IBM/MASM programs, unmodified.  But I do think that a
majority of programs can, with a little massaging, be made to
assemble under A86.  Furthermore, the massaging can be done in
such a way as to make the programs still acceptable to that old,
behemoth assembler.

I have been adding compatibility features with almost every new
version of A86.  Among the features added since A86 was first
released are: more general forward references, double quotes for
strings, "=" as a synonym for EQU, the RADIX directive, the
COMMENT directive, and the COMPAT.8 file containing macros for a
number of segmentation-model directives.  If you tried feeding an
old source file to a previous A86 and were dismayed by the number
of error messages you got, try again: things might be more
manageable now.


Conversion of MASM programs to A86

Following is a list of the things you should watch out for when
converting from MASM to A86:

1. You need to determine whether the program was coded as a COM
   program or as an EXE program.  All COM programs coded for MASM
   will contain an ORG 100H directive somewhere before the start
   of the code.  EXE programs will contain no such ORG, and will
   often contain statements that load named segments into
   registers.  If the program was coded as EXE, you must either
   assemble it (using the +O option) to an OBJ file to be fed to
   LINK, or you must eliminate the instructions that load segment
   registers-- in a COM program they often aren't necessary
   anyway, since COM programs are started with all segment
   registers already pointing to the same value.

   A good general rule is: when it doubt, try assembling to an
   OBJ file.

2. You need to determine whether the program is executing with
   all segment registers pointing to the same value.  Simple COM
   programs that fit into 64K will typically fall into this
   category.  Most EXE programs, programs that use huge amounts
   of memory, and programs (such as memory-resident programs)
   that take over interrupts typically have different values in
   segment registers.
                                                             12-2

   If there are different values in the segment registers, then
   there may be instructions in the program for which the old
   assembler generates segment-override prefixes "behind your
   back".  You will need to find such references, and to generate
   explicit overrides for them.  If there are data tables within
   the program itself, a CS-override is needed.    If there are
   data structures in the stack segment not accessed via a
   BP-index, an SS-override is needed. If ES points to its own
   segment, then an ES-override is needed for accesses (other
   than STOS and MOVS destinations) to that segment.  In the
   interrupt handlers to memory-resident programs, the "normal"
   handler is often invoked via an indirect CALL or JMP
   instruction that fetches the doubleword address of the normal
   handler from memory, where it was stored by the initialization
   code.  That CALL or JMP often requires a CS-override-- watch
   out!

   If you want to remain compatible with the old assembler, then
   code the overrides by placing the segment-register name, with
   a colon, before the memory-access operand in the instruction.
   If you do not need further compatibility, you can place the
   segment register name before the instruction mnemonic.  For
   example:

    MOV AL,CS:TABLE[SI]   ; if you want compatibility do this
    CS MOV AL,TABLE[SI]   ; if not you can do it this way

3. A86 is distributed with a file called COMPAT.8, containing a
   set of macros that implement some of the directives seen in
   recent versions of MASM.  If you would like A86 to recognize
   these directives, make sure copies of COMPAT.8 and A86.LIB are
   in either the current directory, or a directory named by the
   A86LIB environment variable.  When you do so, the file
   COMPAT.8 will automatically be included in your assembly
   whenever you use one of the MASM directives.  See Chapter 13
   for more details about A86LIB.

4. You should use a couple of A86's switches to maximize
   compatibility with MASM.  I've already mentioned the +O switch
   to produce .OBJ files.  You should also assemble with the +D
   switch, which disables A86's unique parsing of constants with
   leading zeroes as hexidecimal.  The RADIX command in your
   program will also do this. And you should use the +G15 switch,
   that disables a few other A86 features that might have reduced
   compatibility. See Chapter 3 for a detailed explanation of
   these switches.

5. A86 is a bit more restrictive with respect to forward
   references than MASM, but not as much as it used to be. You'll
   probably need to resolve just a few ambiguous references by
   appending " B" or " W" to the forward reference name.

6. A86's macro-definition and conditional-assembly language is
   different from MASM's.  Most macros can be translated by
   replacing the named parameters of the old macros with the
   dedicated names #n of the A86 macro language; and by replacing
   ENDM with #EM.  For example, the following MASM macro:
                                                             12-3

        MOVM MACRO DEST,SRC
        MOV AL,DEST
        MOV SRC,AL
        ENDM

   would be translated by eliminating the DEST,SRC declarations
   on the first line, replacing DEST with #1 and SRC with #2 in
   the body of the definiation, and replacing ENDM by #EM -- the
   result is the MOVM macro that I presented at the beginning of
   Chapter 11.

   Other constructs have straightforward translations, as
   illustrated by the following examples.  Note that examples
   involving macro parameters have double hash signs, since the
   condition will be tested when the macro is expanded, not when
   it is defined.

   MASM construct              Equivalent A86 construct

   IFE expr                     #IF ! expr
   IFB <PARM3>                  ##IF !#S3
   IFNB <PARM4>                 ##IF #S4
   IFIDN <PARM1>,<CX>           ##IF "#1" EQ "CX"
   IFDIF <PARM2>,<SI>           ##IF "#2" NE "SI"
   IFDEF symbol                 #IF DEF symbol
   IFNDEF symbol                #IF ! DEF symbol
   .ERR                         (any undefined symbol)
   .ERRcond                     TRUE EQU 0FFFF
                                TRUE EQU cond
   EXITM                        #EX
   IRP ... ENDM                 #RX1L ... #ER
   REPT 100 ...ENDM             #RX1(100) ... #ER
   IRPC ... ENDM                #CX ... #EC

   The last three constructs, IRP, REPT, and IRPC, usually occur
   within macros; but in MASM they don't have to.  The A86
   equivalents are valid only within macros-- if they occur in
   the MASM program outside of a macro, you duplicate them by
   defining an enclosing macro on the spot, and calling that
   macro once, right after it is defined.
                                                             12-4

7. Later versions of MASM have expanded the syntax of the PROC
   directive, to support interfacing to languages such as Pascal
   and C.  The USES clause lists registers to be pushed at the
   beginning of the procedure, and popped just before the RET
   instruction for that procedure.  In A86, you need to
   explicitly provide such pushes and pops.  MASM also allows the
   declaration of parameters as local variables within the
   procedure.  Such parameters are passed by the calling program
   on the stack, and are addressed using the BP register.  The
   precise offsets from the BP register depend on the particular
   language of the calling program, and on whether the calls are
   NEAR or FAR.  In A86, you can either define the parameter
   names as offsets from BP (the based STRUC feature of A86
   assists in this), or replace usages of the local names with
   explicit BP-index references.  If you define parameter names,
   keep in mind that A86 does not have local scoping of symbols:
   if names are duplicated in different PROCs, you will need to
   change the names to eliminate the duplications.

8. Later versions of MASM have block-structure mechanisms to
   simulate the look and feel of a high-level language.  A86 does
   not yet support these mechanisms: you need to replace them
   with the equivalent conditional-jump and LOOP instructions.

9. A86 supports the STRUC directive, with named structure
   elements, just like MASM, with one exception: A86 does not
   save initial values declared in the STRUC definition, and A86
   does not allow assembly of instances of structure elements.

   For example, the MASM construct

   PAYREC STRUC
      PNAME  DB 'no name given'
      PKEY   DW ?
   ENDS

   PAYREC 3 DUP (?)
   PAYREC <'Eric',1811>

   causes A86 to accept the STRUC definition, and to define the
   structure elements PNAME and PKEY correctly; but the PAYREC
   initializations need to be recoded.  If it isn't vital to
   initialize the memory with the specific definition values, you
   could recode the first PAYREC as:

   DB  ((TYPE PAYREC) * 3) DUP ?

   If you must initialize values, you do so line by line:

   DB 'Eric         '
   DW ?

   If there are many such initializations, you could define a
   macro INIT_PAYREC containing the DB and DW lines.
                                                             12-5

Compatibility symbols recognized by A86

A86 has been programmed to ignore a variety of lines that have
meaning to Intel/IBM/MASM assemblers; but which do nothing for
A86.  These include lines beginning with a percent sign, lines
beginning with ASSUME, and lines beginning with any unrecognized
symbol that begins with a period.  If you are porting your
program to A86, and you wish to retain the option of returning to
the other assembler, you may leave those lines in your program.
If you decide to stay with A86, you can remove those lines at
your leisure.

In addition, there is a class of symbols now recognized by A86 in
its .OBJ mode, but still ignored in .COM mode.  This includes
NAME, END, and PUBLIC.

Named SEGMENT and ENDS directives written for other assemblers
are, of course, recognized by A86's .OBJ mode.  In non-OBJ mode,
A86 treats these as CODE SEGMENT directives.  A special exception
to this is the directive

    segname SEGMENT AT atvalue

which is treated by A86 as if it were the following sequence:

    segname EQU atvalue
    STRUC

This will accomplish what is usually intended when SEGMENT AT is
used in a program intended to be a COM file.



Conversion of A86 Programs to Intel/IBM/MASM

I consider this section a bit of a blasphemy, since it's a little
silly to port programs from a superior assembler, to run on an
inferior one.  However, I myself have been motivated to do so
upon occasion, when programming for a client not familiar with
A86; or whose computer doesn't run A86, and who therefore wants
the final version to assemble on Intel's assembler.  Since my
assembler/debugger environment is so vastly superior to any other
environment, I develop the program using my assembler, and port
it to the client's environment at the end.

The main key to success in following the above scenarios is to
exercise supreme will power, and not use any of the wonderful
language features that exist on A86, but not on MASM. This is
often not easy; and I have devised some methods for porting my
features to other assemblers:

1. I hate giving long sequences of PUSHes and POPs on separate
   lines.  If the program is to be ported to a lesser assembler,
   then I put the following lines into a file that only A86 will
   see:
                                                             12-6

      PUSH2 EQU PUSH
      PUSH3 EQU PUSH
      POP2 EQU POP
      POP3 EQU POP

   I define macros PUSH2, PUSH3, POP2, POP3 for the lesser
   assembler, that PUSH or POP the appropriate number of
   operands.  Then, everywhere in the program where I would
   ordinarily use A86's multiple PUSH/POP feature, I use one or
   more of the PUSHn/POPn mnemonics instead.

2. I refrain from using the feature of A86 whereby constants with
   a leading zero are default-hexadecimal.  All my hex constants
   end with H.

3. I will usually go ahead and use my local labels L0 through L9;
   then at the last minute convert them to a long set of labels
   in sequence: Z100, Z101, Z102, etc.  I take care to remove all
   the ">" forward reference specifiers when I make the
   conversion.  The "Z" is used to isolate the local labels at
   the end of the lesser assembler's symbol table listing. This
   improves the quality of the final program so much that it is
   worth the extra effort needed to convert L0--L9's to Z100--
   Zxxx's.

