Assembly Language Style Guidelines - Statement Organization


Assembly Language Style Guidelines - Statement Organization

5.0 Statement Organization

In an assembly language program, the author must work extra hard to make a program readable. By following a large number of rules, you can produce a program that is readable. However, by breaking a single rule no matter how many other rules you've followed, you can render a program unreadable. Nowhere is this more true than how you organize the statements within your program. Consider the following example taken from "The Art of Assembly Language Programming":



______________________________________________________
                mov     ax, 0
                mov     bx, ax
                add     ax, dx
                mov     cx, ax
 ______________________________________________________
mov             ax,                     0
          mov bx,                      ax
    add               ax, dx
                        mov             cx, ax
______________________________________________________

While this is an extreme example, do note that it only takes a few mistakes to have a large impact on the readability of a program. Consider (a short section from) an example presented earlier:



GetFileRecords:
    mov dx, OFFSET DTA          ;Set up DTA
    mov ah, 1Ah
    int 21h
    mov dx, FILESPEC            ;Get first file name
    mov cl, 37h
    mov ah, 4Eh
    int 21h
    jnc FileFound               ;No files.  Try a different filespec.
    mov si, OFFSET NoFilesMsg
    call Error
    jmp NewFilespec
FileFound:
    mov di, OFFSET fileRecords  ;DI -> storage for file names
    mov bx, OFFSET files        ;BX -> array of files
    sub bx, 2

Improved version:



GetFileRecords: mov     dx, offset DTA          ;Set up DTA
                DOS     SetDTA

                mov     dx, FileSpec
                mov     cl, 37h
                DOS     FindFirstFile
                jc      FileNotFound

                mov     di, offset fileRecords  ;DI -> storage for file
names
                mov     bx, offset files        ;BX -> array of files
                sub     bx, 2                   ;Special case for 1st
iteration

An assembly language statement consists of four possible fields: a label field, a mnemonic field, an operand field, and a comment file. The mnemonic and comment fields are always optional. The label field is generally optional although certain instructions (mnemonics) do not allow labels while others require labels. The operand field's presence is tied to the mnemonic field. For most instructions the actual mnemonic determines whether an operand field must be present.

MASM is a free-form assembler insofar as it does not require these fields to appear in any particular column[12]. However, the freedom to arrange these columns in any manner is one of the primary contributors to hard to read assembly language programs. Although MASM lets you enter your programs in free-form, there is absolutely no reason you cannot adopt a fixed field format, always starting each field in the same column. Doing so generally helps make an assembly language program much easier to read. Here are the rules you should use:

Rule:: If an identifier is present in the label field, always start that identifier in column one of the source line.
Rule:: All mnemonics should start in the same column. Generally, this should be column 17 (the second tab stop) or some other convenient position.
Rule:: All operands should start in the same column. Generally, this should be column 25 (the third tab stop) or some other convenient position.
Exception:: If a mnemonic (typically a macro) is longer than seven characters and requires an operand, you have no choice but to start the operand field beyond column 25 (this is an exception assuming you've chosen columns 17 and 25 for your mnemonic and operand fields, respectively).
Guideline:: Try to always start the comment fields on adjacent source lines in the same column (note that it is impractical to always start the comment field in the same column throughout a program).

Most people learn a high level language prior to learning assembly language. They have been firmly taught that readable (HLL) programs have their control structures properly indented to show the structure of the program. Indentation works great when you have a block structured language. Assembly language, however, is the original unstructured language and indentation rules for a structured programming language simply do not apply. While it is important to be able to denote that a certain sequence of instructions is special (e.g., belong to the "then" portion of an if..then..else..endif statement), indentation is not the proper way to achieve this in an assembly language program.

If you need to set off a sequence of statements from surrounding code, the best thing you can do is use blank lines in your source code. For a small amount of detachment, to separate one computation from another for example, a single blank line is sufficient. To really show that one section of code is special, use two, three, or even four blank lines to separate one block of statements from the surrounding code. To separate two totally unrelated sections of code, you might use several blank lines and a row of dashes or asterisks to separate the statements. E.g.,



                mov     dx, FileSpec
                mov     cl, 37h
                DOS     FindFirstFile
                jc      FileNotFound

;     *********************************************

                mov     di, offset fileRecords  ;DI -> storage for file
names
                mov     bx, offset files        ;BX -> array of files
                sub     bx, 2                   ;Special case for 1st
iteration

Guideline:: Use blank lines to separate special blocks of code from the surrounding code. Use an aesthetic looking row of asterisks or dashes if you need a stronger separation between two blocks of code (do not overdo this, however).

If two sequences of assembly language statements correspond to roughly two HLL statements, it's generally a good idea to put a blank line between the two sequences. This helps clarify the two segments of code in the reader's mind. Of course, it is easy to get carried away and insert too much white space in a program, so use some common sense here.

Guideline:: If two sequences of code in assembly language correspond to two adjacent statements in a HLL, then use a blank line to separate those two assembly sequences (assuming the sequences are real short).

A common problem in any language (not just assembly language) is a line containing a comment that is adjacent to one or two lines containing code. Such a program is very difficult read because it is hard to determine where the code ends and the comment begins (or vice-versa). This is especially true when the comments contain sample code. It is often quite difficult to determine if what you're looking at is code or comments; hence the following enforced rule:

Enforced Rule:: Always put at least one blank line between code and comments (assuming, of course, the comment is sitting only a line by itself; that is, it is not an endline comment[13]).

Return to Assembly Language Style Guidelines Index.

Like to read assembly code with various coding style? Use SourceFormatX Assembly Code Formatter to format assembly source code.