An Assembly Language Tutorial for the Model 100
________________________________________________________________________________
By Mike Berro [75765,473]
(C) 1984 BCS Software
0. Preface.
I have an ulterior motive for writing this. I want to sharpen my technical
writing skills. Any comments or criticisms about contents or style would be
appreciated. Also, I will be happy to try to answer any questions you have
about assemblymlanguage.
________________________________________________________________________________
1. Switching Gears from Basic to Assembly Language.
After programming in Basic, using assembly language may at first seem like
switching from the Model 100 to the HP-41 calculator. Gone are string
variables and the print statements. You don't even have a multiply or a
divide command like the HP-41. You have to multiply or divide by using
successive additions or subtractions. However, the benefits over Basic
include high speed and small size.
My game Planet Protector (PLANET.SRC on CompuServe's Model 100 SIG) is only
2.5K long, and the spaceships move reasonably smoothly across the screen.
Compare tht to my earlier Basic game Starfighter (STARF.100), which is over
10K long and slower than anything. And doing what is essentially bit-mapped
graphics to produce your own shapes would slow Basic down by at least
another factor of ten.
Anyway, since you're still reading this you probably already know the
advantages of assembly language, so we'll begin.
2. How to Use This Tutorial.
It would be to your benefit to have at least a reference book on 8085
assembly language, if not a textbook. I am not trying to write a book here,
so there may be some information left out. This tutorial is designed as an
adjunct to a text book, and not a substitute. It will also be useful for
you to have an assembler. See HELP.ASM in XA4 for help in choosing one.
The sample programs I use will be formatted for Custom Software's assembler,
but only because that happens to be the one I have. You should have no
difficulty translationg to your own assembler.
Also it is very important to backup all the files on your M100 before you
assemble any program. Basic (almost) always makes sure that it doesn't
clobber other files (unless you open a file for OUTPUT instead of INPUT.)
If you assign a value to a variable, the value does get stored in memory,
but Basic will never store it in a protected area of memory.
Machine language programs have no compunction about storing values anywhere.
If you make a typo so that the program stores values where the file
directory should be you're in big trouble. And it happens to the best of
'em. Maybe I'm being paranoid, but I'll say it again:
------ BACKUP ALL YOUR FILES BEFORE YOU BEGIN ASSEMBLY ------
Our goal for this series is to be able to write a program that allows you
to design any shape, and then move it across the screen. The shape table
will be stored in a text file created by the user. The program will be
called SHAPE. It will take several installments of TUTOR before we are
finished.
3. What an Assembler Does.
An assembler converts your source file, which is a text file, into a machine
language file. You create the source file using the built-in text editor.
The source file contains the source code that the assembler converts into a
series of numbers that represent the instructions you want carried out
(machine language.)
The numbers are stored in memory, and then the memory locations used are
identified by assigning a filename to them, with the suffix ".CO". The .CO
file is called a machine language file. It is possible to program directly
in machine language by pokeing the numbers directly into memory, but that
gets tedious quickly. Machine language is all numbers, no english commands
at all. Assembly language consists of "memnonics" that represent the machine
language numbers. You only have to remember the memnonics, which by
definition are supposed to be easy to remember. Then the assembler converts
the memnonics into machine language for you.
The next section deals with binary and hex numbers, and with some of the
inner workings of the 80C85.
4. The 80C85 and Number Systems.
At this point you need to know the binary and hexadecimal number systems.
In this section I will try to explain why you need these other number
systems. If you are not familiar with them, most books about assembly
language start with them. If your book does not, it probably is not a
beginner's book.
The CPU in the Model 100 is an 80C85 integrated circuit. (The C stands for
CMOS, which is a battery conserving type of electrical device.)
As such it only responds to electrical signals, no matter how much you yell
at it. It is an 8-bit CPU, which means it can respond to eight signals
simultaneously. Each signal is a bit. Now the CPU is also a digital device,
so each signal can only be a one or a zero. Music is an analogue signal;
you can send Beethoven's fifth symphony through one wire (although it sounds
better through two.) A digital signal can only be on or off, one or zero.
Given an 8-bit digital CPU, there is a limit to the number of different
messages or commands that we can give it at any one time. In fact there are
only 256. 00000000 is one message, 00000001 is another, 00000010 is the
"next" one, up to 11111111. These are all binary numbers since each digit
(bit) can be one of two values. We normally use the decimal numbers from 0-9
(ten values). Assemblers very often use hexadecimal numbers, which use 16
values.
All three number systems are used in assembly language programming. In fact,
the only reason to use decimal numbers is because we are familiar with them.
The CPU only recognizes ones and zeroes, eight at a time. There are 256
combinations possible. Hexadecimal is used because of convenience.
Look at Table 1. Notice that the left hexadecimal digit precisely
Dec Binary Hex Oct
000 00000000 00 000
001 00000001 01 001
002 00000010 02 002
010 00001010 0A 012
011 00001011 0B 013
012 00001100 0C 014
013 00001101 0D 015
014 00001110 0E 016
015 00001111 0F 017
016 00010000 10 020
017 00010001 11 021
TABLE 1. Number Systems
corresponds the left four digits of the binary number, and the right
hexa-decimal digit corresponds to the right four digits of the binary number.
Binary numbers would be appropriate to use, but they take up more memory in
your source file (8 digits). Hexadecimal numbers are the next logical choice.
Octal numbers are sometimes recommended for the 8085. The first two digits
of the binary number correspond to the first octal digit, the next three
digits correspond to the second octal digit, and the last three binary
digits correspond to the last octal digit.
I do not recommend that you learn octal arithmetic, although it couldn't
hurt. Octal arithmetic is very useful for converting the source code into
the actual machine language, but since that is what our assembler is
supposed to be doing for us, there is no real need to master it.
5. The 80C85 Architecture.
This section describes what's available for the programmer to use in the
8085. It is written from the programmer's point of view, and not the
engineer's. A more complete analysis of the 8085 will be left for later.
One of the most important functions of the CPU is to manipulate data.
In Basic we store the data in variables, both numeric and string. We don't
know exactly where in memory the data is stored, but we know that Basic
knows, so we don't worry about it. In assembly language we have to worry
about it.
We can store data directly inside the CPU. The 8085 has seven 8-bit
registers. Each register can store a value from $00 to $FF (0 to 255: a
dollar sign prior to a number indicates a hexadecimal value.) The registers
are designated by single letters: A, B, C, D, E, H and L. They are not all
created equal, and are used for different purposes.
The A register is also called the accumulator. All of the 8-bit arithmetic
is done on the value in the accumulator. Addition, subtraction, ANDing,
EXCLUSIVE ORing, ORing, rotations and comparisons all involve the A register
(don't worry if you don't know what they all mean. You can worry if you
don't know what the first two mean though.) For example, the instruction
ADI $10 adds the value $10 to what is already in the A register. The
instruction ADD B adds the value of the B register to the value of the A
register, and then stores the value back into the A register. The B register
is unchanged. Only the A register is affected by any of these operations.
Now that we know how to do arithmetic on the A register, how do we get a
value into it? There are five ways to get a value into the A register. The
first is to load it with a preset value, called an immediate value, like
$07, or $AD. The second is to transfer it from a different register. The
third is to get the value from somewhere in memory. The fourth is to get it
from a peripheral device, and the final method is to get the value from the
serial data port.
The last two methods involve advanced topics. The serial data port is used
solely for the cassette input. A peripheral device is something that
communicates with the 8085, something that facilitates input and output
(I/O). The keyboard, LCD display, modem, speaker, light-pen and printer
(with their associated circuitry) are all I/O devices.
Luckily we don't have to learn the I/O instructions, because there are
subroutines already built-in that read data from these devices, or send
data to them. Some of these subroutines are listed starting on page 79
of the Model 100 Technical Manual. We will return to them in section 9.
6. The MoVe Immediate (MVI) Instructions.
The MVI instructions move an immediate value into any specified register.
The following examples should explain how it works better than mere words:
MVI A,$08 ;A will have the value 8 after this instruction
MVI C,$ED ;C will have the value $ED after this instruction
MVI H,$11 ;H will have the value $11 after this instruction
Technically speaking, MVI is an instruction by itself, part of the 8085
"instruction set". The grouping following it is called the operand. For any
move instruction, including the move immediate, the operand always consists
of a source and a destination. The destination always comes first, then a
comma, and then the source. The source for the MVI instruction is always an
8-bit number, a value between $00 and $FF.
The destination is always either a register or a memory location. One of the
nice features of the 8085 is the ability to treat a memory location just
like it was a register. You have to set up certain conditions before you
can do so, and that will be covered in section 8.
The MVI instruction takes up two bytes of memory. A byte is an 8-bit value,
so each memory location stores one byte. The first byte is the MVI r, part,
where r stands for any register. The second byte is the immediate value, the
source.
7. The MOVe (MOV) Instruction.
The MOV instruction moves data from one register to another. The instruction
format is MOV dr,sr, where dr is the destination register, and sr is the
source register. The content of the source register is the same after this
instruction is executed. The destination register will contain the same
value as the source register. The MOV instruction takes only one byte of
memory.
8. Moving Data to and from Memory.
So far we have moved immediate numbers into registers, and values from
register to register. Now we will see how to retrieve data from memory.
In this section we will also see what a source program looks like. To move
data into a memory location, we must first decide what address to use.
Each of the 65,536 memory locations has a unique address, a number from
$0000 to $FFFF. It takes a 16-bit number count that high, and it can be
done on the 8085. We have already seen that the 8085 has seven 8-bit
registers. Six of the registers can be "doubled-up" to give you three 16-bit
registers. The B and the C register give you the BC register, the D and E
registers givve you the DE register, and the H and L registers give you,
surprise!, the HL register.
The 16-bit registers are not separate from the 8-bit registers. If you store
an 8-bit number in L, and then a 16-bit number in HL, the 8-bit number is
lost. To add to the confusion, most assemblers allow you to use only the
first letter of a 16-bit register. Then the only way to tell the difference
between an 8 or 16-bit register is from context, i.e. from the instruction
that precedes it. Here we will use both letters for a 16-bit register.
The HL register is the only register that can be used to specify a memory
address. It is called a pointer register because it points to a memory
location. The H register contains the high byte of the address, and the L
register the low byte (that's why it's not called the FG register.)
High and low bytes may be unfamiliar to you. Consider the decimal number 47.
From convention we know that 4 is the number of "tens", and the 7 is the
number of "units", high and then low. For the hex number $47, 4 is the number
of "sixteens", and 7 is the number of "units". In the same way for the number
$4768, $47 is the number of "two-hundred-fifty-sixes", and $68 is the number
of "units". We split it this way because the first two digits can fit in the
high byte (register) and the second two can fit in the low register. However,
as we shall see later, sometimes the low byte comes first, with the high byte
following, but the HL register is always high-low.
The instruction that loads an address into the HL register is the LXI
instruction. The LXI instruction loads a 16-bit immediate value into the
register pair designated in the first half of the operand, the destination.
The second half of the operand is the 16-bit immediate value. The format
is LXI rp,$nnnn, where rp is any of the three register pairs, and $nnnn
is a 2 byte (up to four hex digit) number.
Once HL points to a memory address, we can treat that memory location just
like it was a register. We can use:
MVI M,$AB ;memory location pointed to by HL will equal $AB
MOV A,M ;A register will have value of memory location
The LXI instruction requires three bytes of memory. The first specifies the
instruction and the destination register (all in one byte), and the last two
contain the immediate 16-bit value. It is interesting (and confusing) to note
that addresses in memory are always stored low byte first.
The instruction LXI HL,$1234 assembles into the three consecutive bytes $21,
$34, $12. $21 is the LXI H, instruction. The $34 and $12 is the address with
the low byte first. You don't have to worry about the order when you are
programming because the assembler puts all the addresses in the proper order.
You do have to worry about it when debugging your program. Sometimes you have
to examine the machine language code to find the problem, and this high-low
business can get confusing. To repeat, addresses in the registers are always
stored high-low. Addresses in memory are stored low-high.
9. Subroutines and a Simple Program.
Let us now jump into some actual programming. Reading about instructions is
all very well, but the only way to learn is to use them. Let us suppose that
we want to print the seconds from the real time clock onto the LCD screen.
We will have our program print the word "Time = ", and then print the
seconds. We don't know how the screen works, but browsing through the
technical manual, we find a ROM subroutine that seems like it will do most
of the work for us. It is a subroutine that Radio Shack calls "LCD". The
manual describes it as "Displays a character on the LCD screen at current
cursor position."
Sounds good, but what does current cursor position mean? We don't know, so
we'll just use the subroutine and see where the numbers get displayed.
It is now time to back-up every file in memory.
Twice.
There is more information in the subroutine description. It says the entry
address is (hex) 4B44, entry conditions:
A=character to be displayed, exit conditions: None. The entry address is
simple the address of the subroutine. In Basic we gosub to a line number.
We dont have line numbers here, but we do have memory addresses.
The entry condition tells us that the A register must contain the character
we want printed when we call the subroutine. Of course registers don't
contain characters, they contain 8-bit numbers. We need to know what number
represents each character. The User's Manual tells us that starting on page
211. It gives us the numbers in decimal, hex and binary. We will use hex to
be consistent.
Look at program 1, and then we will discuss it.
________________________________________________________________________________
001 ;PRTIME
002 ;Oct 30, 1984
;
003 ORG $DAC0
004 ENT $DAC0
;
005 MVI A,$54 ;character code for "T"
006 CALL $4B44 ;print character in A onto screen
007 MVI A,$49 ;character code for "I"
008 CALL $4B44 ;print character
009 MVI A,$4D ;character code for "M"
010 CALL $4B44 ;print character
011 MVI A,$45 ;character code for "E"
012 CALL $4B44 ;print character
013 MVI A,$20 ;character code for space
014 CALL $4B44 ;print character
015 MVI A,$3D ;character code for "="
016 CALL $4B44 ;print character
017 MVI A,$20 ;character code for space
018 CALL $4B44 ;print character
;
019 LXI HL,$F934 ;load address of tens of seconds
020 MOV A,M ;move tens of seconds into register A
021 CALL $4B44 ;print character
022 LXI HL,$F933 ;load address of unit seconds
023 MOV A,M ;move unit seconds into register A
024 CALL $4B44 ;print character
;
025 CALL $12CB ;wait for keypress
026 JMP $5797 ;jump to menu
;
027 END
PROGRAM 1. Print seconds to screen.
________________________________________________________________________________
Please note that the line numbers are NOT part of the program. I added them
here for clarity, but the source code should not have them. Line numbers are
often printed out by the assembler itself, when you direct the assembler to
produce a listing.
The first thing you may notice are all the semicolons. They indicate that
what follows is a comment. Not all assemblers use the semicolon as the
comment indicator, so check your manual. There are no blank lines either.
Every line must have at least one character. However, it makes the program
easier to read if different portions of the program are separated by blank
lines, so the next best thing is to use a comment line without a comment,
just a semicolon. I've also put a comment after every instruction.
Comments are great if you have to come back to a program after not working
on it for a while, but they do take up space. I would recommend you use
comments liberally at first, and then delete some of them if you need more
memory.
The second thing you may notice is the ORG and ENT commands. These commands
are not commands for the 8085, but are commands for the assembler.
The ORG command is a standard (almost universal) command that tells the
assembler where the program is to reside in memory. It is the address of the
first instruction (or data) of your program (the ORiGin.) In this case the
program will start at $DAC0, which is 56000 in decimal.
The highest available memory address for your programs is 62959.
Memory above that is used for the file directory, and other RAM data the
M100 needs. Never let your program extend past there, and don't write to an
address above 62959 unless you know what you're doing.
The ENT instruction tells the assembler where the program should begin
execution. I have never found it neccessary to make the ENT address
different from ORG. Just make sure the first part of your program is an
instruction and not data.
The final instruction is END, and that is also an assembler directive. It
tells the assembler that that's all there is. You can put comments after the
END command without using a semicolon.
All the other lines contain 8085 assembly code.
Line five is the move immediate command, and moves the value $54 into
register A. $54 is the ASCII value for the letter "T". It is important to
note that even though all computers use ASCII code, the ASCII values may be
different. The ASCII value of "T" on the Apple is $D4.
Line 006 says it prints the character in A onto the screen. What it actually
does is CALL the subroutine at $4B44. The CALL instruction is just like the
GOSUB command in Basic. When a return instruction (and there are several
types) is encountered, program execution will resume at the instruction
immediately following the CALL instruction.
In lines 007 to 018, each letter of the word "IME = " is moved into the A
register, and then $4B44 is CALLed. The next section prints the seconds from
the real-time clock onto the screen.
To do this, we need to know where the seconds are located in memory.
Information like this can be found on a "memory map" of the Model 100. A
memory map tells what each address or range of addresses is used for. Most
commercial assemblers come with one. By looking in Custom Software's rather
extensive one, I discover that the tens of seconds are stored at address
$F934, and the unit seconds are at $F933.
So now we want to move a value from a memory location into the A register.
To do that we must first load the memory address into the HL register
(pointer register.) Line 019 does that with the 16-bit load immediate
instruction. Line 020 moves the value from memory pointed to by HL ($F934),
into the A register, and line 021 CALLs the subroutine at $4B44 print the
value in A to the screen.
Lines 022 through 024 do the same thing with the unit seconds at $F933. Once
the seconds have been displayed, we want the program to stop and wait until
we press any key before it returns to the menu. Line 024 does that for us.
Line 024 calls a subroutine at $12CB. $12CB is the entry address for the
subroutine that the M100 technical manual calls CHGET, which "waits and gets
character from keyboard. Since we don't care what key is pressed, that is all
we need to know. As it happens, the ASCII value of the key we press will be
in the A register when execution returns from that subroutine.
Line 025 will only be executed after a key has been pressed. At line 025
execution JuMPs to address $5797. The jump instruction is like the GOTO
instruction in Basic. The jump is executed no matter what. $5797 returns
execution to the main menu, it is like the MENU command in Basic.
Remember to leave out the line numbers when you type in this program.
Be careful to use either tabs or spaces as required by your assembler. They
are usually very finicky, and a space where a tab should be, or vice versa,
may give you a strange error message when you assemble it.
When you run the program from the menu, the screen clears, and the message
we programmed will be displayed in the upper-left corner. Evidently the
"current cursor position" described in the technical manual is line 1 and
row 1. What happens is that running any program from the menu automatically
clears the screen and sets the cursor to the upper-left corner of the screen.
If you run the program from Basic with a CALL 56000 command, the message will
appear wherever the cursor was.
10. Sneak Preview.
That concludes the first chapter of this tutorial. The next chapter will
cover conditional branching. The jump instruction used in the previous
section is an unconditional branch, it always occurs. Suppose you want to
jump back to the beginning of the program if the key pressed was a "B", and
jump to the menu if it was not. Two lines should be inserted between lines
024 and 025:
024a CPI $42 ;compare A with the immediate value $42 (= ASCII "B")
024b JZ $DAC0 ;if equals, then jump to start of this program
As you can see, Conditional branching makes things a lot more interesting.
11. Using Labels.
In section 9 a program was presented that displays "TIME = ", followed by
the seconds. In this section a more useful way of displaying messages is
discussed.
There are several ways to print characters to the LCD. The sample program in
section 9 actually uses the most inefficient method. However, the Radio
Shack Technical Manual doesn't tell you that there is a ROM subroutine that
prints a whole series of characters (a "string") for you. The subroutine to
do that is located at $5A58. I learned this from the instruction manual for
Custom Software's Model 100 Assembler. The assembler even has a nifty little
"macro" that does it for you. (A macro is a one word abbreviation for any
number of instructions. They can make the source code shorter.)
To use the subroutine, you need to have the string stored in memory. Each
assembler has it's own way of assigning data to memory. Here are examples of
how Custom Software does it:
________________________________________________________________________________
EL DB $4C ;stores ASCII "L" to memory location "EL"
JK DW $4A4B ;stores $4A to location "JK" & $4B to "JK"+1
MESSAGE DM TIME = ;stores characters starting at location "MESSAGE"
--------------------------------------------------------------------------------
DB means Define Byte. DW means Define Word (two bytes), and DM means Define
Message. DB and DW are pretty standard, but on some assemblers DM is STR,
for STRing. DB, DW and DM are all assembler directives. They tell the
assembler that there is data here, and not an instruction. The assembler
stores the appropriate value(s) in memory for you.
Notice the words "EL", "JK" and "MESSAGE" in the first column of each
assembler directive. The first column is called the label field, and those
words are labels. The use of labels is discussed below.
Now that we have the message stored in memory, we can use the ROM subroutine
at $5A58. According to Custom Software's manual, there are two requirements
that must be met (entry conditions). The data to be displayed must be
terminated with a zero. The subroutine will display character after
character until it reaches a zero, and then it will RETurn to your program.
Remember, it is looking for the value zero, not the ASCII value for zero,
which is $30. We must therefore add:
DB $00 ;store the value zero as a terminator
after the message line above.
The other entry condition is that HL should point to the characters to be
displayed. That means the HL register needs to contain the address of the
message. What is the address of the message? If it is the first line of the
program, it will be wherever the program is ORG'ed. If it isn't, we would
then have to count the number of bytes from the beginning of the program,
and add the ORG value to that. If you change the program, you would then
have to recalculate each address.
Luckily, there is a easier way, using labels. In the example above, MESSAGE
is a label. When the source code is assembled, the assembler keeps track of
all the labels. In this case, it would know the memory location of the label
MESSAGE. You can then use "MESSAGE" whenever you mean "the memory location
of the label MESSAGE".
Now we can load the HL register with the address of the message:
LXI HL,MESSAGE ;load HL with the address of the message
and then call the subroutine to display it:
CALL $5A58 ;display message
We don't even have to know what address MESSAGE stands for, the assembler
will take care of it for us. However, most assemblers will give you a list
of all labels and their values at the end of the source code listing.
Suppose now we wanted to call the subroutine at location $5A58 "DISPLAY".
If we could tell the assembler that the label DISPLAY means $5A58, then we
could use the label instead. We can do that using the EQUate directive.
The EQUate directive tells the assembler to assign a value to the label of
the EQUate directive. For example:
DISPLAY EQU $5A58 ;assign $5A58 to the label DISPLAY
at the beginning of your program will tell the assembler to substitute $5A58
wherever it sees DISPLAY.
Program 2 does the same thing as program 1, but a little more elegantly.
Here we use labels, and the ROM subroutine at
$5A58.
________________________________________________________________________________
;PRTIME2
;Nov 12, 1984
;
ORG $DAC0
ENT $DAC0
;
; These are all ROM subroutines
DISPLAY EQU $5A58 ;print message pointed to by HL
LCD EQU $4B44 ;print character in register A
CHGET EQU $12CB ;wait for keypress
MENU EQU $5797 ;main MODEL 100 menu
;
SECS EQU $F933 ;memory location for seconds
;
BEGIN LXI HL,MESSAGE ;set HL pointer to start of message data
CALL DISPLAY ;display message
;
LXI HL,SECS+1 ;SECS+1 = $F933+1 = $F934
MOV A,M
CALL LCD
LXI HL,SECS
MOV A,M
CALL LCD
;
CALL CHGET
JMP MENU
;
MESSAGE DM TIME = ;message data
DB $00 ;terminator
;
END
PROGRAM 2. Print seconds to screen (using labels).
________________________________________________________________________________
In program 2 you can see that you can perform arithmetic on labels. Some
assemblers can evaluate very complex expressions, while others can handle
only addition and subtraction. Check the documentation for your particular
assembler.
On page 80 of the Radio Shack Technical Manual is listed the addresses for
several useful display subroutines. Also listed are the cursor locations.
By moving different values into these two locations, you can start your
printing anywhere on the screen. Experimenting is fun, but don't forget to
be prepared for those cold starts!
(Thanks to Greg Susong of Custom Software for permission to use information
taken from the Custom Software Assembler Manual.)