(*** download for full text ***)
Sk00l m3 ASM!!#@$!@#
by Ralph (fu@ckz.org)
-AWC (http://awc.rejects.net)
Version: 0.841
Date: 7/23/00
Fear of serious injury cannont alone justify suppression of free s
peech and assembly.
Louis Brandeis, 1927
TOC
1. Introduction
-What is it?
-Why learn it?
-What will this tutorial teach you?
2. Memory
-Number Systems
-Decimal
-Binary
-Hexadecimal
-Bits, Nybbles, Bytes, Words, Double Words
-The Stack
-Segment:Offset
-Registers
3. Getting started
-Getting an assembler
-Program layout
-.COM
-.EXE
4. Basic ASM
-Basic Register operations
-Stack operations
-Arithmetic operations
-Bit wise operation
-Interrupts
5. Tools
-Debug
-CodeView
6. More basics
-.COM file format
-Flow control operations
-Loops
-Variables
-Arrays
-String Operations
-Sub-Procedures
-User Input
7. Basics of Graphics
-Using interrupts
-Writing directly to the VRAM
-A line drawing program
8. Basics of File Operations
-File Handles
-Reading files
-Creating files
-Search operations
9. Basics of Win32
-Introduction
-Tools
-A Message Box
-A Window
Appendix A
-Resources
Appendix B
-Credits, Contact information, Other shit
1. Introduction
================
What is it?
-----------
Assembly language is a low-level programming language. The syntax
is nothing like
C/C++, Pascal, Basic, or anything else you might be used to.
Why learn it?
-------------
If you ask someone these days what the advantage of assembly is, t
hey will tell you it's
speed. That might have been true in the days of BASIC or Pascal,
but today a C/C++
program compiled with an optimized compiler is as fast, or even fa
ster than the same
algorithm in assembly. According to many people assembly is dead.
So why bother
learning it?
1. Learning assembly will help you better understand just how a c
omputer works.
2. If windows crashes, it usually returns the location/action tha
t caused the error.
However, it doesn't return it in C/C++. Knowing assembly is t
he only way to track
down bugs/exploits and fix them.
3. How often do you wish you could just get rid of that stupid na
g screen in that
shareware app you use? Knowing a high-level language wont get
you very far when you
open the shit up in your decompiler and see something like CMP
EAX, 7C0A
4. Certain low level and hardware situations still require assemb
ly
5. If you need precise control over what your program is doing, a
high level language
is seldom powerful enough.
6. Anyway you put it, even the most optimized high level language
compiler is still
just a general compiler, thus the code it produces is also gen
eral/slow code. If
you have a specific task, it will run faster in optimized asse
mbly than in any other
language.
7. "Professional Assembly Programmer" looks damn good on a resum
e.
My personal reason why I think assembly is the best language is th
e fact that you're
in control. Yes all you C/C++/Pascal/Perl/etc coders out there, i
n all your fancy
high level languages you're still the passenger. The compiler and
the language itself
limit you. In assembly you're only limited by the hardware you ow
n. You control the
CPU and memory, not the otherway around.
What will this tutorial teach you?
----------------------------------
I tryed to make this an introduction to assembly, so I'm starting
from the beginning.
After you've read this you should know enough about assembly to de
velop graphics
routines, make something like a simple database application, accep
t user input,
make Win32 GUIs, use organized and reuseable code, know about diff
erent data types
and how to use them, some basic I/O shit, etc.
2. Memory
==========
In this chapter I will ask you to take a whole new look at compute
rs. To many they
are just boxes that allow you to get on the net, play games, etc.
Forget all that
today and think of them as what they really are, Big Calculators.
All a computer does
is Bit Manipulation. That is, it can turn certain bits on and off
. A computer can't
even do all arithmetic operations. All it can do is add. Subtrac
tion is achieved
by adding negative numbers, multiplication is repeaded adding, and
dividing is
repeaded adding of negative numbers (well sort of, but CPU logic i
s out of the scope
of this file).
Number systems
--------------
All of you are familiar with at least one number system, Decimal.
In this chapter I
will introduce you to 2 more, Binary and Hexadecimal.
Decimal
Before we get into the other 2 systems, lets review the decimal sy
stem. The decimal
system is a base 10 system, meaning that it consists of 10 numbers
that are used to make
up all other number. These 10 numbers are 0-9. Lets use the numb
er 125 as an example:
Hundreds Tens Units
Digit 1 2 5
Meaning 1x10^2 2x10^1 5x10^0
Value 100 20 5
NOTE: x^y means x to the power of y. ex. 13^3 means 13 to the po
wer of 3 (2197)
Add the values up and you get 125.
Make sure you understand all this before going on to the binary sy
stem!
Binary
The binary systems looks harder than decimal at first, but is infa
ct quite a bit easier
since it's only base 2 (0-1). Remember that in decimal you do "va
lue x 10^position" to
get the real number, well in binary you go "value x 2^position" to
get the answer.
Sounds more complicated than it is. To better understand this, le
ts to some converting.
Take the binary number 10110:
1 x 2^4 = 16
0 x 2^3 = 0
1 x 2^2 = 4
1 x 2^1 = 2
0 x 2^0 = 0
Answer: 22
NOTE: for the next example I already converted the Ax2^B stuff to
the real value:
2^0 = 1
2^1 = 2
2^2 = 4
2^3 = 8
2^4 = 16
2^5 = 32
etc....
Lets use 111101:
1 x 32 = 32
1 x 16 = 16
1 x 8 = 8
1 x 4 = 4
0 x 2 = 0
1 x 1 = 1
Answer: 61
After a while you will get the hang of converting binary to decima
l. I can fluently
convert up to 8 binary digits to decimal, it's all just practise.
Make up some binary numbers and convert them to decimal to practis
e this. It is very
important that you completely understand this concept. If you don
't, check Appendix B
for links and read up on this topic BEFORE going on!
Now let's convert decimal to binary, take a look at the example be
low:
238 / 2 remainder: 0
119 / 2 remainder: 1
59 / 2 remainder: 1
29 / 2 remainder: 1
14 / 2 remainder: 0
7 / 2 remainder: 1
3 / 2 remainder: 1
1 / 2 remainder: 1
0 / 2 remainder: 0
Answer: 11101110
Lets go through this:
1. Divide the original number by 2, if it divides evenly the rema
inder is 0
2. Divide the answer from the previous calculation (119) by 2. I
f it wont
divide evenly the remainder is 1.
3. Round the number from the previous calculation DOWN (59), and
divide it by 2.
Answer: 29, remainder: 1
4. Repeat until you get to 0....
The final answer should be 011101110, notice how the answer given
is missing the 1st 0?
That's because just like in decimal, they have no value and can be
omitted (023 = 23).
Practise this with some other decimal numbers, and check it by con
verting your answer
back to binary. Again make sure you get this before going on!
A few additional things about binary:
* Usually 1 represents TRUE, and 0 FALSE
* When writing binary, keep the number in multiples of 4
ex. DON'T write 11001, change it to 00011001, remember that the
0 in front
are not worth anything
* Usually you add a b after the number to signal the fact that it
is a binary number
ex. 00011001 = 00011001b
Hexadecimal
Some of you may have notice some consistency in things like RAM fo
r example. They seem
to always be a multiple of 4. For example, it is common to have 1
28 megs of RAM, but
you wont find 127 anywhere. That's because computer like to use m
ultiples of 2, 4, 8,
16, 32, 64 etc. That's where hexadecimal comes in. Since hexadec
imal is base 16, it is
perfect for computers. If you understood the binary section earli
er, you should have
no problems with this one. Look at the table below, and try to me
morize it. It's not
as hard as it looks.
Hexadecimal Decimal Binary
0h 0 0000b
1h 1 0001b
2h 2 0010b
3h 3 0011b
4h 4 0100b
5h 5 0101b
6h 6 0110b
7h 7 0111b
8h 8 1000b
9h 9 1001b
Ah 10 1010b
Bh 11 1011b
Ch 12 1100b
Dh 13 1101b
Eh 14 1110b
Fh 15 1111b
NOTE: the h after each hexadecimal number stands for <insert g
uess here>
Now lets do some converting:
Hexadecimal to Decimal
2A4F
F x 16^0 = 15 x 1 = 15
4 x 16^1 = 4 x 16 = 64
A x 16^2 = 10 x 256 = 2560
2 x 16^3 = 2 x 4096 = 8192
Answer: 10831
1. Write down the hexadecimal number starting from the last digit
2. Change each hexadecimal number to decimal and times them by 16
^postion
3. Add all final numbers up
Confused? Lets do another example: DEAD
D x 1 = 13 x 1 = 13
A x 16 = 10 x 16 = 160
E x 256 = 14 x 256 = 3584
D x 4096 = 13 x 4096 = 53248
Answer: 57005
Practise this method until you get it, then move on.
Decimal to Hexadecimal
Study the following example:
1324
1324 / 16 = 82.75
82 x 16 = 1312
1324 - 1312 = 12, converted to Hexadecimal: C
82 / 16 = 5.125
5 x 16 = 80
82 - 80 = 2, converted to Hexadecimal: 2
5 / 16 = 0.3125
0 x 16 = 0
5 - 0 = 5, converted to Hexadecimal: 5
Answer: 52C
I'd do another example, but it's too much of a pain in the ass, ma
ybe some other time.
Learn this section you WILL need it!
This was already one of the hardest parts, the next sections shoul
d be a bit easier
Some additional things about hexidecimal
1. It's not uncommon to say "hex" instead of "hexadecimal" even t
hough "hex" means 6,
not 16.
2. Keep hexadecimal numbers in multiples of 4, adding zeros as ne
cessary
3. Most assemblers can't handle numbers that start with a "letter
" because they don't
know if you mean a label, instruction, etc. In that case ther
e are a number of
other ways you can express the number. The most common are:
DEAD = 0DEADh (Usually used for DOS/Win)
and
DEAD = 0xDEAD (Usually used for *Nix based systems)
Consult your assembler's manual to see what it uses.
By the way, does anyone think I should add Octal to this...?
Bits, Nibbles, Bytes, Words, Double Words
-----------------------------------------
Bits are the smallest unit of data on a computer. Each bit can on
ly represent 2 numbers,
1 or 0. Bits are fairly useless because they're too damn small so
we got the nibble.
A nibble is a collection of 4 bits. That might not seem very inte
resting, but remember
how all 16 hexadecimal numbers can be represented with a set of 4
binary numbers?
That's pretty much all a nibble is good for.
The most important data structure used by your computer is a Byte.
A byte is the
smallest unit that can be accessed by your processor. It is made
up of 8 bits, or
2 nibbles. Everything you store on your hard drive, send with you
r modem, etc is in
bytes. For example, lets say you store the number 170 on your har
d drive, it would look
like this:
+---+---+---+---+---+---+---+---+
| 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
+---+---+---+---+---+---+---+---+
7 6 5 4 3 2 1 0
H.O Nibble | L.O Nibble
10101010 is 170 in binary. Since we can fit 2 nibbles in a byte,
we can also refer
to bits 0-3 as the Low Order Nibble, and 4-7 as the High Order Nib
ble
Next we got Words. A word is simply 2 bytes, or 16 bits. Say you
store 43690, it would
look like this:
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
High Order Byte | Low Order Byte
Again, we can refer to bits 0-7 as Low Order Byte, and 7-15 as Hig
h Order Byte.
Lastly we have a Double Word, which is exactly what it says, 2 wor
ds, 4 bytes, 8 nibbles
or 32 bits.
NOTE: Originally a Word was the size of the BUS from the CPU to t
he RAM. Today most
computers have at least a 32bit bus, but most people were u
sed to
1 word = 16 bits so they decided to keep it that way. As f
ar as I know this
is only true for intet and compadible CPUs. A MIPS word fo
r example is 32-bit.
The Stack
---------
You have probably heard about the stack very often. If you still
don't know what it
means, read on. The stack is a very useful Data Structure (anythi
ng that holds data).
Think of it as a stack of books. You put one on top of it, and th
at one will be the
first one to come of next. Putting stuff on the stack is called P
ushing, getting stuff
from the stack is called Poping. For example, say you have 5 book
s called A, B, C, D,
and E stack on top of each other like this:
A
B
C
D
E
Now you add (push) book F to the stack:
F
A
B
C
D
E
If you pop the stack, you get book F back and the stack looks like
this again:
A
B
C
D
E
This called LIFO, Last In, First Out.
So what good is all this? The stack is extremely useful as a "scr
atchpad" to
temporarily hold data.
Segment:Offset
--------------
Everything on your computer is connected through a series of wires
called the BUS.
Back in the days when dinosaurs ruled the earth, theB US to the RA
M was 16 bits. So
when the processor needed to write to the RAM, it did so by sendin
g the 16 bit
location through the bus. This meant that computers could only ha
ve 65535 bytes of
memory (16 bits = 1111111111111111 = 65535). That was plenty bac
k than, but today
that's not quite enough. So designers came up with a way to send
20 bits over the
bus, thus allowing for a total of 1 MB of memory. In this new des
ign, memory is
segmented into a collection of bytes called Segments, and can be a
ccess by specifying
the Offset number within those segments. So if the processor want
s to access data it
first sends the Segment number, followed by the Offset number. Fo
r example, the
processor sends a request of 1234:4321, the RAM would send back th
e 4321st byte in
segment number 1234. This all might sound a bit complicated, but
study it
carefully and you should be able to master segment:offset.
I find the best way to picture seg:off is with a 2 dimensional arr
ay. Remember that
X,Y shit you had to learn in grade 9 math?
Look at the diagram below, the * is located at 4:3. The Y-axis is
equal to the segment,
and the X-axis is the offset.
+--+--+--+--+--+
5 | | | | | |
+--+--+--+--+--+
4 | | |* | | |
Y axis +--+--+--+--+--+
3 | | | | | |
+--+--+--+--+--+
2 | | | | | |
+--+--+--+--+--+
1 | | | | | |
+--+--+--+--+--+
1 2 3 4 5
X axis
To get the physical address do this calculation:
Segment x 10h + Offset = physical address
For example, say you have 1000:1234 to get the physical address yo
u do:
1000h X 10h = 10000h
10000h
+ 1234h
------
11234h
This method is fairly easy, but also fairly obsolete. Starting fr
om the 286 you can
work in Protected Mode. In this mode the CPU uses a Look Up Table
to compute the
seg:off location. That doesn't mean that you cannot use seg x 10h
+ off though, you
will only be limited to working in Real Mode and your programs can
't access more than
1 MB. However by the time you know enough to write a program even
close to this limit,
you already know how to use other methods (for those comming from
a 50 gig hard drive
world, a program that's 1 MB is about 10x bigger than this text fi
le is).
Registers
---------
A processor contains small areas that can store data. They are to
o small to store
files, instead they are used to store information while the progra
m is running.
The most common ones are listed below:
General Purpose:
NOTE: All general purpose registers are 16 bit and can be broken
up into two 8 bit
registers. For example, AX can be broken up into AL and AH
. L stands for Low
and H for High. If you assign a value to AX, AH will conta
in the first part of
that value, and AL the last. For example, if you assign th
e value DEAD to AX,
AH will contain DE and AL contains AD. Likewise the other
way around, if you
assign DE to AH and AD to AL, AX will contain DEAD
AX - Accumulator.
Made up of: AH, AL
Common uses: Math operations, I/O operations
BX - Base
Made up of: BH, BL
Common uses: Base or Pointer
CX - Counter
Made up of: CH, CL
Common uses: Loops and Repeats
DX - Displacement
Made up of: DH, DL
Common uses: Various data, character output
When the 386 came out it added 4 new registers to that category: E
AX, EBX, ECX, and EDX.
The E stands for Extended, and that's just what they are, 32bit ex
tensions to the
originals. Take a look at this diagram to better understand how t
his works:
| EAX |
+----+----+----+----+
| | | AH | AL |
+----+----+----+----+
| AX |
Each box represents 8 bits
NOTE: There is no EAH or EAL
Also note that in this textfile I will only concentrate on 16-bit
programs, so I will
not talk about any 32-bit registers.
Segment Registers:
NOTE: It is dangerous to play around with these!
CS - Code Segment. The memory block that stores code
DS - Data Segment. The memory block that stores data
ES - Extra Segment. Commonly used for video stuff
SS - Stack Segment. Segment the stack is in.
Index Registers:
SI - Source Index. Commonly used to specify the source of a strin
g/array
DI - Destination Index. Used to specify the destination of a stri
ng/array
IP - Instruction Pointer. Contains the address of the current exe
cuting instruction.
You can not directly move data into it, you can however manip
ulate its contents
with various branching instructions.
Stack Registers:
BP - Base pointer. Used in conjunction with SP for stack operatio
ns
SP - Stack Pointer.
Special Purpose Registers:
IP - Instruction Pointer. Holds the offset of the instruction bei
ng executed
Flags - These are a bit different from all other registers. A fla
g register is only 1
bit in size. It's either 1 (true), or 0 (false). There are a num
ber of flag registers
including the Carry flag, Overflow flag, Parity flag, Direction fl
ag, and more. You
don't usually assign numbers to these manually, although it certai
nly is possible.
The value automatically set depending on the previous instruction.
One common use for
them is for branching. For example, say you compare the value in
BX with the value in
CX, if it's the same the flag would be set to 1 (true) and you cou
ld use that
information to branch of into another area of your program.
There are a few more registers, but you will most likely never use
them anyway.
Exercises:
1. Write down all general purpose registers and memorize them
2. Make up random numbers and manually convert them into Binary a
nd hexadecimal
3. Make a 2D graph of the memory located at 0106:0100
4. Get the physical address of 107A:0100
(*** download for full text ***)