Transcript Assbler
x86, Assembler
TASM, MASM, NASM
Available assembler
MASM
Microsoft : Macro Assembler
TASM
Borland : Turbo Assembler
NASM
Library General Public License (LGPL) [Free] :
Netwide Assembler
etc, Flat Assembler, SpAssembler
MASM: Microsoft Macro Assembler
MASM contains a macro language with
looping, arithmetic, text string processing,
and so on, and
MASM supports the instruction sets of the
386, 486, and Pentium processors, providing
you with greater direct control over the
hardware. You also can avoid extra time and
memory overhead when using MASM.
http://msdn.microsoft.com/library/en-us/vcmasm/
html/vcoriMicrosoftAssemblerMacroLanguage.asp
TASM: Turbo Assembler
TASM, Inpise's Borland Turbo Assembler,
supports an alternative to MASM emulation.
This is known as Ideal mode and provides
several advantages over MASM.
The key (questionable) disadvantage, of
course, is that MASM style assemblers
cannot assemble Ideal mode programs.
NASM: Netwide Assembler
NASM is designed for portability and
modularity. It supports a range of object file
formats including Linux, Microsoft 16-bit OBJ
and Win32. Its syntax is designed to be
simple and easy to understand, similar to
Intel's but less complex.
It supports Pentium, P6, MMX, 3DNow! and
SSE opcodes, and has macro capability. It
includes a disassemble as well.
NASM is Library General Public License
(LGPL) [Free]
http://nasm.sourceforge.net
FASM: Flat Assembler
Currently it supports all 8086-80486/Pentium
instructions with MMX, SSE, SSE2, SSE3 and
3DNow! extensions, can produce output in
binary, MZ, PE, COFF or ELF format.
It includes the powerful but easy to use
macroinstruction support and does multiple
passes to optimize the instruction codes for size.
The flat assembler is self-compilable and the full
source code is included.
http://flatassembler.net/
About developing assembly language
CPU’s language (instructions)
X86 instruction set
About Complier
Directives
MASM
TASM
NASM
TASM
Important files
Compiler
TASM
TASM32
Linker
TLINK
16 bits
32 bits
real mode
protected mode
Pseudo instructions
Segment, ends : To define a segment.
Assume: To specify which segment defined
by “Sengment, ends” should use which
segment-register
Data Allocate
Segment Declaration
Usage
Segment_name
…
Segment_name
Ex.
Cseg
…
Cseg
segment
ends
segment
ends
Label declaration
Usage
Label name follow with colon “:”
Ex.
Start: …
mov bx, offset start
…
jmp Start
Data allocate
Define value
DB
DW
DD
DQ
DT
Define Byte
Define Word
Define Doubleword
Define Quadword
Define Ten Bytes
Usage
Var_name
Dx
data
Ex. Data allocation
dseg segment
Msg
MulH
MulF
dseg ends
db “hello world$”
dw 0, 1, 2, 3
dd 1234h
Data duplication
Usage
type count dup (value)
Ex.
data1
data2
data3
data4
db
db
db
db
10 dup (0)
2 dup (3 dup (0))
3 dup (1, 2, 3 dup (4))
4 dup (?)
Structure
Struc PosType
Row
dw ?
Col
dw ?
Ends PosType
Union PosValType
Pos
PosType ?
Val
dd
?
Ends PosValType
Point
PosValType ?
Structure
mov [Point.Pos.Row], bx ;
; OK: Move BX to Row component of Point
mov [Point.Pos.Row], bl ;
; Error: mismatched operands
Data reference
offset directive, To retrieve an offset of a data
mov bx, offset msg1 ;dx=offset/addr
To retrieve / put a data
mov dx, msg1
mov [msg1], dx
mov [bx+2], dx
;dx = [msg1]
;[msg1] = dx
;[bx+2] = dx
Memory contents
ByteVal db ? ;"ByteVal" is name of byte variable
mov ax, bx
;OK: Move value of BX to AX
mov ax, [bx]
;OK: Move word at address BX to AX. Size of
;destination is used to generate proper object code
mov ax,[word bx]
;OK: Same as above with unnecessary size qualifier
mov ax,[word ptr bx]
;OK: Same as above with unnecessary size qualifier
;and redundant pointer prefix
mov al, [bx]
;OK: Move byte at address BX to AL. Size of
;destination is used to generate proper object code
mov [bx], al ; OK: Move AL to location BX
Memory contents
mov ByteVal, al
;Warning: "ByteVal" needs brackets
mov [ByteVal], al
;OK: Move AL to memory location named "ByteVal"
mov [ByteVal], ax
;Error: unmatched operands
mov al, [bx+2]
;OK: Move byte from memory location BX+2 to AL
mov al, bx[2]
; Error: indexes must occur with "+" as above
mov bx, Offset ByteVal
;OK: Offset statement does not use brackets
mov bx, Offset [ByteVal]
; Error: offset cannot be taken of the contents of
memory
Memory contents
lea bx, [ByteVal]
;OK: Load effective address of "ByteVal"
lea bx, ByteVal
;Error: brackets required
mov ax, 01234h
;OK: Move constant word to AX
mov [bx], 012h
;Warning: size qualifier needed to determine
;whether to populate byte or word
mov [byte bx], 012h
;OK: constant 012h is moved to byte at address BX
mov [word bx], 012h
;OK: constant 012h is moved to word at address BX
Echo entered string
cseg
segment
assume cs:cseg, ds:cseg
org 100h
start: jmp load
Buf
db 11, 12 dup (' ')
_ent db 10,13,’$’ ;lf,cr
load: mov ah,0ah
mov dx,offset buf
int 21h
mov
mov
mov
int
ah,09h
dx,load
dx,offset _ent
21h
mov
mov
mov
add
mov
mov
mov
int
int
cseg
al,[buf+1]
ah,00h
bx,offset buf+2
bx,ax
byte ptr [bx],'$'
ah,09h
dx,offset buf+2
21h
20h
ends
end
start
Compiling a program
Syntax:
TASM [options] source [,object] [,listing] [,xref]
/z Display source line with error message
/zi,/zd,/zn Debug info: zi=full, zd=line numbers only, zn=none
Ex
TASM –zi hello.asm
Creating an executable file
TLINK objfiles, exefile, mapfile, libfiles, deffile,
resfiles
/v Full symbolic debug information
/t Create COM file (same as /Tdc)
/Txx Specify output file type
Tdx DOS image (default)
x can be e=EXE or c=COM
Twx Windows image
x can be e=EXE or d=DLL
Ex
Tlink /v /t hello;
NASM
NASM vs. MASM & TASM
NASM is case sensitive.
NASM Requires Square Brackets For
Memory References
No need ‘offset’, either ‘equ’ or ‘address’
; mov ax, offset data
Use square bracket to retrieve content
mov ax, data
mov ax, [data]
;
Everything is treated as a label instead of var
or equ or else
NASM vs. MASM & TASM
Does not support hybrid syntaxes, such as
mov ax, table [bx] -> mov ax, [table + ax]
Likewise
mov ax, es:[di]
-> mov ax, [es:di]
NASM Doesn't Store Variable Types
NASM, by design, chooses not to remember
the types of variables you declare. Whereas
MASM will remember, on seeing `var dw 0',
that you declared `var' as a word-size
variable, and will then be able to fill in the
ambiguity in the size of the instruction
‘mov var,2’, NASM will deliberately remember
nothing about the symbol ‘var’ except where it
begins, and so you must explicitly code
‘mov word [var],2’.
NASM Doesn't Store Variable Types
For this reason, NASM doesn't support the
`LODS', `MOVS', `STOS', `SCAS', `CMPS',
`INS', or `OUTS' instructions, but only
supports the forms such as `LODSB',
`MOVSW', and `SCASD', which explicitly
specify the size of the components of the
strings being manipulated.
NASM Doesn't `ASSUME'
As part of NASM's drive for simplicity, it also
does not support the ‘ASSUME’ directive.
NASM will not keep track of what values you
choose to put in your segment registers, and
will never _automatically_ generate a
segment override prefix.
NASM Doesn't Support Memory Models
NASM also does not have any directives to support
different 16-bit memory models. The programmer has
to keep track of which functions are supposed to be
called with a far call and which with a near call, and is
responsible for putting the correct form of ‘RET’
instruction (`RETN' or `RETF'; NASM accepts `RET'
itself as an alternate form for `RETN'); in addition, the
programmer is responsible for coding CALL FAR
instructions where necessary when calling _external_
functions, and must also keep track of which external
variable definitions are far and which are near.
Layout of a NASM Source Line
Like most assemblers, each NASM source
line contains (unless it is a macro, a
preprocessor directive or an assembler
directive: some combination of the four fields
label:
instruction operands
; comment
Declaring Initialized Data
DB, DW, DD, DQ and DT are used, much as
in MASM, to declare initialized data in the
output file. They can be invoked in a wide
range of ways:
db
db
db
db
dw
dw
dw
dw
dd
dd
dq
dt
0x55
;
0x55,0x56,0x57
;
'a',0x55
;
'hello',13,10,'$';
0x1234
;
'a'
;
'ab'
;
'abc'
;
0x12345678
;
1.234567e20
;
1.234567e20
;
1.234567e20
;
just the byte 0x55
three bytes in succession
character constants are OK
so are string constants
0x34 0x12
0x61 0x00 (it's just a number)
0x61 0x62 (character constant)
0x61 0x62 0x63 0x00 (string)
0x78 0x56 0x34 0x12
floating-point constant
double-precision float
extended-precision float
Declaring Uninitialized Data
RESB, RESW, RESD, RESQ and REST are
designed to be used in the BSS section of a
module: they declare uninitialized storage
space.
Each takes a single operand, which is the
number of bytes, words, doublewords or
whatever to reserve.
NASM does not support the MASM/TASM
syntax of reserving uninitialized space by
writing `DW ?' or similar things.
Defining Constants
EQU defines a symbol to a given constant
value: when EQU is used, the source line
must contain a label. The action of EQU is to
define the given label name to the value of its
(only) operand.
This definition is absolute, and cannot change
later. So, for example,
message
msglen
db
equ
'hello, world'
$-message
Repeating Instructions or Data
The TIMES prefix causes the instruction to be
assembled multiple times. This is partly
present as NASM's equivalent of the DUP
syntax supported by MASM-compatible
assemblers, in that you can code
zerobuf:
times 64 db 0
times 100 movsb ; trivial unrolled loops
Effective Addresses
An effective address is any operand to an
instruction which references memory.
Effective addresses, in NASM, have a very
simple syntax: they consist of an expression
evaluating to the desired address, enclosed
in square brackets. For example:
wordvar dw
123
mov ax,[wordvar]
mov ax,[wordvar+1]
mov ax,[es:wordvar+bx]
Numeric Constants
A numeric constant is simply a number.
NASM allows you to specify numbers in a
variety of number bases, in a variety of ways:
you can suffix
H, Q or O, and B for hex, octal and binary, or
prefix ‘0x’ or ‘$’ for hex in the style of C and
Pascal
Note, a hex number prefixed with a ‘$’ sign must
have a digit after the ‘$’ rather than a letter.
Ex. Numeric Constants
mov
mov
mov
mov
mov
mov
mov
ax,100
ax,0a2h
ax,$0a2
; decimal
; hex
; hex again
; the 0 is required
ax,0xa2
; hex yet again
ax,777q
; octal
ax,777o
; octal again
ax,10010011b ; binary
Echo entered string
org 0x100
start:jmp load
buf: db 11
resb 12
;reserve 12 bytes
_ent: db 10, 13, '$‘
load:
mov ah,0ah
mov dx,buf
int 21h
mov ah,$09
mov dx,_ent
int 21h
mov
mov
mov
add
mov
al,[buf+1]
ah,0x00
bx,buf+2
bx,ax
byte [bx],'$'
mov
mov
int
int
ah,09h
dx,buf+2
21h
20h
How to NASM…
nasm -f bin program.asm -o program.com
nasm -f bin driver.asm -odriver.sys
Q&A
That’s it for now.