Transcript Assbler

x86, Assembler
TASM, MASM, NASM
Available assembler
 MASM

Microsoft : Macro Assembler
 TASM

Borland : Turbo Assembler
 NASM

Library General Public License (LGPL) [Free] :
Netwide Assembler
 etc, Flat Assembler, SpAssembler
MASM: Microsoft Macro Assembler
 MASM contains a macro language with
looping, arithmetic, text string processing,
and so on, and
 MASM supports the instruction sets of the
386, 486, and Pentium processors, providing
you with greater direct control over the
hardware. You also can avoid extra time and
memory overhead when using MASM.
 http://msdn.microsoft.com/library/en-us/vcmasm/
html/vcoriMicrosoftAssemblerMacroLanguage.asp
TASM: Turbo Assembler
 TASM, Inpise's Borland Turbo Assembler,
supports an alternative to MASM emulation.
This is known as Ideal mode and provides
several advantages over MASM.
 The key (questionable) disadvantage, of
course, is that MASM style assemblers
cannot assemble Ideal mode programs.
NASM: Netwide Assembler
 NASM is designed for portability and
modularity. It supports a range of object file
formats including Linux, Microsoft 16-bit OBJ
and Win32. Its syntax is designed to be
simple and easy to understand, similar to
Intel's but less complex.
 It supports Pentium, P6, MMX, 3DNow! and
SSE opcodes, and has macro capability. It
includes a disassemble as well.
 NASM is Library General Public License
(LGPL) [Free]
 http://nasm.sourceforge.net
FASM: Flat Assembler
 Currently it supports all 8086-80486/Pentium
instructions with MMX, SSE, SSE2, SSE3 and
3DNow! extensions, can produce output in
binary, MZ, PE, COFF or ELF format.
 It includes the powerful but easy to use
macroinstruction support and does multiple
passes to optimize the instruction codes for size.
The flat assembler is self-compilable and the full
source code is included.
 http://flatassembler.net/
About developing assembly language
 CPU’s language (instructions)

X86 instruction set
 About Complier

Directives



MASM
TASM
NASM
TASM
Important files
 Compiler


TASM
TASM32
 Linker
 TLINK
16 bits
32 bits
real mode
protected mode
Pseudo instructions
 Segment, ends : To define a segment.
 Assume: To specify which segment defined
by “Sengment, ends” should use which
segment-register
 Data Allocate
Segment Declaration
 Usage



Segment_name
…
Segment_name
 Ex.
Cseg
…
Cseg
segment
ends
segment
ends
Label declaration
 Usage

Label name follow with colon “:”
 Ex.
Start: …
mov bx, offset start
…
jmp Start
Data allocate
 Define value





DB
DW
DD
DQ
DT
Define Byte
Define Word
Define Doubleword
Define Quadword
Define Ten Bytes
 Usage
 Var_name
Dx
data
Ex. Data allocation
dseg segment
Msg
MulH
MulF
dseg ends
db “hello world$”
dw 0, 1, 2, 3
dd 1234h
Data duplication
 Usage

type count dup (value)
 Ex.
data1
data2
data3
data4
db
db
db
db
10 dup (0)
2 dup (3 dup (0))
3 dup (1, 2, 3 dup (4))
4 dup (?)
Structure
Struc PosType
Row
dw ?
Col
dw ?
Ends PosType
Union PosValType
Pos
PosType ?
Val
dd
?
Ends PosValType
Point
PosValType ?
Structure
mov [Point.Pos.Row], bx ;
; OK: Move BX to Row component of Point
mov [Point.Pos.Row], bl ;
; Error: mismatched operands
Data reference
 offset directive, To retrieve an offset of a data
mov bx, offset msg1 ;dx=offset/addr
 To retrieve / put a data
mov dx, msg1
mov [msg1], dx
mov [bx+2], dx
;dx = [msg1]
;[msg1] = dx
;[bx+2] = dx
Memory contents
ByteVal db ? ;"ByteVal" is name of byte variable
mov ax, bx
;OK: Move value of BX to AX
mov ax, [bx]
;OK: Move word at address BX to AX. Size of
;destination is used to generate proper object code
mov ax,[word bx]
;OK: Same as above with unnecessary size qualifier
mov ax,[word ptr bx]
;OK: Same as above with unnecessary size qualifier
;and redundant pointer prefix
mov al, [bx]
;OK: Move byte at address BX to AL. Size of
;destination is used to generate proper object code
mov [bx], al ; OK: Move AL to location BX
Memory contents
mov ByteVal, al
;Warning: "ByteVal" needs brackets
mov [ByteVal], al
;OK: Move AL to memory location named "ByteVal"
mov [ByteVal], ax
;Error: unmatched operands
mov al, [bx+2]
;OK: Move byte from memory location BX+2 to AL
mov al, bx[2]
; Error: indexes must occur with "+" as above
mov bx, Offset ByteVal
;OK: Offset statement does not use brackets
mov bx, Offset [ByteVal]
; Error: offset cannot be taken of the contents of
memory
Memory contents
lea bx, [ByteVal]
;OK: Load effective address of "ByteVal"
lea bx, ByteVal
;Error: brackets required
mov ax, 01234h
;OK: Move constant word to AX
mov [bx], 012h
;Warning: size qualifier needed to determine
;whether to populate byte or word
mov [byte bx], 012h
;OK: constant 012h is moved to byte at address BX
mov [word bx], 012h
;OK: constant 012h is moved to word at address BX
Echo entered string
cseg
segment
assume cs:cseg, ds:cseg
org 100h
start: jmp load
Buf
db 11, 12 dup (' ')
_ent db 10,13,’$’ ;lf,cr
load: mov ah,0ah
mov dx,offset buf
int 21h
mov
mov
mov
int
ah,09h
dx,load
dx,offset _ent
21h
mov
mov
mov
add
mov
mov
mov
int
int
cseg
al,[buf+1]
ah,00h
bx,offset buf+2
bx,ax
byte ptr [bx],'$'
ah,09h
dx,offset buf+2
21h
20h
ends
end
start
Compiling a program
 Syntax:
 TASM [options] source [,object] [,listing] [,xref]


/z Display source line with error message
/zi,/zd,/zn Debug info: zi=full, zd=line numbers only, zn=none
 Ex
 TASM –zi hello.asm
Creating an executable file
 TLINK objfiles, exefile, mapfile, libfiles, deffile,
resfiles



/v Full symbolic debug information
/t Create COM file (same as /Tdc)
/Txx Specify output file type
 Tdx DOS image (default)
 x can be e=EXE or c=COM
 Twx Windows image
 x can be e=EXE or d=DLL
 Ex
 Tlink /v /t hello;
NASM
NASM vs. MASM & TASM
 NASM is case sensitive.
 NASM Requires Square Brackets For
Memory References

No need ‘offset’, either ‘equ’ or ‘address’


; mov ax, offset data
Use square bracket to retrieve content


mov ax, data
mov ax, [data]
;
Everything is treated as a label instead of var
or equ or else
NASM vs. MASM & TASM
 Does not support hybrid syntaxes, such as
 mov ax, table [bx] -> mov ax, [table + ax]
 Likewise

mov ax, es:[di]
-> mov ax, [es:di]
NASM Doesn't Store Variable Types
 NASM, by design, chooses not to remember
the types of variables you declare. Whereas
MASM will remember, on seeing `var dw 0',
that you declared `var' as a word-size
variable, and will then be able to fill in the
ambiguity in the size of the instruction
 ‘mov var,2’, NASM will deliberately remember
nothing about the symbol ‘var’ except where it
begins, and so you must explicitly code
‘mov word [var],2’.
NASM Doesn't Store Variable Types
 For this reason, NASM doesn't support the
`LODS', `MOVS', `STOS', `SCAS', `CMPS',
`INS', or `OUTS' instructions, but only
supports the forms such as `LODSB',
`MOVSW', and `SCASD', which explicitly
specify the size of the components of the
strings being manipulated.
NASM Doesn't `ASSUME'
 As part of NASM's drive for simplicity, it also
does not support the ‘ASSUME’ directive.
 NASM will not keep track of what values you
choose to put in your segment registers, and
will never _automatically_ generate a
segment override prefix.
NASM Doesn't Support Memory Models
 NASM also does not have any directives to support
different 16-bit memory models. The programmer has
to keep track of which functions are supposed to be
called with a far call and which with a near call, and is
responsible for putting the correct form of ‘RET’
instruction (`RETN' or `RETF'; NASM accepts `RET'
itself as an alternate form for `RETN'); in addition, the
programmer is responsible for coding CALL FAR
instructions where necessary when calling _external_
functions, and must also keep track of which external
variable definitions are far and which are near.
Layout of a NASM Source Line
 Like most assemblers, each NASM source
line contains (unless it is a macro, a
preprocessor directive or an assembler
directive: some combination of the four fields
label:
instruction operands
; comment
Declaring Initialized Data
 DB, DW, DD, DQ and DT are used, much as
in MASM, to declare initialized data in the
output file. They can be invoked in a wide
range of ways:












db
db
db
db
dw
dw
dw
dw
dd
dd
dq
dt
0x55
;
0x55,0x56,0x57
;
'a',0x55
;
'hello',13,10,'$';
0x1234
;
'a'
;
'ab'
;
'abc'
;
0x12345678
;
1.234567e20
;
1.234567e20
;
1.234567e20
;
just the byte 0x55
three bytes in succession
character constants are OK
so are string constants
0x34 0x12
0x61 0x00 (it's just a number)
0x61 0x62 (character constant)
0x61 0x62 0x63 0x00 (string)
0x78 0x56 0x34 0x12
floating-point constant
double-precision float
extended-precision float
Declaring Uninitialized Data
 RESB, RESW, RESD, RESQ and REST are
designed to be used in the BSS section of a
module: they declare uninitialized storage
space.
 Each takes a single operand, which is the
number of bytes, words, doublewords or
whatever to reserve.
 NASM does not support the MASM/TASM
syntax of reserving uninitialized space by
writing `DW ?' or similar things.
Defining Constants
 EQU defines a symbol to a given constant
value: when EQU is used, the source line
must contain a label. The action of EQU is to
define the given label name to the value of its
(only) operand.
 This definition is absolute, and cannot change
later. So, for example,


message
msglen
db
equ
'hello, world'
$-message
Repeating Instructions or Data
 The TIMES prefix causes the instruction to be
assembled multiple times. This is partly
present as NASM's equivalent of the DUP
syntax supported by MASM-compatible
assemblers, in that you can code


zerobuf:
times 64 db 0
times 100 movsb ; trivial unrolled loops
Effective Addresses
 An effective address is any operand to an
instruction which references memory.
Effective addresses, in NASM, have a very
simple syntax: they consist of an expression
evaluating to the desired address, enclosed
in square brackets. For example:




wordvar dw
123
mov ax,[wordvar]
mov ax,[wordvar+1]
mov ax,[es:wordvar+bx]
Numeric Constants
 A numeric constant is simply a number.
NASM allows you to specify numbers in a
variety of number bases, in a variety of ways:
you can suffix


H, Q or O, and B for hex, octal and binary, or
prefix ‘0x’ or ‘$’ for hex in the style of C and
Pascal

Note, a hex number prefixed with a ‘$’ sign must
have a digit after the ‘$’ rather than a letter.
Ex. Numeric Constants
 mov
 mov
 mov
 mov
 mov
 mov
 mov
ax,100
ax,0a2h
ax,$0a2
; decimal
; hex
; hex again
; the 0 is required
ax,0xa2
; hex yet again
ax,777q
; octal
ax,777o
; octal again
ax,10010011b ; binary
Echo entered string
org 0x100
start:jmp load
buf: db 11
resb 12
;reserve 12 bytes
_ent: db 10, 13, '$‘
load:
mov ah,0ah
mov dx,buf
int 21h
mov ah,$09
mov dx,_ent
int 21h
mov
mov
mov
add
mov
al,[buf+1]
ah,0x00
bx,buf+2
bx,ax
byte [bx],'$'
mov
mov
int
int
ah,09h
dx,buf+2
21h
20h
How to NASM…
 nasm -f bin program.asm -o program.com
 nasm -f bin driver.asm -odriver.sys
Q&A
That’s it for now.