Transcript Slide 1

Appendix E
The Intel IA-32 Architecture
Appendix Outline
•
•
•
•
•
•
Memory organization and register structure
Addressing modes and types of instructions
Input/output capability
Scalar floating-point operations
Multimedia operations
Vector floating-point operations
Memory Organization
• Byte-addressable, 32-bit address space
• Little-endian addressing scheme
• Double-word (32-bit) and byte (8-bit) data
are most common types
• Also have quadword (64-bit) and word (16-bit)
• Multiple-byte operands may start at any
byte address location; alignment not required
Register Structure
• Eight 32-bit regs., some of which are special:
EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI
• EAX to EDX also have 8- and 16-bit subparts,
e.g., AX (lowest 16), AL (lowest 8), AH (next 8)
• EIP (instruction pointer), EFLAGS (status reg.)
• EFLAGS condition code flags: ZF, SF, CF, OF
• 6 segment registers for segmented memory;
set to 0 when using simple flat memory model
General Purpose Registers
Register Type
Name
Register
31:16, 15:8, 7:0
Data
Accumulator
EAX
EAX, AH, AL
Base
EBX
EBX, BH, BL
Count
ECX
ECX, CH, CL
Data
EDX
EDX, DH, DL
31:16
15:0
Stack Pointer
ESP
SP
Base Pointer
EBP
BP
Source Index
ESI
Si
Destination Index
EDI
DI
Instruction
Instruction Pointer
EIP
IP
EFLAGS
Status Register
EFLAGS
FLAGS
Pointer
Index
Addressing Modes
•
•
•
•
•
•
Immediate: signed 8/32-bit number
Direct: 32-bit memory address
Register: one of the 8 general-purpose regs.
Register indirect: 32-bit address in a register
Base with displacement: [reg] + 8/32-bit value
Index with displacement:
32-bit value + (1, 2, 4, or 8) * [reg]
• Base with index: [reg1] + (1, 2, 4, or 8) * [reg2]
• Base with index and displacement:
8/32-bit value + [reg1] + (1, 2, 4, or 8) * [reg2]
Instructions
•
•
•
•
Zero, one, or two operands
Two-operand format: OP dst, src
Two memory operands not permitted
Move instructions to show addressing modes:
MOV EAX, 25
MOV EAX, DWORD PTR LOCATION
MOV EBX, OFFSET LOCATION
MOV AL, [EBP + 10]
MOV EAX, [EBP + ESI * 4 + 200]
Machine Instruction Format
•
•
•
•
•
•
1 to 12 bytes in length, with up to 4 fields
OP-code field uses 1 or 2 bytes (most use 1)
Addressing mode info in next 1 or 2 bytes
Next 1 to 4 bytes for any displacement
Final 1 to 4 bytes for any immediate value
Simplest instructions need only one byte, e.g.,
INC EDI
• OP-code byte has 3 bits to name a register
Assembly-Language Notation
• Register operand usually indicates data size
• Keywords DWORD PTR or BYTE PTR necessary
(a) to indicate desired data size if no register
(b) to specify treatment of a label as a pointer
• Keyword OFFSET used to specify treatment of
a label as an immediate value
• For consistency with Intel documentation,
upper-case characters used for instructions
Move Instruction
Transfers data: memory or I/O  registers
Condition code flags not affected by transfer
Previous examples transferred mem.  reg.
Can transfer reg.  mem. and reg.  reg.
MOV LOCATION, ECX
MOV EBP, EDI
• Can also transfer immediate value  mem.
MOV DWORD PTR [EAX + 16], 100
•
•
•
•
Arithmetic Instructions
•
•
•
•
•
•
•
Operands: in memory, in register, immediate
ADD, ADC (with carry), SUB, SBB (with borrow)
CMP (subtract without modifying destination)
Condition codes affected by result
Single-operand operations: INC, DEC, NEG
Carry flag not affected by INC and DEC
Must specify data size if sole operand is a label
Arithmetic Instructions
• Signed 32-bit multiplication with IMUL
• In one version, EAX implicitly is multiplicand,
and source operand is multiplier:
IMUL src
• 64-bit product is placed in {EDX (hi), EDX (lo)}
• Another version puts 32-bit result in register:
IMUL REG, src
• Only carry/overflow flags affected by result
Arithmetic Instructions
• Signed division with IDIV instruction:
IDIV src
• 64-bit dividend in {EDX,EAX} divided by source
• Quotient placed in EAX, remainder in EDX
• Condition code flags are undefined
• Use CDQ (convert doubleword to quadword)
to sign-extend 32-value in EAX to fill EDX
before executing IDIV
Jump and Loop Instructions
• All branches, whether conditional or not,
are called Jumps in IA-32 architecture
• There is also a special Loop instruction
• Conditional Jumps test condition code flags
with signed or unsigned interpretations
• Instructions for ‘greater-than’ condition:
signed Jump-greater (JG) tests ZF, SF, and OF
unsigned Jump-above (JA) tests ZF and SF only
Jump and Loop Instructions
• Unconditional Jump (JMP); when used with
Index addressing, enables use of jump tables
• Loop instruction combines register decrement
with greater-than test to branch; uses ECX
• Example:
MOV ECX, NUM_PASSES
START:
...
LOOP START
Logic, Shift, and Rotate Instructions
•
•
•
•
•
•
•
2-operand AND, OR, XOR; 1-operand NOT
Amount for shift or rotate given in 8 bits
Specified as immediate value or in reg. CL
Logical shift
SHL, SHR
Arithmetic shift SAL, SAR (SAL same as SHL)
Rotate without carry flag ROL, ROR
Rotate with carry flag
RCL, RCR
Subroutine Linkage Instructions
•
•
•
•
ESP is pointer to stack; growth is downward
All entries on stack must be doublewords
PUSH / POP instructions for single entries
PUSHAD / POPAD save/restore all registers
(saved ESP value discarded when restoring)
• CALL instruction saves return address on stack
• RET instruction pops return address to EIP
Assembler Directives
• Popular form of IA-32 assembly language uses
directives for Microsoft MASM assembler
• .CODE or .DATA define start of segment
• DD directive defines storage for doublewords
• DB directive defines storage for bytes
• EQU directive defines symbolic constant
Example Programs
• Vector dot product performs multiplication
and addition operations for array elements
• String search uses nested loops to
match given pattern in target string
String String
char * my_strstr(char *p_pattern, int len_pat, char *p_target, int len_tar)
{
int test_cnt = len_tar-len_pat;
int match_cnt = len_pat;
char *p_found=NULL;
char *pp=p_pattern,*pt=p_target;
if(test_cnt<0) return NULL;
test_cnt=len_tar;
do{
if(*pp==*pt){
pp++,pt++,--match_cnt;
if(match_cnt == 0) {test_cnt =0,p_found=pt-len_pat-1;}
}else{
pp=p_pattern,pt++,match_cnt=len_pat;
}
test_cnt--;
}while(test_cnt >0);
return p_found;
}}
String String
E:\00_ausr\work\cwork>cl my_strstr.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 for
80x86
Copyright (C) Microsoft Corporation. All rights reserved.
my_strstr.c
Microsoft (R) Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.
/out:my_strstr.exe
my_strstr.obj
E:\00_ausr\work\cwork>my_strstr.exe hot thelonghotday
pattern = hot, target = thelonghotday, pfound =hotday
String String
PUBLIC
_my_strstr
; Function compile flags: /Odtp
_TEXT
SEGMENT
_match_cnt$ = -20
_p_found$ = -16
_pp$ = -12
_pt$ = -8
_test_cnt$ = -4
_p_pattern$ = 8
_len_pat$ = 12
_p_target$ = 16
_len_tar$ = 20
_my_strstr PROC
; File e:\00_ausr\work\cwork\my_strstr.c
; Line 4
push ebp
mov ebp, esp
sub esp, 20
; size = 4
;
;
;
;
;
;
;
;
size
size
size
size
size
size
size
size
; 00000014H
=
=
=
=
=
=
=
=
4
4
4
4
4
4
4
4
String String
; Line
mov
sub
mov
; Line
mov
mov
; Line
mov
; Line
mov
mov
mov
mov
5
eax, DWORD PTR _len_tar$[ebp]
eax, DWORD PTR _len_pat$[ebp]
DWORD PTR _test_cnt$[ebp], eax
6
ecx, DWORD PTR _len_pat$[ebp]
DWORD PTR _match_cnt$[ebp], ecx
7
DWORD PTR _p_found$[ebp], 0
8
edx, DWORD PTR _p_pattern$[ebp]
DWORD PTR _pp$[ebp], edx
eax, DWORD PTR _p_target$[ebp]
DWORD PTR _pt$[ebp], eax
; 00000014H
String String
; Line 9
cmp DWORD PTR _test_cnt$[ebp], 0
jge SHORT $LN7@my_strstr
xor eax, eax
jmp SHORT $LN8@my_strstr
$LN7@my_strstr:
; Line 10
mov ecx, DWORD PTR _len_tar$[ebp]
mov DWORD PTR _test_cnt$[ebp], ecx
$LN6@my_strstr:
; Line 12
mov edx, DWORD PTR _pp$[ebp]
movsx eax, BYTE PTR [edx]
mov ecx, DWORD PTR _pt$[ebp]
movsx edx, BYTE PTR [ecx]
cmp eax, edx
jne SHORT $LN3@my_strstr
END
String String
; Line 13
mov eax, DWORD PTR _pp$[ebp]
add eax, 1
mov DWORD PTR _pp$[ebp], eax
mov ecx, DWORD PTR _pt$[ebp]
add ecx, 1
mov DWORD PTR _pt$[ebp], ecx
mov edx, DWORD PTR _match_cnt$[ebp]
sub edx, 1
mov DWORD PTR _match_cnt$[ebp], edx
; Line 14
jne SHORT $LN2@my_strstr
mov DWORD PTR _test_cnt$[ebp], 0
mov eax, DWORD PTR _pt$[ebp]
sub eax, DWORD PTR _len_pat$[ebp]
mov DWORD PTR _p_found$[ebp], eax
$LN2@my_strstr:
String String
; Line 15
jmp SHORT $LN1@my_strstr
$LN3@my_strstr:
; Line 16
mov ecx, DWORD PTR _p_pattern$[ebp]
mov DWORD PTR _pp$[ebp], ecx
mov edx, DWORD PTR _pt$[ebp]
add edx, 1
mov DWORD PTR _pt$[ebp], edx
mov eax, DWORD PTR _len_pat$[ebp]
mov DWORD PTR _match_cnt$[ebp], eax
$LN1@my_strstr:
; Line 19
mov ecx, DWORD PTR _test_cnt$[ebp]
sub ecx, 1
mov DWORD PTR _test_cnt$[ebp], ecx
String String
; Line 20
cmp DWORD PTR _test_cnt$[ebp], 0
jg
SHORT $LN6@my_strstr
; Line 21
mov eax, DWORD PTR _p_found$[ebp]
$LN8@my_strstr:
; Line 22
mov esp, ebp
pop ebp
ret 0
_my_strstr ENDP
_TEXT
ENDS
Scalar Floating-Point Operations
• FPU has eight registers, each 80 bits in size
• Treated as a stack; top ST(0) to bottom ST(7)
• Data transfer and arithmetic operations using
FP registers also do push and pop operations
• Pointer to stack wraps around as needed
• FLD pushes memory FP operand to new ST(0)
• FST writes ST(0) to memory; FSTP also pops
• FILD/FIST/FISTP for integers with conversion
Scalar Floating-Point Operations
• FP operands can be single- or double-precision
• Use DWORD PTR or QWORD PTR to specify
when performing data transfers to/from mem.
• FADD, FSUB, FMUL, FDIV have 1 or 2 operands
• Single source operand is in memory, and
destination is implicitly ST(0)
• Two operands must be registers,
with ST(0) as either source or destination