Chap. 3 ARM CPU Architecture

Download Report

Transcript Chap. 3 ARM CPU Architecture

Chap. 3
ARM CPU Architecture
Outline
3.1 Registers
3.2 Memory
3.3 Exceptions
2
3.1 Registers
 Introduction
 ARM Processor Core
 Processor Modes
 Register Organization
 Accessing Registers
 The Program Status Registers (CPSR and SPSRs)
 Condition Flags
 Conditional Execution
3
Introduction
ARM has 37 registers in total, all of which are
32-bits long.
1
dedicated program counter
1
dedicated current program status register
5
dedicated saved program status registers
 30
general purpose registers
4
Introduction (cont’d)
However these are arranged into several banks,
with the accessible bank being governed by the
processor mode. Each mode can access




a particular set of r0-r12 registers
a particular r13 (the stack pointer) and r14 (link register)
r15 (the program counter)
cpsr (the current program status register)
and privileged modes can also access

a particular spsr (saved program status register)
5
ARM Processor Core
 Architecture




Versions 1 and 2 – Acorn RISC, 26-bit address
Version 3 – 32-bit address, CPSR, and SPSR
Version 4 – half-word, Thumb
Version 5 – BLX, CLZ and BRK instructions
 Processor cores



ARM7TDMI (Thumb, debug, multiplier, ICE) – version 4T, lowend ARM core, 3-stage pipeline, 50-100MHz
ARM9TDMI – 5-stage pipeline, 130MHz or 200MHz
ARM10TDMI – version 5, 300MHz
 CPU Core: co-processor, MMU, AMBA


ARM 710, 720, 740
ARM 920, 940
6
Processor Modes
 The ARM has six operating modes:






User (unprivileged mode under which most tasks run)
FIQ (entered when a high priority (fast) interrupt is raised)
IRQ (entered when a low priority (normal) interrupt is raised)
Supervisor (entered on reset and when a Software Interrupt
instruction is executed)
Abort (used to handle memory access violations)
Undef (used to handle undefined instructions)
 ARM Architecture Version 4 adds a seventh mode:

System (privileged mode using the same registers as user mode)
7
Register Organization
General registers and Program Counter
User32 / System
FIQ32
Supervisor32
Abort32
IRQ32
Undefined32
r0
r0
r0
r0
r0
r0
r1
r2
r1
r2
r1
r2
r1
r2
r1
r2
r1
r2
r3
r3
r3
r3
r3
r3
r4
r4
r4
r4
r4
r4
r5
r5
r5
r5
r5
r5
r6
r6
r6
r6
r6
r6
r7
r7
r7
r7
r7
r7
r8
r8_fiq
r8
r8
r8
r8
r9
r9_fiq
r9
r9
r9
r9
r10
r10_fiq
r10
r10
r10
r10
r11
r11_fiq
r11
r11
r11
r11
r12
r12_fiq
r12
r12
r12
r12
r13 (sp)
r13_fiq
r13_svc
r13_abt
r13_irq
r13_undef
r14 (lr)
r14_fiq
r14_svc
r14_abt
r14_irq
r14_undef
r15 (pc)
r15 (pc)
r15 (pc)
r15 (pc)
r15 (pc)
r15 (pc)
Program Status Registers
cpsr
cpsr
sprsr_fiq
spsr_fiq
cpsr
spsr_svc
cpsr
spsr_abt
cpsr
sprsr_fiq
spsr_irq
cpsr
spsr_undef
sprsr_fiq
Ref. [8]
8
Register Example: User to FIQ Mode
Registers in use
Registers in use
User Mode
FIQ Mode
r0
r0
r1
r2
r1
r2
r3
r3
r4
r5
r4
r5
r6
r6
r7
r8
r7
r8_fiq
r9_fiq
r9
r9_fiq
r10
r10_fiq
r10
r10_fiq
r11
r11_fiq
r11
r11_fiq
r8
r8_fiq
r9
EXCEPTION
r12
r12_fiq
r12
r12_fiq
r13 (sp)
r13_fiq
r13 (sp)
r13_fiq
r14 (lr)
r14_fiq
r14 (lr)
r14_fiq
r15 (pc)
r15 (pc)
Return address calculated from User mode
PC value and stored in FIQ mode LR
cpsr
spsr_fiq
cpsr
spsr_fiq
User mode CPSR copied to FIQ mode SPSR
Ref. [8]
9
Accessing Registers
 No breakdown of currently accessible registers.


All instructions can access r0-r14 directly.
Most instructions also allow use of the PC.
 Specific instructions to allow access to CPSR and SPSR.
 Note :
When in a privileged mode, it is also possible to load /
store the (banked out) user mode registers to or from
memory.
10
The Program Status Registers
(CPSR and SPSRs)
31
28
N Z CV
4
8
I F T
0
Mode
Copies of the ALU status flags (latched if the instruction has the "S" bit set).

Condition Code Flags
 N = Negative result from ALU flag.
 Z = Zero result from ALU flag.
 C = ALU operation Carried out
 V = ALU operation oVerflowed
 Interrupt Disable bits.
 I = 1, disables the IRQ.
 F = 1, disables the FIQ.
 T Bit: Processor in ARM (0) or Thumb (1)
 Mode Bits: processor mode
11
Condition Flags
Flag
Logical Instruction
Arithmetic Instruction
Negative
(N=‘1’)
No meaning
Bit 31 of the result has been set
Indicates a negative number in
signed operations
Zero
(Z=‘1’)
Result is all zeroes
Result of operation was zero
Carry
(C=‘1’)
After Shift operation
‘1’ was left in carry flag
Result was greater than 32 bits
oVerflow
(V=‘1’)
No meaning
Result was greater than 31 bits
Indicates a possible corruption of
the sign bit in signed
numbers
12
The Condition Field of Instruction Set
28
31
24
20
16
12
8
4
0
Cond
Code
Suffix
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
EQ
NE
CS/HS
CC/LO
MI
PL
VS
VC
HI
LS
GE
LT
GT
LE
AL
Description
Equal
Not equal
Unsigned higher or same
Unsigned lower
Minus
Positive or Zero
Overflow
No overflow
Unsigned higher
Unsigned lower or same
Greater or equal
Less than
Greater than
Less than or equal
Always
Flags
Z=1
Z=0
C=1
C=0
N=1
N=0
V=1
V=0
C=1 & Z=0
C=0 or Z=1
N=V
N!=V
Z=0 & N=V
Z=1 or N=!V
none
13
Conditional Execution
Most instruction sets only allow branches to be
executed conditionally.
However by reusing the condition evaluation
hardware, ARM effectively increases number of
instructions.

All instructions contain a condition field which
determines whether the CPU will execute them.

Non-executed instructions soak up 1 cycle.
 Still have to complete cycle so as to allow fetching
and decoding of following instructions.
14
Conditional Execution (cont’d)
This removes the need for many branches, which
stall the pipeline (3 cycles to refill).

Allows very dense in-line code, without branches.

The time penalty of not executing several conditional
instructions is frequently less than overhead of the
branch or subroutine call that would otherwise be
needed.
15
Using and updating the
Condition Field
 To execute an instruction conditionally, simply postfix it
with the appropriate condition:

For example an add instruction takes the form:
 ADD r0,r1,r2
; r0 = r1 + r2 (ADDAL)

To execute this only if the zero flag is set:
 ADDEQ r0,r1,r2
; If zero flag set
then…
; ... r0 = r1 + r2
16
Using and updating the
Condition Field (cont’d)
 By default, data processing operations do not affect the
condition flags (apart from the comparisons where this is
the only effect).
 To cause the condition flags to be updated, the S bit of
the instruction needs to be set by postfixing the
instruction (and any condition code) with an “S”.

For example to add two numbers and set the condition
flags:
 ADDS r0,r1,r2 ; r0 = r1 + r2
; ... and set flags
17
Conditional Execution Example
 Greatest Common Divisor (最大公因數)

gcd
less
Normal Assembler
cmp
beq
blt
sub
bal
sub
bal
r0, r1
stop
less
r0, r0, r1
gcd
r1, r1, r0
gcd
;reached the end?
;if r0 > r1
;subtract r1 from r0
;subtract r0 from r1
stop

gcd
ARM Conditional Assembler
cmp
subgt
sublt
bne
r0, r1
r0, r0, r1
r1, r1, r0
gcd
;if r0 > r1
;subtract r1 from r0
;else subtract r0 from r1
;reached the end?
18
The Barrel Shifter
ARM has a barrel shifter which provides a
mechanism to carry out shifts as part of
other instructions.
Operand 1
Operand 2
 Register, optionally with shift
operation applied.
 Shift value can be either be:
Barrel
Shifter
 5 bit unsigned integer
 specified in bottom byte of
another register.
 Immediate value
ALU
 8 bit number
 can be rotated right through an
Result
even number of positions.
 assembler will calculate rotate
for you from constant.
19
Second Operand: Shifted Register
Using a multiplication instruction to multiply by a constant
load the constant into a register
 wait for a number of internal cycles for the multiplication to
complete

A more optimum solution can be often found by using
some combination of MOVs, ADDs, SUBs and RSBs with
shifts.
Multiplications by a constant equal to a ((power of 2) ± 1) can be
done in one cycle.
Example:
r0 = r1 * 5

r0 = r1 + (r1 * 4)
ADD r0, r1, r1, LSL #2

Example:

r2 = r3 * 105
r2 = r3 * 15 * 7
r2 = r3 * (16 - 1) * (8 - 1)
RSB r2, r3, r3, LSL #4
RSB r2, r2, r2, LSL #3
; r2 = r3 * 15
; r2 = r2 * 7
20
Outline
3.1 Registers
3.2 Memory
3.3 Exceptions
21
3.2 Memory
Introduction
Memory Organization
Pipeline
Memory Access
ARM Memory Interface
22
Introduction
Word, half-word alignment (xxxx00 or xxxxx0)
ARM can be set up to access data in either littleendian or big-endian format, through they default to
little-endian.
The ARM uses a pipeline in order to increase the
speed of the flow of instructions to the processor.

Allows several operations to be undertaken simultaneously,
rather than serially.

Rather than pointing to the instruction being executed, the
PC points to the instruction being fetched.
23
Memory Organization
bit 31
bit 0
bit 31
bit 0
23
22
21
20
20
21
22
23
19
18
17
16
16
17
18
19
-------------word16----------------
-------------word16----------------
15
12
14
half-word14
11
10
13
12
half-word12
9
half-word12
8
8
--------------word8-----------------
7
6
byte6
5
4
4
half-word4
5
byte5
2
1
0
byte3
byte2
byte1
byte0
Little-endian memory
organization
9
14
15
half-word14
10
11
--------------word8-----------------
3
(a)
13
byte
address
6
7
half-word6
0
1
2
3
byte0
byte1
byte2
byte3
(b)
byte
address
Big-endian memory
organization
24
Pipeline
 3 stages (ARM7) and 5 stages (ARM9TDMI)
fetch
decode
execute
fetch
decode
execute
PC
PC-4
Load an instruction
from memory
PC-8
memory
access memory
if needed
write-back
write result
to register
25
Memory Access
 The ARM7/9 is a Von Neumann, load/store architecture, i.e.,


Only 32 bit data bus for both instr. and data.
Only the load/store instr. (and SWP) access memory.
 Memory is addressed as a 32 bit address space.
 Data type can be




8 bit bytes,
16 bit half-words,
or 32 bit words,
and may be seen as a byte line folded into 4-byte words.
 Words must be aligned to 4 byte boundaries, and half-words to 2
byte boundaries.
 Always ensure that memory controller supports all three access
sizes.
26
Outline
3.1 Registers
3.2 Memory
3.3 Exceptions
27
3.3 Exceptions
 Introduction
 Types of ARM exceptions
 Exception and Interrupt
 Entering an Exception
 Returning from an Exception
 Exception Entry/Exit
 Exceptions and the Vector Table Address
28
Introduction
 Exceptions and interrupts break the sequential flow of a
program, jumping to architecturally defined memory
locations.
 In ARM, Software Interrupt (SWI) is the “system call”
exception.
29
Types of ARM exceptions
 reset

when CPU reset pin is asserted
 undefined instruction

when CPU tries to execute an undefined op-code
 software interrupt

when CPU executes the SWI instruction
 prefetch abort

when CPU tries to execute an instruction pre-fetched from an illegal addr
 data abort

when data transfer instruction tries to read or write at an illegal address
 IRQ

when CPU's external interrupt request pin is asserted
 FIQ

when CPU's external fast interrupt request pin is asserted
30
Exception and Interrupt
 The terms exception and interrupt are often confused.
 Exception usually refers to an internal CPU event such as



floating point overflow
MMU fault (e.g., page fault)
trap (SWI)
 Interrupt usually refers to an external I/O event such as


I/O device request
Timer interrupt
 In the ARM architecture manuals, the two terms are
mixed together.
31
Entering an Exception
 When an exception is generated, the processor takes the
following actions:

Copy the CPSR to SPSR for the mode in which the exception is
to be handled.

Change the appropriate CPSR mode bits in order to
 Change to the appropriate mode, and map in the appropriate
banked registers for that mode.
 Disable interrupts.
 IRQs are disabled when any exception occurs.
 FIQs are disabled when a FIQ occurs, and on reset.
 Set lr_mode to the return address.
 Set PC to the vector address for the exception.
32
Returning from an Exception
 The actions taken by the processor


Restore the CPSR from SPSR_mode
Restore the PC using return address stored in lr_mode
 The way to return depends on whether a stack is used
during entering the subroutine

Without a stack


Performing a data processing instruction with S flag set
and the PC as the destination register.
With a stack

Restoring the saved registers by performing


LDMFD sp!, {r0-r12, pc}^
^ indicates that the CPSR is restored from the SPSR
33
Returning from SWI and
Undefined Instruction Handlers
 SWI and UI exceptions are generated by the instruction itself, so
the PC is not updated when the exception is taken. Thus, storing
(PC-4) in lr_mode makes lr_mode points to the next instruction
be executed.
SWI xxx  being executed
INST-1
 being decoded
INST-2
 being fetched
 PC-8
 PC-4
 PC
 Restoring the PC from lr
 Without a stack
MOVS pc, lr

With a stack
STMFD
…
LDMFD
sp!, {reglist,lr}
sp!, {reglist, pc}^
34
Returning from FIQ and IRQ
 Check IRQ and FIQ at the end of executing each instruction.
INST-1
INST-2
INST-3
INST-4
 being executed
 being decoded
 being fetched
 Restoring PC from lr
 Without a stack
SUBS
 With a stack
SUB
STMFD
…
LDMFD
 PC-12 ,
 PC-8
 PC-4
 PC
IRQ or FIQ checked
pc, lr, #4
lr, lr, #4
sp!, {reglist, lr}
sp!, {reglist, pc}^
35
Returning from Prefetch Abort
 Prefetch abort is generated when it reaches the execution stage.
INST-1
INST-2
INST-3
 being executed
 being decoded
 being fetched
 Restoring PC from lr
 Without a stack
SUBS
 With a stack
SUB
STMFD
…
LDMFD
 PC-8 , Aborted
 PC-4
 PC
pc, lr, #4
lr, lr, #4
sp!, {reglist, lr}
sp!, {reglist, pc}^
36
Returning from Data Abort
 When a data abort occurs, the program counter has been updated.
INST-1
INST-2
INST-3
INST-4
 being executed
 being decoded
 being fetched
 Restoring PC from lr
 Without a stack
SUBS
 With a stack
SUB
STMFD
…
LDMFD
 PC-12 , Aborted
 PC-8
 PC-4
 PC
pc, lr, #8
lr, lr, #8
sp!, {reglist, lr}
sp!, {reglist, pc}^
37
Exception Entry/Exit
Ref. [4]
38
Exceptions and the Vector Table
Address
Exception Vectors
Address
0x00000000
0x00000004
0x00000008
0x0000000C
0x00000010
0x00000014
0x00000018
0x0000001C
Exception type
Reset
Undefined instruction
Software Interrupt
Abort (prefetch)
Abort (data)
Reserved
IRQ
FIQ
Exception Mode
Supervisor
Undefined
Supervisor
Abort
Abort
Reserved
IRQ
FIQ
Priority
(1=high,6=low)
1
6
6
5
2
Not applicable
4
3
39
References
[1] Andrew Sloss, Dominic Symes and Chris Wright, “ARM System Developer's
Guide“, published by MORGAN KAUFFMAN, 2004
[2] David Seal, “ARM Architecture Reference Manual “, published by AddisonWesley, 2000
[3] ARM DUI 0021A “Programming Techniques“, 1995
[4] http://www.samsung.com/Products/Semiconductor/SystemLSI/Networks
/PersonalNTASSP/CommunicationProcessor/S3C4510B/um_s3c4510b_rev1
.pdf
[5] www.arm.com
[6] http://www.arm.com/pdfs/DUI0056D_ADS1_2_Dev.pdf
[7] http://nthucad.cs.nthu.edu.tw/~wcyao/
[8] www-courses.cs.uiuc.edu/ ~cs433/Processors/ARM/ARMInstV1.0.ppt
40
Exercise
1.
How many registers does ARM have?
And what purpose do they use for?
2.
Describe the processor modes of ARM in detail.
3.
What is the meaning of ARM condition flags in logical
instructions and in Arithmetic Instructions?
4.
What is the benefit of Conditional Execution?
5.
What is the difference of little-endian and big-endian format?
6.
Please describe ARM memory cycle types and how to decide
which type.
7.
List all type of ARM exceptions and describe their addresses and
priorities.
41