ARM7TDMI Processor Introduction
Download
Report
Transcript ARM7TDMI Processor Introduction
ARM Processor Introduction
ARM Architecture
Key Features
Data Size and Instruction Sets
ARM is a 32-bit RISC architecture
(Reduced Instruction Set Computer)
ARM uses a 32-bit load/store architecture
When used in relation to the ARM:
Byte means 8-bit
Halfword means 16-bit (two bytes)
Word means 32-bit (four bytes)
Most ARM’s implement two instruction sets
32-bit ARM Instruction Set
16-bit Thumb Instruction Set
17/07/2015
3
ARM Operating States
The ARM processor has two operating states:
ARM state which executes 32-bit, word aligned ARM instructions
THUMB state which can execute 16-bit, halfword aligned THUMB
instructions
Switching state
Entering THUMB state
- BX instruction with the state bit (bit 0) set in the operand register.
- Automatically on return from an exception (IRQ, FIQ, ABORT, SWI,…), if the
exception was entered with the processor in THUMB state.
Entering ARM state
- BX instruction with the state bit clear in the operand register.
- Automatically on the processor taking an exception. In this case, the PC is
placed in the exception mode’s link register.
17/07/2015
4
ARM Operating Modes
The ARM supports seven modes of operation:
User (usr): The normal ARM program execution state
FIQ (fiq): Designed to support a data transfer or channel process
IRQ (irq): Used for general-purpose interrupt handling
Supervisor (svc): Protected mode for the operating system
Abort mode (abt): Entered after a data or instruction prefetch abort
System (sys): A privileged user mode for the operating system
Undefined (und): Entered when an undefined instruction is executed
Mode changes may be made under software control, or may be
brought about by external interrupts or exception processing.
Most application programs will execute in User mode. The non-user
modes' known as privileged modes-are entered in order to service
interrupts or exceptions, or to access protected resources.
17/07/2015
5
Registers
37 32-bits long registers for 7 processor modes
18 visible 32-bit registers in privileged modes (17 in user mode)
r0-r13 = general purpose registers (r13 = Stack Pointer (SP))
r14 = Link Register (LR)
r15 = Program Counter (PC)
CPSR = Current Program Status Register
SPSR = Saved Program Status Register (only accessible in privileged
mode)
Availability of the register banks depends on the current
processor mode (user, supervisor and others)
Maximum of 13 general purpose registers in user mode (r0-r12)
In high priority interrupts (FIQ), a different bank with general purpose
registers becomes available
One set of xPSR, SP and LR are available for each processor mode
17/07/2015
6
Register Organization Summary
User / System
FIQ
r0
r0
r1
Low Registers
High Registers
Supervisor
Abort
IRQ
Undefined
r0
r0
r0
r0
r1
r1
r1
r1
r1
r2
r2
r2
r2
r2
r2
r3
r3
r3
r3
r3
r3
r4
r4
r4
r4
r4
r4
r5
r5
r5
r5
r5
r5
r6
r6
r6
r6
r6
r6
r7
r7
r7
r7
r7
r7
r8
r8_fiq
r8
r8
r8
r8
r9
r9_fiq
r9
r9
r9
r9
r10
r10_fiq
r10
r10
r10
r11
r11_fiq
r11
r11
r10
r11
r12
r12_fiq
r12
r12
r12
r12
r13 (sp)
r13_fiq
r13_svc
r13_abt
r13_irq
r13_undef
r14 (lr)
r14_fiq
r14_svc
r14_abt
r14_irq
r14_undef
r15 (pc)
r15 (pc)
r15 (pc)
r15 (pc)
r15 (pc)
r15 (pc)
r11
Program Status Registers
cpsr
cpsr
sprsr_fiq
spsr_fiq
cpsr
spsr_svc
cpsr
spsr_abt
cpsr
spsr_irq
sprsr_fiq
cpsr
sprsr_fiq
spsr_undef
17/07/2015
7
THUMB State Registers Set
17/07/2015
8
Relationship between ARM and
THUMB state registers
The THUMB state registers relate to the ARM state
registers in the following way:
17/07/2015
9
Program Status Registers (1/3)
The ARM contains a Current Program Status Register
(CPSR), plus five Saved Program Status Registers
(SPSRs) for use by exception handlers.
These register's functions are:
Hold information about the most recently performed ALU
operation
Control the enabling and disabling of interrupts
Set the processor operating mode
17/07/2015
10
Program Status Registers (2/3)
Condition Code Flags
The N, Z, C and V bits may be changed as a result of arithmetic and
logical operations, and may be tested to determine whether an instruction
should be executed
- In ARM state, all instructions may be executed conditionally.
- In THUMB state, only the Branch instruction is capable of conditional
execution.
Control Bits
The I, F, T and M[4:0]) bits will be changed when an exception arises. If
the processor is operating in a privileged mode, they can also be
manipulated by software.
T bit:
- This reflects the operating state. When this bit is set, the processor is
executing in THUMB state, otherwise it is executing in ARM state. This is
reflected on the TBIT external signal.
- Note that the software must never change the state of the TBIT in the CPSR. If
this happens, the processor will enter an unpredictable state.
17/07/2015
11
Program Status Registers (3/3)
Control Bits
Interrupt disable bits:
- The I and F bits are the interrupt disable bits. When set, these
disable the IRQ and FIQ interrupts respectively.
Mode bits:
- The M4, M3, M2, M1 and M0 bits (M[4:0]) are the mode bits. These
determine the processor's operating mode. Not all combinations of
the mode bits define a valid processor mode. Only those explicitly
described shall be used. The user should be aware that if any illegal
value is programmed into the mode bits, M[4:0], then the processor
will enter an unrecoverable state. If this occurs, reset should be
applied.
17/07/2015
12
Hardware Interrupts
ARM cores do not include an interrupt controller to
support and distinguish many interrupt sources
A memory-mapped interrupt controller is required for devices with
many peripherals / interrupt sources
There are two levels for hardware interrupts, both levels
get access to their own SP and LR registers and a copy of
the original Program Status Register
FIQ – High priority (fast) hardware interrupt
- Disables all other interrupts
- Provides a new bank with 5 general purpose registers
IRQ – Regular hardware interrupt
17/07/2015
13
Exceptions (1/6)
Exceptions arise whenever the normal flow of a program
has to be halted temporarily
For example to service an interrupt from a peripheral.
ARM supports 7 types of exception and has a privileged
processor mode for each type of exception.
ARM Exception vectors
Address
Exception
Mode in Entry
0x00000000
Reset
Supervisor
0x00000004
Undefined instruction
Undefined
0x00000008
Software Interrupt
Supervisor
0x0000000C
Abort (prefetch)
Abort
0x00000010
Abort (data)
Abort
0x00000014
Reserved
Reserved
0x00000018
IRQ
IRQ
0x0000001C
FIQ
FIQ
17/07/2015
14
Exceptions (2/6)
When handling an exception, the ARM:
Preserves the address of the next instruction in the appropriate
Link Register
Copies the CPSR into the appropriate SPSR
Forces the CPSR mode bits to a value which depends on the
exception
Forces the PC to fetch the next instruction from the relevant
exception vector
It may also set the interrupt disable flags to prevent otherwise
unmanageable nestings of exceptions.
If the processor is in THUMB state when an exception occurs, it
will automatically switch into ARM state when the PC is loaded
with the exception vector address.
17/07/2015
15
Exceptions (3/6)
On completion, the exception handler:
Moves the Link Register, minus an offset where appropriate, to
the PC. (The offset will vary depending on the type of exception.)
Copies the SPSR back to the CPSR
Clears the interrupt disable flags, if they were set on entry
17/07/2015
16
Exceptions (4/6)
Reset
When the processor’s Reset input is asserted
- CPSR Supervisor + I + F
- PC 0x00000000
Undefined Instruction
If an attempt is made to execute an instruction that is undefined
- LR_undef Undefined Instruction Address + #4
- PC 0x00000004, CPSR Undefined + I
- Return with : MOVS pc, lr
Prefetch Abort
Instruction fetch memory abort, invalid fetched instruction
- LR_abt Aborted Instruction Address + #4, SPSR_abt CPSR
- PC 0x0000000C, CPSR Abort + I
- Return with : SUBS pc, lr, #4
17/07/2015
17
Exceptions (5/6)
Data Abort
Data access memory abort, invalid data
- LR_abt Aborted Instruction + #8, SPSR_abt CPSR
- PC 0x00000010, CPSR Abort + I
- Return with : SUBS pc, lr, #4 or SUBS pc, lr, #8
Software Interrupt
Enters Supervisor mode
- LR_svc SWI Address + #4, SPSR_svc CPSR
- PC 0x00000008, CPSR Supervisor + I
- Return with : MOVS pc, lr
17/07/2015
18
Exceptions (6/6)
Interrupt Request
Externally generated by asserting the processor’s IRQ input
- LR_irq PC - #4, SPSR_irq CPSR
- PC 0x00000018, CPSR Interrupt + I
- Return with : SUBS pc, lr, #4
Fast Interrupt Request
Externally generated by asserting the processor’s FIQ input
- LR_fiq PC - #4, SPSR_fiq CPSR
- PC 0x0000001C, CPSR Fast Interrupt + I + F
- Return with : SUBS pc, lr, #4
- Handler @0x1C speeds up the response time
17/07/2015
19
Memory Organization
ARM derivatives up to ARM7 are based on the von
Neumann model
Shared, single memory space for code AND data
Linear 32-bit address space (4 GByte)
ARM derivatives of ARM9 and up support the Harvard
model
Separated memory ports for code AND data
Offers simultaneous access to code AND data
17/07/2015
20
The 3-stage Instruction Pipeline
Up to the ARM7, ARM processors have a 3-stage
instruction pipeline
Instruction
Fetch
ThumbARM ARM decode
decompress Reg Select
FETCH
DECODE
Reg
Read
Shift
ALU
Reg
Write
EXECUTE
The three stages are:
1.Fetch
- Fetching an instruction from the memory containing the code
2.Decode
- Decoding the instruction and prepare data path control
signals for next cycle
3.Execute
- The instruction gets executed on the data path specified and
the result is written back to the destination
17/07/2015
21
Execution Of Single Cycle Instructions
It takes 3 cycles to completely process an instruction
However: Once the pipeline is filled, one instruction
becomes executed every cycle
Fetch
Instruction 3
1
2
Execute
Decode
Instruction 2
1
Execute
Decode
Instruction 3
1
Fetch
Instruction 1
Time
Fetch
Instruction 2
Decode
Instruction 1
Fetch
Instruction 3
Decode
Instruction 2
Execute
Instruction 1
17/07/2015
22
Execution Of Multi Cycle Instructions
The pipeline will be halted/delayed for one cycle if
multiple memory accesses have to be made in order to
execute the instruction
For example any instruction requiring access to an operand
stored in memory (and not a register)
NOTE: Branch instructions flush and refill the pipeline
Upon execution of a branch instruction, the current fetch and
decode actions of the pipeline are aborted and a new fetch from
the branch location gets started
17/07/2015
23
The 5-stage Instruction Pipeline
Higher performance ARM derivatives use a 5-stage
pipeline to compensate for the memory access bottleneck
of the 3-stage pipeline
The five stages are:
1.Fetch: Fetch next instruction from memory
2.Decode: Decode instruction and read register operands
3.Execute: Execute instruction
4.Data: Access data memory, if required
5.Write-back: Write the result of the instruction back to the
destination memory location
17/07/2015
24
Pipelining Comparison
ARM7TDMI
Instruction
Fetch
ThumbARM ARM decode
decompress Reg Select
FETCH
DECODE
Reg
Read
Shift
ALU
Reg
Write
EXECUTE
ARM9TDMI
Instruction
Fetch
ARM or Thumb
Inst Decode
Reg
Reg
Decode Read
Shift + ALU
Memory
Access
Reg
Write
FETCH
DECODE
EXECUTE
MEMORY
WRITE
17/07/2015
25
Reducing Code Size: Thumb
Compressed subset of the 32-bit ARM instruction set
Require lower bus bandwidth from narrow external memory
Improves already outstanding code density
A Thumb enabled ARM:
Executes both 32-bit ARM and 16-bit Thumb instructions
Allows runtime interworking between ARM and Thumb code
State change performed via branch with exchange (BX)
instruction
Thumb reduces 32-bit system to 16-bit cost
Consumes less power
Requires less external memory
17/07/2015
26
ARM and Thumb Performance
Thumb programs typically are:
30% smaller than ARM programs
30% faster when accessing 16-bit memory
MIPS
MHz
17/07/2015
27
ARM-Based System
16-bit RAM
32-bit RAM
Interrupt
Controller
nIRQ
8-bit ROM
Peripherals
I/O
nFIQ
ARM
Core
17/07/2015
28
Advanced Microcontroller Bus Architecture
AMBA was introduced in 1996 and is widely used as the
on-chip bus for ARM processors
AMBA is an open standard that describes the
interconnection and management of functional blocks that
makes up a System On chip
17/07/2015
29
Advanced System Bus (ASB)
First Generation of AMBA system bus
Implements features required for high performance
Multiple Bus Masters
Optimizes system performance by sharing resources between
different bus masters such as the CPU, DMA controller, etc.
Pipelined and burst transfers
Allows high speed memory and peripheral access without the
requirement for additional cycles on the bus
17/07/2015
30
Advanced High-Performance Bus (AHB)
Multiple Bus Masters
Optimizes system performance by sharing resources
between different bus masters such as the CPU, DMA
controller, etc.
Pipelined and burst transfers
Allows high speed memory and peripheral access without
the requirement for additional cycles on the bus
Split transactions supported
Enables high latency slaves to release the system bus while
completing a transaction
17/07/2015
31
Advanced Peripheral Bus (APB)
Ideal for general purpose peripherals such as timers,
UARTs, IOs, etc.
Simple bus
Non-pipelined architecture
Easy to implement with all peripherals acting as slaves
Simpler interface means low gate count
Low power
Isolated peripherals behind the bridge reduces load on the main
system bus
17/07/2015
32
Atmel AT91
Architecture
AT91 Architecture
The Atmel AT91 Series of microcontrollers are based upon
the powerful ARM7TDMI or ARM920T processors
New products are based upon the powerful ARM926EJ-S
processor
Atmel has taken these cores, added a wide range of
peripherals and advanced power management systems, to
give the design engineer the best of both worlds – a high
performance peripheral set with very low power
consumption
It gives the buyer a 32-bit processor at 16-bit cost!
17/07/2015
34
ARM Cores used by Atmel
ARM7TDMI
30 MIPS @ 33 MHz in 0.35 um or
60 MIPS @ 66 MHz in 0.18 um.
-40 to 85 drg C
AT91 series
SAM7S, 7A, 7X, 7SE, 7L
ARM7 ASSP & ASICs
ARM920T
200 MIPS @ 180 MHz in 0.18 um
16KB Instruction and 16KB Data Cache, MMU,
ETM
-40 to 85 drg C
ARM926EJ-S
200 MIPS @ 180 MHz in 0.13 um
Extended DSP instructions, JAVA,
MMU, 16KB Instruction and
16KB Data Cache, TCM,
-40 to 85 drg C
AT91RM9200
SAM9260
SAM9261
SAM9262
SAM9263
ARM9 ASSP & ASICs
ARM946E-S
130 MIPS @ 120 MHz in 0.18 um
Extended DSP instructions
Configurable instruction and
data cache
-40 to 85 drg C
ASICs
17/07/2015
35
ARM7
Established, high-volume 32-bit RISC Architecture
Small die size and very low power consumption
3-stage instruction pipeline
Von Neumann memory layout with linear 32-bit address
space (4GByte)
32-bit data bus
Supports little- and big-endian
Performance 0.9 MIPS/MHz
17/07/2015
36
ARM7 Thumb Family
17/07/2015
37
ARM7TDMI processor
The ARM7TDMI processor is a member of the Advanced
RISC machine family of general purpose 32-bit
microprocessor
What does mean ARM7TDMI ?
ARM7 - 32-bit Advanced RISC Machine
T - Thumb architecture extension
- Two separate instruction sets, 32-bit ARM instructions and 16bit Thumb instructions
D - Debug extension
M - Enhanced multiplier
I - Embedded ICE macrocell extension
17/07/2015
38
ARM7TDMI Processor Features
32/16-bit RISC architecture version 4T (ARM v4T)
3-stage pipeline
Unified bus architecture
32-bit ARM Instruction Set + 16-bit Thumb extension
Forward compatible code
EmbeddedICE on-chip debug
Smallest Die Size: 0.53mm² on 0.18µm process
Industry leading 0.25mW/MHz
17/07/2015
39
ARM7TDMI Block Diagram
Von Neumann Architecture
3-stage pipeline
fetch, decode, execute
32-bit Data Bus
32-bit Address Bus
37 32-bit registers
32-bit ARM instruction set
16-bit THUMB instruction set
32x8 Multiplier
Barrel Shifter
17/07/2015
40
ARM9
32-bit RISC processor core with ARM and Thumb instruction
sets
5-stage instruction pipeline
Harvard memory layout with two 32-bit linear address spaces,
one for code and one for data
Integrated instruction and data caches
Double-bandwidth memory access
Reduced CPI (Clocks Per Instruction)
Performance 1.1 MIPS/MHz
17/07/2015
41
ARM926EJ-S Processor Features
32/16-bit RISC architecture version 5TE
Harvard 5-stage pipeline
Separate instruction and data AHB buses
32-bit ARM Instruction Set plus 16-bit Thumb extension
DSP instruction extensions
ARM Jazelle technology Java bytecode acceleration
MMU to support Symbian OS, Windows CE, Linux and Palm OS
Selectable size instruction and data caches
Instruction and data TCM interfaces with wait state support
EmbeddedICE on-chip debug
ETM interface for Real-time trace capability with ETM9 macrocell
Power: 0.5mW/MHz
17/07/2015
42
ARM926EJ-S Block Diagram
ETM Interface
Instruction
TDM Interface
Instruction
Cache
Data
TDM Interface
Data
Cache
ARM9EJ-S
Core
MMU
MMU
Write buffer
Control Logic and Bus Interface Unit
Coprocessor
Interface
AMBA AHB Interface
Instruction
Data
17/07/2015
43
AT91 ARM7-Based Architecture
On-chip
SRAM / ROM
ARM7TDMI
Core
ASB
16-bit
External
Bus
Interface
Advanced
Interrupt
Controller
External
Memories
External
Peripherals
AMBA Bridge
APB
On-chip
Peripherals
17/07/2015
44
AT91R40008 (58A03) Block Diagram
17/07/2015
45
AT91 ARM7-based MCP Architecture
On-chip
SRAM / ROM
ARM7TDMI
Core
ASB
16-bit
External
Bus
Interface
Advanced
Interrupt
Controller
On-chip
16-bit Flash
External
Memories
External
Peripherals
AMBA Bridge
APB
On-chip
Peripherals
Stacked dies
17/07/2015
46
AT91FR40162 Block Diagram
17/07/2015
47
AT91 ARM9-Based Architecture
ARM920T
Core
Advanced
Interrupt
Controller
Memory
Controller
M
M
U
On-chip
SRAM / ROM
ASB
32-bit
External
Bus
Interface
External
Memories
External
Peripherals
AMBA Bridge
APB
On-chip
Peripherals
17/07/2015
48
AT91RM9200 (58A07) Block Diagram
17/07/2015
49
AT91 SAM7S Architecture
On-chip
SRAM
ARM7TDMI
Core
ASB
Advanced
Interrupt
Controller
On-chip
32-bit Flash
AMBA Bridge
APB
On-chip
Peripherals
17/07/2015
50
AT91SAM7S64 (58814) Block Diagram
17/07/2015
51
Advanced
Interrupt
Controller
BIU
M
ARM926EJ-S
M
Core
U
TCM
AT91 SAM9 Architecture
D
I
On-chip
SRAM / ROM
D
I
5-layer
Matrix
32-bit
External
Bus
Interface
AHB
External
Memories
External
Peripherals
AMBA Bridge
APB
On-chip
Peripherals
17/07/2015
52
AT91SAM9261 (59002) Block Diagram
17/07/2015
53
AT91 SAM7SE Architecture
On-chip
SRAM
ARM7TDMI
Core
ASB
On-chip
32-bit Flash
Advanced
Interrupt
Controller
32-bit
External
Bus
Interface
External
Memories
External
Peripherals
AMBA Bridge
APB
On-chip
Peripherals
17/07/2015
54
Appendix
ARM Instruction Set
Summary
17/07/2015
56
Condition Field (1/2)
All ARM instructions can be conditionally executed, which
means that their execution may or may not take place
depending on the values of values of the N, C, C and V
flags in the CPSR
Every instruction contains a 4-bit condition code field in
bits 31 to 28
17/07/2015
57
Condition Field (2/2)
There are fifteen different conditions, each represented by
a two-character suffix that can be appended to the
instruction's mnemonic.
A Branch (B in assembly) becomes BEQ for "Branch if Equal",
which means the Branch will only be taken if the Z flag is set.
Code
Suffix
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
EQ
NE
CS
CC
MI
PL
VS
VC
HI
LS
GE
LT
GT
LE
AL
Flags
Z set
Z clear
C set
C clear
N set
N clear
V set
V clear
C set and Z clear
C clear or Z set
N equals V
N not equal to V
Z clear AND (N equals V)
Z set OR (N not equal to V)
(ignored)
Meaning
Equal
Not equal
Unsigned higher or same
Unsigned lower
Negative
Positive or zero
Overflow
No overflow
Unsigned higher
Unsigned lower or same
Greater or equal
Less than
Greater than
Less than or equal
always
17/07/2015
58
Branch Instructions (1/2)
All ARM Processors support a branch instruction that allows a
conditional branch forwards or backwards up to 32Mbytes.
As the Program Counter (PC) is one of the general-purpose registers (register 15), a
branch or jump can also be generated by writing a value to register 15.
A subroutine call is a variant of the standard branch, the Branch with
Link instruction preserves the address of the instruction after the branch
(the return address) in register 14 (link register or LR).
A load instruction provides a way to branch anywhere in the 4Gbyte
address space. A 32-bit value is loaded directly from memory into the
PC, causing a branch.
The ARM processor that support the Thumb instruction set also support
a branch instruction (BX) that jumps to a given address, and optionally
switches executing Thumb instructions.
17/07/2015
59
Branch Instructions (2/2)
List of branch instructions
B, BL
BX
Branch, and branch with link
Branch and exchange instruction set
Examples
func
B
BCC
label
label
; branch unconditionally to label
; branch to label if carry flag is clear
BEQ
label
; branch to label if zero flag is set
MOV PC, #0
; R15 = 0, branch to location zero
BL
; subroutine call to function
func
MOV PC, LR
MOV LR, PC
LDR PC, =func
; R15=R14, return to instruction after the BL
; store the address of the instruction after the next one into R14
; load a 32-bit value into the program counter
17/07/2015
60
Data Processing (1/2)
ARM has 16 data processing instructions. Most data processing
instructions take two source operands (Move and Move Not
have only one operand) and store a result in a register (except
for the Compare and Test instructions which only update the
condition codes)
Of the two source operands, one is always a register, the other is called a
shifter operand, and is either an immediate value or a register. If the
second operand is a register value, it may have a shift applied to it before
it is used as the operand to the ALU
17/07/2015
61
Data Processing (2/2)
Data Processing (2/2)
List of data processing instructions
Assembler Mnemonic
OP Code
AND
EOR
WUB
RSB
ADD
ADC
SBC
RSC
TST
TEQ
CMP
CMN
ORR
MOV
BIC
MVN
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
Action
Operand1 AND operand2
Operand1 EOR operand2
Operand1 – operand2
Operand2 operand1
Operand1 + operand2
Operand1 + operand2 + carry
Operand1 – operand2 + carry –1
Operand2 – operand1 + carry –1
As AND, but results is not written
As EOR, but result is not written
As SUB, but result is not written
As ADD, but result is not written
Operand1 OR operand2
Operand2 (operand1 is ignored)
Operand1 AND NOT operand2 (Bit clear)
NOT operand2 (operand1 is ignored)
17/07/2015
62
Multiply Instructions (1/2)
ARM has two classes of multiply instruction
normal, 32-bit result
long, 64-bit result
All multiply instructions take two register operands as the input
to the multiplier
ARM does not directly support a multiply by constant instruction due to the
efficiency of shift and add, or shift and reverse subtract instructions
There are two multiply instructions that produce 32-bit results
MUL, multiplies the values of two registers together, truncates the result to
32 bits, and stores the result in a third register.
MLA, multiplies the values of two registers together, adds the value of a
third register, truncates the result to 32 bits, and stores the result into a
fourth register (multiply and accumulate)
MUL
MULS
MLA
R4, R2, R1
R4, R2, R1
R7, R8, R9, R3
; Set R4 to value of R2 multiplied by R1
; R4 = R2xR1, set N and Z flags
; R7 = R8xR9 + R3
17/07/2015
63
Multiply Instructions (2/2)
There are four multiply instructions that produce 64-bit
results (long multiply)
Two of the variants multiply the values of two registers together and
store the 64-bit result in a third and fourth register. There are a
signed (SMULL) and unsigned (UMULL) variants.
The remaining two variants multiply the values of two registers
together, add the 64-bit value from a third and fourth register and
store the 64-bit result back into those registers (third and fourth).
There are also signed (SMLAL) and unsigned (UMLAL) variants.
These instructions perform a long multiply and accumulate
SMULL
UMULL
UMLAL
R4, R8, R2, R3 ; R4 = bits 0 to 31 of R2xR3
; R8 = bits 32 to 63 of R2 x R3
R6, R8, R0, R1 ; R6, R8 = R0 x R1
R5, R8, R0, R1 ; R5, R8 = R0 x R1 + R5, R8
17/07/2015
64
Load and Store Instructions (1/2)
Load and store instruction come in three types:
load or store the value of a single register
load and store multiple register values
swap a register value with the value of a memory location
Load and store single register
Load register instructions can load a 32-bit word, a 16-bit
halfword or an 8-bit byte from memory into a register.
Store register instructions can store a 32-bit word, a 16-bit
halfword or an 8-bit byte from a register to memory.
List of load and store single register:
- LDR/STR, Load/Store word
- LDRB/STRB, Load/Store byte
- LDRH/STRH, Load/Store unsigned halfword
- LDRSB, Load signed byte
- LDRSH, Load signed halfword
17/07/2015
65
Load and Store Instructions (2/2)
Load and Store multiple registers
Load and Store multiple instructions perform a block transfer of any
number of the general purpose registers to or from memory
Four addressing modes are provided:
- pre-increment
- post-increment
- pre-decrement
- post-decrement
List of load and store multiple instructions
- LDM, Load multiple
- STM, Store multiple
Swap a register value with the value of a memory location
Swap can load a value from a register-specified memory location, store
the contents of a register to the same memory location, then write the
loaded value to a register.
List of semaphore instructions
- SWP, Swap
- SWPB, Swap Byte
17/07/2015
66
SWI : Software Interrupt
The Software Interrupt instruction enters supervisor mode
in a controlled manner:
The instruction causes the software interrupt trap to be taken,
which effects the mode change
If the SWI vector address is suitably protected (by external
memory management hardware) from modification by the user, a
fully protected operating system may be constructed.
The bottom 24 bits of the instruction are ignored by the
processor, and may be used to communicate information
to the supervisor code.
17/07/2015
67
THUMB Instruction Set
Summary
17/07/2015
69
How Does Thumb Work ?
The Thumb instruction set is a subset of the ARM
instruction set, optimized for code density.
Almost every Thumb instructions have an ARM
instructions equivalent:
ADD Rd, #Offset8 <> ADDS Rd, Rd, #Offset8
Inline expansion of Thumb Instruction to ARM Instruction
Real time decompression
Thumb instructions are not actually executed on the core
The core needs to know whether it is reading Thumb
instructions or ARM instructions.
Core has two execution states - ARM and Thumb
Core does not have a mixed 16 and 32 bit instruction set.
17/07/2015
70
Thumb Instruction Set Decompression
THUMB: ADD Rd,#Constant
15
0
001
Always
condition
31
1110
10
Major
opcode
28
Minor
opcode
24
00 1
Rd
Constant
Destination &
source register
21 20 19
0100 1
0
16 15
Rd
0
Zero extended
constant
12
Rd
11
0000
8
7
0
Constant
I op1+op2 S
ARM: ADDS Rd, Rd, #Constant
17/07/2015
71
Branch Instructions
Thumb supports four types of branch instruction:
an unconditional branch that allows a forward or backward branch
of up to 2Kbytes
a conditional branch to allow forward and backward branches of up
to 256 bytes
a branch with link is supported with a pair of instructions that allow
forward and backwards branches of up to 4Mbytes
a branch and exchange instruction branches to an address in a
register and optionally switches to ARM code execution
List of branch instructions
B
conditional branch
B
unconditional branch
BL
Branch with link
BX
Branch and exchange instruction set
17/07/2015
72
Data Processing Instructions
Thumb data-processing instructions are a subset of the
ARM data-processing instructions
All Thumb data-processing instructions set the condition codes
List of data-processing instructions
ADC, Add with Carry
MOV, Move
ADD, Add
MUL, Multiply
AND, Logical AND
MVN, Move NOT
ASR, Arithmetic shift right
NEG, Negate
BIC, Bit clear
ORR, Logical OR
CMN, Compare negative
ROR, Rotate Right
CMP, Compare
SBC, Subtract with Carry
EOR, Exclusive OR
SUB, Subtract
LSL, Logical shift left
TST, Test
LSR, Logical shift right
17/07/2015
73
Load and Store Register Instructions
Thumb supports 8 types of load and store register
instructions
List of load and store register instructions
LDR
Load word
LDRB
Load unsigned byte
LDRH
Load unsigned halfword
LDRSB
Load signed byte
LDRSH
Load signed halfword
STR
Store word
STRB
Store byte
STRH
Store halfword
17/07/2015
74
Load and Store Multiple Instructions
Thumb supports four types of load and store multiple
instructions
Two (a load and store) are designed to support block copy
The other two instructions (called PUSH and POP)
implement a full descending stack, and the stack pointer
is used as the base register
List of load and store multiple instructions
LDM
Load multiple
POP
Pop multiple
PUSH
Push multiple
STM
Store multiple
17/07/2015
75
Arm Instruction Set Advantages
All instructions are 32 bits long.
Most instructions are executed in one single cycle.
Every instructions can be conditionally executed.
A load/store architecture
Data processing instructions act only on registers
- Three operand format
- Combined ALU and shifter for high speed bit manipulation
Specific memory access instructions with powerful auto-indexing
addressing modes
32 bit ,16 bit and 8 bit data types
Flexible multiple register load and store instructions17/07/2015
76
Thumb Instruction Set Advantages
All instructions are exactly 16 bits long to improve code
density over other 32-bit architectures
The Thumb architecture still uses a 32-bit core, with:
32-bit address space
32-bit registers
32-bit shifter and ALU
32-bit memory transfer
Gives....
Long branch range
Powerful arithmetic operations
Large address space
17/07/2015
77