A “short list” of embedded systems

Download Report

Transcript A “short list” of embedded systems

Basic Architecture
• Control unit and
datapath
Processor
Control unit
– Note similarity to
single-purpose
processor
Datapath
ALU
Controller
Control
/Status
• Key differences
– Datapath is general
– Control unit doesn’t
store the algorithm –
the algorithm is
“programmed” into the
memory
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Registers
PC
IR
I/O
Memory
1
Datapath Operations
• Load
Processor
– Read memory location
into register
Control unit
Datapath
ALU
• ALU operation
Controller
– Input certain registers
through ALU, store
back in register
Registers
• Store
– Write register to
memory location
+1
Control
/Status
10
PC
11
IR
I/O
Memory
...
10
11
...
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
2
Control Unit
•
Control unit: configures the datapath
operations
Processor
– Sequence of desired operations
(“instructions”) stored in memory –
“program”
•
Control unit
ALU
Controller
Instruction cycle – broken into
several sub-operations, each one
clock cycle, e.g.:
– Fetch: Get next instruction into IR
– Decode: Determine what the
instruction means
– Fetch operands: Move data from
memory to datapath register
– Execute: Move data through the
ALU
– Store results: Write data from
register to memory
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Datapath
Control
/Status
Registers
PC
IR
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Memory
R1
...
500
501
10
...
3
Control Unit Sub-Operations
• Fetch
– Get next instruction
into IR
– PC: program
counter, always
points to next
instruction
– IR: holds the
fetched instruction
Processor
Control unit
ALU
Controller
Control
/Status
Registers
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Datapath
Memory
R1
...
500
501
10
...
4
Control Unit Sub-Operations
• Decode
Processor
– Determine what the
instruction means
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
5
Control Unit Sub-Operations
• Fetch operands
Processor
– Move data from
memory to datapath
register
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
10
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
6
Control Unit Sub-Operations
• Execute
– Move data through
the ALU
– This particular
instruction does
nothing during this
sub-operation
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
10
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
7
Control Unit Sub-Operations
• Store results
– Write data from
register to memory
– This particular
instruction does
nothing during this
sub-operation
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
10
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
8
Instruction Cycles
PC=100
Fetch Decode Fetch Exec. Store
ops
results
clk
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
10
PC 100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
9
Instruction Cycles
PC=100
Fetch Decode Fetch Exec. Store
ops
results
clk
Processor
Control unit
Datapath
ALU
Controller
+1
Control
/Status
PC=101
Registers
Fetch Decode Fetch Exec. Store
ops
results
clk
10
PC 101
IR
inc R1, R0
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
11
R1
...
500
501
10
...
10
Instruction Cycles
PC=100
Fetch Decode Fetch Exec. Store
ops
results
clk
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
PC=101
Registers
Fetch Decode Fetch Exec. Store
ops
results
clk
10
PC 102
IR
store M[501], R1
R0
11
R1
PC=102
Fetch Decode Fetch Exec. Store
ops
results
clk
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Memory
...
500 10
501 11
...
11
Architectural Considerations
• N-bit processor
– N-bit ALU, registers,
buses, memory data
interface
– Embedded: 8-bit, 16bit, 32-bit common
– Desktop/servers: 32bit, even 64
• PC size determines
address space
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
PC
IR
I/O
Memory
12
Architectural Considerations
• Clock frequency
– Inverse of clock
period
– Must be longer than
longest register to
register delay in
entire processor
– Memory access is
often the longest
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
PC
IR
I/O
Memory
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
13
Pipelining: Increasing Instruction Throughput
Wash
1
2
3
4
5
6
7
8
1
2
3
Non-pipelined
Dry
1
Decode
1
2
3
4
5
6
7
1
Time
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
Instruction 1
pipelined instruction execution
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
6
7
8
2
3
4
5
6
7
pipelined dish cleaning
3
Execute
Store res.
8
2
Fetch ops.
5
Pipelined
non-pipelined dish cleaning
Fetch-instr.
4
8
Time
Pipelined
8
Time
14
Superscalar and VLIW Architectures
• Performance can be improved by:
– Faster clock (but there’s a limit)
– Pipelining: slice up instruction into stages, overlap stages
– Multiple ALUs to support more than one instruction stream
• Superscalar
– Scalar: non-vector operations
– Fetches instructions in batches, executes as many as possible
• May require extensive hardware to detect independent instructions
– VLIW: each word in memory has multiple independent instructions
• Relies on the compiler to detect and schedule instructions
• Currently growing in popularity
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
15
Two Memory Architectures
Processor
• Princeton
Processor
– Fewer memory
wires
• Harvard
– Simultaneous
program and data
memory access
Program
memory
Data memory
Harvard
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
(program and data)
Princeton
16
Cache Memory
• Memory access may be slow
• Cache is small but fast
memory close to processor
– Holds copy of part of memory
– Hits and misses
Fast/expensive technology, usually on
the same chip
Processor
Cache
Memory
Slower/cheaper technology, usually on
a different chip
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
17
Programmer’s View
• Programmer doesn’t need detailed understanding of architecture
– Instead, needs to know what instructions can be executed
• Two levels of instructions:
– Assembly level
– Structured languages (C, C++, Java, etc.)
• Most development today done using structured languages
– But, some assembly level programming may still be necessary
– Drivers: portion of program that communicates with and/or controls
(drives) another device
• Often have detailed timing considerations, extensive bit manipulation
• Assembly level may be best for these
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
18
Assembly-Level Instructions
Instruction 1
opcode
operand1
operand2
Instruction 2
opcode
operand1
operand2
Instruction 3
opcode
operand1
operand2
Instruction 4
opcode
operand1
operand2
...
• Instruction Set
– Defines the legal set of instructions for that processor
• Data transfer: memory/register, register/register, I/O, etc.
• Arithmetic/logical: move register through ALU and back
• Branches: determine next PC value when not just PC+1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
19
A Simple (Trivial) Instruction Set
Assembly instruct.
First byte
Second byte
Operation
MOV Rn, direct
0000
Rn
direct
Rn = M(direct)
MOV direct, Rn
0001
Rn
direct
M(direct) = Rn
MOV @Rn, Rm
0010
Rn
MOV Rn, #immed.
0011
Rn
ADD Rn, Rm
0100
Rn
Rm
Rn = Rn + Rm
SUB Rn, Rm
0101
Rn
Rm
Rn = Rn - Rm
JZ Rn, relative
0110
Rn
opcode
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Rm
immediate
relative
M(Rn) = Rm
Rn = immediate
PC = PC+ relative
(only if Rn is 0)
operands
20
Addressing Modes
Addressing
mode
Operand field
Immediate
Data
Register-direct
Register-file
contents
Memory
contents
Register address
Data
Register
indirect
Register address
Memory address
Direct
Memory address
Data
Indirect
Memory address
Memory address
Data
Data
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
21
Sample Programs
C program
int total = 0;
for (int i=10; i!=0; i--)
total += i;
// next instructions...
Equivalent assembly program
0
1
2
3
MOV R0, #0;
MOV R1, #10;
MOV R2, #1;
MOV R3, #0;
// total = 0
// i = 10
// constant 1
// constant 0
Loop:
5
6
7
JZ R1, Next;
ADD R0, R1;
SUB R1, R2;
JZ R3, Loop;
// Done if i=0
// total += i
// i-// Jump always
Next:
// next instructions...
• Try some others
– Handshake: Wait until the value of M[254] is not 0, set M[255] to 1, wait
until M[254] is 0, set M[255] to 0 (assume those locations are ports).
– (Harder) Count the occurrences of zero in an array stored in memory
locations 100 through 199.
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
22
Programmer Considerations
• Program and data memory space
– Embedded processors often very limited
• e.g., 64 Kbytes program, 256 bytes of RAM (expandable)
• Registers: How many are there?
– Only a direct concern for assembly-level programmers
• I/O
– How communicate with external signals?
• Interrupts
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
23
Microprocessor Architecture Overview
• If you are using a particular microprocessor, now is a
good time to review its architecture
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
24
Example: parallel port driver
LPT Connection Pin
I/O Direction
Register Address
1
Output
0th bit of register #2
2-9
Output
10,11,12,13,15
Input
14,16,17
Output
0th bit of register #2
6,7,5,4,3th
bit of register #1
Pin 13
PC
Switch
Parallel port
Pin 2
LED
1,2,3th bit of register #2
• Using assembly language programming we can configure a PC
parallel port to perform digital I/O
– write and read to three special registers to accomplish this table provides
list of parallel port connector pins and corresponding register location
– Example : parallel port monitors the input switch and turns the LED
on/off accordingly
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
25
Parallel Port Example
;
;
;
;
This program consists of a sub-routine that reads
the state of the input pin, determining the on/off state
of our switch and asserts the output pin, turning the LED
on/off accordingly
.386
CheckPort
push
push
dx
mov
in
and
cmp
jne
SwitchOff:
mov
in
and
out
jmp
SwitchOn:
mov
in
or
out
Done:
pop
pop
CheckPort
proc
ax
;
;
dx, 3BCh + 1 ;
al, dx
;
al, 10h
;
al, 0
;
SwitchOn
;
save the content
save the content
base + 1 for register #1
read register #1
mask out all but bit # 4
is it 0?
if not, we need to turn the LED on
extern “C” CheckPort(void);
// defined in
// assembly
void main(void) {
while( 1 ) {
CheckPort();
}
}
Pin 13
PC
Parallel port
Pin 2
dx, 3BCh + 0 ; base + 0 for register #0
al, dx
; read the current state of the port
al, f7h
; clear first bit (masking)
dx, al
; write it out to the port
Done
; we are done
dx,
al,
al,
dx,
3BCh + 0 ; base + 0 for register #0
dx
; read the current state of the port
01h
; set first bit (masking)
al
; write it out to the port
dx
ax
endp
; restore the content
; restore the content
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Switch
LED
LPT Connection Pin
I/O Direction
Register Address
1
Output
0th bit of register #2
2-9
Output
0th bit of register #2
10,11,12,13,15
Input
14,16,17
Output
6,7,5,4,3th bit of register
#1
1,2,3th bit of register #2
26
Operating System
• Optional software layer
providing low-level services to
a program (application).
– File management, disk access
– Keyboard/display interfacing
– Scheduling multiple programs for
execution
• Or even just multiple threads from
one program
– Program makes system calls to
the OS
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
DB file_name “out.txt” -- store file name
MOV
MOV
INT
JZ
R0, 1324
R1, file_name
34
R0, L1
-----
system call “open” id
address of file-name
cause a system call
if zero -> error
. . . read the file
JMP L2
-- bypass error cond.
L1:
. . . handle the error
L2:
27
Development Environment
• Development processor
– The processor on which we write and debug our programs
• Usually a PC
• Target processor
– The processor that the program will run on in our embedded
system
• Often different from the development processor
Development processor
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Target processor
28
Software Development Process
• Compilers
C File
C File
Compiler
Binary
File
Binary
File
– Cross compiler
Asm.
File
• Runs on one
processor, but
generates code for
another
Assemble
r
Binary
File
Linker
Library
Exec.
File
Implementation Phase
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Debugger
Profiler
Verification Phase
•
•
•
•
Assemblers
Linkers
Debuggers
Profilers
29
Running a Program
• If development processor is different than target, how
can we run our compiled code? Two options:
– Download to target processor
– Simulate
• Simulation
– One method: Hardware description language
• But slow, not always available
– Another method: Instruction set simulator (ISS)
• Runs on development processor, but executes instructions of target
processor
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
30
Instruction Set Simulator For A Simple
Processor
#include <stdio.h>
}
typedef struct {
unsigned char first_byte, second_byte;
} instruction;
}
instruction program[1024];
unsigned char memory[256];
//instruction memory
//data memory
}
return 0;
int main(int argc, char *argv[]) {
FILE* ifs;
void run_program(int num_bytes) {
If( argc != 2 ||
(ifs = fopen(argv[1], “rb”) == NULL ) {
return –1;
}
if (run_program(fread(program,
sizeof(program) == 0) {
print_memory_contents();
return(0);
}
else return(-1);
int pc = -1;
unsigned char reg[16], fb, sb;
while( ++pc < (num_bytes / 2) ) {
fb = program[pc].first_byte;
sb = program[pc].second_byte;
switch( fb >> 4 ) {
case 0: reg[fb & 0x0f] = memory[sb]; break;
case 1: memory[sb] = reg[fb & 0x0f]; break;
case 2: memory[reg[fb & 0x0f]] =
reg[sb >> 4]; break;
case 3: reg[fb & 0x0f] = sb; break;
case 4: reg[fb & 0x0f] += reg[sb >> 4]; break;
case 5: reg[fb & 0x0f] -= reg[sb >> 4]; break;
case 6: pc += sb; break;
default: return –1;
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
}
31
Testing and Debugging
(a)
• ISS
(b)
Implementation
Phase
Verification
Phase
Implementation
Phase
Development processor
Debugger
/ ISS
Emulator
External tools
– Gives us control over time –
set breakpoints, look at
register values, set values,
step-by-step execution, ...
– But, doesn’t interact with real
environment
• Download to board
– Use device programmer
– Runs in real environment, but
not controllable
• Compromise: emulator
Programmer
Verification
Phase
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
– Runs in real environment, at
speed or near
– Supports some controllability
from the PC
32