Synchronous Reactive Communication: Generalization, Implementation, and Optimization Guoqiang Gerald Wang PhD Candidate Committee in charge: Prof.

Download Report

Transcript Synchronous Reactive Communication: Generalization, Implementation, and Optimization Guoqiang Gerald Wang PhD Candidate Committee in charge: Prof.

Synchronous Reactive Communication:
Generalization, Implementation, and Optimization
Guoqiang Gerald Wang
PhD Candidate
Committee in charge:
Prof. Alberto Sangiovanni-Vincentelli, Chair
Prof. Robert K. Brayton
Prof. Zuojun Shen
December 9, 2008
Outline
• Background and previous work
• Methodology and implementation framework
• SR semantics preserving communication
protocols
• Protocol implementation under OSEK OS
standard
• Memory optimization under timing
constraints
• Automatic code generation
• Conclusions
2
System-Level Design
• On the edge of a revolution in the way
electronic products are designed
– Increasing design complexity
– High product quality
– Stronger market pressure
• System-level design is the key to success
– Start with the highest possible level of
abstraction (e.g. control algorithm)
– Establish properties at the right level
– Use formal models
– Leverage multiple scientific disciplines
Background
3
Models of Computation
• Process Networks (PN)
– Undecidable: deadlocks and buffer boundedness
• Synchronous Reactive (SR)
– WRT model time
• computation doesn’t take time
• All actors execute simultaneously and instantaneously
– Has strong formal properties: decidable termination & boundedness
– Good for specifying periodic real-time tasks
• Discrete Event (DE)
– Signals are time-stamped events
– Events are processed in chronological order
– Good for modeling and design of time-based systems
• Dataflow process networks
– Dataflow processes are Kahn processes composed of atomic firings
Background
4
Implementation Tools
• Academia:
– MetroPolis (ASV, UCB)
• Meta model
• Communication refinement
– Ptolemy (E. Lee, UCB)
• Started with static dataflow for DSP
• Mixed model verification
Support multiple
models of
computation
• Industry:
– LabVIEW
• Statically schedulable dataflow (Synchronous dataflow)
– Simulink
• Synchronous reactive
Background
5
Model-Based Design
• An instance of system-level design
• Very popular
– Tool support
–
–
–
–
• Design time simulation/verification
Short development cycle
Predictable performance
Flexibility of new design evaluation
Design reuse
• The rest of the presentation
– Synchronous reactive model-based design
Background
6
SR Semantics
Donald, I am
sending $5
and an apple. Now I am
Time for
done.
me to run
Get $5 and an apple.
My turn to Thanks, Mickey!
run!
I finished.
$5
prioritym > priorityd
periodm < periodd
Simulation time
t0
t1
During simulation, writer and reader respond instantaneously
(computation takes zero time)
Background
7
Implementation Options of
Multi-Rate systems
• Single-task option
unschedulable
base rate
0
2
8
time
16
• Multiple-task option
Block 1
Task 1
Task 2
0
0
2
2
Options
8
time
16
8
Period/Deadline (unit)
8
16
Exe time (unit)
2
6 -> 6.1
time
16
Pros
Cons
Single task
Easy to construct
Poor resource utilization
Multiple task
Better scheduling
Tricky due to preemption
Background
Block 2
8
Model Implementation (Case I)
Donald, I send
$1me
and a banana Now I
Time for
this time. am done.
to run again.
Now I am
done.
Donald, I am
sending $5
and an apple.
Time for
me to run
I need to
yield to Mickey.
Now, I
I can resume.
My turn can
to start.
Start receiving.
run. But need
Get $1 and
to wait.
a banana.
prioritym > priorityd
periodm < periodd
t0 t1
t2
t3
t4
Communication is not atomic
Uni-processor and priority-based preemptive scheduling
Background
I finished.
$5
$1
real time
t5
9
Model Implementation (Case II)
Donald, I send
and a bananaNow I
Time for$1me
this time. am done.
to run again.
Now I am
done.
Donald, I am
sending $5
and an apple.
Time for
me to run
Start receiving.
Got $5!
Not finished yet.
Now, I
Have to yield.
My turn can
to start.
Resume receiving.
run. But need
A banana. Wow,
to wait.
$5 and a banana!!
I finished.
$5
$1
real time
t0
Background
t1
t2
t3
t4
t5
10
What Is the
Difference
$5
SR Semantics
simulation time
$5
$5
Case I
data
determinism
problem
$1
Background
data
integrity
problem
$5
second
first
Case II
$1
real time
$1
$5
real time
11
Current Approach Limitations
and Solutions
• Rate transition buffering scheme from The MathWorks
• Limitations
• Solutions
– One to one communication
• Double buffering scheme
• No memory optimization
– For periodic communicating
tasks:
• Periods must be harmonic
• Tasks must be activated with
the same phase
– One to many communication
• Dynamic buffering and
temporal currency control
protocols
– For periodic communicating
tasks:
• Raise the implementation up to
the kernel level
• Furthermore,
– Generalization with support of arbitrary link delay and
multiple activations per task
– Memory optimization through automatic protocol selection
Background
12
Outline
• Background and previous work
• Methodology and implementation framework
• SR semantics preserving communication
protocols
• Protocol implementation under OSEK OS
standard
• Memory optimization under timing
constraints
• Automatic code generation
• Conclusions
13
Methodology
• Platform-based design
– Automatic generation of
• application tasks
• communication protocol implementation
– Automatic configuration of RTOS
procedures and data structures
– Flexibility of choice of RTOS API standard
Methodology
14
Meet-in-the-Middle Approach
• Application domain
– Time-critical applications
– Modeled as SR tasks
• Platform
– Task/resource model platform
Application
Functional Model
Task Model;
Communication Resource
Model
Mapping
• Task’s characteristics and interaction
– RTOS platform
• Priority-based scheduling policies
• Inter-task comm. protocols
RTOS API
Export
– lock-based, lock-free, wait-free
• Middle meeting point
– RTOS API
• OSEK/VDX, POSIX, μITRON
• Execution architecture
– Uni-processor
– Priority-based preemptive scheduling
Methodology
Scheduling Policy: FP, DP;
Communication Resource
Management Policy
Execution
Architecture
15
Design Flow
Specification of SR models
Model-based design tool
Application
configuration files
(OIL)
User’s
source
code
C code
System Generator
(SG)
Files produced by SG
C code
C code
compiler
linker
Methodology
OSEK OS
Kernel
OSEK COM
Object libraries
Executable file
16
Task Model τi
NIP
τi
···
···
NOP
• Parameter characterization
–
–
–
–
–
–
–
–
Priority: 
Period: Ti
Activation time: ai  j
Start time: si  j
Worst-case computation time: Ci
Finish time: fi  j
Worst-case response time: R
Relative deadline: d i
i
i
di
Ri
Ci
Methodology
ai  j si  j
fi  j
time
17
Outline
• Background and previous work
• Methodology and implementation framework
• SR semantics preserving communication
protocols
• Protocol implementation under OSEK OS
standard
• Memory optimization under timing
constraints
• Automatic Code Generation
• Conclusions
18
SR Semantics with Link Delay
i  t   supremum m | ai  m   t
inri  j  outw k 
if delay i  0, k   w  ai  j  

if delay i  1, k  max 0, w  ai  j   1


for any delay i, k  max 0, w  ai  j   delay i

j
delay i
ri
ai  j
in
out
w
SR Protocols
 w  ai  j   delay[i]
 w  ai  j   1
···
delay i
time
 w  ai  j  
time
19
One to Many Communication:
Single-Writer Multiple-Reader
R 
 w
1
T
 w
HPR: M
  max p,q

M1

M1
2  ···
2
(  1)
1

q
w
0

Mq
···
Link delay: design parameter
LPR: N

N0

1
N1
···
 ··· 
p
Np
• General case: d R   T and any 
• Special case: d  T
and   1
SR Protocols
20
SR Semantics Preserving
Communication mechanisms
• Wait-free scheme
• Buffer sizing mechanisms
– Spatially-out-of-order writes
– Spatially-in-order writes
• Buffer indexing protocols
– Dynamic Buffering Protocol (DBP)
– Temporal Concurrency Control Protocol
(TCCP)
SR Protocols
21
Spatially-out-of-Order Writes
• Buffer sizing:
NB = NLPR + 1 + 1
Buf[]
SR Protocols
Read[]
prev
2
5
0
0
0
cur
5
1
1
1
0
1
2
3
2
3
4
3
4
2
4
5
2
5
2
6
LPR
HPR
22
Spatially-in-Order Writes
• Offset
owi  ai k   aw  j
j  sup m | aw  m   ai k 
Owi  Tw
l 
NB  max  i 
1iNR T
 w
• Buffer sizing
I finished.
activated
lifetime l i  delay[i]  Tw  Owi  Ri
Owi
n 1
Ri
n
n 1
Unit
delay
di
Tw
buffer index n
writer instance k
SR Protocols
n 1
k 1



n 1
n

23
How to Guarantee SR Semantics
• Handle communication at two levels
– Buffer indices defined at activation time
by kernel
– Data reading/writing at execution time by
application tasks
SR Protocols
24
The Protocols
Temporal Concurrency Control Protocol (TCCP)
Reader i
Writer i
/* activation time */
/* activation time */
if (delay[i])
prev = cur;
Read[i] = prev;
cur = FindFreeT();
else
/* execution time */
Read[i] = cur;
···
Buf[cur] = ···
/* execution time */
···
Def of FindFreeT()
···
char FindFreeT(void) {
··· = Buf[Read[i]];
return (cur+1) % NB;
···
} /* O(1) */
FreeHd
SR Protocols
2
3
4
1
1
5
-1
0
1
2
3
4
5
UseFreeL[6]
Constant Time Dynamic
Reader i
/* activation time */
if (delay[i])
Read[i] = prev;
else
Read[i] = cur;
if (isHPR[i] == 0)
UseFreeL[Read[i]]++;
/* execution time */
···
··· = Buf[Read[i]];
···
/* termination time (CS)*/
if (isHPR[i] == 0)
UseDec(Read[i]);
Def of FindFreeC()
char FindFreeC(void) {
tmp = FreeHd;
FreeHd = UseFreeL[tmp];
return tmp;
} /* O(1) */
Buffer Protocol (CTDBP)
Writer
/* activation time */
UseDec(prev);
prev = cur;
cur = FindFreeC();
UseFreeL[cur] = 1;
/* execution time */
···
Buf[cur] = ···
···
Def of UseDec()
void UseDec(int j) {
UseFreeL[j]--;
if (UseFreeL[j] == 0) {
UseFreeL[j] = FreeHd;
FreeHd = j;
}
}
25
Outline
• Background and previous work
• Methodology and implementation framework
• SR semantics preserving communication
protocols
• Protocol implementation under OSEK OS
standard
• Memory optimization under timing
constraints
• Automatic code generation
• Conclusions
26
OSEK/VDX
• A series of standards particularly for
automotive designs
• Basic and extended tasks
• Four Conformance Classes
Multiple active task
– BCC1, BCC2, ECC1, ECC2
• Portability
– Minimum requirement of CC
• Kernel services:
BCC1
no
instance
Tasks not in suspended
state
8
> 1 task per priority
no
Event per task
-
Alarm
1
– Task management, alarm, hook mechanism
• OIL
– Modular configuration for system generation
OSEK Implementation
27
An OSEK/VDX Implementation
• Portable implementation C
– BCC1
– Minimum Requirement
• Only one alarm
• Task dispatcher
– GCD of rates
tick
DispHd size
k,d,1
Ck,d,2
Ck,τ,1
Ck,w
Ck,τ,2
Ck,r
Ck,τ,3
Ck,d,3
TickL[LCMR] DispT[TSize]
OSEK Implementation
1 TASK (dispatcher) {
2 tick = (tick+1) % LCMR;
3 if (TickL[tick].DispHd != -1) {
4
for (k = 0; k < TickL[tick].size; k++) {
5
idx = DispT[k+TickL[tick].DispHd];
6
for (i = 0; k < TaskL[idx].NOP; i++) {
7
idx2 = TaskL[idx].OPHd + i;
8
…/* kernel level writer code */
9
}
10
}
11
for (k = 0; k < TickL[tick].size; k++) {
12
idx = DispT[k+TickL[tick].DispHd];
13
for (i = 0; k < TaskL[idx].NIP; i++) {
14
idx2 = TaskL[idx].IPHd + i;
15
…/* kernel level reader code */
16
}
17
ActivateTask(idx);
18
}
19 }
20 TerminateTask();
21 }
28
Implementation (cont)
• Application task
CTDBP
TASK (AppTask_i) {
TaskL[i].done = false;
···
/* each writer w */
Buf[OPL[w].cur] = ···
···
/* each reader r */
··· = Buf[Read[r]];
···
TaskL[i].done = true;
/* hook routine */
TerminateTask();
}
OSEK Implementation
TCCP
void PostTaskHook(void) {
int id, j, k, nip;
GetTaskID(id);
if (TaskL[id].done) {
nip = TaskL[id].NIP;
for (j=0; j<nip; j++) {
k = j + TaskL[id].IPHd;
/* atomic termination code */
}
}
}
TASK (AppTask_i) {
···
/* each writer w */
Buf[OPL[w].cur] = ···
···
/* each reader r */
··· = Buf[Read[r]];
···
TerminateTask();
}
29
Comparison of CTDBP & TCCP
Protocol
CTDBP
TCCP
# of buffers
SysNBD
SysNBT
Sizing mechanism
Dedicated buffers for readers
How many times writer can write
Difficulty of sizing
Easy
Difficult
Temporal property
Fast
Faster
Data structures
Complex
Less complex
Good for
Slow readers
Fast readers
OSEK Implementation
30
Outline
• Background and previous work
• Methodology and implementation framework
• SR semantics preserving communication
protocols
• Protocol implementation under OSEK OS
standard
• Memory optimization under timing
constraints
• Automatic code generation
• Conclusions
31
Mathematical Formulation
• Fixed given priority
– Parameters:
– Variables:
NRSi 

jNR
j1
pi, j 
xi, j 
xi,jLi,j


1 if i >  j
1 if wi
CTDBP
 r
j
0 if wi
TCCP
 r
j
Buffer sizing mechanisms:
TCCP:
NLPRSi
 
0
1
NBFi  max (1  xi,j )
1 jNR

0 otherwise
NLPRSi 
NRFi  NRi  NRSi
CTDBP: NBSi
Optimization
Li, j 
0 otherwise
1 if wi  rj

j  NR
j 1
xi,j p
L
w(i) ,r( j) i,j
max xi, j delay[j]Li, j if NRSi > 0
i j  NR
otherwise
 lj  L
 Tw(i)  i,j
32
Complete Formulation
minimize
NB   i1  NBFi  NBSi 
s.t.
1  i  NT, Ri  di
Reformulate: MILP
NW
Schedulability constraint

 lj  
1  i  NW, NBFi  max  1  xi,j 
 Li,j 

1 jNR 
T
(i)
 w  

NR
1  i  NW, NBSi  j1 xi,jpw(i) ,r( j) Li,j  1  max
delay jxi,jLi,j i
1 jNR







Buffer size
1  i  NR, l i  delay i  Tw(S(i))  OS(i),i  Rr(i)
1  i  NT, Ri  Ci 
-
Ck,w
Ck,r
TCCP
Δ
Φ
CTDBP
Ψ
Γ
Lifetime
R 
 NT   R 

i
i



C

CS

CS

C
  j
  k,  


i,j
j,ter




T
jhpi\k  j 

 j1   Tj 

 R 
 NW   R 
i

 xS j,j   1  xS j,j      i   j   j

 j1   T  j 
j1   T  j 
 r 

 w 
R 
R 
  i   Ck,d  CSclk,d  CSd,ter    i   CSi,clk  CSclk,ter 
 Td 
 Tclk 

NR






1  i  NW, i  min 1,  j1 xi,jLi,j , i  min 1,NRi   j1 xi,jLi,j


NR

  min 1, i1 i ,   min 1, i1 i
NW
Ck,  Ck,D   1    Ck,T
Optimization
CS  CSD   1    CST
NW

NR






WCRT
Protocol flag
Partial cost of dispatcher
33
Cost of context switches
Experimental Setup
• Performance evaluation environment
– PIC18F452
• Performance up to 10 MIPS
– ePICos18
• Multi-task, preemptive, O(1) kernel scheduler
• OSEK compliant
• Task graphs generated by TGFF
– 809 systems (158 unschedulable)
• On average, 12 tasks/graph; execution time: 6•104ICs;
task period: 106ICs
• ≤ 4(8) writers(readers)/task; ≤ 2-unit delay;
Optimization
34
Experimental Results
105 Systems: Smaller Buffer Size under CTDBP than TCCP
0.3
Percentage of Test Cases
0.25
0.2
0.15
0.1
4.8%
0.05
0
Relative Improvement
w.r.t. CTDBP
Optimization
14%
24%
w.r.t TCCP
35
Outline
• Background and previous work
• Methodology and implementation framework
• SR semantics preserving communication
protocols
• Protocol implementation under OSEK OS
standard
• Memory optimization under timing
constraints
• Automatic code generation
• Conclusions
36
Real-Time Workshop
• Simulink system functions (S-Functions)
– Extend capability of Simulink environment
– Coded in C or MATLAB
• Target Language Compiler (TLC)
– From graphical model to intermediate form
– Eventually into target specific code
• RTW Embedded Coder (E-Coder)
– Framework for development of embedded
software
Code Generation
37
SR Implementation Library
Code Generation
38
Example of DyB
Given:
sampling rates,
buffer initial value (8)
Period of task dispatcher?
GCD(2,3,5) = 1
How many buffer slots?
NLPR + 1 + 1 = 3
How many tasks?
5
Code Generation
39
Simulation/Emulation Results
RTW
MPLAB(PIC18F452) + ePICos18
Period = 3
Period = 5
Period = 2
Code Generation
40
Conclusions
• Generalized theory on SR semantics
preserving communication
• Implemented protocols with portability
consideration under the OSEK OS standard
• Optimized memory and supported a wider
range of applications under timing constraints
with automatic protocol selection
• Supported automatic code generation for SR
communication protocols
41
Thank you
42