AFMS'12_report

Transcript AFMS'12_report

Virtual Machine Scheduling for Parallel Soft
Real-Time Applications
Like Zhou, Song Wu, Huahua Sun, Hai Jin, Xuanhua Shi
Services Computing Technology and System Lab
Cluster and Grid Computing Lab
School of Computer Science and Technology
Huazhong University of Science and Technology
Outline
•
•
•
•
•
•
•
Introduction
Motivation
Design
Implementation
Evaluation
Discussion and Future Work
Conclusion
Introduction
• Many soft real-time applications use parallel
programming models to utilize hardware resources
better and possibly shorten response time
• More and more cloud services including such parallel
soft real-time applications (PSRT applications) are
running in virtualized environment
cloud-based live transcoding
computer vision
distributed real-time
stream computing
Introduction
• When running in virtualized environment, PSRT
applications do not behave well and only obtain
inadequate performance
soft real-time
constraints
deadline miss
&
performance
degradation
synchronization
problems
Outline
•
•
•
•
•
•
•
Introduction
Motivation
Design
Implementation
Evaluation
Discussion and Future Work
Conclusion
Motivation
2
CPU3
1
CPU2
soft real-time scheduling
2
1
1
0
1
3
3
3
2
2
CPU1
CPU0
2
1
2
3
1st
4
5
Time
6
7
8
2
2
4
5
Time
2nd 3
1
2
3
parallel soft real-time scheduling CPU1
1
2
3
CPU0
1
2
3
1
CPU2
0
1st
3
2
CPU3
1
1st
9
10
3
How to design and
implement1the parallel2soft
CPU2
co-scheduling
real-time scheduling
algorithm1 which addresses
2
CPU1
soft real-time constraints
and synchronization
1
2
CPU0
problems simultaneously?
1
10
2nd
1
CPU3
0
9
2
3
2nd
4
6
3
3
3
7
8
3rd
5
Time
3rd
6
7
8
9
10
Outline
•
•
•
•
•
•
•
Introduction
Motivation
Design
Implementation
Evaluation
Discussion and Future Work
Conclusion
Overall Design
Parallel soft
real-time
scheduling
address soft
real-time
constraints
real-time
priority
dynamic time
slice
solve
synchronization
problems
parallel
scheduling
Address Soft Real-Time Constraints
• How to handle soft real-time constraints of eventdriven soft real-time applications?
PCPU
0
VCPU0
RTVCPU0
VCPU1
VCPU2
VCPU3
VCPU1
VCPU2
VCPU3
RTVCPU0
VCPU2
VCPU3
receive external events
PCPU
0
RTVCPU0
VCPU0
descheduled
PCPU
0
VCPU0
real-time
VCPU1
boost
under
over
Address Soft Real-Time Constraints
• How to handle soft real-time constraints of timedriven soft real-time applications?
VCPU0
VCPU1
VCPU2
RT-VCPU0
VCPU0
VCPU1
has RT-VMs
VC
PU
0
VC
PU
1
VC
PU
2
VCPU2
no RT-VMs
RTVC
PU
0
VC
PU
0
VC
PU
1
VC
PU
2
RTVC
PU
0
VC
PU
0
VC
PU
1
VC
PU
2
RTVC
PU
0
VC
PU
0
VC
PU
1
VC
PU
2
RTVC
PU
0
VC
PU
0
VC
PU
1
Address Soft Real-Time Constraints
• How to calculate time slice?
S
the length of time slice the scheduler used
LTS
long time slice
NR
the total number of RT-VCPUs in the system
NV
the number of VCPUs per PCPU
L
the expected latency of soft real-time applications
WCSL
worst case scheduling latency
Calculate WCSL:
L and WCSL must meet:
Calculate S:
Address Soft Real-Time Constraints
• How to determine the approximate value of the
expected latency?
– We use the VoIP test of MyConnection Server (MCS) to
conduct an experiment
Time slice
Upstream
jitter
Upstream
packet loss
Packet
discards
MOS
30ms
7.8ms
9.5%
3.0%
1.0
15ms
6.7ms
7.2%
0.9%
1.0
5ms
5.0ms
0.1%
0%
4.0
3ms
4.8ms
0%
0%
4.0
– The time slice with the value of 5ms is good enough to
guarantee the quality of VoIP while minimizing the impact
on other applications
– The value of L is calculated as 15ms
Solve Synchronization Problems
• How to handle synchronization problems?
RTVCPU0
VCPU0
VCPU1
VCPU2
VCPU3
PCPU
1
VCPU3
RTVCPU3
VCPU1
VCPU2
VCPU0
PCPU
2
VCPU2
RTVCPU2
VCPU1
VCPU0
VCPU3
PCPU
3
VCPU1
VCPU0
RTVCPU1
VCPU2
VCPU3
PCPU
0
soft interrupt
real-time
boost
under
over
Solve Synchronization Problems
• How to address the VCPU migration problem?
PCPU
0
VCPU0
VCPU migration
problem
RTVCPU0
VCPU1
VCPU2
VCPU3
VCPU0
VCPU1
VCPU2
VCPU3
steal
PCPU
1
RTVCPU1
real-time
boost
under
RT-VM
affinity exchange
RTVCPU0
RTVCPU1
CPU affinity
CPU affinity
0 1 0 1
1 0 1 0
over
Parallel Soft Real-Time Scheduling
Calculate time slice
Schedule all runnable
VCPUs of a RT-VM
Outline
•
•
•
•
•
•
•
Introduction
Motivation
Design
Implementation
Evaluation
Discussion and Future Work
Conclusion
Implementation
• Poris: parallel soft real-time scheduler
– User interface:
• add a field named type to csched_dom
• add a field named latency to csched_dom
• add a new command xm sched-rt
– Modification to the Credit scheduler (sched_poris):
• add a new priority (CSCHED _PRI _TS _RT) as the realtime priority
• modify event processing
• modify the VCPU and PCPU operating functions
• manage CPU affinity and modify csched_schedule() to
co-schedule all runnable VCPUs of a RT-VM
Outline
•
•
•
•
•
•
•
Introduction
Motivation
Design
Implementation
Evaluation
Discussion and Future Work
Conclusion
Experiment Setup
• Hardware and VM configuration
Name
Hardware configuration
VM configuration
Machine I
a dual-core 2.6GHz Intel CPU, 2GB memory,
500GB SATA disk and 100Mbps Ethernet card
2VCPUs, 256MB memory and
10GB virtual disk
Machine II
two quad-core 2.4GHz Intel Xeon CPUs, 24GB
memory, 1TB SCSI disk and 1Gbps Ethernet card
8VCPUs, 1GB memory and
10GB virtual disk
• Software
– Hypervisor: Xen-4.0.1
– OS: CentOS 5.5 distribution with the Linux-2.6.31.8 kernel
• Interfering configuration
– CPU-intensive interfering configuration: all interfering VMs run CPUintensive workloads
– mixed interfering configuration: some interfering VMs run CPU-intensive
workloads, and some run I/O-intensive workloads
Experiments
• Does Poris guarantee the QoS of VoIP applications?
– Experiments with MyConnection Server
• Is Poris suitable for client-side virtualization?
– Experiments with Media Player
• Does Poris surpass other schedulers?
– Experiments with PARSEC Benchmark
• What is the impact of Poris on non-real-time
workloads?
– Experiments with Non-real-time Workloads (Kernel
compilation, Postmark, Stream benchmark)
MyConnection Server
• Upstream jitter
(a) CPU-intensive interfering configuration
(b) Mixed interfering configuration
MyConnection Server
• VoIP test results
Interfering
configuration
Scheduler
Upstream
jitter
Downstream
jitter
Packet
discards
MOS
CPU-intensive
Credit
11.9ms
9.6ms
4.4%
1.0
Poris
4.6ms
1.2ms
0.0%
4.0
61.34% ↓
Mixed
87.5% ↓
Credit
10.4ms
6.5ms
2.0%
1.0
Poris
4.5ms
0.6ms
0.0%
4.0
56.73% ↓
90.77% ↓
Media Player
• Play low resolution video
71.19% ↑
(a) CPU-intensive interfering configuration
40.68% ↑
(b) Mixed interfering configuration
Media Player
• Play high resolution video
135.94% ↑
(a) CPU-intensive interfering configuration
95.31% ↑
(b) Mixed interfering configuration
Normalized execution time (%)
PARSEC Benchmark
110
Credit
RS
PS Poris
of
Poris
100
The performance
is
up to 44.12% better than
Credit, 41.28% better than
RS, and 28.02% better than
PS.
90
80
70
60
50
blackscholes bodytrack
canneal
dedup
facesim
ferret
Normalized execution time (%)
110
Credit
RS
PS
Poris
100
90
80
70
60
50
fluidanimate freqmine
raytrace streamcluster swaptions
vips
x264
Non-real-time workloads
Kernel compilation
Postmark
Because Poris promotes the priorities of RTVCPUs temporarily and uses dynamic time
slices, the interferences of Poris on non-realtime workloads are slight and acceptable.
Poris even increases the performance of
some types of non-real-time workloads, such
as I/O-intensive workloads.
Stream benchmark
Outline
•
•
•
•
•
•
•
Introduction
Motivation
Design
Implementation
Evaluation
Discussion and Future Work
Conclusion
Discussion and Future Work
• Determining VM type and expected latency
– provide APIs to programmers
– analyze runtime characteristics of applications
• Many applications running in a VM
– use previous techniques to identify real-time applications
• Supporting multiple VMs running the same PSRT
applications
– co-schedule multiple VMs by analyzing the communication
patterns of VMs running the same PSRT applications
Outline
•
•
•
•
•
•
•
Introduction
Motivation
Design
Implementation
Evaluation
Discussion and Future Work
Conclusion
Conclusion
• We identify the scheduling problems in virtualized
environment, and find existing CPU scheduling mechanisms
do not fit for PSRT applications
• We propose a novel parallel soft real-time scheduling
algorithm
• We implement a prototype in the Xen hypervisor based on
the algorithm, named Poris
• We verify the effectiveness of Poris through various
applications. The experimental results show that Poris can
improve the performance of PSRT applications significantly
Thank you！
System Virtualization
Applications
Applications
Guest OS
Guest OS
VM
VM
Hypervisor
Hardware
Credit Scheduler
• CPU resources (or credits) are distributed to VCPUs
of VMs in proportion to their weight
• three kinds of VCPU priorities: boost, under, and
over
• VCPUs with the same priority are scheduled in FCFS
manner
• supports SMP platforms well

AFMS'12_report

Transcript AFMS'12_report

Directory