PowerPoint 프레젠테이션
Download
Report
Transcript PowerPoint 프레젠테이션
vTurbo: Accelerating Virtual Machine I/O
Processing Using Designated Turbo-Sliced Core
Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella, Dongyan Xu
2013 USENIX Annual Technical Conference
Embedded Lab.
Kim Sewoog
Motivation
Pay-as-you-go: Server Consolidation
Save cost in running application and operational expenditure
Multiple VMs sharing the same core
CPU access latency
VM1
VM2
VM3
VM4
Hypervisor(or VMM)
Low I/O
Throughput
I/O Processing
Two basic stages
Device interrupts are processed synchronously in the kernel
Application asynchronously copies the data in kernel buffer
VM1
VM2
VM3
Application
Kernel
Buffer
IRQ
Processing
CPU
Time
IRQ processing delay
< I/O Processing Workflow >
< Effect of CPU Sharing on I/O Processing >
Effect of CPU Sharing on TCP Receive
TCP
Client
Hypervisor
Shared
Buffer
Scheduled
VMs
DATA
DATA
VM1
IRQ
Processing
Delay
VM2
DATA
VM3
ACK
ACK
ACK
Effect of CPU Sharing on UDP Receive
UDP
Client
Hypervisor
Shared
Buffer
Scheduled
VMs
DATA
Shared Buffer
DATA
VM1
Dropped
Full
VM2
Application
Buffer
DATA
VM3
Effect of CPU Sharing on Disk Write
Scheduled
VMs
Application
Kernel Memory
Disk Drive
DATA
DATA
VM3
VM1
IRQ
Processing
Delay
VM2
DATA
VM3
Kernel
Memory
Intuitive Solution
Reduce time-slice of each VM
Causes significant context switch overhead
Our Solution: vTurbo
Our Solution: vTurbo
IRQ processing offloaded to a dedicated turbo core
Turbo core : Any physical core with micro-slicing (e.g., 0.1 ms)
Expose turbo core as a special vCPU to the VM
Turbo vCPU runs on a turbo core
Regular vCPUs run on regular cores
Pin IRQ context of guest OS to turbo vCPU
Benefits
Improved I/O throughput (TCP/UDP, Disk)
Self-adaptive system
vTurbo Design
vTurbo Design
VM1
VM2
VM3
Application
Regular Core
Buf
VM1
VM2
VM3
IRQ
Buf
VM1
VM2
VM3
IRQ
Turbo Core
Time
Data
Data
vTurbot’s Impact on Disk Write
Regular
Core
Application
Kernel Memory
vTurbo
Disk Drive
DATA
VM3
VM1
VM2
VM3
VM1
VM2
VM3
VM1
VM2
VM3
VM1
VM2
VM3
VM1
VM2
VM3
VM1
VM2
Kernel Memory
Effect of CPU Sharing on UDP Receive
UDP
Client
Hypervisor
Shared
Buffer
vTurbo
Regular Cores
Kernel
Buffer
Shared Buffer
DATA
VM1
VM3
VM1
VM2
VM3
VM1
VM2
VM3
VM1
VM2
VM3
VM1
VM2
Kernel Buffer
VM2
Application Buffer
DATA
VM3
Effect of CPU Sharing on TCP Receive
TCP
Client
Hypervisor
Shared
Buffer
vTurbo
Kernel
Buffer
Regular Cores
Backlog Queue
DATA
ACK
VM3
VM1
VM2
VM3
VM1
VM2
VM3
VM1
VM2
VM3
VM1
VM2
VM1
Receive Queue
Locked
VM2
DATA
Application Buffer
VM3
VM Scheduling Policy for Fairness
Turbo cores are not free
Maintain CPU fair-share among VMs
Calculate the credits on both regular and turbo cores
Guarantee the CPU allocation on turbo cores
Deduct I/O intensive VMs’ credits on regular cores
Allocate the deduction to non-IO intensive VMs
< total capacity among the regular and turbo cores >
< each VMs’ turbo core fair share >
< total capacity >
< actual usage of the turbo core >
< each VM’s fair share of CPU >
Evaluation
VM hosts
3.2 GHz Intel Xeon Quad-cores CPU, 16GB RAM
Assign an independent core to driver domain(dom0)
Xen 4.1.2
Linux 3.2
Choose 1 core as Turbo core
Gigabit Ethernet switch(10Gbps for 2 experiments)
File Read/Write Throughput: Micro-Benchmark
regular core <-> turbo core
TCP/UDP Throughput : Micro-Benchmark
NFS/SCP Throughput : Application Benchmark
Apache Olio : Application Benchmark
3 components
a web server to process user requests
a MySQL database server to store user profiles and event information
an NFS server to store images and documents specific to events
Conclusions
Problem : CPU sharing affects I/O throughput
Solution : vTurbo
Offload IRQ processing to a turbo-sliced dedicated core
Results :
Improve UDP throughput up to 4x
Improve TCP throughput up to 3x
Improve Disk write up to 2x
Improve NFS’ throughput up to 3x
Improve Olio’s throughput by up to 38.7%
Reference
CHENG, L., AND WANG, C.-L. “vbalance: Using interrupt load balance
to improve i/o performance for smp virtual machine”, In ACM SoCC
(2012)
DONG, Y., YU, Z., AND ROSE, G. “SR-IOV networking in Xen:
architecture, design and implementation”, In WIOV (2008).
GORDON, A., AMIT, N., HAR’EL, N., BEN-YEHUDA, M., LANDAU, A.,
SCHUSTER, A., AND TSAFRIR, D. “ELI: baremetal performance for
I/O virtualization”, In ACM ASPLOS(2012).
THANK YOU !