140128 카이스트 윤찬현교수님 연구실 발표자료

Download Report

Transcript 140128 카이스트 윤찬현교수님 연구실 발표자료

- 세부 1 -
이종 클라우드 플랫폼 데이터
관리 브로커 연구 및 개발
Network and Computing Lab.
연구 목표
• 모바일 클라우드 메타데이터 정의 기법
• 모바일 클라우드 메타 데이터 기반 자원 관리 및 마이그레이션
기법
• 이종 클라우드 인프라 성능을 고려한 자원 및 서비스 프로파일링
기법을 이용한 서비스 성능 향상 및 사용자 SLA 보장 기법
• 서비스 실행에 필요한 데이터들을 캐싱하여 서비스 속도 및 성능
향상을 위한 데이터 프로비저닝 및 빅데이터 처리를 위한 실시간
데이터 공급 기법 연구
• 데이터 사용 특성 기반 적합 데이터 콘솔리데이션 및 프로비저닝
기법 연구
• 빅데이터 처리를 위한 분산 및 이종 데이터의 통합 관리를 위한
데이터 가상화 기법 연구
모바일 클라우드 메타 데이터 기반 자원
관리 및 마이그레이션 기법
서비스 및 어플리케이션 프로파일링
• 클라우드 메타데이터
– 서비스 및 어플리케이션 프로파일
– 자원 프로파일
• 서비스 및 어플리케이션 프로파일링
– Basic approach
• The expected execution time profiling by VM
types (historical data)
• Application performance by resource usage
profiling
– Advanced approach
• Considering resource contention analysis among
applications
Today’s topic
• Classification scheme [Zhuravlev et al., SIGARCH, 2010]
– Classification scheme is for identifying which
applications should and should not be scheduled
together.
– Classification scheme enables the scheduler to
predict the performance effects of co-scheduling
any group of threads in a shared cache
– VM placement & allocation algorithm consists of
two components: Classification scheme and The
policy
Classification Scheme 1)
Stack Distance Competition (SDC)
[Chandra et al., HPCA, 2005] (1)
• Assumption) L2 Cache LRU Replacement
• Stack Distance Profile
– Capturing the temporal
Reuse behavior of an
application in a fully- or
set-associative cache
• Basic Prediction Approach
– For smaller cache
Classification Scheme 1)
Stack Distance Competition (SDC) (2)
• Objective
– How two applications compete for the LRU stack
positions in the shared cache and estimate the
extra misses incurred by each application as a result
of this contention
• Main idea
– Constructing a new stack distance profile that
merges individual stack distance profiles of threads
that run together
Classification Scheme 1)
Stack Distance Competition (SDC) (3)
• SDC Algorithm
1) Each individual profile is assigned a current pointer
that is initialized to point to the first stack distance
position
2) The algorithm iterates A times over each position in
the profile, determining which of the co-runners will
be the “winner” for this stack-distance position
3) After Ath iteration, the effective cache space for each
thread is computed proportionally to the number of its
stack distance counters that are included in the
merged profile
 The cache miss rate with the new effective cache space
is estimated for each co-runner
Classification Scheme 2)
Animal Classes
[Xie et al., CMP-MSI, 2008] (1)
• This classification scheme allows classifying
applications in terms of their influence on each
other when co-scheduled in the same cache
• Four application classes
– Turtle (low use of the shared cache)
– Sheep (low miss rate, insensitive to the number of
cache ways allocated to it)
– Rabbit (low miss rate, sensitive to the number of
allocated cache ways)
– devil (high miss rate, access the L2 cache very
quickly)
Classification Scheme 2)
Animal Classes (2)
• Application Classification Algorithm
• Symbiosis table
– To approximate relative performance degradations for applications that
fall within different animal classes
– Providing estimates of how well various classes co-exist with each other
on the same shared cache
• This scheme uses stack distance profiles
Classification Scheme 3)
Miss rate
[Zhuravlev et al., SIGARCH, 2010]
[Knauerhase et al., IEEE Micro, 2008] (1)
• Identifying applications with high miss rates is very
beneficial for the scheduler because these applications
exacerbate the performance degradation due to
memory controller contention, memory bus contention,
and prefetching hardware contention
• To attempt an approximation of the best schedule
using the miss rate heuristic, the scheduler will identify
high miss rate applications and separate them into
different caches, such that no one cache will have a
much higher total miss rate than any other cache
Classification Scheme 4)
Pain [Zhuravlev et al., SIGARCH, 2010] (1)
• Cache sensitivity
– A measure of how much an application will suffer when
cache space is taken away from it due to contention
– This can be calculated by
• first, examining the number of cache hits that will most likely turn
into misses when the cache is shared
• second, assigning to positions in the stack distance profile loss
probabilities describing the likelihood that the hits will be lost from
each position
• Loss probability distribution is “i / (n+1)” in this paper
– Cache sensitivity formula
• h(i) is the number of hits to the i-th position in the stack, where i=1
is the MRU and i=n is the LRU for an n-way set associative cache
Classification Scheme 4)
Pain (2)
• Cache intensity
– A measure of how much an application will hurt
others by taking away their space in a shared cache
– Measured using the number of last-level cache
accesses per one million instructions
• The Pain metric
– The resulting pain is measured by combining cache
sensitivity and intensity
Classification Schemes Evaluation
[Zhuravlev et al., SIGARCH, 2010]
데이터 사용 특성 기반 적합 데이터 콘솔
리데이션 및 프로비저닝 기법 연구
Today’s topic
• Data placement & VM placement in Big
data processing
– Importance of data placement
• Input data-intensive workloads such as Map
• Centralized file system vs Distributed file system
– Importance of VM placement
• Intermediate data-intensive workloads such as
Reduce
• Performance issue such as SLA and resource
contention
Data placement & VM placement: Purlieus
[Palanisamy et al., SC, 2011] (1)
• Job classification
– Map-input heavy jobs (Input data-intensive
workloads)
– Reduce-input heavy jobs (Intermediate dataintensive workloads)
– Map-and-Reduce-input heavy jobs (Input
data-and-Intermediate data-intensive
workloads)
Data placement & VM placement: Purlieus (2)
• Map-input heavy jobs (Input data-intensive workloads)
– Input data placement
• Choosing physical machines only based on the storage utilization
and the expected load
– VM placement
• Data locality
• Choosing physical machines which have the corresponding data
Data placement & VM placement: Purlieus (3)
•
Reduce-input heavy jobs (Intermediate data-intensive workloads)
– Input data placement
• Choosing physical machines with maximum free storage
– VM placement
•
•
Choosing physical machines which are close each other
Map-Input-Reduce-input heavy jobs (Intermediate data-intensive
workloads)
– Considering both
Delay scheduling
[Zaharia et al., Eurosys, 2010]
• If we cannot find the appropriate node which has
data for first job in a job queue, delaying the job to
find the appropriate node until the certain period.
• Data locality
• In streaming situation
PM
job queue
delay!
Processing