Load Balance in Linux sjchoi

Download Report

Transcript Load Balance in Linux sjchoi

Load Balance in Linux 2.6.32
Load balancing
Sung-joon Choi
Real-Time Operating Systems Lab.
Seoul National University
2011-09-15
Load Balance in Linux 2.6.32
Contents
 Load balancing
Purpose
Definition
General cases
• Active load balancing
• Passive load balancing
Special cases
• Execution of a new task
• CPU’s shut down or intentionally being IDLE
Limitation
2
Load Balance in Linux 2.6.32
Load Balancing
 Purpose
시스템에 코어 수보다 많은 수의 작업(task)이 있는 한, 모든
코어가 IDLE 상태 없이 수행하도록 조절
 Mechanism
코어 간에 작업량 차이가 크지 않도록 조절
 Definition
Load balancing
• SMP 구조에서 각 코어가 균등한 작업량(load)을 가지도록 조절
하는 것
Load
• 코어의 run-queue 가 갖는 모든 task 들의 weight 를 더한 값
3
Load Balance in Linux 2.6.32
Load Balancing
 Definition (cont.)
Idlest run-queue
• A run-queue that has the minimum load among the cores
Busiest run-queue
• A run-queue that has the maximum value which is scale factor
“load / (core’s power)”
• 모든 코어의 power 가 동일하다면 maximum load 를 갖는 코어
의 run-queue 를 의미한다
• 이종의 프로세서를 사용하는 시스템이라면 각 코어의 power 가
다를수도 있다.
• 일반적으로 power는 capacity 또는 작업수행능력을 의미한다.
4
Load Balance in Linux 2.6.32
Contents
 Load balancing
Purpose
Definition
General cases (mainly focused part)
• Active load balancing
• Passive load balancing
Special cases
• Execution of a new task
• CPU’s shut down or intentionally being IDLE
Limitation
5
Load Balance in Linux 2.6.32
General Cases
 Active Load Balancing
Core 0
Core 1
Core 0
Core 1
Core 0
Run-queue is
empty
Task 1
Task 2
Run-queue
Task 4
Run-queue
Core 1
Core 1 is going
to IDLE
Core 0
Task migration
Task 1
Task 1
Task 1
Task 2
Task 2
Task 2
Run-queue
Run-queue
Run-queue
Run-queue
Run-queue
Task 3
Task 5
Task 3
Task 4
Task 3
Task 4
Task 3
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
READY
RUNNING
Core 1
Task 2
Run-queue
Current
task
Going to DEAD
(Assumption: all tasks have same weight)
6
General Cases
Active Load Balancing
 Implementation
When a task is going to end up its execution time
 do_exit()
• Sets task’s state to “TASK_DEAD”
•  schedule()
– In back-end procedure, if a core’s state is IDLE, it calls
“idle_balance()”
–  idle_balance()
» To pull a task on the busiest core’s run-queue, it calls
“load_balance()”
»  load_balance()
» Does a task migration
7
General Cases
Active Load Balancing
 Drawback
Active load balancing 으로도 충분히 load balancing 을 달성
할 수 있지만
코어 간 작업량 차이가 큰 상황인데도 각 태스크의 수행시간
이 길어서 IDLE 상태를 갖게 되는 코어가 한동안 없다면, 단
기간 내 load balancing 의 목적을 달성할 수 없다.
이 상황을 피하기 위해서 주기적인 조절이 필요하다
8
Load Balance in Linux 2.6.32
General Cases
 Passive(Periodic) load balancing
Core 0
Core 1
Core 0
Core 1
Core 0
Core 1
Core 0
Core 1
Task 1
Task 4
Task 4
Task 4
Task 2
Task 1
Task 1
Task 1
Task 6
Task 2
Task 2
Task 3
Run-queue
Task 5
Run-queue
Task 2
Run-queue
Task 6
Run-queue
Task 2
Run-queue
Task 6
Run-queue
Run-queue
Run-queue
Task 4
Task 6
Task 3
Task 5
Task 3
Task 5
Task 3
Task 5
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
For a long time, there
is no IDLE core
READY
If there is big gap of load between
cores, it is uncomfortable
RUNNING
Going to DEAD
(Assumption: all tasks have same weight)
Periodic check
Busiest run-queue
Task migration
Idlest run-queue
9
General Cases
Passive Load Balancing
 Triggered by scheduler_tick()
Tick value is compared with a parameter “next_balance”
which is the time to do load balancing
• Each run-queue has “next_balance”
• If a core takes the active load balancing, the parameter is set to
1 second after
• If a core takes the passive load balancing, the parameter is set
to 1 minute after
• 1초와 1분의 차이는 IDLE 상태를 밸런싱했던 코어는 다시 IDLE
상태가 되기 쉽기 때문에 곧바로 밸런싱을 해주기 위한 것
 Executed by bottom-half handler
A softirq named “SCHED_SOFTIRQ” is handled by
“run_rebalance_domains()”
10
General Cases
Passive Load Balancing
 Implementation – start load balance
Core 0
Core 1
Task 1
Task 2
Task 3
Task 4
Timer interrupt invokes
“scheduler_tick()”
Task 6
If the tick value is equal to or greater than parameter “next_balance”,
Run-queue
Run-queue
Task 5
Task 7
Current
task
Current
task
Next_balance
Next_balance
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
11
General Cases
Passive Load Balancing
 Implementation – step1
Core 0
Core 1
SCHED_SOFTIRQ
Task 1
???
…
Task 2
Softirq table
Task 3
Task 4
Run-queue
Task 6
Run-queue
Task 5
Task 7
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
12
General Cases
Passive Load Balancing
 Implementation – step2
Core 0
Core 1
SCHED_SOFTIRQ
Task 1
???
…
Task 2
Softirq table
Task 3
Task 6
Task 4
ksoftirqd
Run-queue
Run-queue
Task 5
Task 7
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Step2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
13
General Cases
Passive Load Balancing
 Implementation – step3
Core 0
Core 1
Handler function (bottom-half handler)
SCHED_SOFTIRQ
run_rebalance_domains()
Task 1
???
…
Task 2
Softirq table
Task 3
Task 6
Task 4
Task 7
Run-queue
Run-queue
Task 5
ksoftirqd
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Step2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Step3: the thread executes a function “do_ksoftirqd()” that
picks a softirq and calls its handler function
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
14
General Cases
Passive Load Balancing
 Implementation – step4
Core 0
Core 1
Handler function (bottom-half handler)
SCHED_SOFTIRQ
run_rebalance_domains()
Task 1
???
…
Task 2
Softirq table
Task 3
Task 6
Task 4
Task 7
Run-queue
Run-queue
Task 5
ksoftirqd
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Step2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Step3: the thread executes a function “do_ksoftirqd()” that
picks a softirq and calls its handler function
Step4: the handler function finds the busiest run-queue to pull a task
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
15
General Cases
Passive Load Balancing
 Implementation – step5
Core 0
Core 1
Handler function (bottom-half handler)
SCHED_SOFTIRQ
run_rebalance_domains()
Task 1
???
Task 2
Task 4
…
Softirq table
Task 3
Task 6
Task 4
Task 7
Run-queue
Run-queue
Task 5
ksoftirqd
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Step2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Step3: the thread executes a function “do_ksoftirqd()” that
picks a softirq and calls its handler function
Step4: the handler function finds the busiest run-queue to pull a task
Step5: task migration
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
16
General Cases
Passive Load Balancing
 Implementation
Core 0
Core 1
Core 0
Core 1
Task 1
Task 2
Task 1
Task 3
Task 2
Task 4
Task 3
Task 6
Task 4
Run-queue
Task 6
Run-queue
Run-queue
Run-queue
Task 5
Task 7
Task 5
Task 7
Current
task
Current
task
Current
task
Current
task
Next_balance
Next_balance
Next_balance
Next_balance
17
General Cases
Passive Load Balancing
 Drawback
This algorithm has large overhead
• The algorithm should check the maximum and minimum load
out of all cores
• And, if a current core is not the idlest one,
– The kernel thread “ksoftirqd” should be enqueued to the idlest runqueue of other core and waken up
– Also, a current task of the target core that has the idlest run-queue
is preempted by “ksoftirqd”
Tradeoff: balancing time interval  throughput latency
18
Load Balance in Linux 2.6.32
Contents
 Load balancing
Purpose
Definition
General cases
• Active load balancing
• Passive load balancing
Special cases
• Execution of a new task
• CPU’s shut down or intentionally being IDLE
Limitation
19
Load Balance in Linux 2.6.32
Special Cases
 Execution of a new task
When a new task is created in one core, kernel checks the
core’s load whether it is reasonable to handle a new task
• If the load is unacceptable, current task of the core is migrated
to the idlest core’s run-queue and rescheduled
• And a new task is executed in the core (not the idlest core)
 CPU’s shut down or intentionally being IDLE
When one core should be shut down or intentionally be
IDLE, such as in POWER_SAVING_LOAD_BALANCE
All tasks in its run-queue are migrated to other cores
Actually, this case is just a task migration
20
Load Balance in Linux 2.6.32
Contents
 Load balancing
Purpose
Definition
General cases
• Active load balancing
• Passive load balancing
Special cases
• Execution of a new task
• CPU’s shut down or intentionally being IDLE
Limitation
21
Load Balance in Linux 2.6.32
Limitation
 Global Fairness
Global Fairness는 여러 개의 CPU로 이루어진 SMP에서 모
든 task가 자신의 weight에 비례해서 run-time을 보장받는 정
도를 의미한다.
SMP 환경에서 Run queue가 CPU에 하나씩 있고, Load
Balance는 각 Run queue의 load(sum of weight)만을 고려해
서 task를 옮기므로 task가 자신의 weight에 비례한 시간을 못
받는 경우가 생긴다.
• Example) Dual-core CPU에 서로 같은 weight를 갖는 task1, 2,
3가 있을 때 CPU1의 Run-queue에는 task1이 있고, CPU2의
Run-queue에는 task2, task3이 들어간다. 이 경우 load balance
가 잘 일어나지 않으므로 서로 같은 weight를 갖고 있음에도 같
은 run-time을 보장 받지 못한다.
22
CFS in Linux 2.6.37
End
Q & A?
23