Load Balance in Linux sjchoi
Download
Report
Transcript Load Balance in Linux sjchoi
Load Balance in Linux 2.6.32
Load balancing
Sung-joon Choi
Real-Time Operating Systems Lab.
Seoul National University
2011-09-15
Load Balance in Linux 2.6.32
Contents
Load balancing
Purpose
Definition
General cases
• Active load balancing
• Passive load balancing
Special cases
• Execution of a new task
• CPU’s shut down or intentionally being IDLE
Limitation
2
Load Balance in Linux 2.6.32
Load Balancing
Purpose
시스템에 코어 수보다 많은 수의 작업(task)이 있는 한, 모든
코어가 IDLE 상태 없이 수행하도록 조절
Mechanism
코어 간에 작업량 차이가 크지 않도록 조절
Definition
Load balancing
• SMP 구조에서 각 코어가 균등한 작업량(load)을 가지도록 조절
하는 것
Load
• 코어의 run-queue 가 갖는 모든 task 들의 weight 를 더한 값
3
Load Balance in Linux 2.6.32
Load Balancing
Definition (cont.)
Idlest run-queue
• A run-queue that has the minimum load among the cores
Busiest run-queue
• A run-queue that has the maximum value which is scale factor
“load / (core’s power)”
• 모든 코어의 power 가 동일하다면 maximum load 를 갖는 코어
의 run-queue 를 의미한다
• 이종의 프로세서를 사용하는 시스템이라면 각 코어의 power 가
다를수도 있다.
• 일반적으로 power는 capacity 또는 작업수행능력을 의미한다.
4
Load Balance in Linux 2.6.32
Contents
Load balancing
Purpose
Definition
General cases (mainly focused part)
• Active load balancing
• Passive load balancing
Special cases
• Execution of a new task
• CPU’s shut down or intentionally being IDLE
Limitation
5
Load Balance in Linux 2.6.32
General Cases
Active Load Balancing
Core 0
Core 1
Core 0
Core 1
Core 0
Run-queue is
empty
Task 1
Task 2
Run-queue
Task 4
Run-queue
Core 1
Core 1 is going
to IDLE
Core 0
Task migration
Task 1
Task 1
Task 1
Task 2
Task 2
Task 2
Run-queue
Run-queue
Run-queue
Run-queue
Run-queue
Task 3
Task 5
Task 3
Task 4
Task 3
Task 4
Task 3
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
READY
RUNNING
Core 1
Task 2
Run-queue
Current
task
Going to DEAD
(Assumption: all tasks have same weight)
6
General Cases
Active Load Balancing
Implementation
When a task is going to end up its execution time
do_exit()
• Sets task’s state to “TASK_DEAD”
• schedule()
– In back-end procedure, if a core’s state is IDLE, it calls
“idle_balance()”
– idle_balance()
» To pull a task on the busiest core’s run-queue, it calls
“load_balance()”
» load_balance()
» Does a task migration
7
General Cases
Active Load Balancing
Drawback
Active load balancing 으로도 충분히 load balancing 을 달성
할 수 있지만
코어 간 작업량 차이가 큰 상황인데도 각 태스크의 수행시간
이 길어서 IDLE 상태를 갖게 되는 코어가 한동안 없다면, 단
기간 내 load balancing 의 목적을 달성할 수 없다.
이 상황을 피하기 위해서 주기적인 조절이 필요하다
8
Load Balance in Linux 2.6.32
General Cases
Passive(Periodic) load balancing
Core 0
Core 1
Core 0
Core 1
Core 0
Core 1
Core 0
Core 1
Task 1
Task 4
Task 4
Task 4
Task 2
Task 1
Task 1
Task 1
Task 6
Task 2
Task 2
Task 3
Run-queue
Task 5
Run-queue
Task 2
Run-queue
Task 6
Run-queue
Task 2
Run-queue
Task 6
Run-queue
Run-queue
Run-queue
Task 4
Task 6
Task 3
Task 5
Task 3
Task 5
Task 3
Task 5
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
Current
task
For a long time, there
is no IDLE core
READY
If there is big gap of load between
cores, it is uncomfortable
RUNNING
Going to DEAD
(Assumption: all tasks have same weight)
Periodic check
Busiest run-queue
Task migration
Idlest run-queue
9
General Cases
Passive Load Balancing
Triggered by scheduler_tick()
Tick value is compared with a parameter “next_balance”
which is the time to do load balancing
• Each run-queue has “next_balance”
• If a core takes the active load balancing, the parameter is set to
1 second after
• If a core takes the passive load balancing, the parameter is set
to 1 minute after
• 1초와 1분의 차이는 IDLE 상태를 밸런싱했던 코어는 다시 IDLE
상태가 되기 쉽기 때문에 곧바로 밸런싱을 해주기 위한 것
Executed by bottom-half handler
A softirq named “SCHED_SOFTIRQ” is handled by
“run_rebalance_domains()”
10
General Cases
Passive Load Balancing
Implementation – start load balance
Core 0
Core 1
Task 1
Task 2
Task 3
Task 4
Timer interrupt invokes
“scheduler_tick()”
Task 6
If the tick value is equal to or greater than parameter “next_balance”,
Run-queue
Run-queue
Task 5
Task 7
Current
task
Current
task
Next_balance
Next_balance
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
11
General Cases
Passive Load Balancing
Implementation – step1
Core 0
Core 1
SCHED_SOFTIRQ
Task 1
???
…
Task 2
Softirq table
Task 3
Task 4
Run-queue
Task 6
Run-queue
Task 5
Task 7
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
12
General Cases
Passive Load Balancing
Implementation – step2
Core 0
Core 1
SCHED_SOFTIRQ
Task 1
???
…
Task 2
Softirq table
Task 3
Task 6
Task 4
ksoftirqd
Run-queue
Run-queue
Task 5
Task 7
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Step2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
13
General Cases
Passive Load Balancing
Implementation – step3
Core 0
Core 1
Handler function (bottom-half handler)
SCHED_SOFTIRQ
run_rebalance_domains()
Task 1
???
…
Task 2
Softirq table
Task 3
Task 6
Task 4
Task 7
Run-queue
Run-queue
Task 5
ksoftirqd
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Step2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Step3: the thread executes a function “do_ksoftirqd()” that
picks a softirq and calls its handler function
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
14
General Cases
Passive Load Balancing
Implementation – step4
Core 0
Core 1
Handler function (bottom-half handler)
SCHED_SOFTIRQ
run_rebalance_domains()
Task 1
???
…
Task 2
Softirq table
Task 3
Task 6
Task 4
Task 7
Run-queue
Run-queue
Task 5
ksoftirqd
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Step2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Step3: the thread executes a function “do_ksoftirqd()” that
picks a softirq and calls its handler function
Step4: the handler function finds the busiest run-queue to pull a task
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
15
General Cases
Passive Load Balancing
Implementation – step5
Core 0
Core 1
Handler function (bottom-half handler)
SCHED_SOFTIRQ
run_rebalance_domains()
Task 1
???
Task 2
Task 4
…
Softirq table
Task 3
Task 6
Task 4
Task 7
Run-queue
Run-queue
Task 5
ksoftirqd
Current
task
Current
task
Next_balance
Next_balance
If the tick value is equal to or greater than parameter “next_balance”,
Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Step2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Step3: the thread executes a function “do_ksoftirqd()” that
picks a softirq and calls its handler function
Step4: the handler function finds the busiest run-queue to pull a task
Step5: task migration
READY
RUNNING
Busiest run-queue
(Assumption: all tasks have same weight)
Idlest run-queue
16
General Cases
Passive Load Balancing
Implementation
Core 0
Core 1
Core 0
Core 1
Task 1
Task 2
Task 1
Task 3
Task 2
Task 4
Task 3
Task 6
Task 4
Run-queue
Task 6
Run-queue
Run-queue
Run-queue
Task 5
Task 7
Task 5
Task 7
Current
task
Current
task
Current
task
Current
task
Next_balance
Next_balance
Next_balance
Next_balance
17
General Cases
Passive Load Balancing
Drawback
This algorithm has large overhead
• The algorithm should check the maximum and minimum load
out of all cores
• And, if a current core is not the idlest one,
– The kernel thread “ksoftirqd” should be enqueued to the idlest runqueue of other core and waken up
– Also, a current task of the target core that has the idlest run-queue
is preempted by “ksoftirqd”
Tradeoff: balancing time interval throughput latency
18
Load Balance in Linux 2.6.32
Contents
Load balancing
Purpose
Definition
General cases
• Active load balancing
• Passive load balancing
Special cases
• Execution of a new task
• CPU’s shut down or intentionally being IDLE
Limitation
19
Load Balance in Linux 2.6.32
Special Cases
Execution of a new task
When a new task is created in one core, kernel checks the
core’s load whether it is reasonable to handle a new task
• If the load is unacceptable, current task of the core is migrated
to the idlest core’s run-queue and rescheduled
• And a new task is executed in the core (not the idlest core)
CPU’s shut down or intentionally being IDLE
When one core should be shut down or intentionally be
IDLE, such as in POWER_SAVING_LOAD_BALANCE
All tasks in its run-queue are migrated to other cores
Actually, this case is just a task migration
20
Load Balance in Linux 2.6.32
Contents
Load balancing
Purpose
Definition
General cases
• Active load balancing
• Passive load balancing
Special cases
• Execution of a new task
• CPU’s shut down or intentionally being IDLE
Limitation
21
Load Balance in Linux 2.6.32
Limitation
Global Fairness
Global Fairness는 여러 개의 CPU로 이루어진 SMP에서 모
든 task가 자신의 weight에 비례해서 run-time을 보장받는 정
도를 의미한다.
SMP 환경에서 Run queue가 CPU에 하나씩 있고, Load
Balance는 각 Run queue의 load(sum of weight)만을 고려해
서 task를 옮기므로 task가 자신의 weight에 비례한 시간을 못
받는 경우가 생긴다.
• Example) Dual-core CPU에 서로 같은 weight를 갖는 task1, 2,
3가 있을 때 CPU1의 Run-queue에는 task1이 있고, CPU2의
Run-queue에는 task2, task3이 들어간다. 이 경우 load balance
가 잘 일어나지 않으므로 서로 같은 weight를 갖고 있음에도 같
은 run-time을 보장 받지 못한다.
22
CFS in Linux 2.6.37
End
Q & A?
23