Document 7515523

Transcript Document 7515523

LINUX SCHEDULING
Evolution in the 2.6 Kernel
Kevin Lambert
Maulik Mistry
Cesar Davila
Jeremy Taylor
Main Topics

General Process Scheduling

2.6 Kernel Processing (short-term)

I/O Scheduling (disk requests)
General Scheduler
Considerations
1 - Preemptive vs. Cooperative
2 - I/O-bound vs. CPU-bound
3 - Throughput Vs. Latency
PRIORITY!
Text Editor vs. Video Encoder
Which one has more priority in Linux?

Which requires more processing?

Which one requires more I/O?

Which one has greater priority?
Preemption


Due to timeslice running out
Due to priority being lower than that of
current running process
Where Did We Come From?
Pre 2.6 Schedulers

Didn’t utilize SMP very well


Preemption not possible


Single runqueue lock meant idle processors
awaiting lock release
Lower priority task can execute while high
priority task waits
O(n) complexity

Slows down with larger input.
Where Are We Now?
The 2.6 Scheduler

Each CPU has a separate runqueue

140 FIFO priority lists


1-100 are for real-time tasks
101-140 are for user tasks

Active and Expired runqueues

O(1) complexity

Constant time thanks to runqueue swap
Where Are We Now?
(cont)
The 2.6 Scheduler

Preemption

Dynamic task prioritization


Up to -5 niceness for I/O-bound

Up to +5 niceness for CPU-bound

Remember, less niceness is good… in this
case.
SMP load balancing

Checks runqueues every 200 ms
CFS – The Future is Now!


Completely Fair Scheduler

Merged into the 2.6.23 kernel

Runs task with the “gravest need”

Guarantees fairness (CPU usage)
No runqueues!

Uses a time-ordered red-black binary tree

Leftmost node is the next process to run
Red/Black Tree Rules
1) Every node has two
children, each
colored either red or
black.
2) Every tree leaf node
is colored black.
3) Every red node has
both of its children
colored black.
4) Every path from the root to a tree leaf contains
the same number (the "black-height") of black
nodes.
Cited from http://mathworld.wolfram.com/Red-BlackTree.html
Good animation at http://www.geocities.com/SiliconValley/Network/1854/Rbt.html
CFS Features


(cont)
No timeslices!... sort of

Uses wait_runtime (individual) and fair_clock
(queue-wide)

Processes build up CPU debt

Different priorities “spend” time differently

Half priority task sees time pass twice as fast
O(log n) complexity

Only marginally slower than O(1) at very
large numbers of inputs
IO Scheduling
• Minimize latency on disk seeks
• Prioritize processes’ IO requests
• Efficiently share disk bandwidth between
processes
• Guarantee that requests are issued
before a deadline
• Avoid starvation
CFQ


What it's good for:

Default system for Red Hat Enterprise Linux
4

Distributes bandwidth equally among IO
requests and is excellent for multi-user
environments

Offers performance for the widest range of
applications and IO system designs and
those that require balancing
Considered anticipatory because process queue
idles at the end of synchronous IO allowing IO
to be handled from that process.
How CFQ Works



Assigns requests to queues and priorities
based on the process they are coming
from
Current time recorded when task enters
runqueue
Traffic divides into a fixed number of
buckets (64 by default)

Hash code from networking atm

Round robin all non-empty buckets
How CFQ Works



IO scheduler uses a per-queue function
(not per-bucket)
Runnable tasks use a 'fair clock' with
runnable tasks (1/N) to increase priority
Several innovations made for CFQ V2
http://www.redhat.com/magazine/008jun05/features/schedulers/

Document 7515523

Transcript Document 7515523

Directory