Transcript Interrupts

Interrupts (Hardware)

Interrupt Descriptor Table

IDT specified as a segment using the IDTR register Slide #2

Slide #3

Calling the IRQ handler

Interrupt Context

Exceptions

• First 32 IRQ vectors in IDT – Correspond to events generated by the CPU – Page fault, Divide by zero, invalid instruction, etc • Full list in the CPU architecture manuals – Generally its an “error” or “exception” encountered during CPU instruction execution • IDT is referenced directly by the CPU – Completely internalized

External Interrupts

• • Events triggered by devices connected to the system – Network packet arrivals, disk operation completion, timer updates, etc – Can be mapped to any IRQ vector above the exceptions • (IRQs 32-255) External because they happen outside the CPU – External logic signals CPU and notifies it which handler to execute – Managed by Interrupt Controller • Special device included in southbridge

Interrupt Controllers

• • Translate device IRQ signals to CPU IRQ vectors – Each device has only a single pin • High = IRQ pending, Low = no IRQ – Interrupt controller maps devices to vectors Two x86 controller classes – Legacy: 8259 PIC • Connected to a set of default I/O ports on CPU – Modern: APIC + IOAPIC • Memory mapped into each CPUs physical memory – (How?) • Next generation APIC (x2PIC) accessed via MSRs – Model specific registers – control registers accessed via special instructions » WRMSR, RDMSR

8259 PIC

• Allows the mapping of 8 IRQ pins (from devices) to 8 separate vectors (to CPU) • Assumes continuous numbering • Assign the PIC a vector offset, • Each pin index is added to that offset to calculate the vector

8259 Cont’d

• 1 PIC only supports 8 device lines – Often more than 8 devices in a system – Solution: Add more PICs • But x86 CPUs only have 1 INTR input pin • X86 Solution: – Chain the PICs together (master and slave) • Slave PIC is attached to the 2 nd IRQ pin of the master • CPU interfaces directly with master

• • PC architecture defines common default devices to each PIC input pin Initially PIC vector offsets set to 32, just above exceptions

1 2 3 4 6

IRQ

0 8 10 11 12 13 14 15

INT

32 33 34 35 36 38 40 42 43 44 45 46 47

IRQ Example

Hardware Device

Timer Keyboard PIC Cascading Second serial port First serial port Floppy Disk System Clock Network Interface USB port, sound card PS/2 Mouse Math Coprocessor EIDE first controller EIDE second controller Slide #12

APIC

• Problem: PICs don’t support multiple CPUs – Only one INT signal, so only one CPU can receive interrupts • SMP required a new solution – APIC + IOAPIC – Idea: Separate the responsibility of the PIC into two components • APIC = Interfaces with CPU • IOAPIC = Interfaces with devices

APIC

• • • Each CPU has its own local APIC – In charge of keeping track of interrupts bound for its assigned CPU – Since Pentium Pro, the APIC has been implemented in the CPU die APIC interfaces with CPUs interrupt pins to invoke correct IDT vector – This is its primary responsibility But it does other things as well – Timer – APIC has its own timer device per CPU • Legacy PC had a separate timer device on the motherboard • Allows each CPU to have its own timer – Inter-Processor Interrupt – Allows cross CPU communication • 1 CPU can send an interrupt to another one • • Why would you want to do this?

How does the APIC do this?

ICC bus

• • APICs and IOAPICs share a common communication bus • ICC bus: Interrupt Controller Communication Bus Handles routing of interrupts to the correct APIC

IOAPIC

• • Connects devices to ICC bus – Must still translate IRQ pins from devices to vectors – But now must also select destination APIC Typically has 24 I/O Redirection Table Registers – Specifies vector # to send to APIC – Specifies which APIC (or group of APICS) can accept the IRQ • Several methods of specifying APIC addresses – Allows masking of IRQ pins

IO-APIC configuration

• Usually initialized to mirror the PIC configuration – But as architecture diverge from legacy PC, this is becoming harder – Generally speaking resource discovery is UGLY • OS then can map IO-APIC entries to which ever vector on whichever CPU they want – This means that IRQ vectors can be reused between CPUs

Interrupt Vectors

Vector Range

0-19 20-31 32-127 128 129-238 239 240 241-250 251-253 254 255

Use

Nonmaskable interrupts and exceptions.

Intel-reserved External interrupts (IRQs) System Call exception External interrupts (IRQs) Local APIC timer interrupt Local APIC thermal interrupt Reserved by Linux for future use Interprocessor interrupts Local APIC error interrupt Local APIC suprious interrupt Slide #18

IRQ Handling

1. Monitor IRQ lines for raised signals.

If multiple IRQs raised, select lowest # IRQ.

2. If raised signal detected 1. Converts raised signal into vector (0-255).

2. Stores vector in I/O port, allowing CPU to read.

3. Sends raised signal to CPU INTR pin.

4. Waits for CPU to acknowledge interrupt.

5. Kernel runs do_IRQ() .

6. Clears INTR line.

3. Goto step 1.

Slide #19

do_IRQ

1. Kernel jumps to entry point in entry.S

.

2. Entry point saves registers, calls do_IRQ() .

3. Finds IRQ number in saved %EAX register.

4. Looks up IRQ descriptor using IRQ #.

5. Acknowledges receipt of interrupt.

6. Disables interrupt delivery on line.

7. Calls handle_IRQ_event() to run handlers.

8. Cleans up and returns.

9. Jumps to ret_from_intr().

Slide #20

handle_IRQ_event()

fastcall int handle_IRQ_event(unsigned int irq, struct pt_regs *regs, struct irqaction *action) { int ret, retval = 0, status = 0; if (!(action->flags & SA_INTERRUPT)) local_irq_enable(); do { ret = action->handler(irq, action->dev_id, regs); if (ret == IRQ_HANDLED) status |= action->flags; retval |= ret; action = action->next; } while (action); if (status & SA_SAMPLE_RANDOM) add_interrupt_randomness(irq); local_irq_disable(); return retval; }

Slide #21

Interrupt Handlers

Function kernel runs in response to interrupt.

More than one handler can exist per IRQ.

Must run quickly.

Resume execution of interrupted code.

How to deal with high work interrupts?

Ex: network, hard disk Slide #22

Top and Bottom Halves

Top Half The interrupt handler.

Current interrupt disabled, possibly all disabled.

Runs in interrupt context, not process context. Can ’ t sleep.

Acknowledges receipt of interrupt.

Schedules bottom half to run later.

Bottom Half Runs in process context with interrupts enabled.

Performs most work required. Can sleep.

Ex: copies network data to memory buffers.

Slide #23

Interrupt Context

Not associated with a process.

Cannot sleep: no task to reschedule.

current macro points to interrupted process.

Shares kernel stack of interrupted process.

Be very frugal in stack usage.

Slide #24

Registering a Handler

request_irq() Register an interrupt handler on a given line.

free_irq() Unregister a given interrupt handler.

Disable interrupt line if all handlers unregistered.

Slide #25

Registering a Handler

int request_irq(unsigned int irq, irqreturn_t (*handler)(int, void *, struct pt_regs *), unsigned long irqflags, const char * devname, void *dev_id) irqflaqs = SA_INTERRUPT | SA_SAMPLE_RANDOM | SA_SHIRQ Slide #26

Writing an Interrupt Handler

irqreturn_t ih(int irq,void *devid,struct pt_regs *r)

Differentiating between devices Pre-2.0: irq Current: dev_id Registers Pointer to registers before interrupt occurred.

Return Values IRQ_NONE : Interrupt not for handler.

IRQ_HANDLED : Interrupted handled.

Slide #27

RTC Handler

irqreturn_t rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs) { spin_lock (&rtc_lock); rtc_irq_data += 0x100; rtc_irq_data &= ~0xff; if (rtc_status & RTC_TIMER_ON) mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100); spin_unlock (&rtc_lock); /* Now do the rest of the actions */ spin_lock(&rtc_task_lock); if (rtc_callback) rtc_callback->func(rtc_callback->private_data); spin_unlock(&rtc_task_lock); wake_up_interruptible(&rtc_wait); kill_fasync (&rtc_async_queue, SIGIO, POLL_IN); return IRQ_HANDLED; }

Slide #28

Interrupt Control

Disable/Enable Local Interrupts local_irq_disable(); /* interrupts are disabled */ local_irq_enable(); Saving and Restoring IRQ state Useful when don ’ t know prior IRQ state.

unsigned long flags; local_irq_save(flags); /* interrupts are disabled */ local_irq_restore(flags); /* interrupts in original state */ Slide #29

Interrupt Control

Disabling Specific Interrupts For legacy hardware, avoid for shared IRQ lines.

disable_irq(irq) enable_irq(irq) What about other processors?

Disable local interrupts + spin lock.

We ’ ll talk about spin locks next time… Slide #30

Bottom Halves

Perform most work required by interrupt.

Run in process context with interrupts enabled.

Three forms of deferring work SoftIRQs Tasklets Work Queues Slide #31

SoftIRQs

Statically allocated at compile time.

Only 32 softIRQs can exist (only 6 currently used.) struct softirq_action { void (*action)(struct softirq_action *); void *data; }; static struct softirq_action softirq_vec[32]; Tasklets built on SoftIRQs.

All tasklets use one SoftIRQ.

Dynamically allocated.

Slide #32

SoftIRQ Handlers

Prototype void softirq_handler(struct softirq_action *) Calling my_softirq->action(my_softirq); Pre-emption SoftIRQs don ’ t pre-empt other softIRQs.

Interrupt handlers can pre-empt softIRQs.

Another softIRQ can run on other CPUs.

Slide #33

Executing SoftIRQs

Interrupt handler marks softIRQ.

Called raising the softirq.

SoftIRQs checked for execution: In return from hardware interrupt code.

In ksoftirq kernel thread.

In any code that explicitly checks for softIRQs.

do_softirq() Loops over all softIRQs.

Slide #34

Current SoftIRQs

SoftIRQ

HI TIMER NET_TX NET_RX SCSI TASKLET 2 3 4 5

Priority Description

0 1 High priority tasklets.

Timer bottom half.

Send network packets.

Receive network packets.

SCSI bottom half.

Tasklets.

Slide #35

Tasklets

• • • Implemented as softIRQs.

– Linked list of tasklet_struct objects.

Two priorities of tasklets: – HI: tasklet_hi_schedule() – TASKLET: tasklet_schedule() Scheduled tasklets run via do_softirq() – HI action: tasklet_action() – TASKLET action: tasklet_hi_action() Slide #36

ksoftirqd

SoftIRQs may occur at high frequencies.

SoftIRQs may re-raise themselves.

Kernel will not handle re-raised softIRQs immediately in do_softirq() .

Kernel thread ksoftirq solves problem.

One thread per processor.

Runs at lowest priority (nice +19).

Slide #37

Work Queues

Defer work into a kernel thread.

Execute in process context.

One thread per processor: events/n .

Processes can create own threads if needed.

struct workqueue_struct { struct cpu_workqueue_struct cpu_wq[NR_CPUS]; const char *name; }; struct list_head list; /* Empty if single thread */ Slide #38

Work Queue Data Structures

worker thread cpu_workqueue_struct 1/CPU workqueue_struct 1/thread type work_struct work_struct work_struct 1/deferrable function Slide #39

Worker Thread

Each thread runs worker_thread() 1. Marks self as sleeping.

2. Adds self to wait queue.

3. If linked list of work empty, schedule() .

4. Else, marks self as running, removes from queue.

5. Calls run_workqueue() to perform work.

Slide #40

run_workqueue()

1. Loops through list of work_struct s struct work_struct { unsigned long pending; struct list_head entry; void (*func)(void *); void *data; void *wq_data; struct timer_list timer; }; 2. Retrieves function, func, and arg, data 3. Removes entry from list, clears pending 4. Invokes function Slide #41

Which Bottom Half to Use?

1. If needs to sleep, use work queue.

2. If doesn ’ t need to sleep, use tasklet.

3. What about serialization needs?

Bottom Half

Softirq Tasklet Work queues

Context

Interrupt Interrupt Process

Serialization

None Against same tasklet None Slide #42

Timer Interrupt

Executed HZ times a second.

#define HZ 1000 /* */ Called the tick rate.

Time between two interrupts is a tick.

Driven by Programmable Interrupt Timer (PIT).

Interrupt handler responsibilities Updating uptime, system time, kernel stats.

Rescheduling if current has exhausted time slice.

Balancing scheduler runqueues.

Running dynamic timers.

Slide #43

Jiffies

Jiffies = number of ticks since boot.

extern unsigned long volatile jiffies; Incremented each timer interrupt.

Uptime = jiffies/HZ seconds.

Convert for user space: jiffies_to_clock_t() Comparing jiffies, while avoiding overflow.

time_after(a, b): a > b time_before(a,b) a < b time_after_eq(a,b): a >= b time_before_eq(a,b): a <= b Slide #44

Timer Interrupt Handler

1. Increments jiffies .

2. Update resource usages (sys + user time.) 3. Run dynamic timers.

4. Execute scheduler_tick() .

5. Update wall time.

6. Calculate load average.

Slide #45