No Slide Title

Download Report

Transcript No Slide Title

CPUs

CPU power consumption

CPU power consumption

    Most modern CPUs are designed with power consumption in mind to some degree Power vs. energy:   Power  

P = I avg X V cc (watt), 1w = 1J/s

The rate at which energy is consumed Energy  1 Joule =

1W X 1s

Heat-Limited application (HLA):  heat depends on power consumption  600 MHz Alpha: 109.0 W @ 2.30V V dd Energy-Limited application (ELA):  battery life depends on energy consumption

Power reduction techniques

 Power breakdown in a high-performance CPU Datapath Memory Control, IO Clock

Power reduction techniques

  Circuit Level  Voltage scaling  Clock gating System Level   Power saving modes Cache organization  Software Based  Instruction level power analysis

CMOS power consumption

  Voltage drops : power consumption proportional to

V 2

:

P s =C L V dd 2 f s

Toggling (switching) : more activity means more power  Leakage : basic circuit characteristics; can be eliminated by disconnecting power

CPU power-saving strategies

 Reduce power supply voltage  Run at lower clock frequency  Disable function units with control signals when not in use  Disconnect parts from power supply when not in use

Power management styles

 Static power management : does not depend on CPU activity  Example: user-activated power-down mode  Dynamic power management : based on CPU activity  Example: disabling off function units

Application: PowerPC 603 energy features

 Provides doze, nap, sleep modes  Dynamic power management features:  Uses static logic  Can shut down unused execution units  Cache organized into subarrays to minimize amount of active circuitry

PowerPC 603 activity

 Percentage of time units are idle for SPEC integer/floating-point: unit D cache I cache load/store fixed-point floating-point system register Specint92 29% 29% 35% 38% 99% 89% Specfp92 28% 17% 17% 76% 30% 97%

Power-down costs

 Going into a power-down mode costs:  time  energy  Must determine if going into mode is worthwhile  Can model CPU power states with power state machine

Power vs. time running a real application

 Pentium processor

Application: StrongARM SA 1100 power saving

 Processor takes two supplies:  VDD is main 3.3V supply  VDDX is 1.5V

 Three power modes:  Run: normal operation  Idle: stops CPU clock, with logic still powered  Sleep: shuts off most of chip activity; 3 steps, each about 30 m s; wakeup takes > 10 ms

SA-1110 Power and Clock supply sources

SA-1100 power state machine

P run = 400 mW run 10 m s 10 m s 90 m s 90 m s idle 160 ms sleep P idle = 50 mW P sleep = 0.16 mW

Software Power Minimization

 SW constitutes a major component of systems  Instruction level power analysis  Instruction base costs  Effect of circuit state  Other inter-instruction effects  Pipeline stalls  Cache misses

Reducing power in high-performance microprocessors,

V. Tiwari et al.

Instruction level power analysis and optimization of software,

V. Tiwari et al.

Instruction base costs

   486DX2: 300-500 mA SPARClite: 200-300 mA DSP : 20-60 mA

Effect of circuit state

 Case of the 486DX2  5-30 mA while most instructions in the range of 300-420 mA  Case of a smaller, more basic processor (DSP)

Other Inter-instruction effects

 Case of the 486 DX2  Pipeline stall cycle  250 mA  Cache miss cycle  216 mA

Overall Instruction level power model

E p =

S

i (B i x N i ) +

S

i,j (O i,j x N i,j ) +

S

k E k

 Impact of internal power management  OR, SHIFT, ADD, or MULTIPLY do not show much of the cost variation 

Guarded evaluation

Turn off the power of the unused modules dynamically

Low power Pentium, PowerPC 603, etc.

Software Energy Optimization Techniques

 Reducing Memory Accesses  486 DX :   Register operand: 300 mA Memory read: 400 mA

Software Energy Optimization Techniques, cont’d

 Energy cost driven code generation  Traditional cost criteria  Either the size or the running time  Instruction Reordering for Low power  Limited impact in 486DX and the SPARClite  Beneficial for the DSP (Why?)  Processor specific optimizations