No Slide Title

Download Report

Transcript No Slide Title

CPU power consumption
KAIST 전산학과
맹 승 렬
[email protected]
CPU power consumption
 Most modern CPUs are designed with power consumption in mind to
some degree
 Power vs. energy:
• Power
– P = Iavg X Vcc (watt), 1w = 1J/s
– The rate at which energy is consumed
• Energy
– 1 Joule = 1W X 1s
 Heat-Limited application (HLA):
• cost of power supply
• cost of cooling
– heat depends on power consumption
– 600 MHz Alpha: 109.0 W @ 2.30V Vdd
 Energy-Limited application (ELA):
• battery life depends on energy consumption (mobile system)
• cooling
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
2
Power Density
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
3
Notebook Power Breakdown
 IBM ThinkPad R40
• 1.3 GHz Pentium M
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
4
Power consumption for Intel CPUs
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
5
CPU Power Breakdown by unit
Breakdown of Power for Modern High
Performance Processor
Pentium Pro Breakdown of Power
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
6
Power reduction techniques
 Transistor Level
• static power consumption
• dynamic power consumption
• leakage current
 Circuit Level
• Voltage scaling
• Clock gating
 System Level
• Power saving modes
• Cache organization
 Software Based
• Instruction level power analysis
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
7
CMOS power consumption
 Voltage drops: power consumption proportional to
V2 :
• Ps=CLVdd2fs
 Toggling (switching): more activity means more
power
 Leakage: basic circuit characteristics; can be
eliminated by disconnecting power (below 0.13
micron)
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
8
CPU power-saving strategies
 Reduce power supply voltage
• noise margin
• leakage current
 Run at lower clock frequency
• Performance
 Disable function units with control signals when
not in use
 Disconnect parts from power supply when not in
use
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
9
Application: PowerPC 603 energy
features
 Provides doze, nap, sleep modes
 Dynamic power management features:
• Uses static logic
• Can shut down unused execution units
• Cache organized into subarrays to minimize amount of
active circuitry
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
10
PowerPC 603 activity
 Percentage of time units are idle for SPEC
integer/floating-point:
unit
D cache
I cache
load/store
fixed-point
floating-point
system register
Specint92
29%
29%
35%
38%
99%
89%
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
Specfp92
28%
17%
17%
76%
30%
97%
11
Power-down costs
 Going into a power-down mode costs:
• time
• energy
 Must determine if going into mode is worthwhile
 Can model CPU power states with power state
machine
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
12
Power vs. time running a real
application
 Pentium processor
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
13
Application: StrongARM SA-1100 power
saving
 Processor takes two supplies:
• VDD is main 3.3V supply
• VDDX is 1.5V
 Three power modes:
• Run: normal operation
• Idle: stops CPU clock, with logic still powered
• Sleep: shuts off most of chip activity;
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
14
SA-1110 Power and Clock supply
sources
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
15
SA-1100 power state machine
Prun = 400 mW
run
10 ms
160 ms
90 ms
10 ms
idle
Pidle = 50 mW
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
90 ms
sleep
Psleep = 0.16 mW
16
Intel PXA27x Power Management
 Power Modes
•
•
•
•
•
•
•
Turbo mode
Run mode – normal full-function mode
Idle mode – stopping the CPU clock
Deep-idle mode – back to 13-MHz core frequency
Standby mode
Sleep mode – keeps I/O powered
Deep-sleep mode – I/O powered down
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
17
Summary of Module Power and Clocks
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
18
DFM (Dynamic Frequency Management)
and DVM
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
19
DFM and DVM (2)
 DFM
• The core clock can be configured dynamically by SW
 DVM
• The voltage manager provides voltage management
through use of an I2C unit
 Coupling
• A frequency change can be coupled with a voltage change
• A voltage change can be coupled with a frequency change
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
20
DFM and DVM (3)
 Programmable Operating Frequencies
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
21
Workload Characterization for Intel
DFM and DVM
•
•
•
•
CPU bound
Memory bound
I/O bound
CPU and Memory bound
• CPU bound job
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
22
Workload Characterization for Intel
DFM and DVM
Memory bound job
CPU and Memory bound job
Window Media Video
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
23
Power management styles
 Static power management: does not depend on
CPU activity
• Example: user-activated power-down mode
 Dynamic power management: based on CPU
activity
• Example: disabling off function units
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
24
Power optimization
KAIST 전산학과
맹 승 렬
[email protected]
Power optimization
 Power management: determining how system
resources are scheduled/used to control power
consumption
• Static : does not depend on CPU activity
– Example: user-activated power-down mode
• Dynamic : based on CPU activity
– Example: disabling off function units
 OS can manage for power just as it manages for
time.
 OS reduces power by shutting down units.
• May have partial shutdown modes.
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
26
Power management and performance
 Power management and performance are often at
odds.
 Entering power-down mode consumes
• energy
• time
 Leaving power-down mode consumes
• energy
• time
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
27
Simple power management policies
 Request-driven: power up once request is received.
Adds delay to response.
 Predictive shutdown: try to predict how long you
have before next request.
• May start up in advance of request in anticipation of a new
request.
• If you predict wrong, you will incur additional delay while
starting up.
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
28
Probabilistic shutdown
 Assume service requests are probabilistic.
 Optimize expected values:
• power consumption
• response time
 Simple probabilistic: shut down after time Ton, turn
back on after waiting for Toff.
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
29
Advanced Configuration and Power
Interface
 Conceived by Intel, Microsoft, and Toshiba (the
promoters)
 An “interface” specification
• ACPI/OSPM replaces APM, MPS, and PnP BIOS Spec
• APM : advanced power management (1992)
– to reduce power consumption below the 60 to 80 watts
requirement of DOS-based systems
– BIOS-driven power management
 Allow OS-directed Power Management (OSPM)
• power management specification
• system design specification
• application program power management specification
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
30
ACPI, cont’d
 Defines
• Hardware registers - implemented in chipset silicon
• BIOS interfaces
– Configuration tables
– Interpreted executable function interface (Control
Methods)
– Motherboard device enumeration and configuration
• System and device power states
• ACPI Thermal Model
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
31
Advanced Configuration and Power
Interface
 ACPI: open standard for power management
services.
applications
OS kernel
device
drivers
power
management
ACPI BIOS
Hardware platform
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
32
ACPI Global States and Transitions
Power
Failure
Legacy
B oot
(SCI_EN =0)
Modem
HDD
CDROM
CPU
D3
D3
D3
C3
D2
D2
D2
C2
D1
D1
D1
C1
D0
C0
D0
D0
G3 -Mech
Off
C0
A CPI
B oot
(SCI_EN =1)
S4BIOS_F
S4BIOS_REQ
BIOS
Routine
A CPI_EN A BLE
(SCI_EN =1)
G0 (S0) Working
Legacy
SLP_TYPx=(S1-S4)
and
SLP_EN
A CPI_D ISA BLE
(SCI_EN =0)
W ake
Event
A CPI
B oot
(SCI_EN =1)
Legacy
B oot
(SCI_EN =0)
S4
S3
S2
S1
G1 Sleeping
SLP_TYPx=S
5
and
SLP_EN
or
PW R BTN_OR
G2 (S5) Soft Off
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
33
ACPI global power states
 G3: mechanical off
 G2(S5): soft off
 G1: sleeping state
•
•
•
•
S1: low wake-up latency with no loss of context
S2: low latency with loss of CPU/cache state
S3: low latency with loss of all state except memory
S4: lowest-power state with all devices off
 G0: working state
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
34
Global Power States
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
35
Device and Processor Power States
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
36
Software Power Minimization
 SW constitutes a major component of systems
 Instruction level power analysis
• Instruction base costs
• Effect of circuit state
• Other inter-instruction effects
– Pipeline stalls
– Cache misses
“Reducing power in high-performance microprocessors,” V. Tiwari et al.
“Instruction level power analysis and optimization of software, ” V. Tiwari et al.
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
37
Instruction base costs
 486DX2: 300-500 mA
 SPARClite: 200-300 mA
 DSP : 20-60 mA
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
38
Effect of circuit state
 Case of the 486DX2
• 5-30 mA while most instructions in the range of 300-420 mA
 Case of a smaller, more basic processor (DSP)
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
39
Other Inter-instruction effects
 Case of the 486 DX2
• Pipeline stall cycle
– 250 mA
• Cache miss cycle
– 216 mA
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
40
Overall Instruction level power model
Ep =Si(Bi x Ni) +Si,j(Oi,j x Ni,j) + SkEk
 Impact of internal power management
• OR, SHIFT, ADD, or MULTIPLY do not show much of the
cost variation
• Guarded evaluation
– Turn off the power of the unused modules dynamically
– Low power Pentium, PowerPC 603, etc.
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
41
Software Energy Optimization
Techniques
 Reducing Memory
Accesses
• 486 DX :
– Register
operand: 300
mA
– Memory read:
400 mA
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
42
Software Energy Optimization Techniques,
cont’d
 Energy cost driven code generation
• Traditional cost criteria
– Either the size or the running time
 Instruction Reordering for Low power
• Limited impact in 486DX and the SPARClite
• Beneficial for the DSP (Why?)
 Processor specific optimizations
2004
전문대교수연수
([email protected])
2006년Spring
SEP561 Embedded
Computing
43