Holistic, Energy Efficient Design @ Cardiff
Download
Report
Transcript Holistic, Energy Efficient Design @ Cardiff
Holistic, Energy Efficient
Design
@ Cardiff
Going Green Can Save Money
Dr Hugh Beedie
CTO
ARCCA & INSRV
Introduction
The Context
Drivers to Go Green
Where does all the power go ?
Before the equipment
In the equipment
What should we do about it ?
What Cardiff University is doing about it
The Context (1)
Cardiff University receives £3M grant to
purchase a new supercomputer
A new room is required to house it, with
appropriate power, cooling, etc
2 tenders
Data Centre construction
Supercomputer
The Context (2)
INSRV Sustainability Mission: To minimise CU’s IT
environmental impact and to be a leader in delivering
sustainable information services.
Some current & recent initiatives:
University INSRV Windows XP image default
settings
Condor – saving energy, etc compared to a
dedicated supercomputer
ARCCA & INSRV new Data Centre
PC Power saving project – standby 15 mins after
logout (being implemented this session)
Drivers – Why do Green IT?
Increasing demand for CPU & Storage
Lack of Space
Lack of Power
Increasing energy bills (oil prices doubled)
Enhancing the Reputation of Cardiff
University & attracting better students
Sustainable IT
Because we should (for the Planet )
Congress Report Aug 2007
US Data Centre electricity demand
doubled 2000-2006
Trends toward 20kW+ per rack
Large scope for efficiency improvement
Obvious – more efficiency at each stage
Holistic approach necessary – facility and
component improvements
Less obvious – virtualisation (up to 5X)
Where does all the power go? (1)
“Up to 50% is used before getting
to the Server”
Report to US Congress Aug 2007
Loss = £50,000 p.a. for every 100kW
supplied to the room
Where does all the power go? (2)
Where does all the power go? (3)
How ?
Power Conversion - before it gets to your
room, you lose in the HV-> LV transformer
Efficiency=98% not 95%
Return On Investment (ROI)?
New installation, ROI = 1 month
Replacement, ROI = 1 year
Lifetime of investment = 20+ yrs !!!!!
Where does all the power go? (4)
How?
Cooling infrastructure
Typical markup 75%
• Lowest markup 25-30% ?
• Est ROI 2-3 years (lifetime 8 years)
•
Where does all the power go? (4)
How?
Backup power (UPS) (% load vs %
efficiency)
Efficiency = 80-95%
•
•
Est. ROI for new installation - <1year
Replacement not so good, UPS’ life 3-5 yrs
only ?
Where does it go? – Bull View
Cumulative power
Data Centre consumption
100%
80%
Cooling
40%
Power
delivery
Loads
Where does it go? – Intel View
25.5%
5.5%
7.3%
36.4%
Load
CPU, Memory,
Drives , I/O
100W
18.2%
PSU 50W
Voltage
Regulators
20W
Server
fans
15W
7.3%
UPS
+PDU
20W
Room cooling
system
70W
Total 275W
Source: Intel Corp.
Where does it go? – APC View
Server Power Consumption
Server Components
Power Consumption
PSU losses
38W
Fan
10W
CPU
80W
Memory
36W
Disks
12W
Peripheral slots
50W
Motherboard
25W
Options for Cardiff (1)
Carry on as before
Dual core HPC solutions
Wait for quad core
Saves on Flops/watt
Saves on infrastructure (fewer network ports)
Saves on management (fewer nodes)
Saves on space
Saves on power
Options for Cardiff (2)
High density solution
Carry on as before (6-8kW per rack)
Needs specialist cooling over 8kW per rack
Probable higher TCO
Low density solution (typically)
BT – free air cooling
Allow wider operating temp range – warranty
issues ?
Not applicable here (no space)
What did Cardiff do? (1)
Ran 2 projects
TCO as key evaluation criterion
HPC equipment
Environment
Plus need to measure and report on usage
Problems
Finger pointing (strong project mgt)
Scheduling (keep everyone in the loop)
Timetable
Tender elements
Date of Issue
Date of Order
Room Tender
April 2007
July 2007
(to Comtec)
No delay
December
2007
Waiting for
QuadCore
March 2008
(Cardiff
Estates)
Long lead times
on low loss
transformers
HPC Tender
January 2007
HV
August 2007
Transformer
Reason for
delay
What did Cardiff do? (2)
Bought Bull R422 servers for HPC
80W Quad core Harpertown
2 dual socket, quad core servers in 1U –
common PSU
Larger fans (not on CPU)
Other project in same room
IBM Bladecentres
Some pizza boxes
Back Up Power
APC 160kW Symmetra UPS
Full and half load efficiency 92%
Scaleable & Modular – could grow
as we grew
Strong environment management
options
Integrated with cooling units
SNMP
Bypass capability
Bought 2 (not fully populated)
1 for compute nodes
1 for management and
another project
Enhanced existing standby
generator
Cooling Inside the Room
APC Netshelter Airflow
APC Inline RC units
Provides residual cooling to the room
Front
Servers
Resilient against loss of an RC unit
Cools hot air without mixing with
cold air
InRow
Cooling
Unit
InRow
Cooling
Unit
Cooling – Outside the Room
3 Airedale 120kW
Chillers
Ultima Compact Free
Cool 120kW
Quiet model
Variable speed fans
N+1 arrangement
Free-cooling vs Mechanical cooling
System operated on
100% Free-Cooling
(12% of Year)
System operated on
partial Free-Cooling
(50% of Year)
System operated on
mechanical cooling only
(38% of Year)
Cooling
Load
-7oC
3.5oC
12.5oC
Ambient Temp.
Cost Savings Summary
Low loss transformer
£10k p.a.
UPS
£20k p.a.
Cooling
£50k p.a. estimated
Servers
80W part - £20k p.a.
Quad core – same power but twice the ‘grunt’
Lessons learned from SRIF3 HPC
Procurement - Summary
Strong project management essential
IT and Estates liaison essential but difficult
Good Supplier relationship essential
Major savings possible
Infrastructure (power & cooling)
Servers (density and efficiency)