Transcript Slide 1

Data Minimisation
Managing Data Growth While Containing Cost and
Carbon Footprint
Ken Hall, Dimension Data
Friday, July 17, 2015
Agenda
Introductions
Today’s data management challenges
Energy efficiency in the data centre
What is Data Minimisation?
Online Active Archiving
Backup Data De-Duplication
Data Minimisation effects
Developing the business case
Questions & Answers
Dimension Data - ‘Data Centre & Storage Solutions’
Network Integration
Security
Microsoft Solutions Infrastructure
Managed Services
Microsoft Solutions Application Integration
Customer Interactive Solutions
Data Centre & Storage Solutions – Availability, Compliance & Optimisation
•
Storage Solutions – SAN, NAS, CAS
•
Virtualisation Solutions – DR, Server & Desktop Consolidation
•
Backup, Recovery & Archiving Solutions
•
Data Centre Environmental’s – Power, Cooling & Rack Solutions
Key Technology Partners
•
APC, Cisco, EMC, HDS, HP, IBM, Microsoft, NetApp, Quantum, Symantec, Sun
The Digital Universe is Rapidly Expanding
Amount of Digital Information Created and Replicated Each Year
1,773 exabytes
1,800
1,600
1,400
Exabytes
1,200
1,000
800
600
400
173
exabytes
200
0
2006
2007
2008
2009
Ten-fold growth in five years!
Source: IDC White Paper, "The Diverse and Exploding Digital Universe," March 2008
2010
2011
Typical DD Customer – Exponential Data Growth
•
Annual Compound Data Growth of 65%
•
Having to squeeze more into Backup Window
•
Daily Incremental and Weekly Full
•
B2D Requirement Growing Rapidly
•
2 Week Retention on Disk (3 Full’s - 10 Incr)
•
Backup Media Server/s Under Pressure
•
4 Week Retention on Tape
•
Network Bandwidth Constraints
•
12 Monthly’s on Tape kept indefinitely
•
Tape Infrastructure &Handling Costs Increasing
Coping with Information Growth in Today’s Economy
In 2009, IT budgets are
flat or declining*
 Escalating costs for primary storage
 Difficulty meeting backup and recovery
windows
 Ensuring high availability of information
 Providing timely access to historical
information
*“Global purchases of IT goods and services…
will equal $1.66 trillion in 2009, declining by
3 percent after an 8 percent rise in 2008.”
Global IT Market Outlook: 2009, Forrester Research, January 12, 2009
Data Center Energy Use is Doubling
Comparison of Projected Electricity Use, 2007 to 2011
Annual Electricity Use
(billion kWh/year
140
120
100
Historical energy use
80
60
40
State of the art scenario
20
0
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
 IT energy use has doubled since 2000 and will likely double again by 2011
 Energy operating costs will soon exceed the cost of purchase for servers
 Existing conservation technologies can reduce consumption to 2002 levels
Source: EPA report to Congress, 2007
2011
Available Capabilities for Energy Efficiency
Improve Efficiency – Reduce Energy Consumption
REDUCE
CAPACITY
Snaps
Clones
Compression
De-duplication
Archiving
INCREASE
UTILIZATION
Server virtualisation
Data migration
Storage consolidation
Virtual Provisioning
Flash drives
Optimisation algorithms
Automated discovery
Document management
Storage tiering
Virtual LUNS
File and e-mail tiering
Storage virtualisation
Large-capacity drives
Replication across
storage tiers
How can we...
Manage exponential data growth, while...

Improving access to organisational data

Containing data management and infrastructure costs

Reducing the data centre’s carbon footprint...
Implement a Data Minimisation Strategy

Online archiving of e-mail and file systems

Backup with data de-duplication
Data Minimisation Elements
New Technologies and Services are Enablers
Primary
Storage
Archive
Identify candidates
for archiving
Classify and move
Backup
Establish SLAs based
on information class
 Retention and compliance
 Tier backup infrastructure
 Data reduction
 Optimise media: B2D,
VTL, de-dupe and tape
 Universal access
 Simplify management
 Address security issues
 Simplify management
Data Minimisation – How it works
1. Archive the inactive data before you perform the backup process

Identify Inactive Data based on polices

Automate the movement of the data to a lower cost storage tier or dedicated
archive platform leaving stubs behind

Items are retrieved from the online archive on user demand

Backup up the archive infrequently or never
2. Backup the remaining data using resource efficient data de-duplication

Rapid ‘Full Backups’ - only the ‘sub-file’ changes are sent and stored on disk

Minimal Bandwidth – only a fraction of the typical 200% is sent over the wire

Minimal Storage Consumption – only unique ‘sub-file’ blocks are stored

Protect more, with less for longer
Today: Energy-Efficient Storage Design
1 TB Data on Different Capacity/Performance Drives
94%
6,096 kWh/yr
38%
Less
Energy
87%
3,048 kWh/yr
73%
50%
1,434 kWh/yr
3,790 kWh/yr
787 kWh/yr
30x
IOPS
73 GB
Flash drive
15K
73 GB
393 kWh/yr
15K
146 GB
10K
300 GB
7.2K
500 GB
CONSUME LESS ENERGY BY CAPACITY
7.2K 1 TB
File System Archiving

Extract inactive, final-form data to
an archive

Enhance performance of
production applications

Reduce size of backup datasets

Free up expensive Tier 1 disk

Store archived data on high
density low cost energy efficient
storage
Before
After
Backup
Back
upfull,
4 TB,
10active
TB data only
Production
4 TB
Active
data
Always
Extract
available
10 TB
6 TB
Active archive
Inactive
Reclaimed
data
storage
Primary
storage
Secondary
storage
17 July 2015
E-Mail Archiving
Mail Archival automatically create shortcuts to archived messages /
attachments…and deletes the original attachments from the e-mail server
Message Server
Space saved on e-mail server
is typically 60–80%
Message 1 Jan. 1, 2008
To: Rick
Subject: Question
Attached:
Shortcut
E-mail Archive Server
Message 1 Jan. 1, 2008
To: Rick
Subject: Question
Attached:
Shortcut
Message 2 Jan. 1, 2008
To: Ron
Subject: Update
Attached:
Message 2 Jan. 1, 2008
To: Ron
Subject: Update
Attached:
Shortcut
Message 3 Feb. 1, 2008
To: Bill
Subject: Training
User’s Inbox
Message 3 Feb. 1, 2008
To: Bill
Subject: Training
E-mail Archive
Definition of De-duplication
“The process of detecting and identifying the unique data
segments within a given set of information, enabling the
elimination of redundancy when stored or moved.”
Data Set 1
De-duplication
Data Set 2
Data Set 3
Before: total segments = 39
After: Unique segments = 6
Data De-duplication: How it Works
 First Instance
 Duplicate Instance
May 2007
 Modified Instance
May 2007
June 2008
A
B
A
B
E
B
C
D
C
D
C
D
Only unique
data segments
are backed up
A
B
C
D
Data already backed up,
so only a unique ID pointer
is stored (20 bytes)
A
B
C
D
E
New data segment
identified and backed up
E
Unique data stored on disk, available for immediate recovery
Key Point – Data Minimisation requires a platform that
doesn’t need to be backed up!
Archiving Functionality
Customer Archival Requirements
WORM DISK
Active Archiving
WORM delivers unique features for
online archives

Location independence

Self-healing and management

Guaranteed authenticity

Single-instancing
Online Archiving
Tier 3 Disk
Tier 3 Disk with SATA and NAS with
ATA
Offline Archiving
Tape is best suited for offline
archives
Tape
Management Efficiency
Data Minimisation Strategy - How it all fits together
Static
Data
growth
OH
Tier 2
Secondary Storage
Tier 3 Data
Growth
No
management
required
Tier 3
Archive long term
Retention on disk
80% of data
De-duped Data
Tier 4
Backup to disk
(De-Dupe)
Quick recovery
Optional 20%
Data backup
Automated movement relative to age
Tier 1
Primary Storage
Tier 5
Legacy long
Term retention
On tape
Optional 20%
Static
Data
growth
Quantified Results – Reduce Tier 1/2 with Archiving
Major reduction in expensive Tier1/2 Storage
Tier 3 Archive storage minimised due to single instancing & compression
73% reduction in power and cooling requirements for archived data
Quantified Results – The Data Minimisation Leverage
Good Tier 4 Savings with Archiving or De-Duplication
Excellent results by combining Archiving & Backup Data De-Duplication
6 x reduction in power and cooling requirements for B2D storage
Quantified Results – Less Tape Infrastructure
Associated reduction in Tape Library Slots, Drives, Management & Handling
Power of combining Archiving & De-Duplication – 560 Less LTO4 Tapes in Year3
Tape could be removed altogether – Offsite Replication & Disk Spin-Down
Data management cost comparison – Data Minimisation
New Data Management Annual Costs
$3,000,000.00
$2,500,000.00
$2,000,000.00
Old Cost
$1,500,000.00
New Cost
$1,000,000.00
$500,000.00
$0.00
Present State
Year 1
Year 2
Year 3
Year 4
Year 5
Significant Reduction of Backup Infrastructure and Tape Management
•
22
Tape Drive, Tape Licences, Slots, Library, Backup Server, Tape Media, Offsite Storage & Recall Costs, Admin Costs
© Copyright Dimension Data 2000 - 2006
17 July 2015
Data Minimisation Assessment – Business Case
•
Current backup minimisation methods
give you better efficient backups
•
However it doesn't fix the cause of the problem
which is data growth
•
A combination of data archival, backup de-duplication
and compression represents the most effective manner
to contain data within your environment
•
Helps quantify business case for archiving
(or other appropriate solution)
•
Workshop to identify costs/issues
23
© Copyright Dimension Data 2000 - 2008
17 July 2015
Data Minimisation – Input Variables
24
© Copyright Dimension Data 2000 - 2008
17 July 2015
Data Minimisation – Graphical View
25
© Copyright Dimension Data 2000 - 2008
17 July 2015
Data Minimisation – Graphical View (Cont.)
26
© Copyright Dimension Data 2000 - 2008
17 July 2015
Data minimisation strategy achieved by...
Footprint
Units / kW / Tons

Archiving over 70% of data to a protected environment
which removed the need for that data to be backed up via
archiving
sq. ft.
4,500
25,000
4,000
20,000
3,500
$
3,000
15,000
2,500
2,000
10,000
1,500

Minimised the impact of data backup via de-duplication
and compression (reduction in data volume and backup
data by 80%)
1,000
$
5,000
500
0
0
$
2006
2008
2010
2012
2014
2016
Estimated Infrastructure Run Rate
$4,500

Minimised the impact of VMware on the environment
through de-duplication
Power (kW)
Cooling
$4,000(Tons)
$3,500
Footprint (sq. ft.)
$
$3,000
"K$"
Equipment (Units)
$2,500
$2,000
$1,500
$1,000

$500
Contained Tier 1 disk growth and spend

Provided the most storage efficient backup method
possible today

Estimated savings to be over 5 Million dollars in 5 years.
$0
Year 1
Year 2
Year 3
Total
Cost BAU
$708
$1,410
$2,107
$4,226
Cost Optimized
$278
$560
$840
$1,678
Savings
$430
$850
$1,267
$2,548
My initial Sync took 12 hours now I backup in 50 mins’ – Dimension Data Customer
Questions & Answers
Friday, July 17, 2015