Transcript Slide 1

Everything You Wanted to Know About Storage, but Were Afraid to Ask

•Do you have a Cell phone, PDA or Smartphone?

•Do you have a DIGITAL CAMERA?

•Do you have a PC?

•What do all of these devices have in common ?

• How do you protect your data?

Digital Footprint Calculator

http://www.emc.com/digital_universe/downloads/web/personal-ticker.htm

•Are you familiar with RAID ?

RAID 0

• Data is striped across the HDDs in a RAID set • The stripe size is specified at a host level for software RAID and is vendor specific for hardware RAID • When the number of drives in the array increases, performance improves because more data can be read or written simultaneously • Used in applications that need high I/O throughput • Does not provide data protection and availability in the event of drive failures

RAID 1

Mirroring

data. is a technique whereby data is stored on two different HDDs, yielding two copies of • In addition to providing complete data redundancy, mirroring enables faster recovery from disk failure. • Mirroring involves duplication of data — the amount of storage capacity needed is twice the amount of data being stored. Therefore, mirroring is considered expensive • It is preferred for mission-critical applications that cannot afford data loss

Nested RAID

• Mirroring can be implemented with striped RAID by mirroring entire stripes of disks to stripes on other disks • RAID 0+1 and RAID 1+0 combine the performance benefits of RAID 0 with the redundancy benefits of RAID 1 • These types of RAID require an even number of disks, the minimum being four.

• RAID 0+1 is also called

mirrored stripe

. • This means that the process of striping data across HDDs is performed initially and then the entire stripe is mirrored.

Nested RAID

• RAID 1+0 is also called

striped mirror

• The basic element of RAID 1+0 is that data is first mirrored and then both copies of data are striped across multiple HDDs in a RAID set • Some applications that benefit from RAID 1+0 include the following: • High transaction rate Online Transaction Processing (OLTP),Database applications that require high I/O rate, random access, and high availability

RAID 3

• RAID 3 stripes data for high performance and uses parity for improved fault tolerance. • Parity information is stored on a dedicated drive so that data can be reconstructed if a drive fails • RAID 3 is used in applications that involve large sequential data access, such as video streaming.

RAID 4

• Stripes data across all disks except the parity disk at the block level • Parity information is stored on a dedicated disk • Unlike RAID 3 , data disks can be accessed independently so that specific data elements can be read or written on a single disk without read or write of an entire stripe

RAID 5

• RAID 5 is a very versatile RAID implementation • The difference between RAID 4 and RAID 5 is the parity location. • RAID 4, parity is written to a dedicated drive, while In RAID 5, parity is distributed across all disks • The distribution of parity in RAID 5 overcomes the write bottleneck. • RAID 5 is preferred for messaging, medium-performance media serving, and relational database management system (RDBMS) implementations in which database administrators (DBAs) optimize data access

RAID 6

• RAID 6 works the same way as RAID 5 except that RAID 6 includes a second parity element • This enable survival in the event of the failure of two disks in a RAID group. • RAID-6 protects against two disk failures by maintaining two parities

Hot Spare

• A hot spare refers to a spare HDD in a RAID array that temporarily replaces a failed HDD of a RAID set. • When the failed HDD is replaced with a new HDD, The hot spare replaces the new HDD permanently, and a new hot spare must be configured on the array, or data from the hot spare is copied to it, and the hot spare returns to its idle state, ready to replace the next failed drive.

• A hot spare should be large enough to accommodate data from a failed drive. • Some systems implement multiple hot spares to improve data availability.

• A hot spare can be configured as automatic or user initiated, which specifies how it will be used in the event of disk failure

What is an Intelligent Storage System

• Intelligent Storage Systems are RAID arrays that are: Highly optimized for I/O processing Have large amounts of cache for improving I/O performance Have operating environments that provide: – Intelligence for managing cache – Array resource allocation – Connectivity for heterogeneous hosts – Advanced array based local and remote replication options

Components of an Intelligent Storage System

• An intelligent storage system consists of four key components:

front end, cache, back end,

and

physical disks

.

Components of an Intelligent Storage System

• • • • The front end provides the interface between the storage system and the host. It consists of two components: front-end ports and front-end controllers The

front-end ports

enable hosts to connect to the intelligent storage system, and has processing logic that executes the appropriate transport protocol, such as SCSI, Fibre Channel, or iSCSI, for storage connections

Front-end controllers

route data to and from cache via the internal data bus. When cache receives write data, the controller sends an acknowledgment

Components of an Intelligent Storage System

• • • Controllers optimize I/O processing by using command queuing algorithms

Command queuing

is a technique implemented on front-end controllers It determines the execution order of received commands and can reduce unnecessary drive head movements and improve disk performance

Intelligent Storage System: Cache

• Cache is an important component that enhances the I/O performance in an intelligent storage system.

• Cache improves storage system performance by isolating hosts from the mechanical delays associated with physical disks, which are the slowest components of an intelligent storage system. Accessing data from a physical disk usually takes a few milliseconds • Accessing data from cache takes less than a millisecond. Write data is placed in cache and then written to disk

Cache Data Protection

Cache mirroring:

Each write to cache is held in two different memory locations on two independent memory cards •

Cache vaulting:

Cache is exposed to the risk of uncommitted data loss due to power failure • using battery power to write the cache content to the disk storage vendors use a set of physical disks to dump the contents of cache during power failure

Intelligent Storage System: Back End

• It consists of two components: back-end ports and back-end controllers • Physical disks are connected to ports on the back end. • The back end controller communicates with the disks when performing reads and writes and also provides additional, but limited, temporary data storage. • The algorithms implemented on back-end controllers provide error detection and correction, along with RAID functionality. Controller • Multiple controllers also facilitate load balancing

Intelligent Storage System: Physical Disks

• Disks are connected to the back-end with either SCSI or a Fibre Channel interface

What is LUNs

• Physical drives or groups of RAID protected drives can be logically split into volumes known as logical volumes, commonly referred to as

Logical Unit Numbers

(LUNs)

High-end Storage Systems

• High-end storage systems, referred to as

active-active arrays ,

are generally aimed at large enterprises for centralizing corporate data • These arrays are designed with a large number of controllers and cache memory • An active-active array implies that the host can perform I/Os to its LUNs across any of the available Paths

Midrange Storage Systems

• Also referred as path fails Active-passive arrays • Host can perform I/Os to LUNs only through active paths • Other paths remain passive till active • Midrange array have two controllers, each with cache, RAID controllers and disks drive interfaces • Designed for small and medium enterprises • Less scalable as compared to high-end array

CLARiiON Whiteboard Video

DAS

DAS

Direct-Attached Storage (DAS)

• storage connects directly to servers • applications access data from DAS using block-level access protocols • Examples: • internal HDD of a host, • tape libraries, and • directly connected external HDD

DAS

Direct-Attached Storage (DAS)

• DAS is classified as internal or external, based on the location of the storage device with respect to • the host.

Internal DAS:

storage device internally connected to the host by a serial or parallel bus • distance limitations for high-speed connectivity • can support only a limited number of devices, and • occupy a large amount of space inside the host

DAS

Direct-Attached Storage (DAS)

External DAS:

server connects directly to the external storage device • usually communication via SCSI or FC protocol. • overcomes the distance and device count limitations of internal DAS, and • provides centralized management of storage devices.

DAS Benefits

• Ideal for local data provisioning • Quick deployment for small environments • Simple to deploy • Reliability • Low capital expense • Low complexity

DAS Connectivity Options

• host  storage device communication via • • protocols •

ATA/IDE and SATA

– Primarily for internal bus

SCSI

– Parallel (primarily for internal bus) – Serial (external bus)

FC

– High speed network technology

DAS Connectivity Options

• protocols are implemented on the HDD controller • a storage device is also known by the name of the protocol it supports

DAS Management

• LUN creation, filesystem layout, and data addressing •

Internal

– Host (or 3 rd party software) provides: • Disk partitioning (Volume management) • File system layout

DAS Management

External

– Array based management – Lower TCO for managing data and storage Infrastructure

DAS Challenges

• limited scalability • Number of connectivity ports to hosts • Number of addressable disks • Distance limitations •For internal DAS, maintenance requires downtime • Limited ability to share resources (unused resources cannot be easily re-allocated) – Array front-end port, storage space – Resulting in islands of over and under utilized storage pools

Introduction to SCSI

•SCSI–3 is the latest version of SCSI

SCSI Architecture

Primary commands common to all devices

SCSI Architecture

Standard rules for device communication and information sharing

SCSI Architecture

Interface details such as electrical signaling methods and data transfer modes

SCSI Device Model

• SCSI initiator device – Issues commands to SCSI target devices – Example: SCSI host adaptor

SCSI Device Model

• SCSI target device – Executes commands issued by initiators – Examples: SCSI peripheral devices

SCSI Device Model

• Device requests contain

Command Descriptor Block (CDB)

SCSI Device Model

• CDB structure – 8 bit structure – defines the command to be executed – contains operation code, command specific parameter and control parameter

SCSI Addressing

a number from 0 to 15 with the most common value being 7

SCSI Addressing

a number from 0 to 15

SCSI Addressing

a number that specifies a device addressable through a target

SCSI Addressing Example

controller target device

Areas Where DAS Fails

• Just-in-time information to business users • Integration of information infrastructure with business processes • Flexible and resilient storage architecture

The Solution?

• Storage Networking • FC SAN • NAS • IP SAN

What is a SAN ?

• Dedicated high speed network of servers and shared storage devices • Provide block level data access

What is a SAN ?

• Resource Consolidation – Centralized storage and management • Scalability – Theoretical limit: Appx. 15 million devices • Secure Access

Fibre Channel

Latest FC implementations support 8Gb/s

Fibre Channel

a high-speed network technology that runs on high-speed optical fiber cables (for front end SAN connectivity)

Fibre Channel

and serial copper cables (for back-end disk connectivity)

FC SAN Evolution

Components of SAN

• three basic components: • servers, • network infrastructure, and •storage, • can be further broken down into the following key elements: • node ports, • cabling, • interconnecting devices (such as FC switches or hubs), • storage arrays, and • SAN management software

Components of SAN: Node ports

• Examples of nodes – Hosts, storage and tape library • Ports are available on: – HBA in host– Front-end adapters in storage – Each port has transmit (Tx) link and receive (Rx) link • HBAs perform low level interface functions automatically to minimize impact on host performance

Components of SAN: Cabling

• Copper cables for short distance • Optical fiber cables for long distance – Single-mode • Can carry single beams of light • Distance up to 10 KM – Multi-mode • Can carry multiple beams of light simultaneously • Distance up to 500 meters

Components of SAN: Cabling

Components of SAN: Cabling (connectors)

Node Connectors: • SC Duplex Connectors • LC Duplex Connectors Patch panel Connectors: • ST Simplex Connectors

Components of SAN: Interconnecting devices

– Hubs – Switches and – Directors

Components of SAN: Storage array

• storage consolidation and centralization • provides – High Availability/Redundancy – Performance – Business Continuity – Multiple host connect

Components of SAN: SAN management software

• A suite of tools used in a SAN to manage the interface between host and storage arrays • Provides integrated management of SAN environment • Web based GUI or CLI

SAN Interconnectivity Options: FC-AL

Fibre Channel Arbitrated Loop (FC-AL) – Devices must arbitrate to gain control – Devices are connected via hubs – Supports up to 127 devices

SAN Interconnectivity Options: FC-SW

Fabric connect (FC-SW) – Dedicated bandwidth between devices – Support up to 15 million devices – Higher availability than hubs

Network-Attached Storage

Think "File Sharing"

Sharing Files

Sharing Files

2.2 GB

4 GB

Sharing Files

Sharing Files

Sharing Files

What is NAS?

What is NAS?

• IP-based file sharing device attached to LAN • Server consolidation • File-level data access and sharing

Why NAS?

dedicated to file-serving

Benefits of NAS

•Support comprehensive access to information •Improves efficiency and flexibility •Centralizes storage •Simplifies management •Scalability •High availability – through native clustering •Provides security integration to environment (user authentication and authorization)

file sharing protocols IP network NICs CPU and Memory NAS OS storage protocols (ATA, SCSI, or FC)

Benefits: •Increases performance throughput (service level) to end users •Minimizes investment in additional servers •Provides storage pooling •Provides heterogeneous file servings •Uses existing infrastructure, tools, and processes

Benefits: •Provides continuous availability to files •Heterogeneous file sharing •Reduces cost for additional OS dependent servers •Adds storage capacity non disruptively •Consolidates storage management •Lowers Total Cost of Ownership

IP SAN

Celerra Whiteboard Video

Driver for IP SAN

• In FC SAN transfer of block level data takes place over Fibre Channel • Emerging technologies provide for the transfer of block-level data over an existing IP network infrastructure

Why IP?

• Easier management • Existing network infrastructure can be leveraged • Reduced cost compared to new SAN hardware and software • Supports multi-vendor interoperability • Many long-distance disaster recovery solutions already leverage IP-based networks • Many robust and mature security options are available for IP networks

Block Storage over IP - iSCSI

• SCSI over IP • IP encapsulation • Ethernet NIC card • iSCSI HBA • Hardware-based gateway to Fibre Channel storage • Used to connect servers

Block Storage over IP - FCIP

• Fibre Channel-to IP bridge / tunnel (point to point) • Fibre Channel end points • Used in DR implementations

iSCSI ?

• IP based protocol used to connect host and storage • Carries block-level data over IP-based network • Encapsulate SCSI commands and transport as TCP/IP packet

Components of iSCSI

• iSCSI host initiators – Host computer using a NIC or iSCSI HBA to connect to storage – iSCSI initiator software may need to be installed • iSCSI targets – Storage array with embedded iSCSI capable network port – FC-iSCSI bridge • LAN for IP storage network – Interconnected Ethernet switches and/or routers

• No FC components • Each iSCSI port on the array is configured with an IP address and port number – iSCSI Initiators Connect directly to the Array

• Bridge device translates iSCSI/IP to FCP – Standalone device – Integrated into FC switch (multi-protocol router) • iSCSI initiator/host configured with bridge as target • Bridge generates virtual FC initiator

• Array provides FC and iSCSI connectivity natively • No bridge devices needed

FCIP (Fibre Channel over IP)?

• FCIP is an IP-based storage networking technology • Combines advantages of Fibre Channel and IP • Creates virtual FC links that connect devices in a different fabric • FCIP is a distance extension solution – Used for data sharing over geographically dispersed SAN

FCIP (Fibre Channel over IP)?

FCoE Whiteboard Video

Question 1

What was EMC’s revenue in 2009?

A. 60 Billion C. 14 Billion Ask a Colleague 50:50 B. 46.2 Billion D. 9 Billion

EMC Corporation

2009 At a Glance

Revenues Net Income Employees Countries where EMC does business R&D Investment Operating Cash Flow Free Cash Flow Founded

$14 billion $1.9 billion ~41,500 >80 ~$1.5 billion $3.3 billion $2.6 billion 1979

112

IDC Digital Universe Study

IDC – May 2010

Question 2

How much digital information was created worldwide in 2009?

A. 846 Terabytes B. 686 Petabytes C. .8 Zettabytes Ask a Colleague 50:50 D. 2502 Exabytes

The Digital Universe 2009-2020

Growing by a Factor of 44 2009: 0.8 ZB

One Zettabyte (ZB) = 1 trillion gigabytes

Source: IDC Digital Universe Study, sponsored by EMC, May 2010

2020: 35.2 Zettabytes

1.2 ZB in 2010 is Equal to . . .

75 Billion Fully Loaded 16GB iPads

What is Driving the Digital Explosion?

Web 2.0 Applications Ubiquitous Content-Generating Devices

3G/4G

Longer Data Retention Periods

SEC 17a-4 Freedom of Information Act HIPAA Sarbanes-Oxley

Regulation Landscape

Secure Collaboration

Data Center

1

Data

3 Local Copies 5 Backup copy

Remote Site

2 4 Remote Copies 6 Copy for archiving

Question 3

What percentage of the .8 zettabytes of digital information is created by individuals?

A. 30% C. 70% Ask a Colleague 50:50 B. 50% D. 90%

The Digital Information World

Individuals create data …companies manage it!

Corp.

Ind.

Of the digital universe will be created by individuals

Create

Source: IDC Digital Universe Study, sponsored by EMC, May 2010

Corp.

Of the digital universe will be the responsibility of companies to manage and secure Ind.

Manage

Question 4

How much storage capacity was available on the first Symmetrix 4200 that EMC shipped in 1990?

A. 24 Gigabytes B. 240 Gigabytes C. 24 Terabytes Ask a Colleague 50:50 D. 2502 Exabytes

ADIC Scalar family

EMC’s Tiered Storage Platforms

iSCSI

Broadest Range of Function, Performance, and Connectivity

Fibre Channel IP FICON SAN NAS CAS DL740

DL4200 DL4400

EMC Centera CLARiiON CX3 UltraScale Series NS500 NS700 NS704 Invista Connectrix

iSCSI

DMX-3 950

SATA 250 GB 7,200 rpm

DL710 DL720 EMC Centera 4-Node FC & iSCSI AX150

SATA 500 GB 7,200 rpm

NS500G NS700G NS704G Rainfinity Global File Virtualization

Fibre Channel 73 GB 10k/15k rpm Fibre Channel 146 GB 10k/15k rpm

1990

Symmetrix 4200 Integrated Cached Disk Array introduced with a capacity of 24 gigabytes.

Fibre Channel 300 GB

DMX800 DMX1000

10k rpm Low-cost Fibre Channel 500 GB 7,200 rpm

2009

Symmetrix V-Max Systems are available with up to 2 petabytes of usable storage in a single system.

Managing Information Storage Trends, Challenges and Options

EMC – 2010-2011

Question 6

What is the number 1 challenge identified by IT and storage managers?

A. Storage consolidation C. Managing storage growth B. Designing & deploying multi-site environments D. Making informed picture decisions Ask a Colleague 50:50

Digital Information Storage Challenges

Most important activities/constraints identified as challenges by IT/storage managers

1. Managing Storage Growth 2. Designing, deploying, and managing backup and recovery 3. Designing, deploying, and managing storage in a virtualized server environment 4. Designing, deploying, and managing disaster recovery solutions 5. Storage consolidation 6. Making informed strategic / big-picture decisions 7. Integrating storage in application environments (such as Oracle, Exchange, etc.) 8. Designing and deploying multi-site environments 9. Lack of skilled storage professionals

*Source

Input from over 1,450 storage professionals worldwide  http://education.EMC.com/ManagingStorage/

Managing Information Storage: Trends, Challenges and Options 2010-2011

Building an Effective Storage Mgmt Organization

Hire an additional 22%+ storage professionals . . .

Based on EMC study ‘ Managing Information Storage: Trends, Challenges & Options (2010-2011)’ www.emc.com/managingstorage

Where Managers Plan to Find Storage Expertise

Based on EMC study ‘ Managing Information Storage: Trends, Challenges & Options (2010-2011)’ www.emc.com/managingstorage

Top IT Certifications by Salary

Source: Certification Magazine, December 2009

Storage Role Across IT Disciplines

• • • • •

Leverage the functionalities of storage technology products to…..

Systems Architects/Administrators – Maximize performance, increase availability, and avoid costly server upgrades. Network Administrators – Maximize performance of your network and to help you plan in advance. Database Administrators – Maximize performance, increase availability, and realize faster recoverability of your database.

Application Architect – Increase the performance and availability of your application IT Project Managers – Plan & execute your IT Projects, which involve or are impacted by Storage technology components

EMC Academic Alliance

Key Pillars of IT

Businesses IT perspective on the data center in the last 20 years have focused on 4 pillars of Information Technology: operating systems, databases, networking, and software application development

Based on today’s IT infrastructure, Information Storage is the 5th pillar of IT!

Question 7

What is the name of the EMC authored booked that was released in May 2009?

A. Storage Area Networks for Dummies B. Storage Networks Explained C. Administering Data Centers D. Information Management Ask a Colleague 50:50

Information Storage and Management (ISM)

Modules Section 1.

Storage System

Section 2.

Storage Networking Technologies & Virtualization

Section 3.

Business Continuity

Section 4.

Storage Security & Management

http://education.EMC.com/ismbook

Information Storage and Mgmt (ISM)

Section 1.

Storage System

Section 1.

Section 2.

Section 3.

Section 4.

‘Open’

Data and Host, Connectivity,

Experienced

Structured and Unstructured Data Block-Level and File Level Access Storage Technology Architectures Core Elements of a Data Center File System and Volume Manager Storage Media and Devices Disk Components Information Management Information Lifecycle Management Zoned Bit Recording Logical Block Addressing Little’s Law and the Utilization Law Hardware and Intelligent

Aspiring

Striping, Front-End Mirroring, and Command Parity RAID Write Penalty Queuing Cache Mirroring and Vaulting Hot Spares Logical Unit Number (LUN) LUN Masking High-end Storage System Midrange Storage System

Information Storage and Mgmt (ISM)

Section 2.

Storage Networking Technologies and Virtualization

KEY CONCEPT COVERAGE Key initiatives for all companies

Internal and Storage Fixed Content NAS Device External DAS Consolidation

Consolidation

SCSI Architecture (FC) Architecture Sharing and Archives Single-Instance Storage

Section 1.

Section 2.

SCSI Addressing Fibre Channel Protocol Stack NAS Connectivity and Protocols Object Storage and Retrieval

Section 3.

Section 4.

‘Open’

iSCSI Protocol Native and Bridged iSCSI FCIP Protocol Fibre Channel Ports Fibre Channel Addressing World Wide Names (WWN) Zoning NAS Performance and Availability MTU and Jumbo Frames Memory Virtualization Network Virtualization Content Authenticity Storage Virtualization In-Band and Out of-Band Topologies Server Virtualization Block-Level and File Level Virtualization

Information Storage and Mgmt (ISM)

Section 3.

Business Continuity

Section 1.

Section 2.

Section 3.

Section 4.

‘Open’

Business Information Availability Disaster

/ Business

Operational Backup Archival Bare-Metal Business Impact Analysis Backup Architecture Backup Topologies 5 Library Maximize Data Availability Data Consistency Host-Based Local Synchronous and

Remote Site

Replication LVM-Based Replication Replication Array-Based Local Replication Shipping Copy on First Access (CoFA) Replication Copy on First Write (CoFW) Replication Restore and Data Consistency Copy for archiving Minimize chances of data loss

Information Storage and Mgmt (ISM)

Section 4.

Storage Security and Management

Section 1.

Section 2.

Section 3.

Section 4.

‘Open’

Storage Security Framework The Risk Triad Alerts Management Platform Standards Internal Chargeback Security Domain Infrastructure Right Management Access Control Consolidated Virtualized

and in the Cloud Data storage security considerations

EMC Academic Alliance

Developing tomorrow’s Information Storage Professionals…today!

Partnering with leading Institutes of Higher Education worldwide to bridge the storage knowledge gap in Industry

Providing EMC, Customers and Partners with source to hire storage educated graduates

Hundreds of institutions globally, educating thousands of students

Offering unique ‘open’ course on Information Storage and Management

Focus on concepts and principles

Opportunity for EMC to give back as the industry leader

For the latest list of participating institutions and to introduce us to your Alma Mater, visit http://education.EMC.com/academicalliance

Becoming an Academic Partner

Required Steps . . .

1. Institution enrolls via the EAA online application.

http://info.emc.com/mk/get/EAA_APPL_form?src=&HBX_Account_Number=emc-emccom 2. Institution identifies faculty to teach course and administer the program.

3. Institution identifies faculty to attend the 5 day ISM Faculty Readiness Seminar (FRS) and clear ISM certification exam.

4. Institution accesses secure Faculty website to download teaching aids such as chapter PowerPoints, quizzes, simulators, etc.

5. Institution promotes ISM course to students.

6. Institution schedules and begins teaching the ISM course.

Summary

• Information storage is one of the fastest growing sectors within IT.

• Information growth and complexity creates challenges and career opportunities • Business and industry are looking for IT professionals who know all 5 pillars.

• Those who obtain the skills through formal education and industry qualification have an advantage.