DCS834 Computer Networking - Seidenberg School of Computer

Download Report

Transcript DCS834 Computer Networking - Seidenberg School of Computer

DSC861A Emerging Technology

Storage Virtualization Team 3

Jennifer Brola-Richards Mohib Fanek Kathy Larson Donovan Miles Vishu Reddy Fran Trees 1

Presentation Outline

Storage Virtualization

What is storage virtualization and why storage virtualization?

Storage Evolution and Fundamental Concepts

What are innovations and fundamental concepts associated with storage?

 

Storage Virtualization Deep Dive

What, Where and How of Storage Virtualization?

 

Case Study Research Topics in Storage Virtualization

What are potential topics of research and dissertation?

Summary and Verbal Quiz

2

What is storage virtualization?

Storage Virtualization is the next frontier in Storage Advances that aims to provide a layer of

abstraction to reduce complexity.

Storage Networking Industry Association (SNIA) defines Storage Virtualization as: 1. The act of abstracting, hiding, or isolating the internal functions of a storage (sub) system or service from applications, host computers, or general network resources, for the purpose of enabling application and network independent management of storage or data.

2. The application of virtualization to storage services or devices for the purpose of aggregating functions or devices, hiding complexity, or adding new capabilities to lower level storage resources.

3

Why storage virtualization?

Storage Virtualization aims to provide a layer of

abstraction to manage storage and reduce complexity !!!

Provided continuous availability despite exponential growth (e.g.

FaceBook

-

Over 55 billion page views a month, 41 million active users 1 )

Effectively group and manage heterogeneous storage devices & servers (

e.g. Estimated number of Google Servers 450,000 2 !)

Mergers and Acquisitions

(e.g. Microsoft & Yahoo!)

(1) Lucas Nealan, php|works, Atlanta September 13, 2007 (2) Wikipedia Allocate and manage storage in accordance to the Quality of Service (QoS) associated with the data

(e. g. Gartner estimates average data center doubling its storage every 18 to 24 months) !)

Multiple Storage Software Platforms (

e.g. IBM, EMC, HP,..)

4

What are the innovations and fundamentals associated with storage?

Client side storage innovations

… variety of

storage device innovations

that are smaller, higher capacity and cheaper have helped end users cope with increasing storage requirements!

5

What are the innovations and fundamentals associated with storage?

Server side storage innovations

a combination of storage devices, storage interfaces and storage software innovations

have helped enterprises cope with exponential growth of data storage requirement !

Storage devices

have evolved from tapes to hard drives to RAID hard drives increasing capacity and resiliency. 6

What are the innovations and fundamentals associated with storage?

Storage interface innovations have evolved from

SCSI to ISCI, Fiber Channel (FCP) and InfiniBand to inter

connect devices and transport the data faster.

SCSI ISCSI FCP Infiniband 7

What are the innovations and fundamentals associated with storage?

Storage Access File level access takes center stage along with conventional Block level access.

Block level access

: Block addresses are used to Read/Write data [Read/Write, Block #] to the storage media.

Sample conventional Block Allocation Map

File level access

: Files are accessed by "semantics" instructions [example: Open, Close]. Data inside files is accessed by byte-ranges within the file (example: the first 10 bytes of a file). GFS (Google File System) is an example of a large scale distributed file system.

8

What are the innovations and fundamentals associated with storage?

Metadata

is Data about data; in the context of storage metadata may describe an individual datum, or content item, or a collection of data including multiple content items. Examples include: file size, who created file, attributes such as read only, free block bitmaps, control data.

9

What are the innovations and fundamentals associated with storage?

Storage Software from simple back-up and restore to advanced storage networks and storage management software functions.

(A) Simple Direct Attached Storage (DAS) (B) Storage Area Network (SAN) (C) Network Attached Storage (NAS)

10

What are the innovations and fundamentals associated with storage?

SAN and NAS: Key Differences

Access Methods

Access Medium Architecture Transport Protocol Efficiency Sharing and Access Control Typical Applications Typical Clients NAS

File access

SAN

Disk block access

Ethernet Decentralized Layer over TCP/IP SCSI/FC and SCSI/IP Less Fiber Channel Centralized More Good Web Workstations Poor Database Database servers 11

What and Where can Storage be Virtualized?

SNIA Storage Model 3 Potential Areas of Virtualization

File Level Virtualization *

2 4

* *

6 5

Host Level Virtualization Network Virtualization Block Virtualization Device Virtualization

1

Source: The Storage Networking Tutorials, SNIAVIRT- Page 20 http://www.snia.org/education/tutorials/

*

Host aka Server

**

Device=aggregation of Host and Network (Meta Data)

Storage Level Virtualization 12

What and Where can Storage be Virtualized?

Storage Virtualization: Innovations and Trends 1

Storage Device Level Virtualization

2

Host Level Virtualization

3

File Level Virtualization

4

Block Virtualization

Historical:

Mainframe

Recent development example

: VMware

Historical

: RAID Level, SCSI Interface

Recent Development Examples

: Fiber Channel

Historical:

Mainframe

Recent development example

: NAS Sub-Technique

5

Device Virtualization Sub-Technique

6

Network Virtualization

Major innovations continue to emerge even in historical areas of storage virtualization Symmetrical (aka in-band) and Asymmetrical (aka Out-of-Band) are emerging as key areas of abstraction and virtualization.

13

How is storage virtualized at the enterprise level?

Currently Networks are virtualized using Metadata or Storage Volume Controllers. There are two types of network virtualization… Metadata or Storage Volume Controllers (SVC) are placed (in band) or in the path of data flow.

Metadata or Storage Volume Controllers are placed (out of band) outside the path of data flow.

Source: IBM Redbook Page 8 http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf

14

How is storage virtualized at the enterprise level?

In-Band Virtualization

1 Metadata or Storage Volume Controllers (SVC) are placed (in band) or in the path of data flow.

3 SVC are managed through Storage Management Software.

Source: IBM Redbook Page 10 http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf

4 Key Challenge is the potential IO bottlenecks 2 SVC controls who can get access to the storage device controls, how storage can be accessed, how storage is allocated, etc.

15

How is storage virtualized at the enterprise level?

Out-of-Band Network Virtualization

2 Host sends Metadata to SVC SVC controls who can get access to the storage device controls, how storage can be accessed, how storage is allocated, etc.

4 1 Metadata or Storage Volume Controllers (SVC) are placed (in band) or in the path of data flow.

Source: IBM Redbook Page 12 http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf

3 Storage Pool sends Metadata to SVC 16

How is storage virtualized at enterprise level?

Virtualization Implementation Example

HIGH LEVEL DIAGRAM _ Typical Primary/Secondary site data replication with Storage Virtualization Ethernet Type 1 SAN Storage with_52TB (xxx) pSeries server(s) (xxx) Blade server(s) Virtualization Engine Blade SAN Fabric xSeries server Monitor Library wi LT03 drives Type 2 SAN Storage_ 40 TB San Fabric B Director Monitor

3 Com

San Fabric A Director Type 2 SAN Storage

26TB ea

CISCO SYSTEMS CISCO SYSTEMS (2) Cisco 6509 switch SAN Fabric A San Fabric A Director SD SAN Fabric B VPN Comm-link for remote support

PRIMARY SITE Environment:PROD, DEV, QA, SIT Application:App1, App2

Pwr Management VLAN _ QA/ director _ 950 PROD_ Blades + Blade Fabric_ 955 Type 2 SAN Storage Network Appliances

Network Appliances

Pwr

DWDM

SD

Network Appliances SECONDARY SITE Environment:Prod Application:App1, App2

CISCO SYSTEMS CISCO SYSTEMS SAN Fabric

A

SAN Fabric

B

VPN Comm-link for remote support

3 Com

Library wi LTO3 drives Type 1 Storage San Fabric A San Fabric B Virtualization Engine Monitor (xxx) pSeries server(s) (xxx) xSeries server Virtualization Engines Ethernet 17

Case Study

The Study

1.

2.

Shows that commingling of data and meta-data on a single logical device means that there is no way to achieve different service level objectives for data and meta-data in the same file system, without moving file-system specific knowledge into the logical disk layers.

Shows that the standard assumptions underlying the organization of data and meta-data in file systems are no longer valid in virtualized storage environment and hence fail to materialize the full benefits of storage virtualization.

Proposes

a different file system organization of data and meta-data designed to exploit the power of virtualized storage.

18

Case Study

Service Level requirements within a single file system • Organization A Needs No Encryption • Organization B_ Needs Encryption – Stores Medical Records – Security requirements for file data is extremely high. – Performs nightly indexing operation on file systems – All directory information and file access times must be read to determine “changed” state of data – Business requirement that all file data be encrypted at rest.

– File meta data has no security requirement In Unix fast file system (ffs), a logical disk is divided into collections of blocks called cylinder groups, each of which stores both file data blocks as well as file meta-data 19

Case Study

• •

Results

Clean logical separation between data and metadata Allows file system feature to use virtualization features and achieve different SLO’s •

Redesign

changes – Code change – Packing the re-located cylinder group header in the first few meta data cylinder groups ensures each header is located @ a fixed, predictable offset from the front of the block device – User configurable block address space before which no data stored and after no meta data stored 20

Case Study

5-7% gains on the new file system layout 31-44% for the file lookup and file delete benchmarks, which result in little or no file data i/o, the advantage of data-only encryption become obvious • • •

Future Work

Differing SLO’s for granular meta data Completely separate fixed/dynamic metadata Separate file data from user defined file attribute data 21

What are potential topics of research and dissertation?

Sample Research Topics in Storage Virtualization

 Bayesian analysis for resource management  Bayesian analysis for diagnostics  Trusted domains for security  Storage Virtualization and Metadata Standards  Algorithm advances for block, device and other component virtualization techniques 22

Summary and Verbal Quiz

Storage Basics

1. What type of storage is found in your work station?

2. What type of storage systems may be found in a large enterprise?

3. How is data accessed from storage?

4. Network Attached Storage (NAS) is well suited for what type of applications?

5. Storage Area Network (SAN) is well suited for what type of applications?

Storage Virtualization

1. What is Storage Virtualization?

2. Where and What can be virtualized in storage?

3. How is storage virtualized at a network level?

4. How is storage virtualization currently implemented?

5. What are the potential research topics in storage virtualization?

23

Annotated References

1.

2.

3.

4.

5.

Faibish. S., Fridella S, Bixby P., and Gupta U., “Storage Virtualization using a Block-device File System” January 2008 ACM SIGOPS Operating Systems Review, Volume 42 Issue 1 Publisher: ACM The Storage Networking Tutorials, SNIAVIRT http://en.wikipedia.org/wiki/Metadata http://www.snia.org/education/tutorials/ http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf

Nealan.L., php|works, Atlanta September 13, 2007 http://sizzo.org/wp/wp-content/uploads/2007/09/facebook_performance_caching.pdf

6. http://en.wikipedia.org/wiki/Google_platform 24