Architecting FC HA Solutions for Filers

Download Report

Transcript Architecting FC HA Solutions for Filers

Architecting Fibre Channel HA Solutions

Rick Jooss [email protected]

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

NetApp Confidential -- Do Not Distribute 2

Agenda

CFModes

Single System Image

Multipathing

Host Clustering

Storage System Backend HA

Q&A

NetApp Confidential -- Do Not Distribute 3

CFMODE – Cluster Failover Mode

What is CFMODE?

– –

FCP Setting Determines behavior of FC Target Ports, particularly during a CFO event

Why is there more than one CFMODE?

Original CFMODE (standby) did not work for all host types (HP-UX, AIX)

Original CFMODE did not work with the 270C because it only has a single FC port

4 NetApp Confidential -- Do Not Distribute

Available Paths - Standby Mode Host Switch/Fabric 1 Switch/Fabric 2 HA Configuration Controller 1 0c 0d 0a 0b 0c 0d 0a 0b Controller 2 Solid Blue are paths to the LUNs being served by Controller 1 Dashed Purple are paths to the LUNs being served by Controller 2

LUNs LUNs NetApp Confidential -- Do Not Distribute 5

Path Access (Switch Failure) – Standby Mode Host Switch/Fabric 1 Switch/Fabric 2 Switch/Fabric 1 will experience a failure HA Configuration Controller 1 0c 0d 0a 0b 0c 0d 0a 0b Controller 2 Solid and Blue are paths to the LUNs being served by Controller 1 Dashed and Purple are paths to the LUNs being served by Controller 2

LUNs LUNs 6 NetApp Confidential -- Do Not Distribute

Path Access (CFO event) - Standby Mode Host Switch/Fabric 1 Switch/Fabric 2 Controller 2 Takes over all operations HA Configuration Controller 1 0c 0d 0a 0b 0c 0d 0a 0b Controller 2 Solid and Blue are paths to the LUNs being served by Controller 1 Dashed and Purple are paths to the LUNs being served by Controller 2

LUNs LUNs 7 NetApp Confidential -- Do Not Distribute

Path Access (CFO event) - Standby Mode Host Switch/Fabric 1 Switch/Fabric 2 Filer Head 2 Takes over experience a failure MP layer is not involved in switchover HA Configuration WWN1 0c WWN2 WWN3 0d 0a WWN4 0b WWN5 0c WWN6 WWN7 0d 0a WWN8 0b Controller 1 Controller 2 Solid and Blue are paths to the LUNs being served by Controller 1 Dashed and Purple are paths to the LUNs being served by Controller 2

LUNs LUNs 8 NetApp Confidential -- Do Not Distribute

Available Paths - Partner Mode

Host Switch/Fabric 1 Switch/Fabric 2 HA Configuration Controller 1 0c 0d 0a 0b 0c 0d 0a 0b Controller 2 Solid Blue are paths to the LUNs being served by Controller 1 Dashed Purple are paths to the LUNs being served by Controller 2

LUNs LUNs NetApp Confidential -- Do Not Distribute 9

Available Paths - Partner Mode – FAS3000 Default Configuration Host Switch/Fabric 1 HA Configuration Controller 1 0c 0d

LUNs NetApp Confidential -- Do Not Distribute 10

Switch/Fabric 2 0c 0d Controller 2 Solid Blue are paths to the LUNs being served by Controller 1 Dashed Purple are paths to the LUNs being served by Controller 2

LUNs

Available Paths - Dual Fabric Host Switch/Fabric 1 HA Configuration Controller 1 0c_0 0c_2

LUNs

Switch/Fabric 2 0c_2 0c_0 Controller 2 Solid Blue are paths to the LUNs being served by Controller 1 Dashed Purple are paths to the LUNs being served by Controller 2

LUNs 11 NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

NetApp Confidential -- Do Not Distribute 12

What is the single system image cfmode?

Universal cfmode

– –

Works on all HA storage systems Works on all switches

Presents the HA configuration as a single target

All LUNs are visible on all controller ports

All hosts require multipathing software

13 NetApp Confidential -- Do Not Distribute

Available Paths - Single System Image – Single Card Host Switch/Fabric 1 HA Configuration 0c Controller 1 0d

LUNs NetApp Confidential -- Do Not Distribute 14

0c Switch/Fabric 2 0d Controller 2 Solid Blue are paths to the LUNs being served by Controller 1 Dashed Purple are paths to the LUNs being served by Controller 2

LUNs

Path Access (Switch Failure) - Single System Image – Single Card Host Switch/Fabric 1 Switch/Fabric 2 MP layer works around the failure HA Configuration 0c Controller 1 0d 0c 0d Controller 2 Solid and Blue are paths to the LUNs being served by Head 1 Dashed and Purple are paths to the LUNs being served by Head 2

LUNs LUNs 15 NetApp Confidential -- Do Not Distribute

Path Access (CFO event) - Single System Image – Single Card Host Switch/Fabric 1 Switch/Fabric 2 Controller 2 takes over experience a failure MP layer works around the failure HA Configuration 0c Controller 1 0d 0c 0d Controller 2 Solid Blue are paths to the LUNs being served by Controller 1 Dashed Purple are paths to the LUNs being served by Controller 2

LUNs LUNs 16 NetApp Confidential -- Do Not Distribute

Available Paths - Single System Image – Single Port Host Switch/Fabric 1 HA Configuration 0d Controller 1

NetApp Confidential -- Do Not Distribute LUNs 17 LUNs

Switch/Fabric 2 0d Controller 2 Solid Blue are paths to the LUNs being served by Controller 1 Dashed Purple are paths to the LUNs being served by Controller 2

Available Paths - Single System Image – Single Port Host Loop Mode HA Configuration Controller 1 0d

LUNs NetApp Confidential -- Do Not Distribute 18 LUNs

Loop Mode 0d Controller 2 Solid Blue are paths to the LUNs being served by Controller 1 Dashed Purple are paths to the LUNs being served by Controller 2

Why SSI mode?

Works in all configurations

Makes us look more like other SAN vendors

Reduces port burn without using FC Loop

Fully redundant config requires only 1 “wire” per controller, instead of 2.

Simpler wiring, no a/b port distinctions and no requirement to run the same cables from each controller to the same switch.

19 NetApp Confidential -- Do Not Distribute

Management changes

Unified LUN mapping address space across the HA configuration.

Controller prevents these conflicts by checking with the partner controller.

If the controller interconnect is down, some operations are disabled by default

Igroup add, lun map, lun online, igroup set ostype

NetApp Confidential -- Do Not Distribute 20

SSI Roadmap

Introduced in ONTAP 7.1

Refer to FCP host compatibility matrix http://now.netapp.com/NOW/knowledge /docs/san/fcp_iscsi_config/index.shtml

for specific host support

NetApp Confidential -- Do Not Distribute 21

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

NetApp Confidential -- Do Not Distribute 22

Multipathing

Multipathing provides multiple paths from the host to the external storage device

Provides High-Availability

Protects against path failures

Ensures high availability of applications and data by eliminating single points of failure

Provides Improved Performance

Increases potential performance by utilizing multiple paths

23 NetApp Confidential -- Do Not Distribute

Multipathing Host Switch/Fabric 1 Switch/Fabric 2 HA Configuration 0c Controller 1 0d

LUNs NetApp Confidential -- Do Not Distribute 24

0c 0d Controller 2

LUNs

A/P (active passive) policy – Single LUN Hosts Switch/Fabric 1 Switch/Fabric 2 HA Configuration 0c Controller 1 0d

LUNs NetApp Confidential -- Do Not Distribute 25

0c 0d Controller 2

LUNs

A/P (active passive) policy – No Round Robining

Hosts Switch/Fabric 1 Switch/Fabric 2 HA Configuration 0c Controller 1 0d

LUN1 LUN2 NetApp Confidential -- Do Not Distribute 26

0c 0d Controller 2

LUN3 LUN4

A/P (active passive) policy - Round Robining

Hosts Switch/Fabric 1 Switch/Fabric 2 HA Configuration 0c Controller 1 0d

LUN1 LUN2 NetApp Confidential -- Do Not Distribute 27

0c 0d Controller 2

LUN3 LUN4

A/P (active/passive)

Active/Passive Configuration

1 active path to a single LUN

Performance to a LUN is limited by that paths capability (HBA, switch, target port)

Possible to round robin multiple LUNs across multiple paths

– –

All other paths to the LUN are passive On failover

• •

Primary paths are tried first Secondary paths are used if no primary paths are available

28 NetApp Confidential -- Do Not Distribute

A/A (Active active) policy (cfmode = standby) Hosts Switch/Fabric 1 Switch/Fabric 2 HA Configuration Controller 1 0c 0d 0a 0b 0c 0d 0a 0b Controller 2

LUNs LUNs NetApp Confidential -- Do Not Distribute 29

A/A (active/active)

Host accessing data from a single LUN across multiple paths simultaneously

Typically used for load balancing

• • •

Round Robin Least Queue Depth Weighted

On failure I/Os are sent down remaining available paths

NetApp Confidential -- Do Not Distribute 30

A/A/A (asymmetric active active) Host Switch/Fabric 1 Switch/Fabric 2 HA Configuration 0c Controller 1 0d

LUNs NetApp Confidential -- Do Not Distribute 31

0c 0d Controller 2

LUNs

A/A/A (asymmetric active active)

Distinguishes between primary and secondary paths

Does active/active across primary paths only

Only uses secondary paths when no primary are available

NetApp Confidential -- Do Not Distribute 32

NetApp’s Multipathing Strategy

2 pronged strategy

Support for “native” solutions

What most customers rightly feel best about

Support for host and storage independent solution

• •

VERITAS Allows common solution across various server as well as storage variants

NetApp Confidential -- Do Not Distribute 33

Multipathing For Windows

Windows MPIO

– – –

Uses the Microsoft standard infrastructure A/P Policy Automatically chooses primary paths for failover before trying proxy ones

In standby the LUNS are automatically round robined across all paths MPIO Partner/SSI cfmode A/P Standby cfmode Dual Fabric cfmode A/P A/P

34 NetApp Confidential -- Do Not Distribute

MultiPathing For Solaris

Partner/SSI cfmode Standby cfmode Dual Fabric cfmode DMP 4.0

A/A/A A/A A/P MPxIO A/P N/A A/P

NetApp Confidential -- Do Not Distribute 35

MultiPathing For Solaris

VERITAS DMP 4.0

NetApp ASL 4.0

Supports A/P, A/A, & A/A/A (Active Passive Concurrent)

SUN Native MPxIO

Not supported with standby cfmode

Supports A/P

– – – –

Can be A/A but required manual failback Manual configuration required Round Robining of the LUNs possible Sometimes called

Traffic Manager

Leadville Stack

36 NetApp Confidential -- Do Not Distribute

MultiPathing For Linux

Qlogic

– – –

A/P Policy Manually configured Round Robining of LUNs is possible

DCM

Linux native solution Partner/SSI cfmode Standby cfmode Dual Fabric cfmode Qlogic A/P A/P A/P

37 NetApp Confidential -- Do Not Distribute

DM A/A/A A/A A/P

MultiPathing For AIX

Partner/SSI cfmode Standby cfmode Dual Fabric cfmode DMP 4.0

A/A/A N/A A/P SANpath A/A/A N/A A/P MPIO A/A/A NA A/P

NetApp Confidential -- Do Not Distribute 38

MultiPathing For AIX

SANpath

– –

A/A/A Automatically chooses primary paths for failover before trying proxy ones

Special policy for SCSI-2 reservation

Required for host clustering HACMP

Can only use A/P

VERITAS DMP 4.0

Only supports A/A/A

IBM MPIO

IBM native solution with NetApp PCM

39 NetApp Confidential -- Do Not Distribute

Multipathing for HP-UX

Partner/SSI cfmode Standby cfmode Dual Fabric cfmode PVLinks A/P N/A A/P DMP 3.5

A/P N/A A/P

NetApp Confidential -- Do Not Distribute 40

Multipathing for HP-UX

PVlinks/LVM

A/P policy

– – –

Single active path per LUN, user controlled Ordering for remaining paths for failover ntap_config_paths

NETAPP script to define path ordering based on filer path types: primary, proxy

automatically round robin primary paths among all LUNS

Supports both FCP and iSCSI paths

VERITAS DMP 3.5

A/P Policy

41 NetApp Confidential -- Do Not Distribute

Multipathing for VMware

VMware

– – –

A/P Policy Manually configured Round Robining of LUNs possible Partner/SSI cfmode Standby cfmode Dual Fabric cfmode VMware A/P A/P A/P

42 NetApp Confidential -- Do Not Distribute

Multipathing for Netware

Novell

– – –

A/P Policy Manually configured Round Robining of LUNs possible Partner/SSI cfmode Standby cfmode Dual Fabric cfmode Novell A/P A/P A/P

43 NetApp Confidential -- Do Not Distribute

Fibre Channel SAN Host Support

Windows “NTAP DSM” Linux: Qlogic “Failover Mode” VMware Multipathing Solaris “DMP” Solaris “MPxIO” AIX “SANpath” HP-UX “PVLinks” Novell Partner/SSI cfmode

A/P A/P A/P A/A/A A/P A/A/A A/P A/P NetApp Confidential -- Do Not Distribute 44

Standby cfmode

A/P A/P A/P A/A N/A N/A N/A A/P

Dual Fabric cfmode

A/P A/P A/P A/P A/P A/P A/P A/P

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

NetApp Confidential -- Do Not Distribute 45

Host Clustering & Storage

LUNs need to be made visible to host simultaneously

Some Host Clustering solutions require SCSI reservations to avoid to split brain Host 1 Switch/Fabric 1 Controller 1 0b 0d 0c 0a Controller 2 Host 2 Switch/Fabric 2 0b 0c 0a 0d Controller 2 Active Shelf(s)

46 NetApp Confidential -- Do Not Distribute

Host Clustering for Microsoft

Microsoft Cluster

– –

SnapDrive is integrated to help configuration WIN2K3 allows single HBA for both boot device & shared storage

Cannot grow LUN online in cluster

SnapDrive ability to very quickly grow a LUN minimizes the pain caused by this

NetApp Confidential -- Do Not Distribute 47

Host Clustering for VERITAS

VCS

By default does not us I/O fencing to protect against split brain

– –

I/O fencing requires SCSI-3 reservations 7.0.3 will have SCSI-3 reservations that are compatible with VERITAS

Does not do failover on FC links

NetApp Confidential -- Do Not Distribute 48

Host Clustering for HP-UX

ServiceGuard

1 to 3 node clusters using SCSI-2 locks as arbitrator to avoid split brain

Does not do failover in dead FC links

NetApp Confidential -- Do Not Distribute 49

Host Clustering for AIX

HACMP

Uses SCSI-2 locks as arbitrator to avoid split brain

“setsp –b2” to enable locks with SANpath

SCSI-2 locks to active/active are mutually exclusive

NetApp Confidential -- Do Not Distribute 50

Fibre Channel SAN Host Support

OS Vendor HBA

Native Emulex QLogic QLogic Emulex Native Emulex QLogic QLogic NetApp Confidential -- Do Not Distribute

Multipath

SANpath

Host Cluster

HACMP

Volume Mgr File System

LVM JFS/2 Raw MPIO QLogic QLogic MSCS MMC NTFS Oracle 9i, 10g RAC Oracle 9i, 10g RAC Veritas VCS ext3 ext2 Reiser ext3 ext2 Reiser Veritas VxVM Veritas VxFS Veritas DMP HP PVLInks Veritas DMP VMWare MC ServiceGuard Veritas VCS MSCS VirtualCenter (VMotion) LVM Veritas VxVM VMware JFS/ HFS Raw Veritas VxFS VMFS 2.x

Raw QLogic Novell Clusters NSS 51

Shared Storage

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

NetApp Confidential -- Do Not Distribute 53

Protect Against Cable Pulls Or Breaks

Enables Dual Path HA

X X Protect Against Single HBA Failure Protect Against Storage Controller (eg. ESH2) Hot Swap Loop 1 Loop 2

NetApp Confidential -- Do Not Distribute

X Loop 3

54

Loop 4 Key Benefits

Full storage hardware redundancy in HA systems

Prevent cluster failover events due to many storage issues.

Complements CFO for improved HA and resiliency

Switched Back-End

Dual Active Paths for HA Environments

– Reduces the number of HA failovers – Improve overall HA performance – Data ONTAP tries to balance load across paths 

SyncMirror

– SyncMirror requires 100% disk overhead – Proper configuration survives all single failures 55 NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A?

NetApp Confidential -- Do Not Distribute 56