Capabilities are subject to change Packaging and licensing have not yet been determined Any screen captures or concepts shown are pre-release and.

Transcript Capabilities are subject to change Packaging and licensing have not yet been determined Any screen captures or concepts shown are pre-release and.

Capabilities are subject to change Packaging and licensing have not yet been determined Any screen captures or concepts shown are pre-release and for illustration purposes only Disclaimer This presentation contains preliminary information that may be changed substantially prior to final commercial release of the software described herein. The information contained in this presentation represents the current view of Microsoft Corporation on the issues discussed as of the date of the presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of the presentation. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESSED, IMPLIED, OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this presentation. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this information does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

Database sizes must be manageable Reseeds must be fast and reliable Capacity is increasing, but IOPS aren’t Passive copy disk IOPS inefficient Lagged copies have assymetric storage design Limited agility from low disk space recovery

Server1 Server2 Server3 DB1 Passive

20MB/Sec

DB1 Active DB1 Passive

DB2 Active

12MB/Sec

DB1 Active DB4 Active

12MB/Sec

DAG Server4 DB1 Passive DB4 Active • • Single database copy/disk: Reseed 2TB Database = ~23 hrs Reseed 8TB Database = ~93 hrs • • 4 database copies/disk: Reseed 2TB Disk = ~9.7 hrs Reseed 8TB Disk = ~39 hrs

Server1 Server2 Server3 Server4 50% IOPS 40% IOPS Utilization 50% IOPS 40% IOPS Utilization Utilization 50% IOPS 40% IOPS Utilization Utilization 50% IOPS 40% IOPS Utilization Utilization A/A/P/L

DAG

Exchange 2013 failover speed 2x better than Exchange 2010!

Decreased IO latency = better user experience

• • Exchange Server 2010 Passive database IOPS = active database IOPS Active = 100MB Checkpoint Depth Passives = 5MB Checkpoint Depth • Exchange Server 2013 Passive database IOPS = 50% of active database IOPS IOPS Savings + ESE Fast failover = 100MB Checkpoint Depth on passives with no failover perf penalty • • 25% increase in aggregate disk utilization, e.g., 4 database copies/disk Balanced = 1 Active, 2 Passives, 1 Lag • • Failure Mode: Actives/disk doubles* 2 Active, 1 Passive, 1 Lag 50% IOPS Utilization

passive database IO = active database IO passive copy performs aggressive pre reading background database maintenance runs at 5 MB/sec/copy

Single logical disk/partition per physical disk Database copies per volume = copies per database Same neighbors on all servers Balance activation preferences

Disk failure on active copy = database failover Failed disk & database corruption need to be addressed quickly Fast recovery to restore redundancy is needed

Use spares to automatically restore database redundancy after a disk failure

Automatic Reseed

Periodically scan for failed and suspended copies Check prerequisites: single copy, spare availability Allocate and remaps a spare Start the seed Verify that healthy copy Release the original disk

Configure storage subsystem with spare disks

AutoDagDatabasesRootFolderPath AutoDagVolumesRootFolderPath

Create DAG, add servers with configured storage Create directory and mount points Configure DAG, including 3 new properties Create mailbox databases and database copies

AutoDagDatabaseCopiesPerVolume = 1

MDB1 MDB1 DB MDB2 MDB1 logs MDB1 DB MDB1 MDB2 MDB1 logs

Name System Bad State Long I/O Times Replication service memory threshold (ok, not a storage failure



) Check

No threads, including non-managed threads, can be scheduled I/O operation latency MSExchangeRepl.exe

consumes excessive memory

Action

Hard restart (bugcheck) Hard restart (bugcheck) 1. Log event 4395 with termination message 2. Initiate termination of msexchangerepl.exe

3. If termination fails, hard restart (bugcheck)

Threshold

302 seconds 41 seconds 4 GB

Managed Availability Database Failover Changes Best Copy Selection Changes Maintenance Mode DAG Network Auto-Config Cmdlet Enhancements Transport HA Enhancements

If a protocol goes down on a mailbox server, every active database loses access to that protocol For most protocols, quick correction is provided through restart action If restart fails, often a failover is triggered • Protocols control recovery sequence • Recovery sequence optimized thru Office 365 experience; Service experience accrues to enterprise!

DAG MBX15-1

DB2 Layer 4 LB time

CAS15-1 CAS15-2 Managed Availability = Monitoring + HA

“Stuff breaks, but the Experience does not”

MBX15-2

OWA DB2

MBX15-3

OWA DB1 DB2

Provides • Reliable and scalable monitoring framework for Exchange components • Broader perspective across groups of Exchange servers Provides • Sequencing mechanism to control when recovery actions are done vs. alert issued (human engaged) • Common set of recovery actions Provides • Set of enhancements to the best copy selection (BCS) process • Mechanism to control in and out of service for Mailbox and CAS (maintenance mode++)

Restart Service - kill and start a service; optional dump AppPool - restart an app pool; optional dump Server bugcheck the machine Failover, Offline, Online Database - failover a single active database System- failover all active databases Protocol off - set health state for protocol to offline Protocol on calculate when a health set is green Escalate Notify a human of an issue

Checks for a server hosting a copy of the affected database that has all health sets in a healthy state Checks for a server hosting a copy of the affected database that has all health sets Medium and above in a healthy state Checks for a server hosting a copy of the affected database that has health sets in a state that is better than the current server hosting the affected copy Checks for a server hosting a copy of the affected database that has health sets in a state that is the same as the current server hosting the affected copy

cas1 cas2 Redmond cas3 cas4 Portland

1. Mark the failed servers/site as down: Stop-DatabaseAvailabilityGroup DAG1 –ActiveDirectorySite:Redmond 2. Stop the Cluster Service on Remaining DAG members: Stop-Clussvc 3. Activate DAG members in 2 nd datacenter: Restore-DatabaseAvailabilityGroup DAG1 –ActiveDirectorySite:Portland mbx1 mbx2 Redmond

dag1

mbx3 mbx4 Portland

dag1

mbx3 mbx4 Portland

namespace simplification consolidation of server roles separation of CAS array and DAG recovery de-coupling of CAS and Mailbox by AD site load balancing changes

three locations

Assuming MBX3 and MBX4 are operating and one of them can lock the witness.log file, automatic failover should occur If not, you can perform fast recovery using previous steps mbx1 mbx2 Redmond

dag1

witness mbx3 mbx4 Portland

Download the preview version of Exchange Server 2013 Try the new Exchange Online in the Office 365 Enterprise Preview Follow the Exchange Team Blog Product Documentation