BC and DR
BC and DR
Business Continuance
and Disaster Recovery
Maciej Bocian
mbocian@[Link]
Architecture Sales Manager
Data Center and Virtualization, Central Europe
CCIE#7785
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 1
Business Continuance Drivers
• Regulatory
g y mandates ((Homeland Hurricanes
grids
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 2
Business Continuance Is More Critical than Ever
75% of IT decision-makers have altered Disaster
Recovery/Business Continuance programs as a
result of September 11
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 3
Agenda
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 4
The Evolution of
Data Centers
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 5
Data Center Evolution
NETWORKED DATA
CENTER PHASE
Data Center
Continuous
Data Center Availability
Distributed
Data Center
Network Consolidation
COMPUTE Optimization
Internet
Data Center
Business Agility
EVOLUTION Computing
Client/ Networking
Server 1. Consolidation
1
Mainframes 2. Integration
Content 3. Distributed
Networking 4. High Availability
Thin Client: HTTP
TCP/IP
NETWORK
Terminal EVOLUTION
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 7
What is Distributed Data Center
Data Replication
FC FC
Primaryy Secondaryy
Data Center Data Center
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 8
Why Distributed Data Centers
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 9
Front-end IP Access Layer
y
“Content Routing”
APP A APP B site selection APP A APP C
FC FC
Primaryy Secondaryy
Data Center Data Center
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 10
Application
pp and Database Layer
y
“Content
Content Switching
Switching”
Load Balancing
APP A APP B “Server Clustering” APP A APP C
High Availability
FC FC
Primary Secondary
Data Center Data Center
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 11
Backend SAN Extension
FC FC
Primary
P i Secondary
S d
Data Center Data Center
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 12
Data Center Disaster
Recovery
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 13
Agenda
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 14
Disaster Recovery
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 15
What It means For Business
Business Resilience
Continued Operation of
Business During a Failure
Business Continuance
Restoration of Business
After a Failure
Disaster Recovery
Protecting
g Data Through
g Offsite
Data Replication
and Backup
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 16
Disaster Recovery Planning
• Risk Analysis
Identifies important functions and assets that are critical to
company’s
company s operations
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 17
Disaster Recovery Objectives
Recovery Point Objective (RPO)
Th point
The i t iin titime ((prior
i tto th
the outage)
t ) in
i which
hi h system
t and
dddata
t
must be restored to
Tolerable lost of data in event of disaster or failure
The impact of data loss and the cost associated with the loss
Recovery Time Objective (RTO)
The period of time after an outage in which the systems and data
must be restored to the predetermined RPO
The maximum tolerable outage time
R
Recovery AAccess Obj
Objective
ti (RAO)
Time required to reconnect user to the recovered application,
regardless where it is recovered
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 18
Recovery Point/Time vs. Cost
Critical data is Disaster Systems recovered
recovered strikes and operational
time
Recovery point Recovery time
time t0 time t1 time t2
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 20
Failure Scenarios
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 21
Network Failures
Internet Service
Service
P id A
Provider Provider B
ISP failure
9 Dual ISP connections
9 Multiple ISP
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 22
Device Failures
Internet Service
Service
Provider A Provider B
Hosts
9 HA cluster
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 23
Storage Failures
Internet Service
Service
P id A
Provider Provider B
Disk arrays
9 RAID
Disk Controllers
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 24
Site Failures
Internet Service
Service
P id A
Provider Provider B
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 25
Agenda
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 26
Cold Standby
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 27
Disaster Recovery – Active/Standby
FC FC
Primaryy Secondary
Data Center Data Center
(Cold Standby)
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 28
Warm Standby
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 29
Disaster Recovery – Active/Standby
IP/Optical Network
FC FC
Secondary
Primaryy
Data Center
Data Center
(Warm Standby)
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 30
Hot Standby
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 31
Disaster Recovery – Active/Standby
IP/Optical Network
FC FC
Primaryy Secondaryy
Data Center Data Center
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 32
Disaster Recovery – Active/Active
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 33
Multiple Tiers of Application
Internet Service
Service
P id A
Provider Provider B
Presentation Tier
Application Tier
Storage Tier
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 34
Active/Active Data Centers
Internal Internet
Network Service Service
Provider A Provider B Internal
Network
Active/Active Web
Hosting
Active/Active
Application Processing
Active/Standby
Database Processing
Or
Active/Active
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 35
Disaster Recoveryy
Components
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 36
Agenda
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 37
Site Selection Mechanisms
Site selection mechanisms depend on the technology
or mix of technologies adopted for request routing:
1. HTTP Redirect
2 DNS Based
2.
3. L3 Routing with Route Health Injection (RHI)
Health
H lth off servers and/or
d/ applications
li ti needs
d tto be
b
taken into account
Optionally, other metrics (like load ) can be measured
Optionally
and utilized for a better selection
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 38
HTTP Redirection – The Idea
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 39
HTTP Redirection – Traffic Flow
[Link]
[Link]
[Link]
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 40
Advantages of the HTTP Redirection
Approach
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 41
Limitations of the HTTP Redirection
Approach
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 42
DNS-Based
DNS Based Site Selection – The Idea
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 43
DNS-Based
DNS Based Site Selection – Traffic Flow
Root Name Server for/
Authoritative Name Server for .com
DNS Proxy 2
3 4 Authoritative Name Server
[Link]
5
1 6
10 7
8
Client 9 Authoritative
Name Server
[Link] [Link]
UDP:53
TCP 80
TCP:80
Protocol independent:
p works with any
y
application that uses name resolution
Minimal configuration changes in the current
IP and DNS infrastructure ((DNS authoritative
server)
Implementation can be different for specific
host names
A-records can be changed on the fly
Can take load or data center size into
account
Can provide proximity
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 45
Limitations of the DNS
DNS-Based
Based Approach
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 46
Route Health Injection – The Idea
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 47
Route Health Injection – Implementation
Router 10
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 48
Advantages of the RHI Approach
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 49
Limitations of the RHI Approach
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 50
Agenda
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 51
Cluster Overview
A cluster is two or more servers
configured to appear as one
Two types of clustering: Load
balancing (LB) and High
Availability (HA) Web Servers
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 52
HA Cluster Connections
Public Network (typically
Ethernet) for client /Application
requests
Servers with same hardware,
OS, and application software
Private Network (typically
Ethernet) for interconnection
between nodes. Could be direct
connect,
t or optionally
ti ll going
i
through the public network
Storage Disk (typically Fiber)
shared storage array
array, NAS or
SAN
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 53
Typical HA Cluster Components
Application
pp software that are clustered to p
provide High
g
Availability. Example: Microsoft Exchange, SQL, Oracle
database, File and Print Services
Operating System that runs on the server hardware.
E
Example:l Mi
Microsoft
ft Wi
Windows
d 2000 or 2003
2003, Linux
Li (and
( d the
th
other flavors of UNIX), IBM VMS or z/OS (for mainframe)
Cluster Software that provides the HA clustering service
for the application
application. Example: Microsoft MSCS
MSCS, EMC
AutoStart (Legato), Veritas Cluster Server, HP TruCluster
and OpenVMS
Optionally
Optionally, Cluster Enabler
Enabler, a software that synchronizes
the cluster software with the storage disk array software
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 54
Basic HA Cluster Design
Active/Standby:
– Active node takes client
requests and writing to the data
– Standby takes over when
detecting failure on active
– Two-node or multi-node
node1 node2
Active/Active:
– Database requests load
balanced to both nodes
– Lock mechanism ensures
data integrity
– Most scalable design
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 55
File System Approaches for HA Clusters
Shared Everything
y g
– Equal access to all storage
– Each node mounts all storage resources
– Provides a single layout reference system for all nodes
– Changes updated in the layout reference
Shared Nothing
– Traditional file system with peer-peer communication
– Each node mounts only its “semi-private” storage
– Data stored on the p
peer system’s
y storage
g is accessed via the p
peer-
peer communication
– Failed node’s storage needs to be mounted by the peer
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 56
Geo clusters
Geo-clusters
Geo-cluster: cluster that span multiple data centers
WAN
Local Remote
Datacenter Datacenter
node1 node2
Disk Replication
Synchronous or Asynchronous
2 x RTT
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 57
Considerations for HA Clusters
Split Brain: Cluster partitioning when nodes can not communicate with
each other but are equally capable of forming a cluster and mount disks.
Extended L2 required in most implementations for:
– Public Network
Network, since client only knows about the Virtual IP address
– Private Network, used for Heart-beats
Storage:
– Directly Attached Disk (DAS) cannot be used
– Shared Disk needs to be visible to both Nodes
– Needs to interface with cluster software for disk failover, zoning,
LUN masking when there is a node failure
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 58
Split Brain
Split-Brain
disk,
d s , tthus
us corrupting
co upt g data
Data Corruption
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 59
Resolution for Split Brain: Quorum
quorum
Application data
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 60
Extended Layer 2 Network
Disk Replication:
Synchronous or
Asynchronous
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 61
Resolution: L3 Routed Solution
In certain cases a L3 routed solution
is possible 172.28.210.x
11 20 5 x
11.20.5.x
Microsoft MSCS
– Requires that 2 nodes be on the node2
same subnet. node1
– The
Th communication
i ti bbetween
t th
the 2
nodes is UDP unicast
– Local Area Mobility (LAM) allows
the placement of the nodes on 2
different subnets
Extended SAN
Veritas VCS
– Allows having nodes with IP
addresses in different subnets
– The Virtual Address needs to
change when moving from node1 to
node2
– DNS can be used to p provide name-
multiple IP mapping Disk Replication:
Synchronous or
Asynchronous
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 62
Storage Disk Zoning
node1 node2
What storage
g disk array y
standby
should node 2 be zoned to active
before and after a failure on
node 1
To complete the failover you
need to change the zoning Extended SAN
configuration
Software needed to
synchronize the Cluster
Software with the Disk Array’s
software, i.e. Cluster Enabler
sym1320 sym1291
RW RD
RW RD
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 63
Resolution: Cluster Enabler
RW WD
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 64
Agenda
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 65
Terminology
Storage subsystem
Just a bunch of disks (JBOD)
Redundant array of independent disks (RAID)
Storage protocols
SCSI
iSCSI
FC (FCIP)
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 66
Terminology (Cont
(Cont’d)
d)
Direct Attached Storage (DAS)
Storage is
St i “local”
“l l” behind
b hi d th
the server
No storage sharing possible
Costly to scale; complex to manage
Network Attached Storage (NAS)
Storage is accessed at a file level over an IP network
St
Storage can be
b shared
h db between
t servers
Storage Area Networks (SAN)
Storage is accessed at a block
block-level
level
Separation of Storage from the Server
High performance interconnect providing high I/O throughput
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 67
Storage for Applications
Presentation Tier
U
Unrelated small data files commonly
y stored on internal disks
Manual distribution
Application Processing Tier
Transitional, unrelated data
Small files residing on file systems
Mayy use RAID to spread
p data over multiple
p disks
Storage Tier
Large, permanent data files or raw data
Large batch updates, most likely Real time
Log and data on separate volumes
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 68
Backup and Replication
Offsite tape vaulting
Backup tapes stored at offsite location
Electronic vaulting
Transmission of backup data to offsite location
Remote disk replication
Continuous copying of data to offsite location
Transparent to host
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 69
Replication: Modes of Operation
Synchronous
All data written to cache of local and remote arrays before I/O is
complete and acknowledged to host
Asynchronous
Write acknowledged after write to local array cache; changes
(writes) are replicated to remote array asynchronously
Semi-synchronous
Write acknowledged
g with a single
g subsequent WRITE command
pending from remote array
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 70
Synchronous Vs. Asynchronous Trade-
Off
Synchronous Asynchronous
Impact to Application No Application
Performance Performance Impact
Distance Limited (Are Both Unlimited Distance (Second
Sites within the Same Site Outside Threat Radius)
Threat Radius)
Exposure to
No Data Loss Possible Data Loss
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 71
Data Replication with DB Example
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 72
Data Replication with DB Example
(Cont d)
(Cont’d)
Failure or disaster occurs at
time t1
• Media Failure (e.g.
(e g disk)
time • Human Error (datafile deletion)
• Database Corruption
Hot Backup of
Database restored to state at time of failure (time t1)
Datafiles and by:
Control Files taken
at Time t0 1. Restoring Control Files & Datafiles from last Hot
Backup (time t0)
2. Sequentially replaying changes from subsequent
Redo Logs (archived and online) – changes made
between time t0 and t1
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 73
Data Replication with DB Example
(Cont d)
(Cont’d)
Primary Site Secondary Site
Redo Logs (Cyclic) Redo Logs (Cyclic)
Copy of Every Committed
Transaction Synchronously Replicated
Earlier DB
for Zero Loss
Backups
Database
SAN
Extension
E t i Database
Database Transport Copy at
copy at Time t0
Point in Time time t0
Copy Taken
When DB Replicated/Copied
Quiescent
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 74
Data Center Interconnection Options
Internet Internet
Stateful
Stateful Firewalls
Firewalls
Content
C t t
Content Caching
High
Caching SONET/SDH High
Density Density
Multilayer Server Server
Load Balancing Load Balancing Multilayer
LAN LAN
Switch Switch
Intrusion Intrusion
Detection Detection
DWDM/
CWDM
Back-End Application Back-End Application
Servers Servers
High
g High
Density Density
Multilayer Multilayer
SAN SAN
Director Director
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 75
Data Center Transport Options
Increasing Distance
Data
Center Campus Metro Regional National
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 76
Data Center Replication with SAN
Extension
Shared
Sh dDData
Extend the normal reach of
Cluster or
Remote Host
a Fibre Channel fabric
Access to
Storage Replication
Remote host to target array
Shared data clusters
SAN Extension
Network
FC FC
Replication
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 77
SAN Design for Data Replication
Server
Site A Access
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 78
Data Center
Disaster Recovery
sample design
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 79
Disaster Impact Radius
Global
Regional
< 400km
Secondary Primary
DR Site Data Center D t Center
Data C t
Metro
< 50km
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 80
Active/Standby Architecture - Today
CA CA NC
High Availability Site 1 High Availability Site 2 Disaster Recovery Site
Synch CWDM
MDS 9509’s Replication MDS 9509’s MDS 9509’s
Dual OC12
Synch FCIP Asynchronous
Replication FCIP Replication
Storage 1
Storage 2 Bunker Storage 3
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 81
Frame Based Replication
Data Center 1 Data Center 2
Production
Cluster D/R
SRDF
R2 BCV/R1 BCV
Timefinder Timefinder
PiT SRDF/A
PiT PROD D/R
SRDF/A
PiT Redo Redo
SRDF/A
PiT
Arch Arch
Triple Threat
EMC/DMX EMC/DMX EMC/DMX
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 82
Active/Active Architecture - Tomorrow
Mirror
Presentation Layer Asynchronous
Replication
DC1
Requests
directed to
primary
application
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 83
SANTap and Continuous Data Protection
Primary Secondary
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 84
Fabric Based Replication with CDP
Data Center 1 Data Center 2
Production
Cluster D/R
Replication/CDP Replication/CDP
Appliance Appliance
SANTap
DUAL OC12
MDS MDS
D/R
APiT APiT APiT
SRDF/A
PROD BCV
SRDF/A
Redo Redo
APiT APiT SRDF/A APiT
Arch Arch
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 85
End-End
End End Data Center Resilience
Corp.
DNS
GSS-1 GSS-2
DC-1 DC-2
DC-3
Web/APP
Server
Farm
DB
IP/Optical Network
CWDM/DWDM
FC
FC FC
Primary Secondary
Location Location
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 86
Summary - Design Details
Data centers 1 and 2 are in primary location with close
enough distance that can provide DC HA for active/active
access
Data Center 3 (DR) with > tolerable disaster radius, away
for Primary DC 1 and 2
Web/App server farms are load balanced geographically
DB servers are within a geo
geo-HA
HA cluster and running in a
L3 design
Synchronize
y Data replication
p between data centers within
the primary location
Asynchronous Data replication is done between the
primary and secondary storage systems
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 87
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 88