0% found this document useful (0 votes)

26 views88 pages

BC and DR

The document discusses the importance of business continuity and disaster recovery in data centers, highlighting the critical need for organizations to recover operations quickly after disruptions. It outlines various disaster recovery strategies, including cold, warm, and hot standby options, and emphasizes the necessity of planning and risk analysis to mitigate impacts from disasters. Additionally, it presents the evolution of data centers and the components necessary for effective disaster recovery, such as data replication and server high availability.

Uploaded by

ZABI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views88 pages

BC and DR

Uploaded by

ZABI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Data Center

Business Continuance
and Disaster Recovery

Maciej Bocian
mbocian@[Link]
Architecture Sales Manager
Data Center and Virtualization, Central Europe
CCIE#7785

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 1
Business Continuance Drivers

• Cost of application downtime, lost data

and productivity

• Regulatory
g y mandates ((Homeland Hurricanes

Defense, Basel II, HIPAA, GLB, SEC)

Firms must recover business operations the
same business day a disruption occurs
“Out-of-region” data center, 200+ km away
Mandates backup data centers on separate The Northeast Blackout

grids

NYC Blizzard of 2003

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 2
Business Continuance Is More Critical than Ever
75% of IT decision-makers have altered Disaster
Recovery/Business Continuance programs as a
result of September 11

Following a disaster 43% of directly affected

businesses do not reopen and 29% fail within 24
months as a result

Only 15% of Global 2000 enterprises have a full-

fledged business continuity plan.

Disasters: fire, storm, floods, earthquakes, chemical

accidents, nuclear accidents, wars
Sources: Disaster Recovery Journal, Gartner Group

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 3
Agenda

Introduction to Data Center - The Evolution

Data Center Disaster Recovery
Objectives
Failure Scenarios
Design Options
Components of Disaster Recovery
Site Selection - Front End GSLB
Server High Availability - Clustering
D t R
Data Replication
li ti anddS
Synchronization
h i ti - SAN E
Extension
t i
Sample Design

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 4
The Evolution of
Data Centers

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 5
Data Center Evolution
NETWORKED DATA
CENTER PHASE
Data Center
Continuous
Data Center Availability
Distributed
Data Center
Network Consolidation
COMPUTE Optimization
Internet
Data Center
Business Agility

EVOLUTION Computing
Client/ Networking
Server 1. Consolidation
1
Mainframes 2. Integration
Content 3. Distributed
Networking 4. High Availability
Thin Client: HTTP

TCP/IP
NETWORK
Terminal EVOLUTION

1960 1980 2000 2010

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 6
What is involved in a Data Center

Network infrastructure solution Application solution

Cisco GSRs, Li
Linux/HP,
/HP
CISCO CATALYST Solaris/SunFire,
6500, Cisco Catalyst WebLogic, J2EE
Cat4000 custom app, etc.

Layer 4–7 services solution

CSM, Database solution
SSLM, Linux/HP, Solaris/
CSS, SunFire, Oracle
CE, GSS 10G RAC, etc.

Network security solution

PIX®,
FWSM,
IDSM,
VPNSM,
St l ti
Storage solution
CSA MDS9000

Management and instrumentation solution

Terminal
servers, NAM,
NAM
Cisco Works
LMS/VMS,
HSE

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 7
What is Distributed Data Center

APP A APP B APP A APP C

Data Replication

FC FC

Primaryy Secondaryy
Data Center Data Center

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 8
Why Distributed Data Centers

Provide disaster recovery and business continuance

Avoid single, concentrated data depositary
High
g availability
y of applications
pp and data access
Load balancing together with performance scalability
Better response and optimal content routing: proximity
to clients

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 9
Front-end IP Access Layer
y

“Content Routing”
APP A APP B site selection APP A APP C

FC FC

Primaryy Secondaryy
Data Center Data Center

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 10
Application
pp and Database Layer
y

“Content
Content Switching
Switching”
Load Balancing
APP A APP B “Server Clustering” APP A APP C
High Availability

FC FC

Primary Secondary
Data Center Data Center

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 11
Backend SAN Extension

“Storage” & “Optical”

APP A APP B APP A APP C
Data
Mirroring
o ga and
d Replication
ep cat o

FC FC

Primary
P i Secondary
S d
Data Center Data Center

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 12
Data Center Disaster
Recovery

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 13
Agenda

Introduction to Data Center - The Evolution

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 14
Disaster Recovery

Recovery of data and resumption of service - Ensuring

business can recover and continue after failure or
disaster

Ability of a business to adapt, change and continue when

confronted with various outside impacts

Mitigating the impact of a disaster

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 15
What It means For Business

Business Resilience
Continued Operation of
Business During a Failure

Business Continuance
Restoration of Business
After a Failure
Disaster Recovery
Protecting
g Data Through
g Offsite
Data Replication
and Backup

Zero Down Time is the ultimate goal

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 16
Disaster Recovery Planning

• Business Impact Analysis (BIA)

Determines the impacts of various disasters to specific business
functions and company assets

• Risk Analysis
Identifies important functions and assets that are critical to
company’s
company s operations

• Disaster Recovery Plan (DRP)

Restores operability of the target systems
systems, applications
applications, or
computing facility at the secondary Data Center after the disaster

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 17
Disaster Recovery Objectives
Recovery Point Objective (RPO)
Th point
The i t iin titime ((prior
i tto th
the outage)
t ) in
i which
hi h system
t and
dddata
t
must be restored to
Tolerable lost of data in event of disaster or failure
The impact of data loss and the cost associated with the loss
Recovery Time Objective (RTO)
The period of time after an outage in which the systems and data
must be restored to the predetermined RPO
The maximum tolerable outage time
R
Recovery AAccess Obj
Objective
ti (RAO)
Time required to reconnect user to the recovered application,
regardless where it is recovered

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 18
Recovery Point/Time vs. Cost
Critical data is Disaster Systems recovered
recovered strikes and operational

time
Recovery point Recovery time
time t0 time t1 time t2

days hours mins secs secs mins hours days weeks

Tape Periodic Asynchronous Synchronous Extended Manual Tape

backup Replication Replication Replication Cluster Migration Restore

$$$ Increasing cost $$$ Increasing cost

Smaller RPO/RTO Larger RPO/RTO

Higher $$$, Replication, Hot Lower $$$, Tape backup/restore,
standby Cold stanby
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 19
Agenda

Introduction to Data Center - The Evolution

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 20
Failure Scenarios

Disaster could mean many types of Failure

Network Failure
Device
D i F Failure
il
Storage Failure
Site Failure

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 21
Network Failures
Internet Service
Service
P id A
Provider Provider B

ISP failure
9 Dual ISP connections
9 Multiple ISP

Connection failure within the

network
9 ether-channel
9 Multiple route paths

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 22
Device Failures
Internet Service
Service
Provider A Provider B

Routers, Switches, FWs

9 HSRP
9 VRRP

Hosts
9 HA cluster

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 23
Storage Failures
Internet Service
Service
P id A
Provider Provider B

Disk arrays
9 RAID

Disk Controllers

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 24
Site Failures
Internet Service
Service
P id A
Provider Provider B

Partial Site Failure

9 Application
pp maintenance
9 Application migration
9 Application scheduled DR
exercise

Complete Site Failure

9 Disaster

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 25
Agenda

Introduction to Data Center - The Evolution

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 26
Cold Standby

One or more data center with appropriately configured

space equipped with pre-qualified environmental,
electrical,, and communication conditioning
g
Hardware and Software installation, Network access, and
data restoration all need manual intervention
Least expensive to implement and maintain
Substantial delayy from standbyy to full operation
p

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 27
Disaster Recovery – Active/Standby

APP A APP B APP A APP B

FC FC

Primaryy Secondary
Data Center Data Center
(Cold Standby)

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 28
Warm Standby

A data center that is partially equipped with hardware and

communications interfaces capable of providing backup
operating
p g support.
pp
Latest backups from the production data center must be
delivered
Network access needs to be activated
Provides better RTO and RPO than Cold Standbyy
Backup

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 29
Disaster Recovery – Active/Standby

APP A APP B APP A APP B

IP/Optical Network

FC FC
Secondary
Primaryy
Data Center
Data Center
(Warm Standby)

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 30
Hot Standby

A data center that is environmentally ready and has

sufficient hardware, software to provide data processing
service with little down or no down time.
Hot Backup offers Disaster Recovery, with little or no
human intervention
Application
A li ti d data
t iis replicated
li t d ffrom th
the primary
i site
it
A hot backup site provides very good RTO and RPO

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 31
Disaster Recovery – Active/Standby

APP A APP B APP A APP C

IP/Optical Network

FC FC

Primaryy Secondaryy
Data Center Data Center

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 32
Disaster Recovery – Active/Active

What Does Active/Active Mean??

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 33
Multiple Tiers of Application
Internet Service
Service
P id A
Provider Provider B

Presentation Tier

Application Tier

Storage Tier

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 34
Active/Active Data Centers

Internal Internet
Network Service Service
Provider A Provider B Internal
Network

Active/Active Web
Hosting

Active/Active
Application Processing

Active/Standby
Database Processing
Or
Active/Active
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 35
Disaster Recoveryy
Components

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 36
Agenda

Introduction to Data Center - The Evolution

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 37
Site Selection Mechanisms
Site selection mechanisms depend on the technology
or mix of technologies adopted for request routing:
1. HTTP Redirect
2 DNS Based
2.
3. L3 Routing with Route Health Injection (RHI)
Health
H lth off servers and/or
d/ applications
li ti needs
d tto be
b
taken into account
Optionally, other metrics (like load ) can be measured
Optionally
and utilized for a better selection

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 38
HTTP Redirection – The Idea

Leveraging the HTTP redirect function:

HTTP return code 302
Proper site selection made after the initial DNS request
has been resolved, via redirection
Mainly as a method of providing site persistence while
providing local server farm failure recovery
Can be used with the “Location Cookie” feature of the
CSS to provide redirection after wrong site selection

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 39
HTTP Redirection – Traffic Flow

[Link]

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 40
Advantages of the HTTP Redirection
Approach

Can be implemented without any other

GSLB devices or mechanisms
Inherent persistence to the selected
location
Can be used in conjunction with other
methods to provide more sophisticated
site selection

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 41
Limitations of the HTTP Redirection
Approach

It is protocol specific – relies on HTTP

Requires
q redirection to fully
yqqualified
additional names – additional DNS
records
U
Users may b bookmark
k k a specific
ifi llocation
i
– losing automatic failover
HTTPS redirect requires full SSL hand
shake to be completed first

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 42
DNS-Based
DNS Based Site Selection – The Idea

The client D-proxy (local name server) performs

iterative queries
The device which acts as “site selector” is the
authoritative name server for the domain(s) distributed
in multiple locations
The “site selector” sends keepalives to servers or
server load
l db balancer
l iin th
the llocall and
d remote
t llocations
ti
The “site selector” selects a site for the name
resolution, according to the pre-defined
pre defined answers and
site load balance method
The user traffic is sent to the selected location

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 43
DNS-Based
DNS Based Site Selection – Traffic Flow
Root Name Server for/
Authoritative Name Server for .com
DNS Proxy 2
3 4 Authoritative Name Server
[Link]
5
1 6
10 7
8

Client 9 Authoritative
Name Server
[Link] [Link]
UDP:53
TCP 80
TCP:80

Data Center 1 Data Center 2

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 44
Advantages of the DNS Approach

Protocol independent:
p works with any
y
application that uses name resolution
Minimal configuration changes in the current
IP and DNS infrastructure ((DNS authoritative
server)
Implementation can be different for specific
host names
A-records can be changed on the fly
Can take load or data center size into
account
Can provide proximity

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 45
Limitations of the DNS
DNS-Based
Based Approach

Visibility limited to the D-proxy

D proxy (not the
client)
Can not gguarantee 100% session
persistency
DNS caching in the D-proxy
DNS caching in the client application
Order of multiple A-record answers
can be altered by D-proxies

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 46
Route Health Injection – The Idea

Server and application health monitoring provided by

local Server Load Balancers
SLB can advertise or with draw VIP address to upstream
routing devices depending on the availability of the local
server farm
S
Same VIP addresses
dd can be
b advertised
d ti d ffrom multiple
lti l
data centers – IP Anycast
Relying on L3 routing protocols for route propagating
and content request routing
Disaster Recoveryy p
provided by
y network convergence
g

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 47
Route Health Injection – Implementation

Client A Router 11 Client B

Router 13

Router 10

Router 12 Low Cost

Very High Cost
Location A
Backup Location for Location B
VIP x.y.w.z Preferred Location for
VIP x.y.w.z

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 48
Advantages of the RHI Approach

Supports legacy application and does not

rely on a DNS infrastructure
Veryy g
good re-convergence
g time,
especially in Intranets where L3 protocols
can be fine tuned appropriately
P
Protocol-independent:
t li d d t works
k with
ith any
application
Robust protocols and proven features

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 49
Limitations of the RHI Approach

Relies on host routes (32 bits),

bits) which
cannot be propagated all over the
internet (more on this later)
Requires tight integration between the
application-aware devices and the L3
routers
Inability to intelligently load balance
among the data centers

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 50
Agenda

Introduction to Data Center - The Evolution

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 51
Cluster Overview
A cluster is two or more servers
configured to appear as one
Two types of clustering: Load
balancing (LB) and High
Availability (HA) Web Servers

Clustering provides benefits for

availability, reliability, scalability,
and manageability
LB clustering:
l t i multiple
lti l copies
i off Application Servers

the same application against the

same data set, usually read only
HA clustering: multiple copies of
long running application that Database Servers
requires access to a common data
depository, usually read and write

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 52
HA Cluster Connections
Public Network (typically
Ethernet) for client /Application
requests
Servers with same hardware,
OS, and application software
Private Network (typically
Ethernet) for interconnection
between nodes. Could be direct
connect,
t or optionally
ti ll going
i
through the public network
Storage Disk (typically Fiber)
shared storage array
array, NAS or
SAN

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 53
Typical HA Cluster Components

Application
pp software that are clustered to p
provide High
g
Availability. Example: Microsoft Exchange, SQL, Oracle
database, File and Print Services
Operating System that runs on the server hardware.
E
Example:l Mi
Microsoft
ft Wi
Windows
d 2000 or 2003
2003, Linux
Li (and
( d the
th
other flavors of UNIX), IBM VMS or z/OS (for mainframe)
Cluster Software that provides the HA clustering service
for the application
application. Example: Microsoft MSCS
MSCS, EMC
AutoStart (Legato), Veritas Cluster Server, HP TruCluster
and OpenVMS
Optionally
Optionally, Cluster Enabler
Enabler, a software that synchronizes
the cluster software with the storage disk array software

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 54
Basic HA Cluster Design

Active/Standby:
– Active node takes client
requests and writing to the data
– Standby takes over when
detecting failure on active
– Two-node or multi-node
node1 node2
Active/Active:
– Database requests load
balanced to both nodes
– Lock mechanism ensures
data integrity
– Most scalable design

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 55
File System Approaches for HA Clusters

Shared Everything
y g
– Equal access to all storage
– Each node mounts all storage resources
– Provides a single layout reference system for all nodes
– Changes updated in the layout reference
Shared Nothing
– Traditional file system with peer-peer communication
– Each node mounts only its “semi-private” storage
– Data stored on the p
peer system’s
y storage
g is accessed via the p
peer-
peer communication
– Failed node’s storage needs to be mounted by the peer

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 56
Geo clusters
Geo-clusters
Geo-cluster: cluster that span multiple data centers

WAN

Local Remote
Datacenter Datacenter

node1 node2

Disk Replication
Synchronous or Asynchronous
2 x RTT

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 57
Considerations for HA Clusters

Split Brain: Cluster partitioning when nodes can not communicate with
each other but are equally capable of forming a cluster and mount disks.
Extended L2 required in most implementations for:
– Public Network
Network, since client only knows about the Virtual IP address
– Private Network, used for Heart-beats
Storage:
– Directly Attached Disk (DAS) cannot be used
– Shared Disk needs to be visible to both Nodes
– Needs to interface with cluster software for disk failover, zoning,
LUN masking when there is a node failure

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 58
Split Brain
Split-Brain

Split-brain happens when all of the

network communication links
between two or more cluster nodes
fail.
Both nodes could potentially go
active, and concurrently access the node1 node2

disk,
d s , tthus
us corrupting
co upt g data

Data Corruption

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 59
Resolution for Split Brain: Quorum

Aqquorum device serves as a tie

breaker to arbitrate which system has
access to resources.
The qquorum ensures that even if there
is no communication between the
nodes, only one node can continue to
node1 node2
access the disk.
Only the node that owns the quorum
(or, majority quorum votes) can bring
resources online.
Any resource can be used as the
arbitrator to break the tie.

quorum

Application data
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 60
Extended Layer 2 Network

In most implementation, WAN

a common L2 network
t k iis
needed for the heartbeat Remote
Local
between the nodes, as Datacenter Datacenter

well as public client

access
Public Layer 2 network
Extending VLAN on a
geographical basis is not node1 node2
considered
id d b bestt practice
ti Private Layer 2 network
because of the impact of
broadcasts, multicast,
flooding
g and Spanning-g
Tree integration issues

Disk Replication:
Synchronous or
Asynchronous

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 61
Resolution: L3 Routed Solution
In certain cases a L3 routed solution
is possible 172.28.210.x
11 20 5 x
11.20.5.x
Microsoft MSCS
– Requires that 2 nodes be on the node2
same subnet. node1

– The
Th communication
i ti bbetween
t th
the 2
nodes is UDP unicast
– Local Area Mobility (LAM) allows
the placement of the nodes on 2
different subnets
Extended SAN
Veritas VCS
– Allows having nodes with IP
addresses in different subnets
– The Virtual Address needs to
change when moving from node1 to
node2
– DNS can be used to p provide name-
multiple IP mapping Disk Replication:
Synchronous or
Asynchronous

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 62
Storage Disk Zoning
node1 node2
What storage
g disk array y
standby
should node 2 be zoned to active
before and after a failure on
node 1
To complete the failover you
need to change the zoning Extended SAN
configuration
Software needed to
synchronize the Cluster
Software with the Disk Array’s
software, i.e. Cluster Enabler
sym1320 sym1291

RW RD

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 63
Resolution: Cluster Enabler

The Cluster Enabler (CE) provides node1 node2

the interface between the
Clustering Software and the Disk active standby
Array’s software
When the Clustering Software
detects a failure and wants to fail
the node, the Cluster Enabler
instructs the Disk Array to perform
an failover Extended SAN

Cluster Enabler also allows node1

to be zoned to sym1320 and
node2 to be zoned to 1291
The Cluster Enabler running on
each node typically communicates sym1320 sym1291
with the Cluster Enabler Software
running on the remote node with
Local Multicast messages RW WD

RW WD

Introduction to Data Center - The Evolution

Storage I/O devices

Host Bus Adapter (HBA)
p
Small Computer Serial Interface ((SCSI))

Storage protocols
SCSI
iSCSI
FC (FCIP)

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 66
Terminology (Cont
(Cont’d)
d)
Direct Attached Storage (DAS)
Storage is
St i “local”
“l l” behind
b hi d th
the server
No storage sharing possible
Costly to scale; complex to manage
Network Attached Storage (NAS)
Storage is accessed at a file level over an IP network
St
Storage can be
b shared
h db between
t servers
Storage Area Networks (SAN)
Storage is accessed at a block
block-level
level
Separation of Storage from the Server
High performance interconnect providing high I/O throughput

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 67
Storage for Applications
Presentation Tier
U
Unrelated small data files commonly
y stored on internal disks
Manual distribution
Application Processing Tier
Transitional, unrelated data
Small files residing on file systems
Mayy use RAID to spread
p data over multiple
p disks
Storage Tier
Large, permanent data files or raw data
Large batch updates, most likely Real time
Log and data on separate volumes

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 68
Backup and Replication
Offsite tape vaulting
Backup tapes stored at offsite location
Electronic vaulting
Transmission of backup data to offsite location
Remote disk replication
Continuous copying of data to offsite location
Transparent to host

Other methods of replication

Host-based mirroring
Network-based replication

Synchronous
All data written to cache of local and remote arrays before I/O is
complete and acknowledged to host

Asynchronous
Write acknowledged after write to local array cache; changes
(writes) are replicated to remote array asynchronously

Semi-synchronous
Write acknowledged
g with a single
g subsequent WRITE command
pending from remote array

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 70
Synchronous Vs. Asynchronous Trade-
Off
Synchronous Asynchronous
Impact to Application No Application
Performance Performance Impact
Distance Limited (Are Both Unlimited Distance (Second
Sites within the Same Site Outside Threat Radius)
Threat Radius)
Exposure to
No Data Loss Possible Data Loss

Enterprises Must Evaluate the Trade-Offs

Maximum tolerable distance ascertained by

assessing each application
Cost of data loss

• DB name Control Files Control Files identify other files

making up the database and
• creation date
records content and state of
• backup performed the db.
• redo log time period
• datafile state Datafile is only
y updated
p
periodically
Redo logs record db changes
Identify
resulting from transactions
Used
U d tto play
l b back
k changes
h th
thatt
may not have been written to
datafile when failure occurred
Typically archived as they fill to
local and DR site destinations
Record
Datafiles changes to Redo Log
Files

• Tablespaces • Database changes

• Indexes
• Data Dictionary

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 72
Data Replication with DB Example
(Cont d)
(Cont’d)
Failure or disaster occurs at
time t1
• Media Failure (e.g.
(e g disk)
time • Human Error (datafile deletion)
• Database Corruption

... ... ...

Archived Redo Logs t1

t0 Online Redo
Logs

Hot Backup of
Database restored to state at time of failure (time t1)
Datafiles and by:
Control Files taken
at Time t0 1. Restoring Control Files & Datafiles from last Hot
Backup (time t0)
2. Sequentially replaying changes from subsequent
Redo Logs (archived and online) – changes made
between time t0 and t1

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 73
Data Replication with DB Example
(Cont d)
(Cont’d)
Primary Site Secondary Site
Redo Logs (Cyclic) Redo Logs (Cyclic)
Copy of Every Committed
Transaction Synchronously Replicated
Earlier DB
for Zero Loss
Backups

Database

SAN
Extension
E t i Database
Database Transport Copy at
copy at Time t0
Point in Time time t0
Copy Taken
When DB Replicated/Copied
Quiescent

Archive Logs Archive Logs

Replicated/Copied

Mixture of sync and async replication technologies commonly used

Usually only redo logs sync replicated to remote site
Archive logs
g created from redo log
g and copied
p when redo log
g switches
Point in time (PiT) copies of datafiles and control files copied periodically
(e.g. nightly)

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 74
Data Center Interconnection Options
Internet Internet
Stateful
Stateful Firewalls
Firewalls
Content
C t t
Content Caching
High
Caching SONET/SDH High
Density Density
Multilayer Server Server
Load Balancing Load Balancing Multilayer
LAN LAN
Switch Switch
Intrusion Intrusion
Detection Detection

Front-End Application Front-End Application

Servers Servers

DWDM/
CWDM
Back-End Application Back-End Application
Servers Servers

High
g High
Density Density
Multilayer Multilayer
SAN SAN
Director Director

Enterprise-Class Storage Arrays Enterprise-Class storage Arrays

IP/Metro E

Increasing Distance
Data
Center Campus Metro Regional National

Dark Fiber Sync Limited by Optics (Power Budget)

CWDM Sync (2Gbps) Limited by Optics (Power Budget)

cal
Optic

DWDM Sync (2Gbps lambda) Limited by BB_Credits

SONET/SDH Sync (1Gbps+ subrate) Async

MDS9000 FCIP Sync (Metro Eth) Async (1Gbps+)

Shared
Sh dDData
Extend the normal reach of
Cluster or
Remote Host
a Fibre Channel fabric
Access to
Storage Replication
Remote host to target array
Shared data clusters
SAN Extension
Network

FC FC
Replication

Servers with two fibre

Replication
channel connections to
FC
Fabrics storage arrays for high
availability
Use of multipath software is
required in dual fabric host
design
DC
Interconnect
Network SAN extension fabrics
typically separate from
host access fabrics
Replication fabric
FC
Replication
requirements generally
fabrics specified by array vendor
Site B

Regional
< 400km
Secondary Primary
DR Site Data Center D t Center
Data C t

Metro
< 50km

Disasters are characterized by Local

their impact 1–2 km
Local metro
Local, metro, regional
regional, global
Fire, flood, earthquake, attack

Is the backup site within the threat

radius?

Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 80
Active/Standby Architecture - Today
CA CA NC
High Availability Site 1 High Availability Site 2 Disaster Recovery Site

Hosts 1 Hosts 2 Hosts 3

HA Cluster(s) Electronic Journaling

Synch CWDM
MDS 9509’s Replication MDS 9509’s MDS 9509’s

Dual OC12
Synch FCIP Asynchronous
Replication FCIP Replication

MDS 9509 MDS 9509 MDS 9509

Gateway Gateway Gateway

Storage 1
Storage 2 Bunker Storage 3

MDS DUAL OC12 MDS

SRDF
R2 BCV/R1 BCV
Timefinder Timefinder
PiT SRDF/A
PiT PROD D/R
SRDF/A
PiT Redo Redo
SRDF/A
PiT
Arch Arch

Triple Threat
EMC/DMX EMC/DMX EMC/DMX

Service Locator Group Data Centers

ACE ACNS ACE
User decrypts caches routes
request pages request Clustered
DC2 Backend
Y Active Standby
Active
Active
X Standby Data X
Data Y
Content
Engine
GSS performs Site (DC) selection
according to pre-configured condition, using ACE Requests
FQDN probes directed to
t k
track b k
backup
application application
health

Mirror
Presentation Layer Asynchronous
Replication
DC1
Requests
directed to
primary
application

Clustered Active Standby

Backend Data X Active
Data Y
X Active
Y Standby

• SANTap Production Servers

• Appliance based storage replication
• Reliable copy of WRITE operations
• SCSI-FCIP communication

• Continuous Data Protection

• Automatic and Continuous Backups CDP
• Time Addressable Storage (TAS) Appliance
• Any Point-in-Time Recovery SAN Tap
• Application based or Network based MDS
SAN

Primary Secondary

Replication/CDP Replication/CDP
Appliance Appliance
SANTap
DUAL OC12

MDS MDS

D/R
APiT APiT APiT
SRDF/A
PROD BCV
SRDF/A
Redo Redo
APiT APiT SRDF/A APiT
Arch Arch

EMC/DMX TAS/SATA TAS/SATA EMC/DMX

GSS-1 GSS-2

ACE-1 ACE-2 ACE-3

DC-1 DC-2
DC-3
Web/APP
Server
Farm

IP/Optical Network
CWDM/DWDM

FC FC
Primary Secondary
Location Location
Presentation_ID © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential 86
Summary - Design Details
Data centers 1 and 2 are in primary location with close
enough distance that can provide DC HA for active/active
access
Data Center 3 (DR) with > tolerable disaster radius, away
for Primary DC 1 and 2
Web/App server farms are load balanced geographically
DB servers are within a geo
geo-HA
HA cluster and running in a
L3 design
Synchronize
y Data replication
p between data centers within
the primary location
Asynchronous Data replication is done between the
primary and secondary storage systems

Cisco Datacenter Disaster Recovery
100% (2)
Cisco Datacenter Disaster Recovery
63 pages
Data Center Disaster Recovery
No ratings yet
Data Center Disaster Recovery
48 pages
Session 31
No ratings yet
Session 31
25 pages
Disaster Recovery and Busi Ness Continuity: Dr. Pranita Upadhyaya
No ratings yet
Disaster Recovery and Busi Ness Continuity: Dr. Pranita Upadhyaya
47 pages
Business Continuity and Disaster Recovery Notes
100% (1)
Business Continuity and Disaster Recovery Notes
32 pages
Saudiexpo Dcna
No ratings yet
Saudiexpo Dcna
33 pages
Introduction To Business Continuity
No ratings yet
Introduction To Business Continuity
25 pages
Business Impact Analysis Rpo/Rto Disaster Recovery Testing, Backups, Audit
No ratings yet
Business Impact Analysis Rpo/Rto Disaster Recovery Testing, Backups, Audit
39 pages
01 Business Continuity Intro
No ratings yet
01 Business Continuity Intro
42 pages
DC Migration and Consolidation Discovery Methodology: BRKDCT-2863
No ratings yet
DC Migration and Consolidation Discovery Methodology: BRKDCT-2863
35 pages
Implementing Data Center Services (Interoperability, Design and Deployment)
No ratings yet
Implementing Data Center Services (Interoperability, Design and Deployment)
31 pages
Data Storage Management Overview
100% (1)
Data Storage Management Overview
43 pages
Business Continuity & Disaster Recovery Guide
100% (2)
Business Continuity & Disaster Recovery Guide
69 pages
BCP DR
100% (1)
BCP DR
61 pages
Disaster Recovery & Continuity Guide
No ratings yet
Disaster Recovery & Continuity Guide
44 pages
Business Continuity: No Longer Out of Reach
No ratings yet
Business Continuity: No Longer Out of Reach
8 pages
Data Center Site Selection (Cisco GSS)
No ratings yet
Data Center Site Selection (Cisco GSS)
100 pages
Oregon: State Data Center
No ratings yet
Oregon: State Data Center
11 pages
BSS Data Center Disaster Recovery
No ratings yet
BSS Data Center Disaster Recovery
53 pages
Disaster Recovery As A Service (Draas) 2.0 Design Guide: Building Architectures To Solve Business Problems
No ratings yet
Disaster Recovery As A Service (Draas) 2.0 Design Guide: Building Architectures To Solve Business Problems
68 pages
Business Continuity and Data Availability
No ratings yet
Business Continuity and Data Availability
27 pages
CH 9 Implementing Controls
No ratings yet
CH 9 Implementing Controls
34 pages
Cisco Case Study Emc SRDF Fcip
No ratings yet
Cisco Case Study Emc SRDF Fcip
12 pages
Active-Active Data Center Strategies
No ratings yet
Active-Active Data Center Strategies
117 pages
RPO and RTO in Business Continuity Plans
100% (1)
RPO and RTO in Business Continuity Plans
38 pages
CH 09
No ratings yet
CH 09
70 pages
Business Continuity Essentials
100% (2)
Business Continuity Essentials
59 pages
Business Continuity and Disaster Recovery
No ratings yet
Business Continuity and Disaster Recovery
1 page
IT Recovery Strategies
No ratings yet
IT Recovery Strategies
47 pages
SQL Server Disaster Recovery Options Guide
No ratings yet
SQL Server Disaster Recovery Options Guide
8 pages
Data Center Services Proposal Draft Reduced
100% (1)
Data Center Services Proposal Draft Reduced
16 pages
CHP 4
No ratings yet
CHP 4
9 pages
V8: Net-Ing A Greener Data Center
No ratings yet
V8: Net-Ing A Greener Data Center
24 pages
Cisco Data Center Optimization Services
No ratings yet
Cisco Data Center Optimization Services
30 pages
W5-Data Center - Data Recovery Center
No ratings yet
W5-Data Center - Data Recovery Center
87 pages
Business Continuity & Disaster Recovery
No ratings yet
Business Continuity & Disaster Recovery
20 pages
Checklist SMB Data Center Basics
No ratings yet
Checklist SMB Data Center Basics
6 pages
Cisco IT Dual Purpose Data Center Case Study
100% (1)
Cisco IT Dual Purpose Data Center Case Study
9 pages
How To Preserve Critical Business Functions in The Face of A Disaster
No ratings yet
How To Preserve Critical Business Functions in The Face of A Disaster
32 pages
Definition of Disaster Recovery Management
No ratings yet
Definition of Disaster Recovery Management
7 pages
Chapter 6 BC and DRP V 2
No ratings yet
Chapter 6 BC and DRP V 2
10 pages
Introduction To Business Continuity
100% (1)
Introduction To Business Continuity
26 pages
Cisco Data Center Network Architecture and Solutions Overview
No ratings yet
Cisco Data Center Network Architecture and Solutions Overview
19 pages
CICS Recovery and Restart Guide
No ratings yet
CICS Recovery and Restart Guide
295 pages
Ccs367 Unit IV QB
No ratings yet
Ccs367 Unit IV QB
5 pages
Business Continuity & Disaster Recovery Guide
No ratings yet
Business Continuity & Disaster Recovery Guide
15 pages
Disaster Recovery Exec Brief 069248 PDF
No ratings yet
Disaster Recovery Exec Brief 069248 PDF
2 pages
Disaster Recovery Exec Brief 069248
No ratings yet
Disaster Recovery Exec Brief 069248
2 pages
Brkini 2201
No ratings yet
Brkini 2201
105 pages
DCI Imp
No ratings yet
DCI Imp
80 pages
Data Center Interconnect Design Guide
No ratings yet
Data Center Interconnect Design Guide
80 pages
Active Active Data Centre Strategies
100% (1)
Active Active Data Centre Strategies
87 pages
Designing Data Centers With The Nexus 7000: Ben Basler Technical Marketing Engineer January 2009, v2.1
No ratings yet
Designing Data Centers With The Nexus 7000: Ben Basler Technical Marketing Engineer January 2009, v2.1
55 pages
Cyber Resiliency Deep Dive Presentation - 2020-May-20
No ratings yet
Cyber Resiliency Deep Dive Presentation - 2020-May-20
23 pages
Callmanager Database Replication
No ratings yet
Callmanager Database Replication
53 pages
Business Continuity Planning (BCP) & Disaster Recovery Planning (DRP)
100% (1)
Business Continuity Planning (BCP) & Disaster Recovery Planning (DRP)
33 pages
Wireless - Connection - To - CNC
No ratings yet
Wireless - Connection - To - CNC
18 pages
WK-3802CW 1GE+1FE+WiFi+CATV Dual Mode ONU User Manual
No ratings yet
WK-3802CW 1GE+1FE+WiFi+CATV Dual Mode ONU User Manual
9 pages
Full Answers: : SRWE Final PT Skills Assessment (PTSA)
No ratings yet
Full Answers: : SRWE Final PT Skills Assessment (PTSA)
5 pages
UMG8900
No ratings yet
UMG8900
0 pages
22520-2019-Winter-Model-Answer-Paper (Msbte Study Resources)
No ratings yet
22520-2019-Winter-Model-Answer-Paper (Msbte Study Resources)
29 pages
Cisco 2802i Set Up
No ratings yet
Cisco 2802i Set Up
3 pages
Cisco Catalyst 9400 Install Guide
No ratings yet
Cisco Catalyst 9400 Install Guide
192 pages
Cis 185 CCNP Route Chapter 2: Implementing EIGRP 1 of 3: Basic EIGRP
No ratings yet
Cis 185 CCNP Route Chapter 2: Implementing EIGRP 1 of 3: Basic EIGRP
46 pages
CCNA 200-301 Course Outline
No ratings yet
CCNA 200-301 Course Outline
2 pages
What Is SMPTE ST2110?: Andreas Hildebrand RAVENNA Evangelist
No ratings yet
What Is SMPTE ST2110?: Andreas Hildebrand RAVENNA Evangelist
37 pages
CS169: Mobile Wireless Networks: Lab1: NS3 Introduction
No ratings yet
CS169: Mobile Wireless Networks: Lab1: NS3 Introduction
17 pages
Fluke Network Tester
No ratings yet
Fluke Network Tester
46 pages
Evo BSC 8200 Board Overview
No ratings yet
Evo BSC 8200 Board Overview
14 pages
CCNA2 Module 8 Study Guide
No ratings yet
CCNA2 Module 8 Study Guide
3 pages
Complete 100 Networking Questions RRB PO
No ratings yet
Complete 100 Networking Questions RRB PO
3 pages
Lab Exercise - 802.11: Objective
No ratings yet
Lab Exercise - 802.11: Objective
15 pages
SDN (Software Defined Networking)
No ratings yet
SDN (Software Defined Networking)
7 pages
OSPFv2 Default Route Propagation Guide
No ratings yet
OSPFv2 Default Route Propagation Guide
2 pages
Ba Scalance-Xb-200 76
No ratings yet
Ba Scalance-Xb-200 76
72 pages
TL-SG105E Datasheet: 5-Port Gigabit Easy Smart Switch
No ratings yet
TL-SG105E Datasheet: 5-Port Gigabit Easy Smart Switch
4 pages
HUAWEI S Series Switches After-Sales Documentation Bookshelf (Enterprise)
No ratings yet
HUAWEI S Series Switches After-Sales Documentation Bookshelf (Enterprise)
32 pages
4G Technology Overview for Tech Enthusiasts
No ratings yet
4G Technology Overview for Tech Enthusiasts
10 pages
User Manual: AC1300 MU-MIMO Wi-Fi Nano USB Adapter
No ratings yet
User Manual: AC1300 MU-MIMO Wi-Fi Nano USB Adapter
49 pages
RB1100AHx4 Product Details 2020-09-16 0350 PDF
No ratings yet
RB1100AHx4 Product Details 2020-09-16 0350 PDF
2 pages
HCIA-Datacom V1.0 Lab Guide
No ratings yet
HCIA-Datacom V1.0 Lab Guide
181 pages
Networking 102-Lesson 8
No ratings yet
Networking 102-Lesson 8
3 pages
Chapter 3 SRWE - Module - 4 Inter-VLAN Routing
No ratings yet
Chapter 3 SRWE - Module - 4 Inter-VLAN Routing
45 pages
Network Traffic Congestion Control
No ratings yet
Network Traffic Congestion Control
61 pages
Product Data Sheet Deltav PK Controller Deltav en 3583460 PDF
No ratings yet
Product Data Sheet Deltav PK Controller Deltav en 3583460 PDF
17 pages
LTE - Protocols and Procedures
No ratings yet
LTE - Protocols and Procedures
287 pages

BC and DR

Uploaded by

BC and DR

Uploaded by

Data Center

• Cost of application downtime, lost data

Defense, Basel II, HIPAA, GLB, SEC)

NYC Blizzard of 2003

 Following a disaster 43% of directly affected

 Only 15% of Global 2000 enterprises have a full-

 Disasters: fire, storm, floods, earthquakes, chemical

 Introduction to Data Center - The Evolution

1960 1980 2000 2010

Network infrastructure solution Application solution

Layer 4–7 services solution

Network security solution

Management and instrumentation solution

APP A APP B APP A APP C

 Provide disaster recovery and business continuance

“Storage” & “Optical”

 Introduction to Data Center - The Evolution

 Recovery of data and resumption of service - Ensuring

 Ability of a business to adapt, change and continue when

 Mitigating the impact of a disaster

Zero Down Time is the ultimate goal

• Business Impact Analysis (BIA)

• Disaster Recovery Plan (DRP)

days hours mins secs secs mins hours days weeks

Tape Periodic Asynchronous Synchronous Extended Manual Tape

$$$ Increasing cost $$$ Increasing cost

 Smaller RPO/RTO  Larger RPO/RTO

 Introduction to Data Center - The Evolution

Disaster could mean many types of Failure

 Connection failure within the

 Routers, Switches, FWs

 Partial Site Failure

 Complete Site Failure

 Introduction to Data Center - The Evolution

 One or more data center with appropriately configured

APP A APP B APP A APP B

 A data center that is partially equipped with hardware and

APP A APP B APP A APP B

 A data center that is environmentally ready and has

APP A APP B APP A APP C

What Does Active/Active Mean??

 Introduction to Data Center - The Evolution

 Leveraging the HTTP redirect function:

 Can be implemented without any other

 It is protocol specific – relies on HTTP

 The client D-proxy (local name server) performs

Data Center 1 Data Center 2

 Visibility limited to the D-proxy

 Server and application health monitoring provided by

Client A Router 11 Client B

Router 12 Low Cost

 Supports legacy application and does not

 Relies on host routes (32 bits),

 Introduction to Data Center - The Evolution

 Clustering provides benefits for

the same application against the

 Split-brain happens when all of the

 Aqquorum device serves as a tie

 In most implementation, WAN

well as public client

 The Cluster Enabler (CE) provides node1 node2

 Cluster Enabler also allows node1

 Introduction to Data Center - The Evolution

 Storage I/O devices

 Other methods of replication

Enterprises Must Evaluate the Trade-Offs

 Maximum tolerable distance ascertained by

• DB name Control Files  Control Files identify other files

• Tablespaces • Database changes

... ... ...

Archived Redo Logs t1

Archive Logs Archive Logs

 Mixture of sync and async replication technologies commonly used

Front-End Application Front-End Application

Enterprise-Class Storage Arrays Enterprise-Class storage Arrays

Dark Fiber Sync Limited by Optics (Power Budget)

Following a disaster 43% of directly affected

Only 15% of Global 2000 enterprises have a full-

Disasters: fire, storm, floods, earthquakes, chemical

Introduction to Data Center - The Evolution

Provide disaster recovery and business continuance

Introduction to Data Center - The Evolution

Recovery of data and resumption of service - Ensuring

Ability of a business to adapt, change and continue when

Mitigating the impact of a disaster

Smaller RPO/RTO Larger RPO/RTO

Introduction to Data Center - The Evolution

Connection failure within the

Routers, Switches, FWs

Partial Site Failure

Complete Site Failure

Introduction to Data Center - The Evolution

One or more data center with appropriately configured

A data center that is partially equipped with hardware and

A data center that is environmentally ready and has

Introduction to Data Center - The Evolution

Leveraging the HTTP redirect function:

Can be implemented without any other

It is protocol specific – relies on HTTP

The client D-proxy (local name server) performs

Visibility limited to the D-proxy

Server and application health monitoring provided by

Supports legacy application and does not

Relies on host routes (32 bits),

Introduction to Data Center - The Evolution

Clustering provides benefits for

Split-brain happens when all of the

Aqquorum device serves as a tie

In most implementation, WAN

The Cluster Enabler (CE) provides node1 node2

Cluster Enabler also allows node1

Introduction to Data Center - The Evolution

Storage I/O devices

Other methods of replication

Maximum tolerable distance ascertained by

• DB name Control Files Control Files identify other files

Mixture of sync and async replication technologies commonly used

Servers with two fibre

Disasters are characterized by Local

Is the backup site within the threat