0% found this document useful (0 votes)
164 views

Scalability Availability Stability:, & Patterns

Scalability, Availability and Stability Patterns Managing Overload Immutability as the default Referential Transparency (FP) Laziness Think about your data: different data need different guarantees. If your system is slow for a single user How do I know if I have a performance problem?
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
164 views

Scalability Availability Stability:, & Patterns

Scalability, Availability and Stability Patterns Managing Overload Immutability as the default Referential Transparency (FP) Laziness Think about your data: different data need different guarantees. If your system is slow for a single user How do I know if I have a performance problem?
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 197

Scalability, Availability & Stability Patterns

twitter: @jboner email: [email protected]

Jonas Bonr

Jayway Stockholm

Tuesday, May 11, 2010

Outline

Tuesday, May 11, 2010

Outline

Tuesday, May 11, 2010

Outline

Tuesday, May 11, 2010

Outline

Tuesday, May 11, 2010

Outline

Tuesday, May 11, 2010

Introduction

Tuesday, May 11, 2010

Scalability Patterns

Tuesday, May 11, 2010

Managing Overload

Tuesday, May 11, 2010

Scale up vs Scale out?

Tuesday, May 11, 2010

General recommendations

Immutability as the default Referential Transparency (FP) Laziness Think about your data:

Different data need different guarantees


Tuesday, May 11, 2010

Scalability Trade-offs

Tuesday, May 11, 2010

Tuesday, May 11, 2010

Trade-offs

Latency vs Throughput Availability vs Consistency


Performance vs Scalability
Tuesday, May 11, 2010

Performance vs Scalability

Tuesday, May 11, 2010

How do I know if I have a performance problem?

Tuesday, May 11, 2010

How do I know if I have a performance problem?


If your system is slow for a single user

Tuesday, May 11, 2010

How do I know if I have a scalability problem?

Tuesday, May 11, 2010

How do I know if I have a scalability problem?


If your system is fast for a single user but slow under heavy load

Tuesday, May 11, 2010

Latency vs Throughput

Tuesday, May 11, 2010

You should strive for

maximal throughput
with

acceptable latency
Tuesday, May 11, 2010

Availability vs Consistency

Tuesday, May 11, 2010

CAP
theorem
Tuesday, May 11, 2010

Brewsters

You can only pick

Consistency Availability Partition tolerance


At a given point in time
Tuesday, May 11, 2010

Centralized system
In a centralized system (RDBMS etc.) So you get both:
we dont have network partitions, e.g. P in CAP

Availability
onsistency

Tuesday, May 11, 2010

Atomic Consistent Isolated Durable


Tuesday, May 11, 2010

Distributed system
In a distributed system we (will) have
network partitions, e.g. P in CAP

So you get to only pick one:

Availability
onsistency

Tuesday, May 11, 2010

CAP in practice:

...there are only two types of systems:


1. CA == CP (they are equivalent) 2. AP

...there is only one choice to make. In


case of a network partition, what do you sacrice?
1. C: Consistency 2. A: Availability
Tuesday, May 11, 2010

Basically Available Soft state Eventually consistent


Tuesday, May 11, 2010

Eventual Consistency
...is an interesting trade-off

Tuesday, May 11, 2010

Eventual Consistency
...is an interesting trade-off

But lets get back to that later

Tuesday, May 11, 2010

Availability Patterns

Tuesday, May 11, 2010

Availability Patterns
Fail-over Replication
Master-Slave Tree replication Master-Master Buddy Replication

Tuesday, May 11, 2010

What do we mean with Availability?

Tuesday, May 11, 2010

Fail-over

Tuesday, May 11, 2010

Fail-over

Copyright Michael Nygaard


Tuesday, May 11, 2010

Fail-over

But fail-over is not always this simple


Copyright Michael Nygaard
Tuesday, May 11, 2010

Fail-over

Copyright Michael Nygaard


Tuesday, May 11, 2010

Fail-back

Copyright Michael Nygaard


Tuesday, May 11, 2010

Network fail-over

Tuesday, May 11, 2010

Replication

Tuesday, May 11, 2010

Replication

Active replication - Push Passive replication - Pull



Data not available, read from peer, then store it locally Works well with timeout-based caches

Tuesday, May 11, 2010

Replication

Master-Slave replication Tree Replication Master-Master replication Buddy replication


Tuesday, May 11, 2010

Master-Slave Replication

Tuesday, May 11, 2010

Master-Slave Replication

Tuesday, May 11, 2010

Tree Replication

Tuesday, May 11, 2010

Master-Master Replication

Tuesday, May 11, 2010

Buddy Replication

Tuesday, May 11, 2010

Buddy Replication

Tuesday, May 11, 2010

Scalability Patterns: State

Tuesday, May 11, 2010

Scalability Patterns: State


Partitioning HTTP Caching RDBMS Sharding NOSQL Distributed Caching Data Grids Concurrency
Tuesday, May 11, 2010

Partitioning

Tuesday, May 11, 2010

HTTP Caching
Reverse Proxy


Tuesday, May 11, 2010

Varnish Squid rack-cache Pound Nginx Apache mod_proxy

HTTP Caching
CDN, Akamai

Tuesday, May 11, 2010

Generate Static Content


Precompute content

Homegrown + cron or Quartz Spring Batch Gearman Hadoop Google Data Protocol Amazon Elastic MapReduce
Tuesday, May 11, 2010

HTTP Caching
First request

Copyright Ryan Tomayko


Tuesday, May 11, 2010

Subsequent request

HTTP Caching

Copyright Ryan Tomayko


Tuesday, May 11, 2010

Service of Record SoR

Tuesday, May 11, 2010

Service of Record

Relational Databases (RDBMS) NOSQL Databases

Tuesday, May 11, 2010

How to scale out RDBMS?

Tuesday, May 11, 2010

Sharding

Replication
Tuesday, May 11, 2010

Partitioning

Sharding: Partitioning

Tuesday, May 11, 2010

Sharding: Replication

Tuesday, May 11, 2010

ORM + rich domain model

anti-pattern

Attempt: Result:

Read an object from DB You sit with your whole database in your lap

Tuesday, May 11, 2010

Think about your data


Think again

When do you need ACID? When is Eventually Consistent a better t? Different kinds of data has different needs
Tuesday, May 11, 2010

When is a RDBMS good enough?


Tuesday, May 11, 2010

not

Scaling to a RDBMS is
Tuesday, May 11, 2010

reads

hard

Scaling to a RDBMS is
Tuesday, May 11, 2010

writes

impossible

Do we really need a RDBMS?

Tuesday, May 11, 2010

Do we really need a RDBMS? Sometimes...


Tuesday, May 11, 2010

Do we really need a RDBMS?

Tuesday, May 11, 2010

Do we really need a RDBMS? But many times we dont


Tuesday, May 11, 2010

NOSQL (Not Only SQL)

Tuesday, May 11, 2010

NOSQL
Key-Value databases Column databases Document databases Graph databases Datastructure databases

Tuesday, May 11, 2010

Whos ACID?

Relational DBs (MySQL, Oracle, Postgres) Object DBs (Gemstone, db4o) Clustering products (Coherence,
Terracotta)

Most caching products (ehcache)


Tuesday, May 11, 2010

Whos BASE?
Distributed databases

Cassandra Riak Voldemort Dynomite, SimpleDB etc.


Tuesday, May 11, 2010

NOSQL in the wild


Google: Bigtable Amazon: Dynamo Amazon: SimpleDB Yahoo: HBase Microsoft: Dynomite Facebook: Cassandra LinkedIn: Voldemort
Tuesday, May 11, 2010

But rst some background...

Tuesday, May 11, 2010

Chord & Pastry


Distributed Hash Tables (DHT) Scalable Partitioned Fault-tolerant Decentralized Peer to peer Popularized
Node ring Consistent Hashing
Tuesday, May 11, 2010

Node ring with Consistent Hashing

Find data in log(N) jumps


Tuesday, May 11, 2010

Bigtable
How can we build a DB on top of Google File System? Paper: Bigtable: A distributed storage system for structured data, 2006 Rich data-model, structured storage Clones: HBase Hypertable Neptune
Tuesday, May 11, 2010

Dynamo
How can we build a distributed hash table for the data center? Paper: Dynamo: Amazons highly available keyvalue store, 2007 Focus: partitioning, replication and availability Eventually Consistent Clones: Voldemort Dynomite
Tuesday, May 11, 2010

Types of NOSQL stores

Key-Value databases (Voldemort, Dynomite) Column databases (Cassandra,Vertica) Document databases (MongoDB, CouchDB) Graph databases (Neo4J, AllegroGraph) Datastructure databases (Redis, Hazelcast)
Tuesday, May 11, 2010

Distributed Caching

Tuesday, May 11, 2010

Distributed Caching
Write-through Write-behind Eviction Policies Replication Peer-To-Peer (P2P)

Tuesday, May 11, 2010

Write-through

Tuesday, May 11, 2010

Write-behind

Tuesday, May 11, 2010

Eviction policies

TTL (time to live) Bounded FIFO (rst in rst out) Bounded LIFO (last in rst out) Explicit cache invalidation
Tuesday, May 11, 2010

Peer-To-Peer

Decentralized No special or blessed nodes Nodes can join and leave as they please

Tuesday, May 11, 2010

Distributed Caching
Products
EHCache JBoss Cache OSCache memcached

Tuesday, May 11, 2010

memcached

Very fast Simple Key-Value (string >binary) Clients for most languages Distributed Not replicated - so 1/N chance
for local access in cluster
Tuesday, May 11, 2010

Data Grids / Clustering

Tuesday, May 11, 2010

Data Grids/Clustering
Parallel data storage

Data replication Data partitioning Continuous availability Data invalidation Fail-over C + A in CAP
Tuesday, May 11, 2010

Data Grids/Clustering
Products

Coherence Terracotta GigaSpaces GemStone Hazelcast Innispan


Tuesday, May 11, 2010

Concurrency

Tuesday, May 11, 2010

Concurrency
Shared-State Concurrency Message-Passing Concurrency Dataow Concurrency Software Transactional Memory

Tuesday, May 11, 2010

Shared-State Concurrency

Tuesday, May 11, 2010

Shared-State Concurrency
Everyone can access anything anytime Totally indeterministic Introduce determinism at well-dened places... ...using locks

Tuesday, May 11, 2010

Shared-State Concurrency
Problems with locks:
Locks do not compose Taking too few locks Taking too many locks Taking the wrong locks Taking locks in the wrong order Error recovery is hard

Tuesday, May 11, 2010

Shared-State Concurrency
Please use java.util.concurrent.*
ConcurrentHashMap BlockingQueue ConcurrentQueue ExecutorService ReentrantReadWriteLock CountDownLatch ParallelArray andmuchmuchmore..

Tuesday, May 11, 2010

Message-Passing Concurrency

Tuesday, May 11, 2010

Actors
Originates in a 1973 paper by Carl Hewitt Implemented in Erlang, Occam, Oz Encapsulates state and behavior Closer to the denition of OO than classes

Tuesday, May 11, 2010

Actors
ShareNOTHING Isolated lightweight processes Communicates through messages Asynchronous and non-blocking No shared state hence, nothing to synchronize. Each actor has a mailbox (message queue)
Tuesday, May 11, 2010

Actors
Easier to reason about Raised abstraction level Easier to avoid Race conditions Deadlocks Starvation Live locks
Tuesday, May 11, 2010

Actor libs for the JVM


Akka (Java/Scala) scalaz actors (Scala) Lift Actors (Scala) Scala Actors (Scala) Kilim (Java) Jetlang (Java) Actors Guild (Java) Actorom (Java) FunctionalJava (Java) GPars (Groovy)

Tuesday, May 11, 2010

Dataow Concurrency

Tuesday, May 11, 2010

Dataow Concurrency
Declarative No observable non-determinism Data-driven threads block until
data is available On-demand, lazy No difference between:
Concurrent & Sequential code

Limitations: cant have side-effects


Tuesday, May 11, 2010

Software Transactional Memory

STM:

Tuesday, May 11, 2010

STM: overview
See the memory (heap and stack)
as a transactional dataset Similar to a database
begin commit abort/rollback

Transactions are retried

automatically upon collision Rolls back the memory on abort

Tuesday, May 11, 2010

STM: overview
Transactions can nest

Transactionscompose (yipee!!)
atomic{ ... atomic{ ... } }

Tuesday, May 11, 2010

STM: restrictions
All operations in scope of a transaction:

Need to be idempotent

Tuesday, May 11, 2010

STM libs for the JVM


Akka

(Java/Scala) Multiverse (Java) Clojure STM (Clojure) CCSTM (Scala) Deuce STM (Java)

Tuesday, May 11, 2010

Scalability Patterns: Behavior

Tuesday, May 11, 2010

Scalability Patterns: Behavior


Event-Driven Architecture Compute Grids Load-balancing Parallel Computing

Tuesday, May 11, 2010

Event-Driven Architecture
Four years from now, mere mortals will begin to adopt an event-driven architecture (EDA) for the sort of complex event processing that has been attempted only by software gurus [until now] --Roy Schulte (Gartner), 2003

Tuesday, May 11, 2010

Event-Driven Architecture
Domain Events Event Sourcing Command and Query Responsibility Segregation (CQRS) pattern Event Stream Processing Messaging Enterprise Service Bus Actors Enterprise Integration Architecture (EIA)
Tuesday, May 11, 2010

Domain Events
It's really become clear to me in the last couple of years that we need a new building block and that is the Domain Events -- Eric Evans, 2009

Tuesday, May 11, 2010

Domain Events
Domain Events represent the state of entities at a given time when an important event occurred and decouple subsystems with event streams. Domain Events give us clearer, more expressive models in those cases. -- Eric Evans, 2009
Tuesday, May 11, 2010

Domain Events
State transitions are an important part of our problem space and should be modeled within our domain. -- Greg Young, 2008

Tuesday, May 11, 2010

Event Sourcing
Every state change is materialized in an Event All Events are sent to an EventProcessor EventProcessor stores all events in an Event Log System can be reset and Event Log replayed No need for ORM, just persist the Events Many different EventListeners can be added to
Tuesday, May 11, 2010

EventProcessor (or listen directly on the Event log)

Event Sourcing

Tuesday, May 11, 2010

Command and Query Responsibility Segregation (CQRS) pattern


A single model cannot be appropriate for reporting, searching and transactional behavior. -- Greg Young, 2008

Tuesday, May 11, 2010

Bidirectional

Bidirectional

Tuesday, May 11, 2010

Tuesday, May 11, 2010

Unidirectional

Unidirectional

Unidirectional

Tuesday, May 11, 2010

Tuesday, May 11, 2010

Tuesday, May 11, 2010

Tuesday, May 11, 2010

CQRS
in a nutshell

All state changes are represented by Domain Events Aggregate roots receive Commands and publish Events

Reporting (query database) is updated as a result of the


published Events

All Queries from Presentation go directly to Reporting


and the Domain is not involved

Tuesday, May 11, 2010

CQRS

Copyright by Axis Framework


Tuesday, May 11, 2010

CQRS: Benets
Fully encapsulated domain that only exposes Queries do not use the domain model No object-relational impedance mismatch Bullet-proof auditing and historical tracing Easy integration with external systems Performance and scalability
Tuesday, May 11, 2010

behavior

Event Stream Processing

select*from Withdrawal(amount>=200).win:length(5)
Tuesday, May 11, 2010

Event Stream Processing


Products

Esper (Open Source) StreamBase RuleCast


Tuesday, May 11, 2010

Messaging

Publish-Subscribe Point-to-Point Store-forward Request-Reply


Tuesday, May 11, 2010

Publish-Subscribe

Tuesday, May 11, 2010

Point-to-Point

Tuesday, May 11, 2010

Store-Forward
Durability, event log, auditing etc.

Tuesday, May 11, 2010

Request-Reply
F.e. AMQPs replyTo header

Tuesday, May 11, 2010

Messaging

Standards: Products:

RabbitMQ (AMQP) ActiveMQ (JMS) Tibco MQSeries etc

AMQP JMS

Tuesday, May 11, 2010

ESB

Tuesday, May 11, 2010

ESB products
ServiceMix (Open Source) Mule (Open Source) Open ESB (Open Source) Sonic ESB WebSphere ESB Oracle ESB Tibco BizTalk Server
Tuesday, May 11, 2010

Actors

Fire-forget

Fire-And-Receive-Eventually

Async send

Async send + wait on Future for reply

Tuesday, May 11, 2010

Enterprise Integration Patterns

Tuesday, May 11, 2010

Enterprise Integration Patterns


Apache Camel

More than 80 endpoints XML (Spring) DSL Scala DSL


Tuesday, May 11, 2010

Compute Grids

Tuesday, May 11, 2010

Compute Grids
Parallel execution

Divide and conquer


1. Split up job in independent tasks 2. Execute tasks in parallel 3. Aggregate and return result

MapReduce - Master/Worker
Tuesday, May 11, 2010

Compute Grids
Parallel execution

Automatic provisioning Load balancing Fail-over Topology resolution


Tuesday, May 11, 2010

Compute Grids
Products

Platform DataSynapse Google MapReduce Hadoop GigaSpaces GridGain


Tuesday, May 11, 2010

Load balancing
Tuesday, May 11, 2010

Load balancing
Random allocation Round robin allocation Weighted allocation Dynamic load balancing

Least connections Least server CPU etc.

Tuesday, May 11, 2010

Load balancing
DNS Round Robin (simplest) Reverse Proxy (better) Hardware Load Balancing
Tuesday, May 11, 2010

Ask DNS for IP for host Get a new IP every time

Load balancing products


Reverse Proxies:

Apache mod_proxy (OSS) HAProxy (OSS) Squid (OSS) Nginx (OSS)

Hardware Load Balancers:


BIG-IP Cisco
Tuesday, May 11, 2010

Parallel Computing

Tuesday, May 11, 2010

Parallel Computing
SPMD Pattern Master/Worker Pattern Loop Parallelism Pattern Fork/Join Pattern MapReduce Pattern
UE: Unit of Execution Process Thread Coroutine Actor
Tuesday, May 11, 2010

SPMD Pattern
Single Program Multiple Data Very generic pattern, used in many other patterns Use a single program for all the UEs Use the UEs ID to select different pathways through the program. F.e:
Branching on ID Use ID in loop index to split loops

Keep interactions between UEs explicit


Tuesday, May 11, 2010

Master/Worker

Tuesday, May 11, 2010

Master/Worker
Good scalability Automatic load-balancing How to detect termination?
Bag of tasks is empty Poison pill

If we bottleneck on single queue?


Use multiple work queues Work stealing

What about fault tolerance?


Use in-progress queue
Tuesday, May 11, 2010

Loop Parallelism
Workow
1.Find the loops that are bottlenecks 2.Eliminate coupling between loop iterations 3.Parallelize the loop

If too few iterations to pull its weight

OpenMP
Tuesday, May 11, 2010

Merge loops Coalesce nested loops ompparallelfor

What if task creation cant be handled by:


parallelizing loops (Loop Parallelism) putting them on work queues (Master/Worker)

Tuesday, May 11, 2010

What if task creation cant be handled by:


parallelizing loops (Loop Parallelism) putting them on work queues (Master/Worker)

Enter Fork/Join
Tuesday, May 11, 2010

Fork/Join
Use when relationship between tasks
is simple Good for recursive data processing Can use work-stealing

1. Fork: Tasks are dynamically created 2. Join: Tasks are later terminated and data aggregated
Tuesday, May 11, 2010

Fork/Join
Direct task/UE mapping
1-1 mapping between Task/UE Problem: Dynamic UE creation is expensive Pool the UE Control (constrain) the resource allocation Automatic load balancing

Indirect task/UE mapping

Tuesday, May 11, 2010

Fork/Join
Java 7 ParallelArray (Fork/Join DSL)

Tuesday, May 11, 2010

Fork/Join
Java 7 ParallelArray (Fork/Join DSL)
ParallelArraystudents= newParallelArray(fjPool,data); doublebestGpa=students.withFilter(isSenior) .withMapping(selectGpa) .max();

Tuesday, May 11, 2010

MapReduce
Origin from Google paper 2004 Used internally @ Google Variation of Fork/Join Work divided upfront not dynamically Usually distributed Normally used for massive data crunching

Tuesday, May 11, 2010

MapReduce
Products
Hadoop (OSS), used @ Yahoo Amazon Elastic MapReduce Many NOSQL DBs utilizes it for searching/querying

Tuesday, May 11, 2010

MapReduce

Tuesday, May 11, 2010

Parallel Computing
products
MPI OpenMP JSR166 Fork/Join java.util.concurrent
ExecutorService, BlockingQueue etc.

ProActive Parallel Suite CommonJ WorkManager (JEE)


Tuesday, May 11, 2010

Stability Patterns

Tuesday, May 11, 2010

Stability Patterns
Timeouts Circuit Breaker Let-it-crash Fail fast Bulkheads Steady State Throttling
Tuesday, May 11, 2010

Timeouts
Always use timeouts (if possible):

Thread.wait(timeout) reentrantLock.tryLock blockingQueue.poll(timeout,timeUnit)/


offer(..)

futureTask.get(timeout,timeUnit) socket.setSoTimeOut(timeout)

Tuesday, May 11, 2010

etc.

Circuit Breaker

Tuesday, May 11, 2010

Let it crash
Embrace failure as a natural state in
the life-cycle of the application manage it

Instead of trying to prevent it; Process supervision Supervisor hierarchies (from Erlang)
Tuesday, May 11, 2010

Restart Strategy
OneForOne

Tuesday, May 11, 2010

Restart Strategy
OneForOne

Tuesday, May 11, 2010

Restart Strategy
OneForOne

Tuesday, May 11, 2010

Restart Strategy
AllForOne

Tuesday, May 11, 2010

Restart Strategy
AllForOne

Tuesday, May 11, 2010

Restart Strategy
AllForOne

Tuesday, May 11, 2010

Restart Strategy
AllForOne

Tuesday, May 11, 2010

Supervisor Hierarchies

Tuesday, May 11, 2010

Supervisor Hierarchies

Tuesday, May 11, 2010

Supervisor Hierarchies

Tuesday, May 11, 2010

Supervisor Hierarchies

Tuesday, May 11, 2010

Fail fast
Avoid slow responses Separate: Verify resource availability before
starting expensive task

SystemError - resources not available ApplicationError - bad user input etc

Input validation immediately


Tuesday, May 11, 2010

Bulkheads

Tuesday, May 11, 2010

Bulkheads
Partition and tolerate
failure in one part

Redundancy Applies to threads as well:

One pool for admin tasks to be able to perform tasks even though all threads are blocked

Tuesday, May 11, 2010

Steady State
Clean up after you Logging:

RollingFileAppender (log4j) logrotate (Unix) Scribe - server for aggregating streaming log data Always put logs on separate disk

Tuesday, May 11, 2010

Throttling
Maintain a steady pace Count requests Queue requests

Used in for example Staged Event-Driven Architecture (SEDA)

If limit reached, back-off (drop, raise error)

Tuesday, May 11, 2010

Upcoming seminars

7 Juni - Akka ? - Event-Driven Architecture

Tuesday, May 11, 2010

?
Tuesday, May 11, 2010

thanks
for listening
Tuesday, May 11, 2010

Extra material

Tuesday, May 11, 2010

Client-side consistency
Strong consistency Weak consistency Eventually consistent Never consistent

Tuesday, May 11, 2010

Client-side Eventual Consistency levels


Casual consistency Read-your-writes consistency (important) Session consistency Monotonic read consistency (important) Monotonic write consistency
Tuesday, May 11, 2010

Server-side consistency
N = the number of nodes that store replicas of
the data

W = the number of replicas that need to


acknowledge the receipt of the update before the update completes

R = the number of replicas that are contacted


when a data object is accessed through a read operation
Tuesday, May 11, 2010

Server-side consistency
W+R > N W + R <= N
strong consistency eventual consistency

Tuesday, May 11, 2010

You might also like