Scalability Availability Stability:, & Patterns
Scalability Availability Stability:, & Patterns
Jonas Bonr
Jayway Stockholm
Outline
Outline
Outline
Outline
Outline
Introduction
Scalability Patterns
Managing Overload
General recommendations
Immutability as the default Referential Transparency (FP) Laziness Think about your data:
Scalability Trade-offs
Trade-offs
Performance vs Scalability
Latency vs Throughput
maximal throughput
with
acceptable latency
Tuesday, May 11, 2010
Availability vs Consistency
CAP
theorem
Tuesday, May 11, 2010
Brewsters
Centralized system
In a centralized system (RDBMS etc.) So you get both:
we dont have network partitions, e.g. P in CAP
Availability
onsistency
Distributed system
In a distributed system we (will) have
network partitions, e.g. P in CAP
Availability
onsistency
CAP in practice:
Eventual Consistency
...is an interesting trade-off
Eventual Consistency
...is an interesting trade-off
Availability Patterns
Availability Patterns
Fail-over Replication
Master-Slave Tree replication Master-Master Buddy Replication
Fail-over
Fail-over
Fail-over
Fail-over
Fail-back
Network fail-over
Replication
Replication
Replication
Master-Slave Replication
Master-Slave Replication
Tree Replication
Master-Master Replication
Buddy Replication
Buddy Replication
Partitioning
HTTP Caching
Reverse Proxy
Tuesday, May 11, 2010
HTTP Caching
CDN, Akamai
Homegrown + cron or Quartz Spring Batch Gearman Hadoop Google Data Protocol Amazon Elastic MapReduce
Tuesday, May 11, 2010
HTTP Caching
First request
Subsequent request
HTTP Caching
Service of Record
Sharding
Replication
Tuesday, May 11, 2010
Partitioning
Sharding: Partitioning
Sharding: Replication
anti-pattern
Attempt: Result:
Read an object from DB You sit with your whole database in your lap
When do you need ACID? When is Eventually Consistent a better t? Different kinds of data has different needs
Tuesday, May 11, 2010
not
Scaling to a RDBMS is
Tuesday, May 11, 2010
reads
hard
Scaling to a RDBMS is
Tuesday, May 11, 2010
writes
impossible
NOSQL
Key-Value databases Column databases Document databases Graph databases Datastructure databases
Whos ACID?
Relational DBs (MySQL, Oracle, Postgres) Object DBs (Gemstone, db4o) Clustering products (Coherence,
Terracotta)
Whos BASE?
Distributed databases
Bigtable
How can we build a DB on top of Google File System? Paper: Bigtable: A distributed storage system for structured data, 2006 Rich data-model, structured storage Clones: HBase Hypertable Neptune
Tuesday, May 11, 2010
Dynamo
How can we build a distributed hash table for the data center? Paper: Dynamo: Amazons highly available keyvalue store, 2007 Focus: partitioning, replication and availability Eventually Consistent Clones: Voldemort Dynomite
Tuesday, May 11, 2010
Key-Value databases (Voldemort, Dynomite) Column databases (Cassandra,Vertica) Document databases (MongoDB, CouchDB) Graph databases (Neo4J, AllegroGraph) Datastructure databases (Redis, Hazelcast)
Tuesday, May 11, 2010
Distributed Caching
Distributed Caching
Write-through Write-behind Eviction Policies Replication Peer-To-Peer (P2P)
Write-through
Write-behind
Eviction policies
TTL (time to live) Bounded FIFO (rst in rst out) Bounded LIFO (last in rst out) Explicit cache invalidation
Tuesday, May 11, 2010
Peer-To-Peer
Decentralized No special or blessed nodes Nodes can join and leave as they please
Distributed Caching
Products
EHCache JBoss Cache OSCache memcached
memcached
Very fast Simple Key-Value (string >binary) Clients for most languages Distributed Not replicated - so 1/N chance
for local access in cluster
Tuesday, May 11, 2010
Data Grids/Clustering
Parallel data storage
Data replication Data partitioning Continuous availability Data invalidation Fail-over C + A in CAP
Tuesday, May 11, 2010
Data Grids/Clustering
Products
Concurrency
Concurrency
Shared-State Concurrency Message-Passing Concurrency Dataow Concurrency Software Transactional Memory
Shared-State Concurrency
Shared-State Concurrency
Everyone can access anything anytime Totally indeterministic Introduce determinism at well-dened places... ...using locks
Shared-State Concurrency
Problems with locks:
Locks do not compose Taking too few locks Taking too many locks Taking the wrong locks Taking locks in the wrong order Error recovery is hard
Shared-State Concurrency
Please use java.util.concurrent.*
ConcurrentHashMap BlockingQueue ConcurrentQueue ExecutorService ReentrantReadWriteLock CountDownLatch ParallelArray andmuchmuchmore..
Message-Passing Concurrency
Actors
Originates in a 1973 paper by Carl Hewitt Implemented in Erlang, Occam, Oz Encapsulates state and behavior Closer to the denition of OO than classes
Actors
ShareNOTHING Isolated lightweight processes Communicates through messages Asynchronous and non-blocking No shared state hence, nothing to synchronize. Each actor has a mailbox (message queue)
Tuesday, May 11, 2010
Actors
Easier to reason about Raised abstraction level Easier to avoid Race conditions Deadlocks Starvation Live locks
Tuesday, May 11, 2010
Dataow Concurrency
Dataow Concurrency
Declarative No observable non-determinism Data-driven threads block until
data is available On-demand, lazy No difference between:
Concurrent & Sequential code
STM:
STM: overview
See the memory (heap and stack)
as a transactional dataset Similar to a database
begin commit abort/rollback
STM: overview
Transactions can nest
Transactionscompose (yipee!!)
atomic{ ... atomic{ ... } }
STM: restrictions
All operations in scope of a transaction:
Need to be idempotent
(Java/Scala) Multiverse (Java) Clojure STM (Clojure) CCSTM (Scala) Deuce STM (Java)
Event-Driven Architecture
Four years from now, mere mortals will begin to adopt an event-driven architecture (EDA) for the sort of complex event processing that has been attempted only by software gurus [until now] --Roy Schulte (Gartner), 2003
Event-Driven Architecture
Domain Events Event Sourcing Command and Query Responsibility Segregation (CQRS) pattern Event Stream Processing Messaging Enterprise Service Bus Actors Enterprise Integration Architecture (EIA)
Tuesday, May 11, 2010
Domain Events
It's really become clear to me in the last couple of years that we need a new building block and that is the Domain Events -- Eric Evans, 2009
Domain Events
Domain Events represent the state of entities at a given time when an important event occurred and decouple subsystems with event streams. Domain Events give us clearer, more expressive models in those cases. -- Eric Evans, 2009
Tuesday, May 11, 2010
Domain Events
State transitions are an important part of our problem space and should be modeled within our domain. -- Greg Young, 2008
Event Sourcing
Every state change is materialized in an Event All Events are sent to an EventProcessor EventProcessor stores all events in an Event Log System can be reset and Event Log replayed No need for ORM, just persist the Events Many different EventListeners can be added to
Tuesday, May 11, 2010
Event Sourcing
Bidirectional
Bidirectional
Unidirectional
Unidirectional
Unidirectional
CQRS
in a nutshell
All state changes are represented by Domain Events Aggregate roots receive Commands and publish Events
CQRS
CQRS: Benets
Fully encapsulated domain that only exposes Queries do not use the domain model No object-relational impedance mismatch Bullet-proof auditing and historical tracing Easy integration with external systems Performance and scalability
Tuesday, May 11, 2010
behavior
select*from Withdrawal(amount>=200).win:length(5)
Tuesday, May 11, 2010
Messaging
Publish-Subscribe
Point-to-Point
Store-Forward
Durability, event log, auditing etc.
Request-Reply
F.e. AMQPs replyTo header
Messaging
Standards: Products:
RabbitMQ (AMQP) ActiveMQ (JMS) Tibco MQSeries etc
AMQP JMS
ESB
ESB products
ServiceMix (Open Source) Mule (Open Source) Open ESB (Open Source) Sonic ESB WebSphere ESB Oracle ESB Tibco BizTalk Server
Tuesday, May 11, 2010
Actors
Fire-forget
Fire-And-Receive-Eventually
Async send
Compute Grids
Compute Grids
Parallel execution
MapReduce - Master/Worker
Tuesday, May 11, 2010
Compute Grids
Parallel execution
Compute Grids
Products
Load balancing
Tuesday, May 11, 2010
Load balancing
Random allocation Round robin allocation Weighted allocation Dynamic load balancing
Least connections Least server CPU etc.
Load balancing
DNS Round Robin (simplest) Reverse Proxy (better) Hardware Load Balancing
Tuesday, May 11, 2010
Parallel Computing
Parallel Computing
SPMD Pattern Master/Worker Pattern Loop Parallelism Pattern Fork/Join Pattern MapReduce Pattern
UE: Unit of Execution Process Thread Coroutine Actor
Tuesday, May 11, 2010
SPMD Pattern
Single Program Multiple Data Very generic pattern, used in many other patterns Use a single program for all the UEs Use the UEs ID to select different pathways through the program. F.e:
Branching on ID Use ID in loop index to split loops
Master/Worker
Master/Worker
Good scalability Automatic load-balancing How to detect termination?
Bag of tasks is empty Poison pill
Loop Parallelism
Workow
1.Find the loops that are bottlenecks 2.Eliminate coupling between loop iterations 3.Parallelize the loop
OpenMP
Tuesday, May 11, 2010
Enter Fork/Join
Tuesday, May 11, 2010
Fork/Join
Use when relationship between tasks
is simple Good for recursive data processing Can use work-stealing
1. Fork: Tasks are dynamically created 2. Join: Tasks are later terminated and data aggregated
Tuesday, May 11, 2010
Fork/Join
Direct task/UE mapping
1-1 mapping between Task/UE Problem: Dynamic UE creation is expensive Pool the UE Control (constrain) the resource allocation Automatic load balancing
Fork/Join
Java 7 ParallelArray (Fork/Join DSL)
Fork/Join
Java 7 ParallelArray (Fork/Join DSL)
ParallelArraystudents= newParallelArray(fjPool,data); doublebestGpa=students.withFilter(isSenior) .withMapping(selectGpa) .max();
MapReduce
Origin from Google paper 2004 Used internally @ Google Variation of Fork/Join Work divided upfront not dynamically Usually distributed Normally used for massive data crunching
MapReduce
Products
Hadoop (OSS), used @ Yahoo Amazon Elastic MapReduce Many NOSQL DBs utilizes it for searching/querying
MapReduce
Parallel Computing
products
MPI OpenMP JSR166 Fork/Join java.util.concurrent
ExecutorService, BlockingQueue etc.
Stability Patterns
Stability Patterns
Timeouts Circuit Breaker Let-it-crash Fail fast Bulkheads Steady State Throttling
Tuesday, May 11, 2010
Timeouts
Always use timeouts (if possible):
futureTask.get(timeout,timeUnit) socket.setSoTimeOut(timeout)
etc.
Circuit Breaker
Let it crash
Embrace failure as a natural state in
the life-cycle of the application manage it
Instead of trying to prevent it; Process supervision Supervisor hierarchies (from Erlang)
Tuesday, May 11, 2010
Restart Strategy
OneForOne
Restart Strategy
OneForOne
Restart Strategy
OneForOne
Restart Strategy
AllForOne
Restart Strategy
AllForOne
Restart Strategy
AllForOne
Restart Strategy
AllForOne
Supervisor Hierarchies
Supervisor Hierarchies
Supervisor Hierarchies
Supervisor Hierarchies
Fail fast
Avoid slow responses Separate: Verify resource availability before
starting expensive task
Bulkheads
Bulkheads
Partition and tolerate
failure in one part
One pool for admin tasks to be able to perform tasks even though all threads are blocked
Steady State
Clean up after you Logging:
RollingFileAppender (log4j) logrotate (Unix) Scribe - server for aggregating streaming log data Always put logs on separate disk
Throttling
Maintain a steady pace Count requests Queue requests
Upcoming seminars
?
Tuesday, May 11, 2010
thanks
for listening
Tuesday, May 11, 2010
Extra material
Client-side consistency
Strong consistency Weak consistency Eventually consistent Never consistent
Server-side consistency
N = the number of nodes that store replicas of
the data
Server-side consistency
W+R > N W + R <= N
strong consistency eventual consistency