0% found this document useful (0 votes)

114 views118 pages

System Design Playbook for Beginners

The 'System Design Playbook' is a beginner's guide authored by Suresh Gandhi, Rohit Jain, and Shubham Chandak, aimed at simplifying system design concepts through clear diagrams and explanations. It covers essential topics such as scalability, availability, and various architectural patterns like microservices and load balancing, making it suitable for learners and professionals alike. The book is designed to fill a gap in accessible resources for understanding system design, particularly for those preparing for technical interviews or seeking to enhance their technical knowledge.

Uploaded by

Abhishek Mukherjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

114 views118 pages

System Design Playbook for Beginners

Uploaded by

Abhishek Mukherjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

A creative book by Sweet Codey

System Design
Playbook
Shortcut to Interview Success

Beginner’s Guide
[Link]
System
Design
Playbook
A Beginner’s Guide

Authors:
Suresh Gandhi
Rohit Jain
Shubham Chandak

[Link]
Copyright
© 2024 Suresh Gandhi, Rohit Jain, Shubham Chandak (Authors)

All rights reserved. No part of this publication may be

reproduced, distributed, or transmitted in any form or by any
means, including photocopying, recording, or other electronic or
mechanical methods, without the prior written permission of
the publisher, except in the case of brief quotations embodied
in reviews and certain other noncommercial uses permitted by
copyright law.

For more information, contact hello@[Link]

[Link]
The authors

Names:
Suresh Gandhi Software Engineer at Microsoft
Rohit Jain Software Engineer at Amazon
Shubham Chandak Software Engineer at Bloomberg

About the authors:

Three friends from IIT Kharagpur—Suresh, Rohit, and
Shubham—reunited in the USA after forging a strong bond over
their shared passion for technology during their undergraduate
years. Each has since excelled at major tech companies; Suresh at
Microsoft, Rohit at Amazon, and Shubham at Bloomberg.

Together, they bring a wealth of experience in building scalable

systems that serve millions. Suresh dives deep to simplify
complex topics, Rohit brings ideas to life with his practical
approach, and Shubham specializes in optimizing user experience
and ensuring top-tier quality. Their combined expertise and unique
perspectives make their teachings particularly insightful and
accessible.

[Link]
Preface

We've noticed a signiﬁcant gap in resources that explain

system design simply and visually. This book aims to ﬁll that
gap with clear diagrams and easy-to-understand explanations.

Whether you're just starting to learn about system design,

preparing for interviews, a product manager wanting to grasp
technical concepts, or simply curious about the topic, this book
is perfect for you. We've designed it to be visually engaging
and concise, making complex ideas accessible to everyone.

Each chapter is divided into subsections that explain relevant

concepts and buzzwords. By the end of the book, you'll feel
comfortable with system design basics and ready to dive into
more advanced topics.

We hope this book serves as a valuable resource and inspires

your curiosity and passion for technology.

Suresh, Rohit, and Shubham

[Link]
Contents

Chapter 1
Chapter 2
Ultimate
Buzzwords
System
Design
Design
Goals
Template

Chapter 3 Chapter 4
Buzzwords Buzzwords
Database Networking

Chapter 5
Chapter 6
Buzzwords
Communica- Buzzwords
tion Extras

[Link]
01
Chapter 1

Chapter System
- The Ultimate 1
Design Template

[Link]
The Ultimate
System Design Template

Client &
Step 1 Server
Step 6 Cache

Content
Step 2 Database Step 7 Delivery
Network

Vertical &
Monolith &
Step 3 Horizontal Step 8 Microservice
Scaling

Load Message
Step 4 Step 9 Queue
Balancer

Database API
Step 5 Sharding & Step 10 Gateway
Replication

[Link]
Step 01

Client & Server

[Link]
Client & Server

● You open the browser. You type the website address. The website
loads.

● There are two main elements involved in the whole process - client and
server.
● Your mobile/computer is the Client as it requests to view the webpage.
● The computer where the webpage is stored is the Server. It takes
client’s request and returns the webpage.

[Link]
Step 02

Database

[Link]
Database

● Lets break down how the server returns the web-page.

● Server ﬁrst understands what the client is requesting for.
● Based on the request it fetches the data from the Data Store. In this case
the web-page data is fetched from the Data Store.
● Note: Think of Data Store as data layer that handles everything related to
storing data.

[Link]
● Data Store includes Database (where dynamic data is stored) and
Object storage (where static data like HTML, ﬁles, images are stored).

[Link]
Step 3

Vertical &
Horizontal Scaling

[Link]
Vertical & Horizontal Scaling

● More and more people are visiting [Link].

● Because of this, both the server and database are overwhelmed and are
struggling to keep up.
● Lets see how we can solve the ‘server getting overwhelmed’ problem ﬁrst.
● To solve this problem, we increase the server’s power (CPU, RAM). This
increase in server’s power is called Vertical Scaling.

● Boosting the server's power helped initially. But now, even more
people are visiting [Link] 😊
● Our powerful server has reached its full capacity and couldn’t handle
any more.

[Link]
● We need more powerful servers to handle them. One is not enough.
● We therefore add more such powerful servers.
● This increase in the number of servers is called Horizontal
Scaling.

[Link]
Step 04

Load Balancer

[Link]
Load Balancer
● Looks like Server 1 is over capacity and Server 2 is under utilized.

● Let’s ﬁx that using a Load Balancer.

● Load Balancer balances the load and distributes the requests to
different servers equally.

[Link]
Step 5

Database Sharding &

Replication

[Link]
Data Sharding & Replication

● We have solved the ‘server getting overwhelmed’ problem. Now, lets see
how we can solve the ‘database getting overwhelmed’ problem.
● The problem is we just have a 'single' database that is handling all the user
operations. With more and more users visiting [Link], this single
database is getting burdened.
● We can solve this problem by splitting our single database into several
smaller databases.
● Each one will hold a different part of the user data, called as a shard. This is
called as Database Sharding.
● Now we don’t have the ‘single’ database overburdened problem. Also, if one
shard has issues, the others keep working. This prevents the entire
database from going down at once.

[Link]
● But what if one of our database shard crashes. We will lose all the data
from that shard. How do we deal with it?
● We simply replicate our database shards. This is known as Database
Replication.
● Now when a database shard crashes, we can replace it with its replica.

[Link]
Step 06

Cache

[Link]
Cache
● Now User1 really starts liking [Link]. He visits the website 100
times in a day.
● This means everytime User1 visits the website, User1’s data needs to be
fetched from the database. This means we are asking the database for the
same data over and over again. Feels repetitive?
● To solve this, we use a Cache. The cache is like a ‘quick-access’ memory
that stores information that people ask a lot.
● Fetching from a cache is much faster than a database. One analogy to
understand this - picking a book from your bedside table (Fetching from
Cache) vs going to the library and borrowing it from there (Fetching from
Database).

The ﬂow becomes as follows:

1. First we look for the data in the

cache.
2. If it's not in the cache, we retrieve it
from the database.
3. We then save this data back in the
cache, so next time it's needed, it
can be accessed much faster.

[Link]
● The overall ﬂow looks as follows -

[Link]
Step 7

Content Delivery
Network (CDN)

[Link]
Content Delivery Network

● Now, let’s say Sweet Codey has all its servers in the USA. A user from India
tries to open [Link].
● The website assets (Images, Videos, etc.) are bulky content. This bulky
content will have to travel a long distance. This will increase latency a lot.
● A CDN (Content Delivery Network) comes handy in this case.
● It stores copies of your website’s static content (static content = the data
that doesn’t change too often) at various locations around the world.
● Now, the user can quickly access static content (images, videos, etc.)
directly from a CDN server closer to them.

[Link]
Step 08

Monolith and
Microservices

[Link]
Monolith & Microservices

● Before we proceed further, let’s ﬁrst try to understand what a ‘service’ is.
● Service is a set of servers which specializes in handling a speciﬁc task.
Example: Set of servers handling user payments.

● Now, lets say you try to buy a book on [Link]. There are 3
separate tasks that needs to be completed by the servers:
● Take your order
● Process your payment
● Send conﬁrmation notiﬁcation
● Now, we could have only one service do all these tasks - ‘Monolithic
Service’.

[Link]
● Another approach could be having three different services dedicated to
do these tasks individually. Order Processing Service, Payment
Processing Service and Notiﬁcation Service. We call this system
‘Microservices ’. Each service has its own load balancer to evenly
distribute the load on the servers.

[Link]
Step 9
Message Queue

[Link]
Message Queue
● Now, we run into a problem here - there are a lot of users placing book
orders.
● The Order Processing Service is receiving many orders. Processing
payment for one order takes time.
● If the Order Processing Service waits for Payment Processing Service to
complete payment for that request, it cannot move to the next order.
● This causes delays and spoils the user experience.
● Message Queue solves this problem.
● The Order Processing Service pushes the order into the message queue and
forgets about it.
● The Payment Processing Service takes messages from the queue and
processes payments for the requests one by one.
● Now, the Order Processing Service doesn’t need to wait for payment
processing before handling new order requests.
● This ‘decouples’ (makes independent) the two services, making the system
more eﬃcient.

[Link]
Step 10

API Gateway

[Link]
API Gateway

● Now, we run into another problem. Users are making different types of
requests.
● Some users are placing book purchase requests, while others are
requesting web pages.
● Without a proper system, managing these different types of requests can
become chaotic.
● We can use an API Gateway (APIG) to handle this problem.
● The API Gateway acts as a single entry point for all user requests.
● All requests go through the API Gateway ﬁrst.
● The API Gateway then routes the purchase requests to the Order
Processing Service and webpage requests to the Webpage Service.
● This helps manage and distribute different types of requests eﬃciently.

[Link]
01
Chapter 2

Design Goals
Chapter 1

[Link]
Design Goals

01 Scalability

02 Availability

Consistency
03 (Strong & Eventual)

Fault Tolerance &

04 Single Point of Failure

[Link]
Step 01

Scalability

[Link]
Scalability

● Imagine a local bakery that initially handles its customers with just one
cashier.
● Now the bakery becomes more popular.
● Because of that, the line of customer grows longer, and waiting times
increase

● To serve more customers, the bakery opens additional checkout counters

helping them to handle growing crowd eﬃciently.
● Basically our bakery has ‘scaled up’ to meet increased demand. Just like
the bakery, our technical systems also need to ‘scale up’ as more users join.
● A well-designed system can scale up more easily.
● Scalability is the system's ability to handle more work smoothly as demand
grows.

[Link]
[Link]
Step 2

Availability

[Link]
Availability
● Availability means how much time a system is up or operational.
● For example, an online banking website that is available 24/7 ensures
that users can access their accounts and perform transactions at any
time.
● A system which is available 99.999% of the time also known as "ﬁve
nines," means it is only allowed 5 minutes (0.001%) of downtime per
year.

[Link]
Step 3

Consistency

[Link]
Consistency
Strong/Eventual

● Consistency means consistent/same data visible to everyone.

● For example, if you update your profile picture, every user sees the updated
picture, nobody sees the old one.
● Ensuring consistency is important for applications where we need
up-to-date information. One good example is financial transactions.
● When you withdraw money from the bank, it’s essential that the updated
balance is immediately reflected, so the same money isn't withdrawn
multiple times.

[Link]
Step 4
Fault Tolerance &
Single Point Of Failure

[Link]
Fault Tolerance &
Single Point of Failure
● Let’s assume a very simple system where a user is trying to open
[Link].
● There is a server which handles the client’s request and a data store which
keeps the site data.
● We can clearly see that if this server goes down the website will be
inaccessible. Here, this server is SPOF.
● SPOF (Single Point of Failure) is a component in a system that, if it fails, will
stop the entire system from working.

● Fault Tolerance means a system's ability to continue operating properly

even if some of its parts or components fail.
● For example: If one server in the data center fails other server takes over. If
one data center fails, the other data center takes over.

[Link]
01
Chapter 3
Buzzwords
Chapter 1
Database

[Link]
Buzzwords
Database

Database
SQL
Step 1 Database Step 5 Sharding &
Replication

NoSQL Step 6 Cache

Step 2 Database

SQL vs Step 7
Step 3 NoSQL
CDN

Object
Step 4
Storage

[Link]
Step 01

Relational Database /
SQL Database

[Link]
Relational \ Database /
SQL Database

● Stores data in tables, which are like spreadsheets with rows and
columns.
● Ideal for the data that has a well structured format like User Data.
● User data is structured because it is organized into predeﬁned
ﬁelds like name, email, phone number, and address.
● Examples of famous relational / SQL Databases - MySQL,
PostgreSQL.

[Link]
Step 02

Non-Relational
Database / NoSQL

[Link]
Non-Relational
Database / NoSQL

● Imagine saving social media posts in a table with columns for text, images,
and videos.
● If a post has only text, the image and video columns remain empty.
● Similarly, a post with only a video leaves the text and image columns empty.
● This leads to many empty spaces in the table, which is ineﬃcient and
wastes resources.

[Link]
● This is where we use NoSQL Database. It is ideal for this type of data which
doesn’t have a ﬁxed structure.
● Examples of famous NoSQL Databases: MongoDB, Cassandra, DynamoDB.
● NoSQL databases come in various types, each suited to different needs:
● Key-Value Stores
● Document Databases
● Graph Databases
● Wide-Column Databases Time-Series Databases.

[Link]
Step 03

SQL vs NoSQL

[Link]
SQL vs NoSQL
● The natural question that arises here is how to choose between SQL vs
NoSQL database.
● Here are some general guidelines that you can follow but DO REMEMBER
it’s not always black and white. A lot of things depends on the project
needs.
a. When you need fast data access, NoSQL is generally preferred over
SQL.
b. When the scale is too large, NoSQL databases tend to perform better
than SQL databases.
c. When the data fits into a fixed structure, SQL is more suited. When the
data doesn’t fit into a fixed structure, NoSQL should be the choice.
d. If you have complex queries to execute on your data, SQL should be
the choice. If you have simpler queries you can use NoSQL.
e. If your data changes frequently or will evolve over time go for NoSQL
database as it supports flexible structure.

[Link]
Step 04

Object Storage

[Link]
Object Storage

● In Object Storage we store objects.

● Each object is either a photo, video, audio, ﬁle. Effectively, they are simply
units of data composed of bits/bytes.
● This type of storage is perfect for keeping large amounts of data that don't
follow a regular structure, like pictures, videos, music, documents, and
backups.
● Examples of object storage services include Amazon S3, Google Cloud
Storage, and Microsoft Azure Blob Storage.

[Link]
Step 05

Database Sharding &

Replication

[Link]
Database Sharding
& Replication

● Database sharding splits a large database into smaller sections called

shards.
● Each shard stores a part of the data. This speeds up searches and reduces
stress on any single server.
● If one shard has a problem and stops working, the other shards keep
functioning. This means the whole system doesn’t go down. This makes
your database more reliable.

[Link]
● Database replication is simply making copies of your database so that if
one fails, others can take over

[Link]
Step 06

Cache

[Link]
Cach
● e
Accessing data from database takes a long time. But if we want to
access it faster, we use cache.
● Accessing from a cache is ~ 50 to 100 times faster than accessing
from DB.
● Cache is a type of memory which is super fast but it has limited
capacity (very less in comparison to database).
● That is why we use Cache to store frequently accessed data.
● It is like keeping snacks close to you at your desk (cache) while you
study. Instead of walking to the kitchen (database) each time you're
hungry, you simply grab a snack from your desk.

When the data is found in the cache, it is called a ‘Cache Hit’.

When the data is not found in the cache, it is called a ‘Cache Miss’.
Examples
● User1's data is found in the cache, so it is quickly fetched from the
cache without the need for accessing the database.

[Link]
Cach
e
User4's data isn't in the cache initially. It's fetched from the database (slow)
and the cache is updated.

Next request for User4 is quickly served from the cache because User4's
data is now in the cache.

[Link]
Step 7

Content Delivery
Network (CDN)

[Link]
Content Delivery Network

● Lets say Sweet Codey has all its servers in the US. A user from India tries to
open [Link].
● The website assets (Images, Videos, etc.) are bulky content. This bulky
content will have to travel a long distance. This will increase latency a lot.
● A CDN (Content Delivery Network) comes handy in this case.
● It stores copies of your website’s static content (static content = the data
that doesn’t change too often) at various locations around the world.
● Now, the user can quickly access static content (images, videos, etc.)
directly from a CDN server closer to them.

[Link]
01
Chapter 4

Chapter 1
Buzzwords
Networking

[Link]
Buzzwords
Networking

01 IP Address

02 DNS (Domain Name Server)

03 Client & Server

Protocols
04 (TCP, UDP, HTTP, Websockets)

05 Forward & Reverse Proxy

[Link]
Step 1

IP Address

[Link]
IP Address

● Just as every person in this world is known by their name, each

computer on the internet is known by its IP address.
● An IP address looks something like this - [Link] OR
[Link]. Well that’s a very
horrible name!

[Link]
Step 02

DNS (Domain
Name Server)

[Link]
DNS (Domain Name
Server)

● When you type [Link] in your browser, your browser

requests sweetcodey’s home page from servers.
● But how does your browser know where sweetcodey’s servers are?
● That’s the purpose of DNS.
● DNS is a service that takes the website name (like [Link])
and provides the IP address of the server.
● Now the browser knows the sweetcodey’s IP address so it gets the
homepage from it.

[Link]
Step 03

Client & Server

[Link]
Client & Server

● Client: A computer that requests information.

● Server: A computer that serves the requested information from the
client.

Examples:
● Your Smart TV requests movies (aka streaming) from Netﬂix - Your
Smart TV is the client requesting information from Netﬂix Server.
● Your phone gets directions from Google Maps - Your phone is the
client requesting information from the Google Maps Server.

[Link]
Step 04

Protocols
TCP, UDP, HTTP, Websocket

[Link]
Protocols - TCP, UDP
HTTP, Websocket
● Just as people use grammatical rules to communicate, computers
also follow certain rules while communicating.
● The rules that computers follow are Protocols.
● Based on what the task is, we use different rules / protocols for it.
● Example - If the task to do some common web interactions like
sending and receiving web pages, updating content etc. we use
HTTP protocol. Similarly, if the task to transfer ﬁles we use FTP
protocol.

[Link]
TCP
(Transmission Control
Protocol)
● Let's say you are streaming a movie and after the ﬁrst scene, you see the
climax directly. That's confusing, right?
● Well, TCP prevents this from happening.
● It is a protocol which ensures your data packets are delivered in the correct
sequence, so you watch the movie in the proper order.
● So, whenever proper ordering is necessary, like in email, or streaming
video, TCP is used.

[Link]
UDP (User Datagram
Protocol)
● Imagine you're watching a live football game. You want it to feel live with
barely any delay.
● UDP helps with this!
● It sends video fast, though sometimes a few pieces might get lost.
● UDP protocol is very fast, but it doesn't guarantee the delivery of all data
packets. In summary, UDP is perfect for tasks where speed is more important
than reliability.
● Unlike UDP, which prioritizes speed and might skip some data, TCP focuses
on reliability. It may be slower, but it ensures everything is complete and in
order—making sure you don’t miss any part of your movie.

[Link]
HTTP
(Hypertext Transfer
Protocol)
● HTTP is the most standard and commonly used protocol on the internet.
● It operates on a simple principle: you demand something from the server,
and the server responds. For instance, you request a webpage, the server
sends it back to you.
● Example: Consider shopping online. Each time you click on a product, your
browser (client) sends a request to the store's server for product details.
The server then fetches this information from its database and sends it
back to your browser in the form of a webpage that you view.

[Link]
Websockets

● In standard protocols like HTTP, the interaction is one-directional i.e. the

client sends a request and then the server responds to it. Without a
request from the client, the server cannot initiate sending data to the
client.
● But with Websockets, both the client and the server can send data to
each other at any time. This makes the communication bi-directional.
● Example: Consider a live chat application. Now your device can’t
constantly send requests to server every second asking “Do you have a
new message?”. That would be very ineﬃcient.
● With websockets, whenever a message arrives for a client, server sends
it to that client. This makes it very eﬃcient.

[Link]
Step 05

Forward Proxy &

Reverse Proxy

[Link]
Forward Proxy &
Reverse Proxy

● Forward Proxy is like a personal assistant for your outdoor requests.

● Whenever you need something from outside, you just tell your assistant
what you need, and they go get it for you.
● Similarly, a forward proxy sits between you (the client) and the internet.
● You send your requests to the proxy, and it fetches the information from
the internet on your behalf.

[Link]
● Reverse Proxy is like a personal assistant for your family.
● Now, instead of people contacting your family directly, they go through
your assistant. The assistant ﬁlters the messages/calls and forwards
only the important ones to your family members.
● Similarly, a reverse proxy sits between the internet and your services
(collection of servers specializing in a task). It receives requests from
clients and forwards those requests to the appropriate service.
● For example, if a user sends a login request, the reverse proxy routes it to
the Authentication Service. If a user requests content, the reverse proxy
routes it to the Content Service.

[Link]
01
Chapter 5

Chapter 1
Buzzwords
Communication

[Link]
Buzzwords
Communication

01 API

02 Rest API

03 GraphQL

04 gRPC

05 Message Queue

[Link]
Step 01

API

[Link]
API
● Just like people are social, computers are social too. They talk to
each other through APIs.
● Based on ‘how’ computers are talking to each other, we can
classify the APIs. Here are 3 common types that we can discuss
and learn more about.

[Link]
REST

Step 02

Rest API

[Link]
Rest API
● You go to a restaurant → look at the menu → order a couple of choices (eg.
burger and fries) to the waiter.
● Waiter acknowledges them → then goes to the kitchen and informs chef →
ﬁnally comes back with food.
● This is very similar to how REST API operates.
● Just as there are ‘standard’ ways for you to place an order i.e. only from
menu items, computers use REST API to talk to each other, only in certain
standard ways, to request and receive data. Shown below are 4 common
standard ways.

● HTTP GET: Client computer gets data from the server computer.

[Link]
● HTTP POST: Client computer creates data in the server computer.

● HTTP PATCH: Client computer updates data in the server computer.

[Link]
● HTTP DELETE: Client computer delete data in the server computer.

[Link]
Step 03

Graph QL

[Link]
Graph QL
● Imagine at a restaurant, instead of picking directly from the menu, you
customize your order. Say a burger with extra pickles, extra cheese, and
a gluten-free bun.
● The waiter notes your speciﬁcs, tells the chef, who then makes your
meal just as you asked. The waiter brings your custom meal exactly to
your liking. This is how GraphQL works.
● Unlike REST, where you get the standard menu items, GraphQL lets you
request customized menu items.
● If this order were placed using REST, you’d need to order each item
separately—burger, extra pickles, extra cheese, gluten-free bun. Once
all items are delivered, you would then assemble them into the burger
you actually wanted.
● With GraphQL, you describe your complete custom burger in one order.
The waiter (akin to the server) understands the detailed request
(GraphQL API Request) and brings you your fully customized burger in
one go.

[Link]
Step 04

gRPC

[Link]
gRPC
● Imagine you’re at a restaurant. You look at the menu and decide to order a
burger and fries.
● You tell the waiter your order.
● The waiter quickly goes to the kitchen and says “B+F for table 1” i.e. Burger
and Fries for table 1. This special language helps them communicate super
fast and efficiently.
● After the kitchen prepares your order using this quick communication, the
waiter brings your meal to the table without any delay.
● This quick internal communication at the restaurant is a lot like gRPC in the
tech world.
● Just as the kitchen staff use a shorthand to communicate efficiently, gRPC
API allows different internal parts of system (like microservices) to
communicate efficiently.
● gRPC uses less data to send messages which makes it fast.

[Link]
Step 5
Message Queue

[Link]
Message Queue

● Imagine a homeowner who has a long list of tasks (prepare food, do trash,
clean home, clean utensils etc.).
● These tasks have to be done in the morning before the homeowner leaves
for his work.
● He has a helper maid who is gonna help him with these tasks.

Scenario 1:
● He starts giving tasks to his maid one by one.
● He waits for each to be completed before assigning the next.
● This is ineﬃcient.
● If he waits for each task to be ﬁnished, it will waste his time.
● Also, he has to leave for work, and this approach delays his departure.

[Link]
Scenario 2:
● He writes down all the tasks on a checklist and leaves for work.
● The maid picks up tasks from the checklist one by one and completes them
independently.
● The homeowner can go to work on time, while the maid completes the
tasks at her own pace.
● This is much more eﬃcient.

● A Message Queue works just like this task checklist scenario.

● The homeowner is like the producer who adds tasks (messages) to
the queue (checklist), and the maid is like the consumer who
processes the tasks one by one.

[Link]
Beneﬁts:
● It is more eﬃcient because the homeowner saves time and can work
on other tasks, without waiting for each one to be processed.
● The task checklist ensures that no tasks are forgotten or lost.
● If the maid needs a break or has to step out, the tasks will remain on
the checklist.
● Whenever she returns, she can pick up right where she left off.
● This ensures all the tasks are taken care of and will be completed by
the end.

Drawbacks:
● For tasks like turning on the light switch or adding sugar to coffee,
writing them in a checklist overcomplicates things. It's faster to handle
these tasks directly.
● For urgent tasks like turning off a burning stove, writing them in a
checklist causes unnecessary delays. These tasks need immediate
attention, and a message queue would be too slow.

[Link]
01
Chapter 6
Buzzwords
Chapter 1
Extras

[Link]
Buzzwords
Extras

01 Cloud Computing

02 Logging & Monitoring

03 Caching Strategies

04 Hashing & Consistent Hashing

05 CAP Theorem

[Link]
Step 01

Cloud Computing

[Link]
Cloud Computing

● Myth buster: There is no such thing as cloud. Servers are always

kept in a data center.
● If you try to host your website, you will need servers for that.
Instead of buying and maintaining physical servers yourself for
your website, why don’t you rent them?
● This is where Cloud computing comes in. With cloud computing,
you can rent computing resources from providers like AWS, Google
Cloud, or Microsoft Azure, allowing you to focus on your website
while they handle the infrastructure.

[Link]
Step 2
Logging and
Monitoring

[Link]
Logging and
Monitoring

Logging

● Logging is like a computer writing in a diary.

● It records everything important like errors and key activities.
● These records are called logs.
● These logs can be used to ﬁx problems (troubleshooting) and also for
data analysis.

[Link]
Monitoring

● Monitoring is like keeping a close watch on a computer.

● When we monitor we constantly check that everything is working
properly.
● If there are any issues, we catch them and ﬁx them.
● It's like having a security camera on a computer.

[Link]
Step 03

Caching
Strategies

[Link]
Caching Strategies

Here are two common methods to read data from a cache:

● Cache Aside Strategy:

When data is requested, we ﬁrst look in the cache. If it's not
there, we get it from the database and save it in the cache for
next time.

[Link]
● Read Through Strategy:

When the data is requested, we ﬁrst look in the cache. If it’s

not there, the cache itself gets the data from the database
and saves in itself for next time.

This is called ‘Read Through’ because we are reading the

data ‘through the cache’.

[Link]
Step 04

Hashing & Consistent

Hashing

[Link]
Hashing And
Consistent Hashing
● Think of a library where each new book gets a numeric code. The
librarian uses this code to quickly put the book on the right shelf.
● To ﬁnd a book, the system uses this code, which is much faster
than searching by title.
● Hashing is very similar. It converts data into a short, random,
unique code. This code helps eﬃciently place and locate data in a
system.

[Link]
● Let’s say we have a need to reorganize our book shelves. However,
if we do so, we would need to change the codes of lot of books.
● Example:
● If we remove shelf number 2, shelf 3 becomes the new shelf 2, and
shelf 4 becomes the new shelf 3.
● Suppose we had a book called "The Alchemist" on shelf 3, coded
as 3B. Now, since our old shelf 3 is the new shelf 2, "The
Alchemist" should have a new code of 2B.

[Link]
● This means we would need to update our system with the new codes
for all these books. If we don’t do it that would be a problem. Why is
this a problem?
● Imagine someone wants to borrow "The Alchemist" now. If our system
isn't updated. So it will still show the book at code 3B.
● The librarian goes to shelf 3B but cannot ﬁnd “The Alchemist” there.
Instead she ﬁnds a different book there.
● Therefore, we would have to update our system with new codes for all
these books which is a big hassle.

[Link]
● Consistent Hashing, a special type of hashing, is a smart algorithm
that minimizes these reorganizations.
● Even if shelves change, most books will still keep their original codes.
Only a few need to be changed, making it much easier to manage.
● We won’t go deep into how it works, as that would be out of the scope
of this course. Just imagine it as a magic algorithm that will help us
minimize all these re-organizational hassles.

[Link]
Step 05

CAP Theorem

[Link]
CAP Theorem
● Ideally you would want that your system to be both consistent and
available.
● Lets say an accident happens that creates a partition in our system.
● How will you tolerate the partition?
● The CAP theorem says that either you can make your system available or
you can make it consistent. You can’t have both at the same time.
Example:
● Consider there is a social media company ‘SweetBook’. They have
servers all around the world.
● At 3pm, an accident happens and the connection between their New
York and San Francisco servers is lost. They have a partition now.
● At 3:30pm, your friend in New York posts something on social media.

[Link]
● Now there could be two scenarios:
● If SweetBook prioritizes availability, the site remains accessible, but
you won't see the new post from your friend in New York—only
posts from local San Francisco users.
● If SweetBook prioritizes consistency, you might see a message like
"Website not available" when you try to access it from San
Francisco.
● You cannot have both at the same time.

[Link]
That’s it
folks!
The Learning
Continues…

[Link]
Did you know this?

Free Resources Your Contribution

Discover free advanced If you love this book
System Design resources at please give us a rating at:
[Link] – dive [Link]/sdplay
deep and master the art! 🚀

Join Our Community

[Link]
Credits
Anmol Gupta (Graphic Designer)
Icons made by Smashicons from www.fl[Link]
Icons made by Vectors Market from www.fl[Link]
Icons made by Pixel perfect from www.fl[Link]
Icons made by Maxim Basinski Premium from www.fl[Link]
Icons made by Freepik from www.fl[Link]
Icons made by Eucalyp from www.fl[Link]
Icons made by juicy_fish from www.fl[Link]
Icons made by mikan933 from www.fl[Link]
Icons made by Md Tanvirul Haque from www.fl[Link]
Icons made by Frey Wazza from www.fl[Link]
Icons made by smashingstocks from www.fl[Link]
Icons made by Witdhawaty from www.fl[Link]
Icons made by flatart_icons from www.fl[Link]
Icons made by Dreamcreateicons from www.fl[Link]
Icons made by kerismaker from www.fl[Link]
Icons made by Parzival’ 1997 from www.fl[Link]
Icons made by logisstudio from www.fl[Link]
Icons made by orvipixel from www.fl[Link]
Icons made by Karyative from www.fl[Link]
Icons made by HAJICON from www.fl[Link]
Icons made by Kalashnyk from www.fl[Link]
Icons made by bsd from www.fl[Link]
Icons made by Indygo from www.fl[Link]
Icons made by Uniconlabs from www.fl[Link]
Icons made by Iconjam from www.fl[Link]
Icons from [Link]
[Link]/icons/2593/cache

[Link]
CREDITS: This presentation template was created by Slidesgo, including \n by
Flaticon, and infographics & images by Freepik

Thanks!
Do you have any questions or suggestions?
hello@[Link]
[Link]

[Link]

System Design Tutorial: Key Steps & Tips
No ratings yet
System Design Tutorial: Key Steps & Tips
9 pages
High-Level Design and Scalability Planning
No ratings yet
High-Level Design and Scalability Planning
35 pages
Fastly Segmented Caching Overview
No ratings yet
Fastly Segmented Caching Overview
12 pages
System Design Cheat Sheet
No ratings yet
System Design Cheat Sheet
6 pages
Online Watch Gallery Project Overview
No ratings yet
Online Watch Gallery Project Overview
39 pages
System Design Interview Strategies
90% (10)
System Design Interview Strategies
103 pages
Scaling from Zero to Millions of Users
No ratings yet
Scaling from Zero to Millions of Users
40 pages
Online Watch Purchase System Overview
No ratings yet
Online Watch Purchase System Overview
38 pages
Caching Fundamentals by Alex Xu
No ratings yet
Caching Fundamentals by Alex Xu
9 pages
System Design Interview Prep Guide
No ratings yet
System Design Interview Prep Guide
14 pages
Online Watch Gallery Project Overview
No ratings yet
Online Watch Gallery Project Overview
39 pages
Scaling Web Apps for Millions of Users
No ratings yet
Scaling Web Apps for Millions of Users
8 pages
System Design Terminology Glossary
No ratings yet
System Design Terminology Glossary
13 pages
Systems Design Fundamentals Glossary
No ratings yet
Systems Design Fundamentals Glossary
193 pages
Booksky Website Project Overview
No ratings yet
Booksky Website Project Overview
19 pages
System Design Interview Essentials
No ratings yet
System Design Interview Essentials
19 pages
Online Panda Market Project Report
No ratings yet
Online Panda Market Project Report
10 pages
System Design Concepts: Latency, Throughput, Availability
No ratings yet
System Design Concepts: Latency, Throughput, Availability
24 pages
Scalable System Design Principles
No ratings yet
Scalable System Design Principles
3 pages
E-commerce Project Report Overview
No ratings yet
E-commerce Project Report Overview
76 pages
Hexx Rewritten 3 Server Overview
No ratings yet
Hexx Rewritten 3 Server Overview
4 pages
System Design Interview Essentials
No ratings yet
System Design Interview Essentials
256 pages
Guide to Designing Scalable Systems
No ratings yet
Guide to Designing Scalable Systems
56 pages
Surviving the Digg/Slashdot Effect
100% (10)
Surviving the Digg/Slashdot Effect
34 pages
Key Concepts in System Design
No ratings yet
Key Concepts in System Design
18 pages
Electronic Book Shop: An Online Book Store For The Individual
No ratings yet
Electronic Book Shop: An Online Book Store For The Individual
72 pages
Enhancing Search Relevance with Elasticsearch
No ratings yet
Enhancing Search Relevance with Elasticsearch
4 pages
Online Ice Cream Parlor Project Overview
0% (1)
Online Ice Cream Parlor Project Overview
19 pages
System Design Concepts and Strategies
No ratings yet
System Design Concepts and Strategies
146 pages
GCUF Website Architecture Analysis
No ratings yet
GCUF Website Architecture Analysis
6 pages
Ice Cream Parlour E-Commerce System
No ratings yet
Ice Cream Parlour E-Commerce System
208 pages
Scaling Systems for Millions of Users
No ratings yet
Scaling Systems for Millions of Users
28 pages
System Design Insights by Rocky Bhatia
No ratings yet
System Design Insights by Rocky Bhatia
11 pages
Pinterest System Design Overview
No ratings yet
Pinterest System Design Overview
10 pages
URL Shortening Service Design Guide
No ratings yet
URL Shortening Service Design Guide
5 pages
Essential Concepts for Software Engineers
No ratings yet
Essential Concepts for Software Engineers
5 pages
Tech Essentials for Product Managers
No ratings yet
Tech Essentials for Product Managers
20 pages
The System Design
No ratings yet
The System Design
135 pages
Comprehensive System Design Overview
No ratings yet
Comprehensive System Design Overview
54 pages
Back End Project Ideas for Developers
No ratings yet
Back End Project Ideas for Developers
1 page
16 Key System Design Concepts for Interviews
100% (1)
16 Key System Design Concepts for Interviews
18 pages
System Design Overview and Best Practices
No ratings yet
System Design Overview and Best Practices
4 pages
System Design Guide for Beginners
No ratings yet
System Design Guide for Beginners
90 pages
E-commerce Website Project Report
91% (11)
E-commerce Website Project Report
26 pages
HomeStyler Website Design Specifications
No ratings yet
HomeStyler Website Design Specifications
20 pages
System Design: Scalability, Reliability, CAP
No ratings yet
System Design: Scalability, Reliability, CAP
35 pages
Personal Blogging Platform Overview
No ratings yet
Personal Blogging Platform Overview
22 pages
About Us - FRONTPAGE Blog Site
No ratings yet
About Us - FRONTPAGE Blog Site
23 pages
System Design Essentials Explained
No ratings yet
System Design Essentials Explained
13 pages
Tourism Management System Overview
No ratings yet
Tourism Management System Overview
54 pages
Grokking System Design Interview Guide
0% (1)
Grokking System Design Interview Guide
25 pages
Web App Development Steps Guide
No ratings yet
Web App Development Steps Guide
5 pages
Database and Web Page Technologies Overview
No ratings yet
Database and Web Page Technologies Overview
17 pages
BCETW Social Networking Project Report
No ratings yet
BCETW Social Networking Project Report
38 pages
Essential System Design Concepts
No ratings yet
Essential System Design Concepts
3 pages
MaintainMan Project Report Overview
No ratings yet
MaintainMan Project Report Overview
50 pages
Student Record Keeping System Database
No ratings yet
Student Record Keeping System Database
16 pages
System Design Terminology Glossary
No ratings yet
System Design Terminology Glossary
52 pages
Data Center Metering Resource Guide
No ratings yet
Data Center Metering Resource Guide
32 pages
Web Design Essentials: Key Elements & Principles
No ratings yet
Web Design Essentials: Key Elements & Principles
28 pages
C++ Programming (Mastering Programming Languages Series) by Theophilus Edet
100% (2)
C++ Programming (Mastering Programming Languages Series) by Theophilus Edet
336 pages
Hostel Management System Overview
No ratings yet
Hostel Management System Overview
4 pages
Computer Science Graduate & Data Analyst
No ratings yet
Computer Science Graduate & Data Analyst
1 page
LV Line Maintenance Procedure Guide
No ratings yet
LV Line Maintenance Procedure Guide
5 pages
ITALO 3 Street Lighting Specifications
No ratings yet
ITALO 3 Street Lighting Specifications
11 pages
Bit Plane Slicing in Image Processing
No ratings yet
Bit Plane Slicing in Image Processing
5 pages
TS450: SAP S/4HANA Procurement Exam Guide
100% (1)
TS450: SAP S/4HANA Procurement Exam Guide
8 pages
Mazda 5 Navigation System Manual
No ratings yet
Mazda 5 Navigation System Manual
100 pages
Client/Server Systems: Service & Support Insights
No ratings yet
Client/Server Systems: Service & Support Insights
11 pages
Skellam.ai Software Engineer Profile
No ratings yet
Skellam.ai Software Engineer Profile
2 pages
Automotive Metal Polishing Adelaide Services
No ratings yet
Automotive Metal Polishing Adelaide Services
3 pages
Varsha Kulkarni: Software Testing Expert
No ratings yet
Varsha Kulkarni: Software Testing Expert
4 pages
Flutter E-Commerce App Development Guide
No ratings yet
Flutter E-Commerce App Development Guide
14 pages
BMW X4 Brochure
No ratings yet
BMW X4 Brochure
23 pages
Power Projection Theory: Bitcoin's Impact
100% (1)
Power Projection Theory: Bitcoin's Impact
405 pages
Nrootgag
No ratings yet
Nrootgag
18 pages
RF09 Form for Learner Data Unmerging
No ratings yet
RF09 Form for Learner Data Unmerging
7 pages
When to Switch Neutral in Transfer Switches
No ratings yet
When to Switch Neutral in Transfer Switches
2 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
7 pages
Transmission Mounting and Pressure Points
No ratings yet
Transmission Mounting and Pressure Points
26 pages
Bacterial Concrete Mixtures Dissertation
No ratings yet
Bacterial Concrete Mixtures Dissertation
3 pages
2025-944 Copley SellSheet HarshRugged v3.0
No ratings yet
2025-944 Copley SellSheet HarshRugged v3.0
2 pages
01 FIDIC Yellow Silver Books 201200613
100% (1)
01 FIDIC Yellow Silver Books 201200613
96 pages
RS-1000 Auto Security Alarm Manual
No ratings yet
RS-1000 Auto Security Alarm Manual
16 pages
en 1
100% (1)
en 1
868 pages
Overview of Technical Mechanical Drawing
No ratings yet
Overview of Technical Mechanical Drawing
5 pages
Facilities Management Policy Overview
100% (1)
Facilities Management Policy Overview
36 pages
E-Sys Software Flashing Guide
100% (1)
E-Sys Software Flashing Guide
3 pages

System Design Playbook for Beginners

Uploaded by

System Design Playbook for Beginners

Uploaded by

A creative book by Sweet Codey

All rights reserved. No part of this publication may be

For more information, contact hello@[Link]

About the authors:

Together, they bring a wealth of experience in building scalable

We've noticed a signiﬁcant gap in resources that explain

Whether you're just starting to learn about system design,

Each chapter is divided into subsections that explain relevant

We hope this book serves as a valuable resource and inspires

Suresh, Rohit, and Shubham

Client & Server

● Lets break down how the server returns the web-page.

● More and more people are visiting [Link].

● Let’s ﬁx that using a Load Balancer.

Database Sharding &

The ﬂow becomes as follows:

1. First we look for the data in the

Fault Tolerance &

● To serve more customers, the bakery opens additional checkout counters

● Consistency means consistent/same data visible to everyone.

● Fault Tolerance means a system's ability to continue operating properly

NoSQL Step 6 Cache

● In Object Storage we store objects.

Database Sharding &

● Database sharding splits a large database into smaller sections called

When the data is found in the cache, it is called a ‘Cache Hit’.

02 DNS (Domain Name Server)

03 Client & Server

05 Forward & Reverse Proxy

● Just as every person in this world is known by their name, each

● When you type [Link] in your browser, your browser

Client & Server

● Client: A computer that requests information.

● In standard protocols like HTTP, the interaction is one-directional i.e. the

Forward Proxy &

● Forward Proxy is like a personal assistant for your outdoor requests.

● HTTP PATCH: Client computer updates data in the server computer.

● A Message Queue works just like this task checklist scenario.

02 Logging & Monitoring

04 Hashing & Consistent Hashing

● Myth buster: There is no such thing as cloud. Servers are always

● Logging is like a computer writing in a diary.

● Monitoring is like keeping a close watch on a computer.

Here are two common methods to read data from a cache:

● Cache Aside Strategy:

When the data is requested, we ﬁrst look in the cache. If it’s

This is called ‘Read Through’ because we are reading the

Hashing & Consistent

Free Resources Your Contribution

Join Our Community

You might also like