0% found this document useful (0 votes)

84 views49 pages

Business-Logic Layer Design: Architecting With Google Cloud Platform: Design and Process

Uploaded by

Daniel Reyes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views49 pages

Business-Logic Layer Design: Architecting With Google Cloud Platform: Design and Process

Uploaded by

Daniel Reyes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Business-logic Layer Design

Architecting with Google Cloud Platform:

Design and Process

Last modified 2018-08-08

© 2017 Google Inc. All rights reserved. Google
and the Google logo are trademarks of Google Inc.
All other company and product names may be
trademarks of the respective companies with
which they are associated.
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
“In computer software, business logic
or domain logic is the part of the
program that encodes the real-world
business rules that determine how data
can be created, stored, and changed.”
Wikipedia

https://en.wikipedia.org/wiki/Business_logic
Agenda Microservices architecture

GCP 12-factor support

Mapping compute needs to platform products

Compute system provisioning

The photo service is slow

Design challenge #1: Log aggregation

GCP lab Deployment Manager: Package and deploy

© 2018 Google Inc. All rights reserved. Google and the Google logo
are trademarks of Google Inc. All other company and product names may
be trademarks of the respective companies with which they are associated.
Microservices architecture

A Microservice Architecture is a method of

developing software applications as a suite
of independently deployable, small, modular
services.

Each service runs a unique process and

communicates through a well-defined,
lightweight mechanism.

Each service contributes to a business goal.

http://microservices.io/patterns/microservices.html
https://cloud.google.com/appengine/docs/standard/python/microservices-on-app-engi
ne
Microservices architecture is considered a specific type of service-oriented
architecture (SOA).
Benefits of microservices design
Benefits of small separate services:

● Atomic, single-purpose code is easier to develop and maintain

● Does one thing and does it very well
● Supports A/B testing

Independently developed services aid in:

● Fault isolation
● Debugging
● Redundancy and resiliency

BUT it’s harder to understand how the microservices interoperate.

● Unit testing is easier, integration testing is harder.

Allow for independent deployment cycles, including rollback. Facilitate concurrent,

A/B release testing on subsystems. Minimize test automation and quality-assurance
overhead.
Improve clarity of logging and monitoring. Provide fine-grained cost accounting.
Increase overall application scalability and reliability.

Small is good
● Easier to understand, faster to develop, more productive, A/B testing
● Faster startup (parallelism in system startup/boot)
● Granular cost
Independently developed and deployed versions, modular/replaceable parts
● Each microservice can be developed and deployed independently
● Easier to deploy new versions
● Modular/replaceable parts reduce "lock in" to a single solution or technology
Improved fault isolation
● Limits system impact due to failure (ex: memory leak)
● Easier debugging
● better reliability/redundancy
Distributed design
● Difficult to implement/Difficult to manage the business logic
● Distributed transactions
● IDEs not geared for it
● Interservice communications
● Testing complexity
Deployment complications
● Resource overhead
● Isolation comes at a cost; for example, multiple VMs instead of one VM means
multiple VM overhead

Post about the drawbacks of Microservices:

http://www.ptone.com/dablog/2015/07/microservices-may-be-the-new-premature-opti
mization/
How microservices complicate business logic

Accounting Cross-services
Microservice communications

Unified Banking Deposit Withdrawal

Service Microservice Microservice

Person "A" wants to deposit money in the bank in one location so that person "B" can
withdraw the money in another location.

In the unified service implementation, the deposit results in an immediate change to

the account. The deposit transaction is atomic. When it is completed, the withdrawal
transaction is permitted. And it is also atomic, so that the account is reduced at the
same time the cash money is physically released from the ATM machine.

In a microservices design, the Deposit Microservice is separate from the Withdrawal

Microservice, and these are both separate from the Accounting Microservice. To
perform deposit and withdrawal now requires cross-services communications. Several
communications have to be made between the services to implement a transaction.
Allowances have to be made in the business logic in case the cross-services
communications drops.

Example 1: When is the cash accepted at the ATM by the Deposit Microservice, and
when is the account value increase by the Accounting Microservice? If the cash is
accepted before the account is updated, and communication drops, then the deposit
might not register. If the account is increased before the cash is accepted by the
ATM, then there is an incentive to disrupt the ATM after the transaction has started
and before the cash is ingested by the ATM machine.

Example 2: When the withdraw is being made, a similar complication arises in

cross-service communications. If the cash is released from the machine before the
account is updated, the money could be lost. If the account is reduced first, and the
communication drops before the cash can be released from the machine, then the
account will show the money was taken out, but the user will not have received it.

To make both of these examples work requires a multi-trip negotiation between the
services to "open" a transaction, hold state on both sides of the transaction, and only
"close" the transaction when it is verified that all of the constituent actions have been
completed. Only by holding state can the microservices guard against loss of
communications.

https://pixabay.com/en/atm-terminal-withdraw-money-146307/
Use microservices where they
make sense in the design

Microservices make sense when there

are many consumers of an atomic unit
of functionality.

When there is one consumer of

tightly-coupled functionality,
microservices add overhead without
much benefit.

Advice
Smaller and independent isn't always better: sometimes centralized control is better.
Do you have the staff/processes to deal with thousands of tiny microservices?
Can you debug thousands of loosely connected microservices?
How can you track the business logic? What if it changes? Do you need to modify
thousands of applications to implement the change?
Consider the processes, not just the technical design.
This is not just a "set it and forget it" strategy: you need to plan and implement
processes to monitor and decide when to expand the microservices

https://pixabay.com/en/adjustable-wrench-tool-1659780/
Cloud Functions is useful in a microservices design

Google Cloud Functions is a lightweight compute solution for developers to create

single-purpose, stand-alone functions that respond to cloud events without the
need to manage a server or runtime environment. Cloud Functions runs javascript
in node.js supporting both frontend HTTP and background functions.

However, there are some limitations:

● Cloud Functions is not a low latency service.

● Because it is serverless, there are few resources that can be adjusted for
price/performance tradeoffs.

Cloud Functions is the name of the Google service. A single entity of this service is a
cloud function. A cloud function is Google's implementation of what is commonly
known in computer science as an anonymous function, a lambda function, or a
function literal. The self-contained function is registered with Cloud Functions,
triggered by an event, and executes without the overhead of a full application
environment.
https://en.wikipedia.org/wiki/Anonymous_function
Microservices example using Cloud Functions

Image

https://cloud.google.com/functions/docs/tutorials/ocr
[1] An image is uploaded to storage, triggering a Cloud Function.
The Cloud Function users Vision API to pull out the text from the image.
The Cloud Function:
● [2] Queues the string of text to the Translate API.
● [3] Uses Cloud Pub/Sub to pass state and trigger a new Cloud
Function to Translate the queued text to the desired language.
● [4] Uses Cloud Pub/Sub to pass state and trigger a new Cloud function
to post the translation to storage.

1. An image is uploaded to storage, triggering a cloud function.

2. The cloud function uses Vision API to pull out the text from the image.
3. The cloud function:
a. Queues the string of text to the Translate API.
b. Uses Cloud Pub/Sub to pass state and trigger a new cloud function to
translate the queued text to the desired language.
c. Uses Cloud Pub/Sub to pass state and trigger a new cloud function to
post the translation to storage.

https://pixabay.com/en/shield-learn-note-sign-directory-2300042/
Microservices design on App Engine

Microservices can be implemented as App Engine

services:
PROJECT-1
● Full code isolation
SERVICE-1 SERVICE-2
● Can be written in different languages
Version-1 Version-2 Version-1 Version-2
● Code executed through HTTP
invocation/RESTful API

However, there are some limitations:

● There are shared services that must be

isolated in the application design
Cloud Task
Memcache
● One master app per project Datastore Queues

● Multiple apps incur additional overhead

Shared services

https://cloud.google.com/appengine/docs/standard/python/microservices-on-app-engi
ne
GCP 12-factor support

12-factor design, which is a popular

methodology for developers to follow when
building modern web-based and cloud-based
applications. Details: 12factor.net

GCP technologies support 12-factor design.

● Cloud Shell for building and deploying

● Cloud Source Repositories / Github support

Strictly separate build and run stages

● Build App Engine app in Cloudshell, upload to App

Engine

Keep development, staging, and production as similar as

possible

● Deployment Manager Templates

Codebase: Put all your code into a source control system. Begin with all the code in a
single repository. As the code grows in complexity, move the code for specific parts of
the application into separate repositories. In a distributed application, code that
communicates to other code is an indicator that it should be considered for a separate
repository. Cloud Source Repositories:
https://cloud.google.com/source-repositories/

Build, Deploy, Run: The idea is that after an outage, the application should come
back without human intervention
● Build: The process that wraps the code into a package of scripts, binaries,
and assets.
● Deploy (Release): Sends the package to the servers/services along with
separate configuration for the environment.
● Run: Runs the code. Should be simple and reliable.
App Engine Tutorials:
https://cloud.google.com/appengine/docs/standard/python/tutorials

Dev / Prod Parity: It is common to have rapid development and deployment cycles,
making changes to your application and deploying them within hours. Keep the
development and production environments as similar as possible to reduce the area
of vulnerability where issues could arise. Use the same backing services, same
configuration techniques, and same versions. Deployment Manager Templates:
https://cloud.google.com/deployment-manager/docs/step-by-step-guide/create-a-temp
late

https://pixabay.com/en/binary-hands-keyboard-tap-enter-2450152/
12-factor software design infrastructure services in GCP

Explicitly declare and isolate dependencies

● Custom images

Store config in the environment

● Metadata server, GCS

Maximize robustness with fast startup and

graceful shutdown

● Instance templates, Managed instance

groups, and autoscaling

Dependencies: All systems have dependencies in every environment where they

run. Never let your application simply assume that these are present.

● Option 1: Ensure that they are present by "baking” them into the application.
● Option 2: At startup, list expected versions and download and update libraries
to the correct version so you know that the dependent resources are in place.
Eliminate guessing by making this an explicit and dynamic process.
Custom images:
https://cloud.google.com/compute/docs/images/create-delete-deprecate-private-imag
es

Config: Configuration data should be stored separately from the code and read at run
time. Anything that might vary between environments should be in the environment:
location of a resource, logging or debugging settings, and usernames and passwords.
Metadata Server: https://cloud.google.com/compute/docs/storing-retrieving-metadata
Another place to store configuration data is in GCS, which provides multiple region
access, reliability, and features like customer-supplied encryption.

Disposability: The idea is that when part of your application starts, it should quickly
be able to start serving: so design it to avoid doing a lot of setup work at startup,
which can cause complexity and delays in scaling. Store state in high speed
databases or cache for fast recovery. Avoid mandatory cleanup processes that could
harm the application if they don't complete during a crash scenario. Instance
Templates: https://cloud.google.com/compute/docs/instance-templates
Also, Instance Groups: https://cloud.google.com/compute/docs/instance-groups/ and
Autoscaling: https://cloud.google.com/compute/docs/autoscaler/

https://pixabay.com/en/binary-hands-keyboard-tap-enter-2372130/
12-factor "store state in the environment" has tradeoffs

Operation Time in ns Time in ms Operation Time in ns Time in ms

(1ms = 1,000,000 ns) (1ms = 1,000,000 ns)

L1 cache 1 Read 1 MB sequentially from memory 7,000 0.007

reference

Branch SSD Random Read 16,000 0.016

3
misprediction
Read 1 MB bytes sequentially from SSD 123,000 0.123
L2 cache 4
reference
10 / 0.123 = Round trip within same data center 500,000 0.5
Mutex 17
lock/unlock 81 times slower!
Read 1 MB sequentially from disk 10,000,000 10
Main memory 100
reference Read 1 MB sequentially from 1-Gbps 10
10,000,000
network
Compress 1 kB 2,000 0.002
with Zippy Disk seek 10
10,000,000
Send 2 kB over 2,000 0.002
1-Gbps network Send packet CA->Netherlands->CA 150,000,000 150

Look at the difference, for example, between reading 1 MB sequentially from a local
SSD, and over a 1-Gbps network. That's 123,000 nanoseconds versus 10,000,000
nanoseconds. That advice has been to store configuration information and state
information separately from the processing—from the VM. Moving that data from the
SSD to networked storage is immediately 81 times slower. This is an example of the
real costs of reliability over speed.
Mapping Compute Needs
to Platform Products
Business logic (the application) uses CPU

Where does it get compute resources?

If you can use a native GCP service, you may not

need to design anything fancier.

App Engine is code-first so it is easy to use to

create new applications. It autoscales. It is highly
available and reliable. App Engine is often a
sufficient solution on its own.

Start with App Engine. If you identify exceptions that can't be handled by App Engine
standard environment, look at containers in both App Engine flexible environment and
Kubernetes Engine. If those don't handle the exceptions, look to Compute Engine.

https://pixabay.com/en/keyboard-software-programming-2529270/
App Engine

Code first

Focus on programming, minimize IT work

Minimize operations overhead

Have scale and reliability handled by the platform

Containers can be run on App Engine flexible environment

https://cloud.google.com/docs/choosing-a-compute-option
App Engine use cases: Web sites, mobile app and gaming backends, RESTful APIs,
Internal Line of Business (LOB) apps, Internet of things (IoT) apps.
Kubernetes Engine

Platform independence

Separate application from OS

No OS dependencies

Already using Kubernetes and need to scale

Application can be containerized

The choice for containers is between App Engine flexible environment (which can run
containers) and Kubernetes Engine. App Engine flexible environment is code-first, the
platform is proprietary, and some elements are not exposed or controllable by the
application. Kubernetes Engine is container-first, the platform is open, and some
elements of the platform are not handled automatically for you, so you should plan on
doing some IT work.
Kubernetes Engine use cases: Containerized workloads, cloud-native distributed
systems, hybrid applications.
Compute Engine

Migrating an application from a Data Center without rewriting it

Dependencies on a specific OS

Required to use an existing VM image

Direct hardware access (GPUs, SSDs)

Driver-level access

Hardware performance is critical

In general, if you already know which is more important in your design, (1)
infrastructure control, (2) code development, or (3) a balance of both, then you should
consider (1) Compute Engine, (2) App Engine, or (3) Kubernetes Engine,
respectively.

Compute Engine use cases: Any workload requiring a specific OS or OS

configuration, currently deployed, on-premises software that you want to run in the
cloud.
Compute System
Provisioning
Deciding how the system will acquire new
compute resources and adapt to changing
requirements.

App Engine is an autoscaling platform.

Kubernetes Engine autoscales based on

containers and pods within the cluster.

So this section applies to Compute Engine

and VMs.

Vertical scaling Horizontal scaling

Do you have a funeral when the server dies?

In the vertical scaling model, the answer is "yes". Your users and your service are
attached to and dependent on one piece of hardware.
In the horizontal scaling model, the answer is "no". One server loss is only part of the
service, and other parts can pick up the load and keep running.

Small servers (horizontal scaling)

● Easy to schedule (e.g., binpack)
● Decrease per-instance failure cost (capacity, recovery, etc.)
● Incremental scaling
● Cheaper at large scale
Large servers (vertical scaling)
● Lowers overhead (unlikely to matter)
● Cheaper at small scale, maybe
Horizontal scaling issues and answers

More server lifecycles to manage (deployment

complexity)

● Automation makes this easy

End-to-end latency increases slightly

● Requirements will indicate whether the latency

matters

More overhead, but unlikely to matter

● Outweighed by the benefits of decoupling scaling,

failures, upgrades, configuration, and so forth

A hard disk might have a 4-million hour average time to failure. At cloud scale this
means hard disks are constantly failing.
That means there are more server lifecycles to manage. But virtualization and
automation effectively solves this problem.

To manage traffic to multiple servers, there must be a decision points, such as a load
balancer. That slightly increases latency to the server versus a vertical scale
single-server approach.

There is more overhead in operating and maintaining a lot of servers. But this is
unlikely to matter compared to the benefits of being able to add capacity and to
handle individual server failures without impacting the overall availability of the service
to the users.
Horizontal scaling design

Keep servers simple: Do one thing well N/3 qps

● Minimize complexity
● Construct simple and concise APIs
● Identify where tasks are separable
● Split into separate servers
N qps Cloud Load
Balancing
Queries Per Second
Prefer small, stateless servers

● Easy to scale; no state to shard/rebalance

Ranges of keys
● Failure is cheap; no state to migrate/recover
mapping to server
● Easy to load balance; no hot-spotting

Be very careful of caching in horizontal scaling designs. Stateless servers can serve
stale copies of data stored at stateful servers if the cache is not properly invalidated.
Tradeoffs: Balance latency, capacity, scalability, and cost

Small stateless servers increase Large stateful servers reduce

reliability and scalability complexity and latency
Divide into parts Unify

Duplicate and coordinate Simplify and consolidate

Separate and isolate Coalesce and colocate

Methods of achieving balance in your design

● What are your SLOs? What do your users value?
● What is the optimal size and number of parts?
● Sometimes central control is necessary/optimal
● Plan on adjusting and build adjustment processes

The previous slides make a strong case for small stateless servers. However, as
mentioned in "Defining the service", you have to consider the design in context. And
no design solution fits every case. Determine whether that design makes sense for
specific components of your service.

The history of computing shows many periods when design skewed towards
centralization and unified control, followed by periods when design skewed towards
decentralization and distributed systems. Choosing the degree of centralization or
decentralization his is not a philosophical debate, but a practical necessity. You need
to strike a balance between these two by examining each part of your application and
balancing the tradeoffs in terms of latency, capacity, scalability, and cost. It is possible
to divide an application into small enough parts that it becomes unmanageable and
unmaintainable. Strike the right balance using measurement—SLOs and SLIs—to
determine what solution is optimal for your users.
This also means that your initial design may need to be adjusted as you get actual
measurable feedback. That's okay.
Build that feedback and adjustment process into your design of both the technical and
behavioral (human, operations, process) elements.
Design first, dimension later

Trying to dimension the solution before the design is completed and before it is iterated
and evolved can lead to confusion. The same is true of cost optimization.

How many machines of what capacity?

Great questions.
● Network: queries per second or bandwidth
They just come later ● Memory: data stored in memory for speed; MB or GB
in the design process. ● Storage: data stored on local disk (PD or SSD); GB or TB

How can the cost be minimized?

You arrive at the design thinking about the qualities that you want in your
system—what parts you want to scale. How you get reliability into the system. And
then you handle dimensioning separately.
View current documentation for current offerings of size and price of GCP resources.
The photo service is slow

This section continues developing the design

of the thumbnail photo service.

What is the cause of the slow service?

What can be done about it?

What lessons does this offer for design?

https://pixabay.com/en/summer-sunflower-flowers-sky-cloud-368224/
Business problem: Users complain that the service is slow

Business Logic

So the problem is that this process is slow, and this the users’ experience of the
service. Let’s think about what you can do..
PROCESS
Systematic logical troubleshooting
2
Step through the system
manually in your mind.

1
Segment and reduce
the problem space. 3
Add more
monitoring or
logging.

Intermittent failures can be caused by multiple simultaneous factors.

It’s necessary to have a systematic and logical process to troubleshooting. Start by

segmenting and reducing the problem space. Where could slow-downs affect the user
experience? Maybe everything does, maybe none do. (CLICK)
Step through the system manually in your mind and get an idea of what the business
logic looks like. (CLICK)
Add monitoring or logging if you’re not getting that information. If there's no monitoring
to tell you what's going on, once your experience no longer supports you, you need to
rely on actual data to try to troubleshoot the problem. So this is where you ask
yourself, do I have the ability to add more information to this so I can troubleshoot this
in the future?

https://pixabay.com/en/light-bulb-current-light-glow-503881/
https://pixabay.com/en/measuring-tape-length-cm-measure-2202258/

Reference: SRE Book: Chapter 12 - Effective Troubleshooting

Reference: SRE Book: Chapter 16 - Tracking Outages

Each step should reduce the set of possible causes

● Don't check hypothesis at random, partition the problem space
Leverage system knowledge to step through the system operation mentally
● Keep diagrams and docs updated
● Sanity check expected component behavior
If you can't identify root cause, add more monitoring or logging
Intermittent failures and performance degradation are the hard to troubleshoot
● Can be multiple simultaneous factors, appearing random
● Historical sub-operation performance data helps diagnose performance issues
PROCESS
Collaboration and communication

● Five Why's
● Being a hero can lead to longer
downtime
● Closed-group conversations can
cause confusion instead of
coordination
● People are never root cause.
Thinking they are...
○ stops analysis early
○ leads to fixing the wrong things

Collaboration and communication is important because it’s often not one person
involved. Use the Five Whys iterative interrogation technique to explore possible
cause-and-effect relationships. Unfortunately, being a hero can lead to longer
downtime and closed-group conversations can cause confusion rather than
coordination. How are others going to benefit from the experience of this
troubleshooting if there’s no collaboration? History shows that people are never the
root cause of a problem. Thinking they are will often lead to the analysis process
ending prematurely and the wrong things being fixed. In addition, focussing on the
individual is counter-productive when trying to encourage collaboration.

Don't try to "fix it yourself " or "be a hero" - it can lead to longer downtime
● 15-20 minutes to find a quick fix, then declare an incident
Keep communications regular and broad
● Closed-group conversations can cause confusion and a lack of coordination
among incident responders
Don't consider people as the root cause. Ask the right questions.
● Stops analysis before finding the real root cause
● Leads to fixing the wrong things

https://pixabay.com/en/building-blocks-insert-2065238/
https://pixabay.com/en/meeting-together-cooperation-1015316/
Useful (blameless) questions to ask about people and processes:

How can you make the systems, tools, and processes more immune to human
fallibility?
How can you give people better information to make decisions?
Was the information they had flawed or misleading?
Is there an automated way to fix the information?

The system should not have been able to fail this way.
Could software have prevented or mitigated this error?
Can this activity be automated so it doesn't require human intervention?

How likely is it that the next person could cause the same problem?
Could a new hire have made this mistake?
What is going on inside the photo thumbnail sample application?
Business Logic

Thumbnail
Image Thumbnail Serving
Ingest Conversion
Storage Storage Thumbnails
(Processing)

User Experience

So let’s break down the business logic behind your photo service: You have the user
experience;
you ingest data;
you store it;
you do some kind of processing;
you store it again;
and then you serve that back up to the user. So if the user experience is slow, there's
something going on internally with one these systems. The first thing that comes to
mind is usually based on your experience. But do you understand the different
attributes of each of these services? Well, let's start by identifying those.

https://pixabay.com/en/countryside-tree-landscape-sunlight-2175353/
What are the characteristics of each of the functions?
Business Logic
Single App
Server
Solution

User Experience Ingest Thumbnail Image Storage Serving Thumbnails

● HTTP Server ● HTTP Conversion ● Large files ● HTTP
● Dynamic / ● High ● High CPU ● Small files ● High read I/O
Static Content Throughput ● Memory ● Many files ● Dynamic /
● Session ● Low disk I/O ● Low disk I/O ● High r/w Static content
Handling ● Session
Handling

So looking at the user experience, you know there's a web server. Web servers are
pretty fast, but there’s dynamic and static content. So you might want to determine
things like: Is the web server responding? Is it DNS servers? Is there some kind of
rogue JavaScript, for example? What about session handling? Well, you’re on a
single box so you don't have to worry about that now, but that could come into play
later on. What if retries have to occur because you lost session info?
Now you get into ingest. So, ingesting is also going to be done through an HTTP
protocol. You know that you're going to do high throughput writes to disk. All you're
doing is writing long streams of large images, so the disk I/O - the actual transactions
- shouldn't be a problem.
Thumbnail conversions are going to consume a lot of the machine's resources. It's
going to be very CPU intensive and will consume a lot of memory. However, probably
low disk I/O though, because all it has to do is read the image into a memory, process
it, then generate an output.
What about the image storage? Well, this could be large and small files, so you might
keep those copies around. There's going to be a lot of files, so you might have to
worry about file systems. How do you keep track of millions or hundreds of millions of
files? It's going to be very high disk I/O intensive, and that could be a potential issue
too.
And then you have considerations in serving that thumbnail back.
Now this might be quick to troubleshoot, and you might draw initial conclusions, but
what you initially identify might not be the problem.

https://pixabay.com/en/countryside-tree-landscape-sunlight-2175353/
Segregate services for better performance and scalability.

Business Issue: User experience is being

impacted by the thumbnail conversion
processing.

● Offloading high CPU process to another

service.
● Reducing memory allocation on the web App Server
servers. Web Server Thumbnail
● Reduced some disk I/O. Processing
● Moved image storage to the the thumbnail
processor.

Thumbnail
Image Conversion Thumbnail Serving
Ingest
Storage (Processing) Storage Thumbnails

User Experience
© 2018 Google Inc. All rights reserved. Google and the Google logo
are trademarks of Google Inc. All other company and product names may
be trademarks of the respective companies with which they are associated.

The business issue is that the user experience is being impacted by the thumbnail
conversion process. So what do you do? Well, why don't you offload that high CPU
process to another service? What does this benefit though; how does it change our
attributes?

Well, you’ve reduced the memory allocations on the web servers, so now you can
have more sessions on the web servers. You've reduced a small amount of disk I/O,
because you won’t contribute to that any longer, and the web servers hardly use any
disk I/O, so that's okay. You've also moved the image storage to the thumbnail
processor. So now the web server itself doesn’t need to hold as much local storage,
and you're going to combine it all into the app server thumbnail processing machine.

The business logic is now handled by the web server, which is doing the user
experience in both ingesting and serving thumbnails. But now the image storage, the
processing, and the output thumbnail storage, is going to be handled by a separate
device. This makes logical sense, it's simple. You're not over-complicating it. This is a
natural, logical design. In fact, you can do this on paper before you do this in
production. It doesn't mean you have to keep deploying in this way, but you're using
these processes.

https://pixabay.com/en/countryside-tree-landscape-sunlight-2175353/
Objectives and Indicators

Objectives Indicators
Availability, 23/24 hours/day = 95.83% availability Aggregated server up/down time

99% of user operations completed in < 1 minute End to end latency

What about our service level objectives and indicators? Even though we added more
servers, our availability SLI remained the same. We’re looking at the aggregated
server uptime instead of a specific server, but in reality the SLI is still just the measure
of the service uptime from the user’s perspective.

The real change is that we’re going to add a new performance-based SLI to the
service. We determined that the users require our service to respond in under one
minute. Therefore, our new SLO will be: “Complete the user’s operation in less than
one minute” and we will use the end to end latency as our SLI, which represents the
overall latency that the user experiences while using the service. In this scenario, we
could find this by evaluating the latency of the HTTP requests.
YOUR TURN

Design challenge #1
Log aggregation

https://pixabay.com/en/the-strategy-win-champion-1080527/
Introducing log files for the photo thumbnail service

Single App
Server
Solution

Log files

ID 12345 Timestamp Payload 288 B

8B 16 B 8B 256 B

When we were logging data for the log files for the thumbnail service, we were storing
the log files on the local machine. Now, what are the log files look like? Well, they are
given an ID for the log entry; a session ID to help us map the session to a user; the
timestamp; then a payload, which consists of application-specific information about
the application.
Log data is now segregated
App Server
Thumbnail
Business Issue: For proper troubleshooting the Web Server Processing
segregated log files must be joined.

Design challenge:
Combine log files from separate locations and join
them together into a single log.

Web server Thumbnail

Web ID 12345 Timestamp Payload 288 B log files server log files

8B 16 B 8B 256 B

App ID 12345 Timestamp Payload 288 B

Joined log
files
Common field

Recall that each log entry contains an Entry ID, timestamp and payload. The objective
of the challenge is to design a system that appends log entries of type Web + Log
based on a shared Entry ID.
Developing one solution together

Remember, there are multiple valid solutions to this challenge.

The class is going to walk through one of these solutions together.
In later exercises you will be challenged to develop or add to the design and compare
your solution to the elements and reasoning in the sample solution.
Logs on two servers, aggregate to a single log

Web Logs App Logs

Logs Logs

Web App
Logs Logs

Logging Server

Logs

Web Logs Ingest Append Transform Output

App Logs
Daily Cron Batch Job

When the design migrated to two servers, the log files are now on two different
servers. To do troubleshooting, the log files should be re-integrated into a single
joined log file.

Ingest
Append
Transform

The slide shows the business logic defined for the new Logging Server component.
Business logic
Log entries: the Entry ID field is shared.
ID Entry ID Timestamp Payload

Matching Entry ID

Webserver Logs App Logs

ID A 12345 Timestamp A Payload A ID B 12345 Timestamp B Payload B

APPEND

Webserver + App logs

ID A 12345 Timestamp A Payload A Timestamp B Payload B

2 types of log entries, Webserver Logs (identified as type 'A') and Application Logs
(App Logs, identified as type 'B')

The business logic is "Append" which joins or appends parts of the App Log to the
matching Webserver Log.
Each log entry has Entry ID, timestamp, and some entry-specific payload.

The output is a series of appended logs.

Output log

● App Log = 25 thousand records / day

● The Payload for A and B records has a maximum size of 256 B

ID A 12345 Timestamp A Payload A Timestamp B Payload B 552 B

8B 16 B 8B 256 B 8B 256 B

Deployment Manager: Package and deploy

Lab 2: How to customize an instance, install software and run an application at boot from
Deployment Manager. (Echo application).
© 2018 Google Inc. All rights reserved. Google and the Google logo
are trademarks of Google Inc. All other company and product names may
be trademarks of the respective companies with which they are associated.

In the lab, you will be working with the simplest service possible: an echo application.

In this lab you will learn to take the pre-written python echo application, that uses
application framework libraries, and package it for deployment on the cloud using
python package manager.
In the previous lab you just started an instance. In this lab you will bring up an
instance and perform the customization necessary to update and install software, and
to host the echo application, and handle configuration of other elements in the
environment, such as networking.

You will build on this Deployment Manager experience in subsequent labs.

File Module Slides 2 Microservice Design and Architecture en - en
No ratings yet
File Module Slides 2 Microservice Design and Architecture en - en
43 pages
Microservice Design and Architecture: Priyanka Vergadia Developer Advocate, Google Cloud
No ratings yet
Microservice Design and Architecture: Priyanka Vergadia Developer Advocate, Google Cloud
43 pages
Microservice Architecture Essentials
No ratings yet
Microservice Architecture Essentials
43 pages
Microservice Architecture Essentials
No ratings yet
Microservice Architecture Essentials
43 pages
02 Microservice Design and Architecture
No ratings yet
02 Microservice Design and Architecture
43 pages
Microservices for Developers
No ratings yet
Microservices for Developers
62 pages
Microservices vs. Monolithic Systems
No ratings yet
Microservices vs. Monolithic Systems
45 pages
IBM Cloud - Microservices Point of View Guide PDF
No ratings yet
IBM Cloud - Microservices Point of View Guide PDF
35 pages
10d Microservices 230330093420 0d059137
No ratings yet
10d Microservices 230330093420 0d059137
38 pages
Unit 7 CC
No ratings yet
Unit 7 CC
17 pages
What Are Microservices - An Introduction To Microservice Architecture - DZone Microservices
No ratings yet
What Are Microservices - An Introduction To Microservice Architecture - DZone Microservices
5 pages
Microservices Architecture
No ratings yet
Microservices Architecture
17 pages
Microservices Design Guide - Platform Engineer - Medium
No ratings yet
Microservices Design Guide - Platform Engineer - Medium
34 pages
Special Feature - Microservices r2
No ratings yet
Special Feature - Microservices r2
28 pages
Microservicess and Discovery
No ratings yet
Microservicess and Discovery
17 pages
Ebook How To Build and Scale With Microservices
No ratings yet
Ebook How To Build and Scale With Microservices
17 pages
Getting Started Microservices PDF
No ratings yet
Getting Started Microservices PDF
6 pages
Understanding Microservices Architecture
No ratings yet
Understanding Microservices Architecture
10 pages
Microservices Seminar Overview
No ratings yet
Microservices Seminar Overview
16 pages
Microservices Design for Engineers
100% (2)
Microservices Design for Engineers
42 pages
Azure Architecture Center Micro - Microsoft
No ratings yet
Azure Architecture Center Micro - Microsoft
54 pages
Understanding Microservice Architecture
No ratings yet
Understanding Microservice Architecture
30 pages
MicroservicesInAction ch1
100% (2)
MicroservicesInAction ch1
29 pages
Microservices and Serverless
No ratings yet
Microservices and Serverless
3 pages
What Is Microservice
No ratings yet
What Is Microservice
43 pages
Microservices Architecture Guide
100% (3)
Microservices Architecture Guide
23 pages
Microservice Architecture Tutorial
No ratings yet
Microservice Architecture Tutorial
14 pages
Microservices for Software Engineers
No ratings yet
Microservices for Software Engineers
6 pages
7.microservices MST
No ratings yet
7.microservices MST
19 pages
For Speed and Agility
No ratings yet
For Speed and Agility
14 pages
Microservice Architecture Tutorial
100% (5)
Microservice Architecture Tutorial
47 pages
Mdca Introduction
No ratings yet
Mdca Introduction
30 pages
Microservices for Dev Teams
No ratings yet
Microservices for Dev Teams
2 pages
Unlockingthe Powerof Microservices 62 Eff 5937 BC 63 D 4 e
No ratings yet
Unlockingthe Powerof Microservices 62 Eff 5937 BC 63 D 4 e
10 pages
Unit 2
No ratings yet
Unit 2
21 pages
Modiji - Microservices Interview Questions
No ratings yet
Modiji - Microservices Interview Questions
34 pages
Micro Services
No ratings yet
Micro Services
50 pages
Microservices Vs Service Oriented Architecture
No ratings yet
Microservices Vs Service Oriented Architecture
57 pages
Microservices Vs Service Oriented Architecture
100% (3)
Microservices Vs Service Oriented Architecture
57 pages
Fall Semester 2024-25 - SWE4007 - TH - AP2024252000662 - 2024-09-06 - Reference-Material-I
No ratings yet
Fall Semester 2024-25 - SWE4007 - TH - AP2024252000662 - 2024-09-06 - Reference-Material-I
66 pages
Micro Services
No ratings yet
Micro Services
2 pages
Microservices 1640736150
No ratings yet
Microservices 1640736150
9 pages
Intro to Microservices with Spring Boot
No ratings yet
Intro to Microservices with Spring Boot
8 pages
Building Software in Microservices - Explained With Best Practices - by Abesin Olabode - Medium
No ratings yet
Building Software in Microservices - Explained With Best Practices - by Abesin Olabode - Medium
7 pages
Microservices
100% (1)
Microservices
12 pages
Microservices & Dapr Guide
No ratings yet
Microservices & Dapr Guide
11 pages
Microservices Explained for DevOps
No ratings yet
Microservices Explained for DevOps
3 pages
Actionable Insights with QuickSight Guide
No ratings yet
Actionable Insights with QuickSight Guide
34 pages
Microservices Platform Guide
No ratings yet
Microservices Platform Guide
20 pages
Microservices vs SOA: Key Insights
No ratings yet
Microservices vs SOA: Key Insights
28 pages
Microservices PDF
No ratings yet
Microservices PDF
6 pages
Unit 2
No ratings yet
Unit 2
27 pages
Deployed in Any Application: What Do You Know About Microservices?
100% (1)
Deployed in Any Application: What Do You Know About Microservices?
5 pages
Google Cloud Compute Engine Guide
No ratings yet
Google Cloud Compute Engine Guide
39 pages
Google Cloud Console: Project Management Guide
No ratings yet
Google Cloud Console: Project Management Guide
23 pages
05 ArchDP Design For Resiliency Scalability and DR
No ratings yet
05 ArchDP Design For Resiliency Scalability and DR
71 pages
Module 5 - Ensuring Successful Operation of A Cloud Solution
No ratings yet
Module 5 - Ensuring Successful Operation of A Cloud Solution
27 pages
Module 1 - About The Associate Cloud Engineer Certification
No ratings yet
Module 1 - About The Associate Cloud Engineer Certification
21 pages
Network Design: Architecting With Google Cloud Platform: Design and Process
100% (1)
Network Design: Architecting With Google Cloud Platform: Design and Process
32 pages
Cognitive Contact Center Phase 2 Overview
No ratings yet
Cognitive Contact Center Phase 2 Overview
15 pages
Module 3 - Planning and Configuring A Cloud Solution
No ratings yet
Module 3 - Planning and Configuring A Cloud Solution
16 pages
Data Layer Design: Architecting With Google Cloud Platform: Design and Process
No ratings yet
Data Layer Design: Architecting With Google Cloud Platform: Design and Process
47 pages
Femto Access Point Service Data Model: Technical Report
No ratings yet
Femto Access Point Service Data Model: Technical Report
46 pages
TR-255 GPON Interoperability Test Plan PDF
No ratings yet
TR-255 GPON Interoperability Test Plan PDF
254 pages
TR-348 Hybrid Access Broadband Network Architecture
No ratings yet
TR-348 Hybrid Access Broadband Network Architecture
49 pages
OG For FTTX O&M - (V100R002C01 - 03) PDF
No ratings yet
OG For FTTX O&M - (V100R002C01 - 03) PDF
801 pages
TR-383 Common YANG Modules For Access Networks
No ratings yet
TR-383 Common YANG Modules For Access Networks
27 pages
TR-098 - Amendment-2 - Corrigendum-1 Internet GW Device PDF
No ratings yet
TR-098 - Amendment-2 - Corrigendum-1 Internet GW Device PDF
48 pages
TR-134 - Corrigendum-1 Broadband Policy Control PDF
No ratings yet
TR-134 - Corrigendum-1 Broadband Policy Control PDF
110 pages
Simulation Function Device Guide
No ratings yet
Simulation Function Device Guide
9 pages
pl-900 2
No ratings yet
pl-900 2
7 pages
Secretary Interview Document
No ratings yet
Secretary Interview Document
4 pages
Digital Arc Voltage Height Controller Operation Manual (V1.9)
No ratings yet
Digital Arc Voltage Height Controller Operation Manual (V1.9)
64 pages
SF2 Manual
100% (2)
SF2 Manual
1 page
Pgdi 722
No ratings yet
Pgdi 722
101 pages
Song of The Fallen Book 1 of The God Slayer Chronicles Casias Download
No ratings yet
Song of The Fallen Book 1 of The God Slayer Chronicles Casias Download
37 pages
Study Guide: Salesforce Certified Platform App Builder
No ratings yet
Study Guide: Salesforce Certified Platform App Builder
9 pages
Chandrima: Interior Design Portfolio
No ratings yet
Chandrima: Interior Design Portfolio
12 pages
LCD Module Specifications and Features
No ratings yet
LCD Module Specifications and Features
1 page
ICDL Data Protection Syllabus 1.0
No ratings yet
ICDL Data Protection Syllabus 1.0
7 pages
Sonar Experiment Using Mobile Phones
No ratings yet
Sonar Experiment Using Mobile Phones
7 pages
10 Principles for Effective Information Management
No ratings yet
10 Principles for Effective Information Management
17 pages
Ashutosh Kar Ve Resume A PR 2025
No ratings yet
Ashutosh Kar Ve Resume A PR 2025
3 pages
Essay Questions
100% (1)
Essay Questions
17 pages
Using MarketSmith
No ratings yet
Using MarketSmith
88 pages
Action Plan For Achieving Financial Targets - 2021 Bank of Ceylon Enderamulla Brach (674) Kelaniya Area Western Province North
No ratings yet
Action Plan For Achieving Financial Targets - 2021 Bank of Ceylon Enderamulla Brach (674) Kelaniya Area Western Province North
12 pages
Literature Review On Computer Keyboard
100% (2)
Literature Review On Computer Keyboard
8 pages
Digital Signature
No ratings yet
Digital Signature
34 pages
Samsung TV Quick Setup Guide
No ratings yet
Samsung TV Quick Setup Guide
2 pages
U.trust GP HSM Se Series Datasheet EN
No ratings yet
U.trust GP HSM Se Series Datasheet EN
2 pages
Unit 2 - 2425
No ratings yet
Unit 2 - 2425
14 pages
Buy SSN Number
No ratings yet
Buy SSN Number
5 pages
Thesaurus: Semantic Relations & Types
No ratings yet
Thesaurus: Semantic Relations & Types
2 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
Spycrushers Spy Pen Camera Operating Manual
No ratings yet
Spycrushers Spy Pen Camera Operating Manual
10 pages
Adidas Invoice2
No ratings yet
Adidas Invoice2
2 pages
Product Approval Portal: User Manual: MES Monitoring Authority
No ratings yet
Product Approval Portal: User Manual: MES Monitoring Authority
21 pages
Drawing Email
No ratings yet
Drawing Email
2 pages
FireClass J424 Conventional Fire Alarm Panel Data Sheet
No ratings yet
FireClass J424 Conventional Fire Alarm Panel Data Sheet
2 pages