0% found this document useful (0 votes)

86 views100 pages

Kafka SlidesShare

Uploaded by

Mohinuddin Mustaq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views100 pages

Kafka SlidesShare

Uploaded by

Mohinuddin Mustaq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Mirror Maker

ZooKeeper Basics
● Open Source Apache Project
● Distributed Key Value Store
● Maintains configuration information
● Stores ACLs and Secrets
● Enables highly reliable distributed coordination
● Provides distributed synchronization
● Three or five servers form an ensemble
Brokers Manage Partitions
● Messages of Topic spread across Partitions
● Partitions spread across Brokers
● Each Broker handles many Partitions
● Each Partition stored on Broker’s disk
● Partition: 1..n log files
● Each message in Log identified by Offset
● Configurable Retention Policy
Questions:
● Why do we need an odd number of ZooKeeper nodes?
● How many Kafka brokers can a cluster maximally have?
● How many Kafka brokers do you minimally need for high
availability?
● What is the criteria that two or more consumers form a
consumer group?
Zookeeper
◦ Zookeeper manages the broker and keep the list of them
◦ Zookeeper does the leader election for partition
◦ Zookeeper notifies Kafka in case of
◦ New Topic
◦ Deletion of topic
◦ New Broker
◦ Broker Dies

Demo using command line

Assuming Zookeeper & Kafka is installed, Also kafka is started on two broker (9093/94
ports)

• Create topic with 2 partition using replication factor as 1

• sh [Link] --zookeeper [Link]:2181 --create --topic first_topic --
partitions 2 --replication-factor 1

• To display topic details “sh [Link] --zookeeper [Link]:2181 –list”

• Produce the message on topic created

• sh [Link] --broker-list [Link]:9093,[Link]:9094 --topic
first_topic

• Consume the message on topic created

• sh [Link] --bootstrap-server [Link]:9093,[Link]:9094
--topic first_topic --from-beginning

• Enter couple of message on producer console and see

corresponding message getting displayed on consumer
console.
• Consumer group – If you want to test consumer group concept, execute below
command on two different terminal. You will notice message are getting
consumed as per partitions
• sh [Link] --bootstrap-server [Link]:9093,[Link]:9094
--topic first_topic --consumer-property
[Link]=mygroup1 --from-beginning
Producer Role
The primary role of a Kafka producer is to take producer properties & record as inputs and write
it to an appropriate Kafka broker. Producers serialize, partitions, compresses and load balances
data across brokers based on partitions.

Properties

Some of the producer properties are bootstrap servers, acks, [Link], [Link] [Link],
[Link] and many more. We will discuss some of these properties later in this article.
Producer record

A message that should be written to Kafka is referred to as Producer Record. A producer record
should have the name of the topic it should be written to and value of the record. Other fields like
partition, timestamp and key are optional.

Broker and metadata discovery

Bootstrap server

Any broker in Kafka cluster can act as a bootstrap server. Generally, a list of bootstrap servers is
passed instead of just one server. At least 2 bootstrap servers are recommended.
In order to send producer record to an appropriate broker, the producer first establishes a
connection to one of the bootstrap server. The bootstrap-server returns list of all the brokers
available in the clusters and all the metadata details like topics, partitions, replication factor
and so on. Based on the list of brokers and metadata details the producer identifies the leader
broker that hosts the leader partition of the producer record and writes to the broker.

Workflow
The diagram below shows the workflow of a producer.
The workflow of a producer involves five important steps:

1. Serialize
2. Partition
3. Compress
4. Accumulate records
5. Group by broker and send

Serialize

In this step, the producer record gets serialized based on the serializers passed to the
producer. Both key and value are serialized based on the serializer passed. Some of the
serializers include string serializer, byteArray serializer and ByteBuffer serializers.

Partition

In this step, the producer decides which partition of the topic the record should get written to. By
default murmur2 algorithm is used for partitioning. Murmur 2 algorithm generates a unique
hash code based on the Key passed and the appropriate partition is decided. In case the key is not
passed the partitions are chosen in a round-robin fashion.

It’s important to understand that by passing the same key to a set of records, Kafka will ensure
that messages are written to the same partition in the order received for a given number of
partitions. If you want to retain the order of messages received it’s important to use an
appropriate key for the messages. Custom partitioner can also be passed to the producer to
control which partitions message should be written to.

Compression

In this step producer record is compressed before it’s written to the record accumulator. By
default, compression is not enabled in Kafka producer. Below are supported compression types:
Compression enables faster transfer not only from producer to broker but also during replication.
Compression helps better throughput, low latency, and better disk utilization. Refer
[Link] for benchmark
details.

Record accumulator

In this step, the records are accumulated in a buffer per partition of a topic. Records are grouped
into batches based on producer batch size property. Each partition in a topic gets a separate
accumulator/buffer.
Sender thread

In this step, the batches of the partition in record accumulator are grouped by the broker to which
they are to be sent. The records in the batch are sent to a broker based on [Link] and
[Link] properties. The records are sent by the producer based on two conditions. When the
defined batch size is reached or defined linger time is reached.

Duplicate message detection

Producers may send a duplicate message when a message was committed by Kafka but the
acknowledgment was never received by the producer due to network failure and other issues.
From Kafka 0.11 to avoid duplicate messages in case of scenario stated earlier Kafka tracks each
message based on producer ID and sequence number. When a duplicate message is received for a
committed message with same producer ID and sequence number then Kafka would treat the
message as a duplicate message and will not committee message again but it will send the
acknowledgment back to the producer so the producer can treat the message as sent.

Few other producer properties

 [Link] – manage buffer memory allocated to producer
 Retries - Number of times to retry message. Default is 0. The retry may cause out of
order messages.
 [Link] - The number of messages to be sent without any
acknowledgment. Default is 5. Set this to 1 to avoid out of order message due to retry.
 [Link] - Maximum size of the message. Default 1 MB.

Summary
Based on the producer workflow and producer properties, tune the configuration to achieve
desired results. Importantly focus on below properties.

 [Link] – batch size (messages) per request

 [Link] – Time to wait before sending the current batch
 [Link] – compress messages

In part 3 of the series let’s understand Kafka producer delivery semantics and how to tune some
of the producer properties to achieve desired results.
Kafka producer delivery semantics
 Published on May 10, 2019

Sylvester Daniel

Program Architect at Mindtree - Big Data, AWS, Azure, Machine Learning & Deep Learning

10 articles

This article is a continuation of part 1 Kafka technical overview and part 2 Kafka producer
overview articles. Let's look into different delivery semantics and how to achieve those using
producer and broker properties.

Delivery semantics
Based on broker & producer configuration all three delivery semantics “at most once”, “at least
once” and “exactly once” are supported.

At most once

In at most once delivery semantics a message should be delivered maximum only once. It's
acceptable to lose a message rather than delivering a message twice in this semantic. Few use
cases of at most once includes metrics collection, log collection and so on. Applications adopting
at most semantics can easily achieve higher throughput and low latency.

At least once

In at least once delivery semantics it is acceptable to deliver a message more than once but no
message should be lost. The producer ensures that all messages are delivered for sure even
though it may result in message duplication. This is mostly preferred semantics out of all.
Applications adopting at least once semantics may have moderate throughput and moderate
latency.

Exactly once

In exactly one delivery semantics a message must be delivered only once and no message
should be lost. This is the most difficult delivery semantic of all. Applications adopting exactly
once semantics may have lower throughput and higher latency compared other 2 semantics.
Delivery Semantics summary

The table below summarizes the behavior of all delivery semantics.

Producer delivery semantics

Different delivery semantics can be achieved in Kafka using Acks property of producer and
[Link] property of the broker (considered only when acks = all).

Acks = 0

When acks property is set to zero you get at most once delivery semantics. Kafka producer
sends the record to the broker and doesn't wait for any response. Messages once sent will not be
retried in this setting. The producer uses “send and forget approach “with acks = 0.
Data loss

In this mode, chances for data loss is high as the producer does not confirm the message was
received by the broker. The message may not have even reached the broker or broker failure
soon after message delivery can result in data loss.

Acks = 1
When this property is set to 1 you can achieve at least once delivery semantics. Kafka producer
sends the record to the broker and waits for a response from the broker. If no acknowledgment is
received for the message sent, then the producer will retry sending the messages based on retry
configuration. Retries property by default is 0 make sure this is set to desired number or
[Link].

Data loss

In this mode, chances for data is moderate as the producer confirms that the message was
received by the broker (leader partition). As the replication of follower partition happens after
the acknowledgment this may still result in data loss. For example, after sending the
acknowledgment and before replication if the broker goes down this may result in data loss as
the producer will not resend the message.
Acks = All

When acks property is set to all, you can achieve exactly once delivery semantics. Kafka
producer sends the record to the broker and waits for a response from the broker. If no
acknowledgment is received for the message sent, then the producer will retry sending the
messages based on retry config n times. The broker sends acknowledgment only after replication
based on [Link] property.

For example, a topic may have a replication factor of 3 and [Link] of 2. In this case,
an acknowledgment will be sent after the second replication is complete. In order to achieve
exactly once delivery semantics the broker has to be idempotent. Acks = all should be used in
conjunction with [Link].

Data loss

In this mode, chances for data loss is low as the producer confirms that the message was received
by the broker (leader and follower partition) only after replication. As the replication of follower
partition happens before the acknowledgment data loss chances are minimal. For example,
before replication and sending acknowledgment if the broker goes down, the producer will not
receive the acknowledgment and will send the message again to the newly elected leader
partition.

Exception

When there are not enough nodes to replicate as per [Link] property then the broker
would return an exception instead of acknowledgment.
Safe producer

In order to create a safe producer that ensures minimal data loss, use below producer properties.

Producer properties

 Acks = all (default 1) – Ensures replication before acknowledgement

 Retries = MAX_INT (default 0) – Retry in case of exceptions
 [Link] = 5 (default) – Parallel connections to broker

Broker properties

 [Link] = 2 (at least 2) – Ensures minimum In Sync replica (ISR).

Acks impact

The table below summarizes the impact of acks property on latency, throughput, and durability.
Summary

Configure Kafka producer and broker to achieve desired delivery semantics based on following
properties.

 Acks
 Retries
 [Link]
 [Link]

In part 4 of the series, let’s understand Kafka consumer, consumer group and how to achieve
different Kafka consumer delivery semantics.
Kafka Monitoring
Apache kafka deals with transfering of large amount of real-time data( we can call it data
in a motion).

To assure end-to-end stream monitoring and every message is delivered from producer to
consumer.

How long messages take to be delivered, also determines the source of issue in your cluster.
We can monitor kafka with the help of metrics.

While monitoring kafka, it’s important to also monitor Zookeeper as kafka depends on it.

Why to Monitor Kafka

Kafka monitoring is important to ensure timeliness of data delivery, overall application

performance , knowing when to scale up ,
connectivity issues and ensuring data is not lost as we deal with streaming data.

Volume of data is large and there are different components involved into kafka cluster
which are:
Producer , Consumer and Broker.

To ensure every component is working fine.

Network Request Rate

Monitor and compare the network throughput per server, if possible by tracking the
number of network requests per second.
[Link]: type=RequestMetrics, name=RequestsPerSec.

Network error Rate

Error conditions include dropped network packets, error rates in responses per request
type, and the types of error(s) occurring.

network throughput with related network error rates can help diagnose the reasons for
latency.

Under-Replicated Partitions
To ensure data durability and that brokers are always available to deliver data ,
[Link]: type=ReplicaManager, name=UnderReplicatedPartitions

Total broker 04 Partitions

Simply knowing how many partitions a broker is managing can help you avoid errors and
know when it’s time to scale out. The goal should
be to keep the count balanced across brokers.
[Link]: type=ReplicaManager, name=PartitionCount – Number of partitions on the
brokers.

Log Flush Latency

Kafka Stores data by appending to existing log files .Cache based writes are flushed to
physical storage.
Your monitoring strategy should include combination of data replication and latency in the
asynchronous disk log flush time.
[Link]: type=LogFlushStats, name=LogFlushRateAndTimeMs

Consumer Message Rate

Set baselines for expected consumer message throughput and measure
fluctuations in the rate to detect latency and the need to scale the
number of consumers up and down accordingly.
[Link] type=ConsumerTopicMetrics, name=MessagePerSec, clientId=([-.w]+)
Messages consumed per sec

Consumer Max Lag

Even with consumers fetching messages at a high rate, producers
can still outspace them. This metrics works at the level of consumer
and partition , means each partition in each topic has its own lag for
a given consumer.
[Link]: type=ConsumerFetcherManager, name=MaxLag,
clientId=([-.w]+) Number of messages by which consumer lags
behind the producer.

Fetcher Lag
This metrics indicates the lag in the number of messages per follower
replica, indicating that replication has potentially stopped or has
been interrupted. Monitoring the [Link]
configuration parameter you can measure the time for which the
replica has not attempted to fetch new data from the leader.
[Link]: type=FetcherLagMetrics, name=ConsumerLag,
clientId=([-.w]+), partition=([0-9]+)

Offline Partition Count

Offline partitions represent data stores unavailable to your
application due to a server failure or restart. In kafka cluster one of
the broker server acts as a controller for managing the states of
partitions and replicas and to reassign partitions when needed.
[Link]: type=KafkaController, name=OfflinePartitionCount –
Number of partitions without an active leader.

Free Memory and Swap space Usage

Kafka performance is best when swapping is kept to minimum. To do
this set the JVM max heap size large enough to avoid frequent
garbage collecion activity, but small enough to allow space for
filesystem caching . Additionally , watch for swap usage if you have
swap enabled , watching for increases in server swapping activity, as
this can lead to kafka operations timeout.
In many cases its best to turn off swap entirely, we have to adjust our
monitoring accordingly.

Kafka Topic We can say that kafka topic is the same concept as a table in the database.

But its definetly not a table and kafka isn’t a database.

A topic is where data(messages) get published by the producer and pulled from by a consumer.

Kafka Topic Configuration

2) For changing the configuration of replication-factor of a topic : add a json script
with the
content provided below:
Assume the script name is [Link].
{"version":1,
"partitions":[
{"topic":"sendInvitation","partition":0,"replicas":[0,1,2]},
{"topic":"sendInvitation","partition":1,"replicas":[0,1,2]},
{"topic":"sendInvitation","partition":2,"replicas":[0,1,2]},
{"topic":"xyz","partition":0,"replicas":[0,1,2]},
{"topic":"xyz","partition":1,"replicas":[0,1,2]},
]}
Than execute the following command to run and apply this script:
./kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file
[Link] --execute

Achieving a 50% Reduction in

Cross-AZ Network Costs
from Kafka

SignalFx:
● Real-Time Cloud Monitoring Platform for Infrastructure, Micro-services and Applications
● 20 Kafka clusters in production
● 400 billion messages per day on the largest cluster
Kafka Message Compression

• Kafka supports end-to-end compression.

• Data is compressed by the Kafka producer client.
• Data is written in compressed format on Kafka
brokers, leading to savings on disk usage.
• Data is decompressed by Kafka consumer client.

• Enabling compression is as simple as setting the config

[Link] on Kafka producer client.

• Compression uses extra CPU and memory on producer/consumer.

• Snappy compression type worked best for us.

Racks changes based on broker placement across the racks

Kafka
No ratings yet
Kafka
88 pages
Apache Kafka
No ratings yet
Apache Kafka
43 pages
Kafka Streaming Data
No ratings yet
Kafka Streaming Data
154 pages
Apache Kafka - Thi Nguyen's Blog
No ratings yet
Apache Kafka - Thi Nguyen's Blog
39 pages
Fundamentals and Architecture of Apache Kafka
No ratings yet
Fundamentals and Architecture of Apache Kafka
30 pages
Big Data-Kafka
No ratings yet
Big Data-Kafka
14 pages
Kafka Topic Questions
No ratings yet
Kafka Topic Questions
9 pages
Kafka Interview Questions
No ratings yet
Kafka Interview Questions
10 pages
Kafka
No ratings yet
Kafka
15 pages
Configuring Kafka For High Throughput
No ratings yet
Configuring Kafka For High Throughput
11 pages
Unit 5 Apache Kafka Notes
No ratings yet
Unit 5 Apache Kafka Notes
54 pages
Kafka Configuration Best Practices
No ratings yet
Kafka Configuration Best Practices
19 pages
Kafka
No ratings yet
Kafka
3 pages
Kafka & Spring Boot for Developers
No ratings yet
Kafka & Spring Boot for Developers
150 pages
Understanding Apache Kafka Architecture
No ratings yet
Understanding Apache Kafka Architecture
45 pages
Kafka and Spark Streaming
No ratings yet
Kafka and Spark Streaming
45 pages
Kafka Producer Internals: Find Answers On The Fly, or Master Something New. Subscribe Today
No ratings yet
Kafka Producer Internals: Find Answers On The Fly, or Master Something New. Subscribe Today
1 page
Data and AI Kafka Overview 1740507867
No ratings yet
Data and AI Kafka Overview 1740507867
20 pages
Kafka Producer Setup Guide
No ratings yet
Kafka Producer Setup Guide
31 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Kafka Patterns and Anti-Patterns
No ratings yet
Kafka Patterns and Anti-Patterns
7 pages
Kafka Using Spring Boot
No ratings yet
Kafka Using Spring Boot
136 pages
Kafka Setup & Operations Guide
No ratings yet
Kafka Setup & Operations Guide
38 pages
Kafka Notes Linkedin
100% (1)
Kafka Notes Linkedin
33 pages
Kafka Monitoring Essentials
No ratings yet
Kafka Monitoring Essentials
64 pages
Apache Kafka Description
No ratings yet
Apache Kafka Description
36 pages
5 Kafka 2.7m
No ratings yet
5 Kafka 2.7m
46 pages
Kafka Interview Preparation
No ratings yet
Kafka Interview Preparation
13 pages
AK
No ratings yet
AK
22 pages
Kafka for Developers and Engineers
No ratings yet
Kafka for Developers and Engineers
7 pages
Best Practices For Apache Kafka
No ratings yet
Best Practices For Apache Kafka
6 pages
Apache Kafka
No ratings yet
Apache Kafka
27 pages
Understanding Apache Kafka Architecture
No ratings yet
Understanding Apache Kafka Architecture
10 pages
Real-Time Data Pipelines with Kafka
No ratings yet
Real-Time Data Pipelines with Kafka
43 pages
Kafka Interview QA 20250603
No ratings yet
Kafka Interview QA 20250603
3 pages
Kafka Components and Key Concepts
No ratings yet
Kafka Components and Key Concepts
2 pages
Kafka
No ratings yet
Kafka
23 pages
Big Data - Group 14
No ratings yet
Big Data - Group 14
26 pages
Kafka - Interview Questions
No ratings yet
Kafka - Interview Questions
4 pages
Kafka - Premiera Ola
No ratings yet
Kafka - Premiera Ola
5 pages
Kafka Architectures Notes
No ratings yet
Kafka Architectures Notes
9 pages
Kafka
No ratings yet
Kafka
5 pages
Kafka
No ratings yet
Kafka
26 pages
Kafka Sparkstreaming
No ratings yet
Kafka Sparkstreaming
75 pages
Apache Kafka Key Concepts
100% (1)
Apache Kafka Key Concepts
8 pages
Documentation
No ratings yet
Documentation
105 pages
Apache - Kafka Notes
No ratings yet
Apache - Kafka Notes
9 pages
Understanding Apache Kafka Architecture
No ratings yet
Understanding Apache Kafka Architecture
48 pages
Kafka Overview
No ratings yet
Kafka Overview
36 pages
Kafka Interview Prep Guide
0% (1)
Kafka Interview Prep Guide
3 pages
Kafka
No ratings yet
Kafka
12 pages
Kafka Clustering v1.0.0
No ratings yet
Kafka Clustering v1.0.0
20 pages
KAFKAExample 2
No ratings yet
KAFKAExample 2
12 pages
Comprehensive Guide to Apache Kafka
No ratings yet
Comprehensive Guide to Apache Kafka
137 pages
A Visual Introduction To Apache Kafka PDF
No ratings yet
A Visual Introduction To Apache Kafka PDF
84 pages
Kafka
No ratings yet
Kafka
19 pages
Quick Docker
No ratings yet
Quick Docker
59 pages
Azure Guide for IT Professionals
No ratings yet
Azure Guide for IT Professionals
104 pages
Skype For Business Server 2015
No ratings yet
Skype For Business Server 2015
59 pages
AzureTroubleshooting Technet
No ratings yet
AzureTroubleshooting Technet
407 pages
IBM Storage Networking SAN64B-6 Switch: Product Guide
No ratings yet
IBM Storage Networking SAN64B-6 Switch: Product Guide
20 pages
Storage
No ratings yet
Storage
114 pages
Ebook Eric Siebert
No ratings yet
Ebook Eric Siebert
64 pages
Aindump.70 663.v2013!02!06.by - Ahmad
No ratings yet
Aindump.70 663.v2013!02!06.by - Ahmad
339 pages
Frequently Asked Questions VMM
No ratings yet
Frequently Asked Questions VMM
18 pages
Hyper-V Server Virtualization Exam 74-409
No ratings yet
Hyper-V Server Virtualization Exam 74-409
11 pages
Exchange 2010: Client Access Insights
No ratings yet
Exchange 2010: Client Access Insights
1 page
Monitor Exchange 2003 with SCOM 2007
No ratings yet
Monitor Exchange 2003 with SCOM 2007
24 pages
GTMedia V8Nova V8Honor Rease Note2020.2.13
No ratings yet
GTMedia V8Nova V8Honor Rease Note2020.2.13
6 pages
Sample in Use
No ratings yet
Sample in Use
22 pages
Infineon-Xmc4500 RM v1.6 2016-UM-v01 06-EN
No ratings yet
Infineon-Xmc4500 RM v1.6 2016-UM-v01 06-EN
2,688 pages
Python Final Report This Is Phyton Solution
No ratings yet
Python Final Report This Is Phyton Solution
22 pages
Oracle AIM Methodology: An Overview
No ratings yet
Oracle AIM Methodology: An Overview
33 pages
Axtraxng: Access Control Management Software
No ratings yet
Axtraxng: Access Control Management Software
2 pages
Assignment 04 Solutionwithsolution
No ratings yet
Assignment 04 Solutionwithsolution
20 pages
Difference Between PLC and DCS Systems
No ratings yet
Difference Between PLC and DCS Systems
2 pages
Blockchain
No ratings yet
Blockchain
4 pages
Application Development
No ratings yet
Application Development
502 pages
PPL Aprl-May 2024 - Question Paper
No ratings yet
PPL Aprl-May 2024 - Question Paper
5 pages
Comprehensive IoT Course Overview
No ratings yet
Comprehensive IoT Course Overview
82 pages
Pengumuman Penerimaan Peminatan Semester Genap 2022 Periode 1 - Online
No ratings yet
Pengumuman Penerimaan Peminatan Semester Genap 2022 Periode 1 - Online
2 pages
Advanced Python Tips
No ratings yet
Advanced Python Tips
50 pages
Parts of Computer WORD SEARCH
No ratings yet
Parts of Computer WORD SEARCH
3 pages
Grade 10 - Part A - Unit 3 (ICT Skills)
No ratings yet
Grade 10 - Part A - Unit 3 (ICT Skills)
2 pages
Programming Languages: Slide 1
No ratings yet
Programming Languages: Slide 1
15 pages
2020 Hardware Guide Book
No ratings yet
2020 Hardware Guide Book
92 pages
Sea Tel ST24 Satellite TV Data Sheet
No ratings yet
Sea Tel ST24 Satellite TV Data Sheet
2 pages
Android TV and Netflix Ready - World Leading Design: 1GB DDR, UHD, Low Power - Rapid TTM, Built For Operators
No ratings yet
Android TV and Netflix Ready - World Leading Design: 1GB DDR, UHD, Low Power - Rapid TTM, Built For Operators
8 pages
2011 Kcse Computer Studies Marking Scheme p1
No ratings yet
2011 Kcse Computer Studies Marking Scheme p1
7 pages
Quantum Pathfinding Drones
No ratings yet
Quantum Pathfinding Drones
10 pages
Dpco Unit IV Processor
No ratings yet
Dpco Unit IV Processor
26 pages
OOP's (CS3391)
No ratings yet
OOP's (CS3391)
17 pages
2023 Assignment Answers
No ratings yet
2023 Assignment Answers
52 pages
Cpre 281: Digital Logic: Instructor: Alexander Stoytchev
No ratings yet
Cpre 281: Digital Logic: Instructor: Alexander Stoytchev
108 pages
Precision 3540 Spec Sheet
No ratings yet
Precision 3540 Spec Sheet
5 pages
IT112: Computer Systems Lab (End Sem Exam Questions Set - April 2022)
No ratings yet
IT112: Computer Systems Lab (End Sem Exam Questions Set - April 2022)
2 pages
Olt 16P Prevail
No ratings yet
Olt 16P Prevail
4 pages
Information Systems Technology Guide
No ratings yet
Information Systems Technology Guide
16 pages

Kafka SlidesShare

Uploaded by

Kafka SlidesShare

Uploaded by

Mirror Maker

Demo using command line

• Create topic with 2 partition using replication factor as 1

• To display topic details “sh [Link] --zookeeper [Link]:2181 –list”

• Produce the message on topic created

• Consume the message on topic created

• Enter couple of message on producer console and see

Broker and metadata discovery

Duplicate message detection

Few other producer properties

 [Link] – batch size (messages) per request

The table below summarizes the behavior of all delivery semantics.

Producer delivery semantics

 Acks = all (default 1) – Ensures replication before acknowledgement

 [Link] = 2 (at least 2) – Ensures minimum In Sync replica (ISR).

Why to Monitor Kafka

Kafka monitoring is important to ensure timeliness of data delivery, overall application

To ensure every component is working fine.

Network Request Rate

Network error Rate

Total broker 04 Partitions

Log Flush Latency

Consumer Message Rate

Consumer Max Lag

Offline Partition Count

Free Memory and Swap space Usage

But its definetly not a table and kafka isn’t a database.

Kafka Topic Configuration

Achieving a 50% Reduction in

• Kafka supports end-to-end compression.

• Enabling compression is as simple as setting the config

• Compression uses extra CPU and memory on producer/consumer.

• Snappy compression type worked best for us.

You might also like