Kubernetes Roadmap
Kubernetes Roadmap
Kubernetes Cluster
Kubernetes Master Node / Controle Plane: API Server, Scheduler, Kube Controller Manager, Cloud Controller Manager, etcd
Kubernetes Worker Nodes/ Data Plane: Kubelet, Kube-proxy, Container runtime
Monitoring & Logging
Kubernetes Dashboard, Prometheus,
Scheduling Grafana, ELK stack etc
Learn Prerequisites Kubernetes Cluster Set-up
Self-Hosted Solutions Affinity and Anti-Affinity,
Basic Linux Commands Turnkey Solutions Set-up & Learn Kubernetes Cli Taints and Tolerations , Annotations
Containerization Concepts Managed Kubernetes Services Install Kubectl
Networking Fundamentals Kubernetes Distributions Services and Networking
k9s
YAML Syntax Pod Networking, Services, Ingress, KVN
kubectx/kubens
Cloud Fundamentals
Kubernetes Concepts
Basics: Namespace, Pods, ReplicaSet,
Deployment, DaemonSet, StatefulSet, Jobs and
CronJobs
Using Kubernetes we can automate many of the manual processes involved in deploying,
managing, and scaling containerized applications.
Kubernetes aims to hide the complexity of managing containers through the use of several
key capabilities, such as REST APIs and declarative templates that can manage the entire
lifecycle.
Prerequisites?
Basic Linux Commands: Familiarize yourself with Linux command-line operations.
Containerization Concepts: Understand Docker and how containers work.
Networking Fundamentals: Grasp the basics of networking, including DNS, load
balancing, and port mapping.
YAML Syntax: Learn how to write and understand YAML files.
Cloud Fundamentals: Get a basic understanding of cloud services and providers like
AWS, Azure, or GCP.
Birth of the Borg
System
From Borg to Omega 2003-2004
2013 Google Introduces
History of
Kubernetes
Kube v1.0 & CNCF
2014
Kubernetes 2015
Kubernetes Goes
Mainstream!
Enterprise Adoption &
Support 2016
2017 More Cloud Adoptions
- GKE, EKS, and AKS
become generally available
Kubectl
UI
Kubernetes What is Kubernetes Cluster?
Cluster
A Kubernetes cluster is a collection of nodes on which
workloads can run, these nodes can be physical (bare
metal) machines, or virtual machines (VMs), or serverless
compute systems like Amazon Fargate.
Tips
assigned by the master node.
There must be a minimum of one master node and one worker node for a Kubernetes cluster to be operational
Kube Controller Manager: It always evaluates the current vs the desired state and
checks for a change in configurations, if any change in the configuration occurs, it spots
the change, starts working to get the desired state.
Cloud Controller Manager: This is responsible for managing controller processes with
dependencies on the underlying cloud provider
etcd: a simple, distributed key value storage which is used to store the Kubernetes
cluster data (such as number of pods, their state, namespace, etc),
Worker Node(s) What is Worker Node(s)?
These are the machines/nodes which host the data plane, where containers (workloads) are deployed.
Every node in the cluster must run a container runtime such as Docker, as well as the below-
mentioned components.
Components
Kubelet: This is responsible for the running state of each node,
regularly taking in new or modified pod specifications (primarily
through the kube-apiserver), and ensuring that pods and their
containers are healthy and running in the desired state. This
component also reports to the master on the health of the host where
it is running
Container: This resides inside a pod. The container is the lowest level
of a micro-service, which holds the running application, libraries, and
their dependencies. Containers can be exposed to the world through
an external IP address. Kubernetes has supported Docker containers
since its first version. In July 2016 the rkt container engine was added
Kubernetes Cluster-setup
Difference ways to set-up Kubernetes Cluster
Self-Hosted Solutions
Turnkey Solutions
Kubernetes Distributions
1. Self-Hosted Solutions
Kubeadm: A tool provided by the Kubernetes project for easily setting up a Kubernetes cluster. You manage both the control
plane and worker nodes. Follow this article for steps.
Kubespray: A community project that uses Ansible playbooks to deploy and manage Kubernetes clusters. Follow this article for
steps.
k0s: A lightweight, single-binary Kubernetes distribution that can be used for creating Kubernetes clusters with minimal
overhead. Follow this article for steps.
MicroK8s: A lightweight, single-node Kubernetes solution for local development by Canonical (Ubuntu). Follow this article for
steps.
k3s: A lightweight Kubernetes distribution optimized for edge and IoT devices by Rancher Labs. Follow this article for steps.
2. Turnkey/Minimum Effort Solutions
Minikube: A local Kubernetes cluster for development purposes, which runs on a single node. Follow this article to get started.
Kind (Kubernetes IN Docker): A tool for running local Kubernetes clusters using Docker container nodes. Ideal for testing
Kubernetes clusters and CI/CD workflows. Follow this article to get started
Docker Desktop Kubernetes: Kubernetes can be enabled within Docker Desktop, providing a local, single-node cluster for
development purposes. Follow this article to get started.
Vagrant + kubeadm: Use Vagrant to provision VMs and kubeadm to set up a Kubernetes cluster for development or testing.
Follow this article to get started.
3. Managed Kubernetes Services
AWS Elastic Kubernetes Service (EKS): Fully managed service by AWS that handles most of the control plane operations.
Google Kubernetes Engine (GKE): Managed Kubernetes service by Google Cloud, offering features like auto-upgrades and
node pools.
Azure Kubernetes Service (AKS): Managed Kubernetes service by Azure, providing integrated CI/CD with Azure DevOps.
IBM Cloud Kubernetes Service: Managed Kubernetes service by IBM Cloud.
Oracle Kubernetes Engine (OKE): Managed Kubernetes service by Oracle Cloud Infrastructure.
4. Kubernetes Distributions
OpenShift: An enterprise Kubernetes platform by Red Hat, with additional tools and services for enterprise use.
Tanzu Kubernetes Grid: VMware's enterprise-grade Kubernetes distribution, part of their Tanzu suite.
Charmed Kubernetes: Canonical's Kubernetes distribution provides a complete, production-grade solution on Ubuntu.
Kubernetes
Concepts
CLI
kubectl
kubectl command is a line tool that interacts with kube-
apiserver and send commands to the master node. Each
command is converted into an API call.
Syntax
kubectl [command] [TYPE] [NAME] [flags]
kubectx/kubens: Tools for switching between Kubernetes clusters (kubectx) and namespaces (kubens).
Useful Commands:
kubectx: List all available contexts.
kubectx <context_name>: Switch to a specific context.
kubens <namespace_name>: Switch to a specific namespace.
One common use case for ReplicaSets is to maintain a stable set of replica
Pods running at any given time. This is particularly useful for applications
that require a certain level of availability, such as web servers or
databases. By defining a ReplicaSet with a desired number of replicas,
you can ensure that the necessary resources are always available to
handle user requests.
In this example:
The CPU request is set at 250 millicpu (or 0.25 of a CPU), and
the limit is set at 500 millicpu (or 0.5 of a CPU).
The memory request is set at 64 MiB, with a limit of 128 MiB.
Assigning Quotas to Namespaces
Resource quotas are a way to limit the total amount of memory and request_limit_namespace_example.yaml
CPU resources that can be used by all pods in a namespace. apiVersion: v1
kind: ResourceQuota
metadata:
This is useful for multi-tenant clusters where resource utilization name: example-quota
namespace: dev
needs to be controlled across different teams or projects. spec:
hard:
requests.cpu: "1" # Total CPU requests across all pods
requests.memory: 1Gi # Total memory requests across all
pods
Here’s an example of how to create a resource quota. limits.cpu: "2" # Total CPU limits across all pods
limits.memory: 2Gi # Total memory limits across all pods
pods: "10" # Total number of pods that can be created
In this example, the resource quota:
Limits the total CPU request across all pods in the dev
namespace to 1 CPU, with a total limit of 2 CPUs.
Sets the total memory request limit to 1 GiB and the total
memory limit to 2 GiB.
Restricts the total number of pods to 10 in the dev namespace.
Auto Scaling
Horizontal Pod Autoscaler (HPA)
What is HPA? hpa.yaml
HPA automatically adjusts the number of pod replicas in a apiVersion: autoscaling/v1
deployment, replicaset, statefulset, or any other pod controller, based kind: HorizontalPodAutoscaler
on observed CPU utilization (or, with custom metrics support, other metadata:
application-provided metrics). name: my-hpa
How it works? spec:
scaleTargetRef:
Metrics Monitoring: HPA collects metrics from either the
Kubernetes metrics server (for CPU/memory usage) or custom
apiVersion: apps/v1
metrics APIs (for other metrics).
kind: Deployment
Decision Making: It compares the current metrics value with the name: my-deployment
target value you set. minReplicas: 1
Scaling Actions: If the current value exceeds the target, HPA maxReplicas: 10
increases the pod replicas. If it’s below the target, it reduces the targetCPUUtilizationPercentage: 50
replicas.
Vertical Pod Autoscaler (VPA)
What is VPA?
VPA allocates the optimal amount of CPU and memory to pods,
vpa.yaml
adjusting their resource requests on the fly. It is useful when you are apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
unsure about the resource needs of your application. metadata:
name: my-vpa
spec:
How it works? targetRef:
apiVersion: "apps/v1"
Monitoring: VPA observes the historical and current resource kind: Deployment
usage of pods. name: my-deployment
Recommendation: Based on this, it calculates and provides updatePolicy:
updateMode: "Auto"
recommended CPU and memory settings. resourcePolicy:
Application: VPA can automatically update the resource requests containerPolicies:
of pods in running state or only apply these recommendations - containerName: '*'
minAllowed:
when pods are restarted, depending on the update policy. cpu: "100m"
memory: "100Mi"
maxAllowed:
cpu: "1"
memory: "500Mi"
Cluster Autoscaler ca.yaml
apiVersion: apps/v1
kind: Deployment
Key Components
- --nodes=1:10:my-node-group
env:
- name: AWS_REGION
value: "us-west-2"
Cloud Provider Integration: Cluster Autoscaler must integrate with your cloud provider to manage - name: AWS_ACCESS_KEY_ID
valueFrom:
the lifecycle of nodes. It supports most major cloud providers, such as AWS, Azure, GCP, and secretKeyRef:
others. name: aws-secret
key: access-key
Resource Monitoring: It relies on metrics like CPU and memory usage provided by services like the - name: AWS_SECRET_ACCESS_KEY
valueFrom:
Kubernetes Metrics Server to make scaling decisions. secretKeyRef:
Pod Disruption Budgets (PDBs): It respects PDBs to ensure that the autoscaling process doesn't name: aws-secret
key: secret-key
violate the application's high availability requirements. volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: true
volumes:
- name: ssl-certs
hostPath:
path: /etc/ssl/certs/ca-certificates.crt
Storage
Volume
simple-volume.yaml
What are Volume?
Volume in a simple sense is a directory containing data, just similar to
a container volume in Docker, but a Kubernetes volume applies to a
whole pod and is mounted on all containers in the pod.
Using Volume Kubernetes guarantees data is preserved across
container restarts
Reclaim Policies
The reclaim policy of a PV dictates what happens to the PV after the PVC bound to it is deleted:
Retain: The PV retains its data after the PVC is deleted. It needs to be manually cleaned up or reused.
Delete: The PV and its associated storage are automatically deleted when the PVC is deleted.
Recycle: The PV is scrubbed (data wiped) and made available for a new claim. This policy is deprecated and not commonly used in modern
Kubernetes setups.
CSI (Container Storage Interface) apiVersion: v1
kind: List
CSI Drivers
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
Features:
Roles: Defines permissions on resources (like Pods, Services).
Can be namespaced or cluster-wide (ClusterRoles).
RoleBindings: Associates roles with users or groups. Can be
namespaced or cluster-wide (ClusterRoleBindings).
Allows fine-grained access control to Kubernetes API
resources.
Kubernetes Network Security
Kubernetes network security involves protecting the communication channels and data within a Kubernetes cluster from unauthorized access and threats.
It ensures that only legitimate traffic can flow between containers, services, and the outside world.
Here are some useful cloud-native tools for Kubernetes network security:
1. Cilium: Leverages eBPF technology to provide highly scalable network security policies, API-aware security, and visibility.
2. Tigera Secure: Extends Calico with enterprise-grade security features for compliance and threat detection.
3. Linkerd: Another service mesh that provides automatic mTLS (mutual TLS) to secure traffic between services.
4. Kube-router: Provides network routing, firewalling, and load balancing capabilities using standard Linux networking tools.
5. Aqua Security: Focuses on security throughout the application lifecycle, from development to production, specifically tailored for containerized
environments.
6. Falco: A cloud-native runtime security project to detect and alert on anomalous activities in your applications and containers.
7. Opa (Open Policy Agent): Provides a high-level declarative language to specify security policies and enforces those policies across the Kubernetes
stack.
Scheduling
Affinity and Anti-Affinity
Affinity:
Allows you to specify rules about how pods
should be scheduled relative to other pods.
Features: Can be used to place pods on the
same node (or spread them out) based on
labels and conditions.
Anti-Affinity:
Ensures that two pods don't end up on the
same node.
Features: Useful for ensuring high availability
by spreading replicas of a service across nodes
or racks.
Taints and Tolerations
Taints:
Taints allow a node to "repel" a set of pods based on key-value pairs.
Useful for designating nodes for specific purposes, like dedicated nodes or GPU nodes.
Tolerations:
Tolerations are applied to pods and allow (but do not require) the pods to schedule onto nodes with
matching taints.
Works in conjunction with taints to ensure pods are scheduled appropriately. For example, only pods
with a specific toleration can be scheduled on a node with a specific taint.
Annotations
In Kubernetes, annotations are a mechanism to attach arbitrary non-identifying metadata
to objects. Unlike labels, which are used to identify and select objects, annotations are not
used to select and find objects. Annotations can be used to store and retrieve additional
information about Kubernetes objects, which can be beneficial for various purposes.