Cloud Native Basics
Cloud native is a mindset and approach for building systems designed for the cloud (public, private, or hybrid). The Cloud Native Computing Foundation (CNCF) is established in 2015. It's a part of the Linux Foundation (est. 2000). Kubernetes was the first donated project.
Cloud Native Governance
- TOC (Technical Oversight Committee): oversees project lifecycle (sandbox → incubating → graduated)
- TAG (Technical Advisory Group)
- SIG (Special Interest Group)
Cloud Native Personas
- DevOps: bridges development and operations with automation
- SRE: uptime, reliability, error budgets, scalability
- DevSecOps: security integrated throughout the lifecycle
- CloudOps: operates public/private/hybrid cloud infrastructure
- Cloud Architect: designs apps/infra for performance, scalability, reliability, and cost
- FinOps: cost management, optimization, budgeting and forecasting
- ML Engineer / Data Scientist / Data Engineer
Autoscaling
- HPA (Horizontal): scales replicas by metrics (CPU, custom)
- VPA (Vertical): adjusts requests/limits (bounded by node)
- Cluster Autoscaler: adds nodes when pods can’t be scheduled
- KEDA: event-driven autoscaling (supports scale to zero)
Cloud Native Architecture
Characteristics: resiliency, agility, operability, observability; high automation; self‑healing; scalable; cost‑efficient; maintainable; secure.
Cloud environment is zero trust model. i.e all components’ indentity and integrity must be verified
Minimum set of permissions to control the demage in the case of compromised/security breach
Well‑Architected pillars: operational excellence; security/privacy/compliance; reliability; cost optimization; performance efficiency.
Serverless Architecture
- CloudEvents spec for interoperable event payloads
- Scale to zero; provisioned concurrency; cold-start considerations
- Beware vendor lock-in; prefer open standards and declarative config
- Event sources → Broker/Trigger → Functions/Services (Knative Serving/Eventing)
- Ingress, autoscaling, and revisioning managed by the platform
- Metrics and traces via OpenTelemetry; use CloudEvents for portability
- Knative: A framework that enables Serverless functionality on top of kubenetes. Kanative provide platform to building/managing serverless apps
- openFaaS: It allows function to be deployed as service
CNCF Projects
- etcd (Disributed storage)
- CoreDNS (DNS plug-in)
- CRI-O (default container runtime in k8s)
- Prometheus (Monitoring)
- Thanos - Prometheus at scale
- Helm (Managing/installing k8s applications)
- Istio (Feature-rich but complex Service Mash)
- LinkerD(simple & powerful tool to add security/observability/reliability to k8s)
- Falco (Runtime security detection in kubecluster)
- Kubescape (misconfiguration & vulnerability scanning)
- Argo CD (GitOps for App delivery)
- OpenTelemetry / OpenTracing
- CloudEvents
- Rook(cloud-native storage orchestrator for k8s, providing the platform, framework & support for a diverse set of storage solutions)
- OpenID Connect (OIDC) (largescale k8s user Authentication)
- Open Policy Agent (OPA) (Admission Controller)
- kube-cost(Managing k8s cost)
- Cilium(eBPF-based Networking & security for k8s)
Docker History & Predecessors
- IBM mainframes (CP/CMS), chroot, FreeBSD jails, Solaris Zones, HP‑UX vPars
- Linux namespaces (user, net, pid, mount, uts, ipc, time) and cgroups
- Docker: started as dotCloud (2010), renamed in 2013; containers share the host kernel
- runC is donated by Docker, it's a part of Open Conainer Initiative (OCI)
- Union File System (AUFS) - Combines Multiple layers and make docker image with thin writable layer on the top
Docker Commands
docker version
docker run nginx
docker run -it --rm -P alpine:3.20 sh # Publish all mapped ports
docker inspect {image_name}
docker manifest inspect {image:tag}
docker save {image} -o /tmp/image.tar
docker buildx create --name buildx-multi-arch
docker buildx use buildx-multi-arch
docker buildx build --no-cache --platform linux/amd64,linux/arm64 -t user/app:tag
docker info | grep -i "Storage Driver"
docker system prune #Remove all unused containers, networks, images
Kubernetes: Architecture

Control Plane
- kube-ApiServer (front door; often behind a load balancer)
- kube-Scheduler
- kube-Controller-manager (controller loops)
- etcd (strongly consistent distributed key‑value store; RAFT)
- CoreDNS
- Production cluster can have as little as 5 nodes or as many as 15000 nodes
- Kube cluster nodes would be in auto-scaling group so nodes can be scalled up/down based on resource utlilization
- Control plane nodes must be Linux
Worker Node
- Kubelet is managing conainer lifecycle (runs on workers and control plane)
- Container runtime: containerd with CRI-O/runC/crun/Kata/gVisor
- Kube-Proxy runs DaemonSet on each node
- Worker node can be linux or window
- CNI - Container Networking Interface for overlay network. Responsible to enforeces NetworkPolicy
Cluster Setup
- kubeadm: production‑grade bootstrap
- kinD: multi‑node clusters in Docker (local/dev)
- minikube / MicroK8s: local development
Kubernetes Objects
- Deployment -> Replicaset -> Pods for stateless apps (namespace level scoped)
- Pod lifecycle: Pending → Running → Succeeded/Failed → Unknown
- Init containers for ordered startup
- Container spec: name, image, command, ports, resources, liveness/readiness probes, securityContext
- Namespaces
- ResourceQuotas, LimitRanges; access control via RBAC
- Default: default, kube-public, kube-system, kube-node-lease
- StatefulSet - for persistent storage like Database/kafka apps (namespace level scoped)
- Stable network identity: ordered Pod names (e.g., web-0, web-1).
- Stable persistent storage: per‑Pod PVCs via volumeClaimTemplates.
- Ordered, graceful deployment/scaling: create/terminate Pods sequentially.
- Ordered rolling updates: supports partitioned rolling updates for controlled upgrades.
- DaemonSet: mostly running one pod per node (cluster level scoped)
- Crobjob/Job: pod will be terminated after job finishes (can scheduled and control runing timeout, backoff etc..)
- Manage and track tasks with retries on failure
- Use CronJobs for backups and scheduled maintenance
- ConfigMap: A readonly settings as key/value for containers running in pod (namespace level scoped)
- Secret:
- Harden base images; secure CI/CD; AppArmor/SELinux
- Admission control (PSP removed in 1.25 → use Kyverno or OPA Gatekeeper)
- Falco (runtime detection); OIDC; Kubescape (misconfiguration & vulnerability scanning)
- Scheduling:
- nodeSelector matches node labels
- Affinity/anti‑affinity (preferred or required)
- Topology spread constraints distribute Pods across zones/regions
- Taints & Tolerations (NoSchedule, PreferNoSchedule, NoExecute)
Networking paths
- Container ↔ Container (in a Pod) via localhost
- Pod ↔ Pod: via cluster networking (e.g., Services, headless services)
- Pod ↔ Service: kube-proxy and node packet filtering
- External ↔ Service: kube-proxy and node packet filtering
kubectl Basics Commands Cheet Sheet
Contexts & Cluster Info
kubectl version --short
kubectl api-resources | more
kubectl get all -A
kubectl version --short
kubectl cluster-info
Config detail
kubectl config view --minify
kubectl config get-contexts
kubectl config use-context {context}
kubectl config set-context --current --namespace={ns}
Deployments and Rollout
kubectl top {pods|nodes}
kubectl run --image ubuntu ubuntu -- sleep infinity
kubectl set image deploy/{name} {container}={image:tag}
kubectl get deploy,rs,po -o wide --selector=role=app
kubectl get pods --show-labels
kubectl get ds --all-namespaces #Daemonsets
kubectl describe pod {pod} -n {ns}
kubectl exec -it {pod} -c {container} -- {command}
kubectl delete pod/nginx --now
kubectl explain pod.spec.restartPolicy
kubectl apply -f {file.yaml|dir|dir|url}
kubectl delete -f {file.yaml|dir|dir|url}
kubectl replace -f {file.yaml|dir|dir|url}
kubectl diff -f {file.yaml|dir|dir|url} # diff between existing new one to be applied
kubectl annotate deployment/nginx kubernetes.io/change-cause="initial version"
kubectl expose deploy/{name} --type=NodePort
kubectl expose deploy/{name} --type=LoadBalancer --port 80 --target-port 8080
kubectl get svc, endpoints -o wide
kubectl rollout {status|history|undo} deploy/{name}
kubectl rollout undo deploy/{name} --to-revision=2
kubectl events {pod|rs|deploy|svc}--sort-by=.lastTimestamp --watch --type=warnings
kubectl port-forward svc/{service_name} 8080:80
kubectl port-forward pod/nginx 8080:80
kubectl cp {ns}/{pod}:/path/in/pod ./local/dir
kubectl scale deploy/{name} --replicas=3
kubectl get role,rolebinding -n demo
CronJobs & Jobs
kubectl create job pi --image=perl -- perl -Mbignum=bpi -wle 'print bpi(1000)'
kubectl create cronjob backup --image=alpine --schedule="0 2 * * *" -- sh -c 'echo backup'
kubectl create cronjob my-cron-job --image={image} --schedule="*/5 * * * *" -- {cmd}
kubectl get jobs,cronjobs -A
kubectl delete job/{name} cronjob/{name}
Scheduling & Nodes
kubectl cordon {node}
kubectl drain {node} --ignore-daemonsets --delete-emptydir-data
kubectl uncordon {node}
kubectl taint nodes {node} key=value:{noSchedule|noExecute|preferNoSchedule}
kubectl label node {node} pool=blue --overwrite
kubectl top pods -A
ConfigMaps & Secrets
kubectl create configmap my-config --from-literal=ENV=prod --from-literal=key2=value2
kubectl create configmap my-config --from-env-file=.env
kubectl create secret generic db --from-literal=USER=admin --from-literal=PASS=secret
kubectl get cm,secret
kubectl describe secret db
Debug & Troubleshoot
kubectl describe pod {pod} -n {ns}
kubectl get events -A --sort-by=.lastTimestamp | tailEphemeral container (if available)
kubectl debug -it deploy/{name} --image=busybox --target={container}
kubectl debug node/{node} -it --image=busybox
kubectl logs nginx
kubectl logs -p -c ruby web-1
kubectl logs -f -c ruby web-1
kubectl logs --tail=20 nginx
kubectl logs --since=1h nginx
~/.kube/config
KUBECONFIG={file} kubectl config view --raw
kubectl get clusterrolebindings
kubectl describe clusterrole/cluster-admin
kubectl create role red-pods --verb=get,list,watch --resource=pods -n demo
kubectl create clusterrolebinding cluster-superhero --cluster-role=cluster-superhero --group=cluster-superheroes
kubectl auth can-i '' '' --as-group=cluster-superheroes --as="superman"
Labels & Selectors
- Metadata for resources; used by Services, NetworkPolicies, rollouts, monitoring, and cost tools
kubectl get all --selector=<label>=<value>
Kubernetes Networking
Service: a logical set of Pods with a policy to access them.
- ClusterIP: internal stable IP (default)
- NodePort: exposes on each node’s port (typically 30000–32767)
- LoadBalancer: provisions an external LB (cloud provider)
- Headless: no cluster IP; direct endpoints via DNS
API Server
Flow: Authentication → Authorization → Admission Control → Validation → Request handling → Response
- Sources: internal components, kubectl, third‑party services (default port 6443)
- Authentication: tokens, client certificates, OIDC
- Authorization modes: Node, ABAC, RBAC, Webhook
- Admission controllers: mutate/validate requests; enforce policies and quotas; namespace management; CRDs extend the API
kubectl get nodes -v=6
kubectl proxy & # local proxy for unauthenticated curl
curl http://127.0.0.1:<port>/openapi/v3
RBAC & Access Control
- Users/groups managed externally; ServiceAccounts are namespaced
- Kubeconfig stores cluster/server and client credentials
- ClusterRole/ClusterRoleBinding (cluster‑wide); Role/RoleBinding (namespaced)
Scheduler & Placement
- Phases: filter nodes → score nodes → bind Pod
- schedulerName for custom schedulers
- nodeName overrides scheduling if resources fit
- nodeSelector, affinity/anti‑affinity
Kubernetes Storage
Ephemeral Storage
- emptyDir: a per‑Pod scratch space created when a Pod starts; deleted when the Pod is removed. Stored on the node’s filesystem by default.
- Memory‑backed emptyDir: set emptyDir.medium: Memory to use tmpfs (RAM); useful for fast scratch or to avoid disk I/O.
- Container restarts: emptyDir content persists across container restarts within the same Pod.
- Ephemeral storage limits: control with resource ephemeral-storage requests/limits.
Persistent Storage (PV, PVC, StorageClass)
- Flow: Pod → references a PVC → PVC matches/binds a PV (statically created) or triggers dynamic provisioning via a StorageClass.
- AccessModes: ReadWriteOnce (or ReadWriteOncePod), ReadOnlyMany, ReadWriteMany (availability depends on the driver).
- Reclaim policy: set on the PV (and copied from the StorageClass during dynamic provisioning). Common values:
- Delete: deletes the underlying storage when the PVC is deleted (typical default for dynamically provisioned volumes).
- Retain: retains the storage resource for manual recovery or reuse (common for statically created PVs).
- VolumeBindingMode (StorageClass): Immediate or WaitForFirstConsumer (delays provisioning until the Pod is scheduled to choose topology‑correct storage).
Provisioners & Solutions
- CSI drivers: cloud and on‑prem (e.g., AWS EBS, GCE PD, Azure Disk, Ceph RBD/CephFS, NFS, etc.).
- Rook: CNCF graduated operator to deploy and manage storage backends (notably Ceph) on Kubernetes.
- Ceph: unified object (RGW/S3), block (RBD), and file (CephFS) storage; widely used with Rook. Red Hat OpenShift’s data platform (OpenShift Data Foundation, formerly OCS) is based on Ceph.
- Static vs dynamic: you can create PVs manually (static) or let the StorageClass dynamically provision PVs when a PVC is created.
Network Policies
- Isolate Pods by IP/namespace selectors; ingress and egress; policies are cumulative
- Requires a CNI plugin that enforces policies (e.g., Calico, Cilium, Weave)
Security
# SecurityContext example
spec:
securityContext:
runAsUser: 1000
runAsGroup: 1000
allowPrivilegeEscalation: false
Pod Disruption Budgets (PDB) & Maintenance
- Cordon node → drain → workloads move → uncordon
kubectl cordon {node}
kubectl drain {node} --ignore-daemonsets --delete-emptydir-data
kubectl uncordon {node}
kubectl create pdb nginx --selector=app=nginx --min-available=2
Helm Charts
- Package manager for Kubernetes apps
- Chart.yaml (name/type/version/description), values.yaml (image, registry, configuration)
helm create myapp
helm install myapp .
helm package .
helm install myapp ./myapp-0.1.0.tgz
helm uninstall myapp
Service Mesh
- Enhances security, observability, and reliability; data plane + control plane
- Sidecar pattern; mTLS; access control; circuit breaking; monitoring
- Istio, Linkerd; SMI (Service Mesh Interface)
Observability: Metrics, Logs, Traces
- Metrics: kube-state-metrics; node-exporter; Prometheus Adapter
- Logging: node-level collection; sidecars; application-level; tools: fluentd, Filebeat; backends: OpenSearch, Grafana Loki
- Visualization/alerts: Grafana; Alertmanager
- node Exporterin Prometheus to extract underneath Node's hardware and OS metrics
- Prometheus AdapterAdapting kubernated information to Prometheus, act as bridge, Help to monitor kubenetes cluster effectively
- kube-state-apicollecte kubecluster objects state by listening to kube api server
- kube-cost: help and managing kubenetes cost
Application Delivery
- CI/CD concepts: build pipelines, automated tests, artifact repositories, progressive delivery (blue/green, canary)
- GitOps principles: declarative manifests; desired state in Git; pull‑based reconciler (e.g., Argo CD)
- Best practices: immutable images; image signing; policy checks (OPA/Kyverno); environment overlays (Kustomize/Helm)
- Ansible A toolset for conainer and application lifecycle, infrastructure deployment, automate configuration task
- Fluxusing GitOps tool kit to manage the state of applicatin with desired state in the repo config
# Example: render and apply with Helm (CI step)
helm template myapp ./chart | kubeval -Example: Argo CD app