David's Guide to Kubernetes
Master Kubernetes from the ground up with comprehensive explanations, architecture diagrams, and practical examples for both administrators and developers.
Prerequisites
Basic understanding of containers and Docker, familiarity with command-line interfaces, basic networking concepts, understanding of YAML syntax
David's Guide to Kubernetes
Welcome to the most comprehensive guide to Kubernetes you'll find! This guide will take you from complete beginner to confident Kubernetes user, whether you're a developer looking to deploy applications or an administrator managing clusters.
What You'll Learn
By the end of this guide, you'll understand:
- Core Concepts: Pods, Services, Deployments, and the Kubernetes API
- Architecture: How Kubernetes components work together
- Administration: Cluster management, security, and monitoring
- Development: Application deployment, debugging, and best practices
- Resource Types: All the different Kubernetes objects and when to use them
- Real Examples: Practical YAML configurations and kubectl commands
Prerequisites
Before diving in, you should have:
- Basic understanding of containers and Docker
- Familiarity with command-line interfaces
- Basic networking concepts
- Understanding of YAML syntax
Let's get started on your Kubernetes journey!
What is Kubernetes?
Kubernetes (often shortened to "K8s") is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. Originally developed by Google and now maintained by the Cloud Native Computing Foundation (CNCF), Kubernetes has become the de facto standard for container orchestration.
Why Kubernetes?
Problems Kubernetes Solves: - Manual Deployment: No more SSH-ing into servers to deploy applications - Scaling Challenges: Automatic scaling based on demand - Service Discovery: Containers can find and communicate with each other easily - Load Distribution: Built-in load balancing across application instances - Health Management: Automatic restart of failed containers - Resource Management: Efficient use of cluster resources
Key Benefits: - 🚀 Speed: Deploy and update applications rapidly - 🔄 Reliability: Self-healing systems that recover from failures - 📈 Scalability: Scale applications up or down automatically - 🌐 Portability: Run anywhere - cloud, on-premise, or hybrid - 🔧 Flexibility: Extensible platform with rich ecosystem
The Container Orchestration Challenge
Imagine you have a microservices application with 20 different services, each running in containers. Without orchestration, you'd need to:
- Manually start containers on different servers
- Keep track of which containers are running where
- Restart failed containers manually
- Load balance traffic between container instances
- Handle networking between containers
- Manage storage and configuration
Kubernetes automates all of this and much more!
Kubernetes Architecture Deep Dive
Understanding Kubernetes architecture is crucial for both administrators and developers. Let's break down how everything fits together.
The Big Picture
A Kubernetes cluster consists of two main types of nodes:
- Control Plane Nodes (formerly called Master Nodes)
- Worker Nodes
The control plane makes global decisions about the cluster, while worker nodes run your actual application containers.
Control Plane Components
The control plane is the brain of your Kubernetes cluster. It consists of several key components:
API Server (kube-apiserver)
- What it does: The frontend for the Kubernetes control plane
- Why it's important: All communication goes through the API server
- For admins: This is what kubectl talks to
- For developers: Your deployment manifests are processed here
etcd
- What it does: Distributed key-value store that holds all cluster data
- Why it's important: This is Kubernetes' "memory" - everything is stored here
- For admins: Critical to backup and secure
- For developers: Your application state is ultimately stored here
Scheduler (kube-scheduler)
- What it does: Decides which worker node should run each Pod
- Why it's important: Optimizes resource usage and respects constraints
- For admins: Can be tuned for different scheduling policies
- For developers: Considers your resource requests and node selectors
Controller Manager (kube-controller-manager)
- What it does: Runs various controllers that handle routine tasks
- Controllers include: Node controller, Deployment controller, Service controller
- Why it's important: Ensures desired state matches actual state
- Example: If a Pod dies, the Deployment controller creates a replacement
Cloud Controller Manager (cloud-controller-manager)
- What it does: Integrates with cloud provider APIs
- Why it's important: Enables cloud-specific features
- Examples: Creating load balancers, managing storage volumes
Worker Node Components
Worker nodes run your application containers and provide the Kubernetes runtime environment.
kubelet
- What it does: The primary node agent that communicates with the control plane
- Responsibilities:
- Ensures containers are running in Pods
- Reports node and Pod status
- Executes health checks
- For admins: Monitor kubelet logs for node issues
- For developers: This is what actually starts your containers
kube-proxy
- What it does: Network proxy that implements Kubernetes Service concept
- How it works: Maintains network rules and forwards traffic
- For admins: Configure network policies and troubleshoot connectivity
- For developers: Enables service discovery and load balancing
Container Runtime
- What it does: Software responsible for running containers
- Common runtimes: Docker, containerd, CRI-O
- Interface: Uses Container Runtime Interface (CRI)
- For everyone: This is what actually runs your container images
Networking in Kubernetes
Kubernetes networking can seem complex, but understanding the basics is essential for both admins and developers.
Networking Fundamentals
Every Pod gets its own IP address - Pods can communicate directly with each other - No NAT (Network Address Translation) needed - IP addresses are ephemeral - they change when Pods restart
Services provide stable endpoints
- Services have stable IP addresses and DNS names
- They route traffic to healthy Pod replicas
- Different service types for different use cases
Network Model Requirements
Kubernetes imposes the following fundamental requirements on networking:
- Pod-to-Pod Communication: All Pods can communicate with all other Pods without NAT
- Node-to-Pod Communication: Agents on nodes can communicate with all Pods on that node
- Pod Network Isolation: Pods on different nodes should be able to communicate without NAT
Common Networking Scenarios
Internal Service Communication
# Frontend Pod communicating with Backend Service
apiVersion: v1
kind: Pod
spec:
containers:
- name: frontend
image: frontend:latest
env:
- name: BACKEND_URL
value: "http://backend-service:8080"
External Traffic Ingress
# LoadBalancer Service for external access
apiVersion: v1
kind: Service
metadata:
name: web-service
spec:
type: LoadBalancer
selector:
app: web
ports:
- port: 80
targetPort: 8080
Storage in Kubernetes
Kubernetes provides several abstractions for managing storage:
Volume Types
EmptyDir: Temporary storage that exists for the life of the Pod HostPath: Mounts a file or directory from the host node (use carefully!) PersistentVolume: Durable storage that persists beyond Pod lifecycle ConfigMap/Secret: For configuration data and sensitive information
Persistent Volumes (PV) and Claims (PVC)
PersistentVolume: A cluster-level resource representing available storage PersistentVolumeClaim: A request for storage by a Pod
This separation allows administrators to provision storage while developers claim what they need.
# PersistentVolumeClaim example
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: web-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd
Security Best Practices
Security should be built into your Kubernetes setup from day one. Here are the essential practices:
Authentication and Authorization
Authentication (Who are you?) - X.509 certificates - Bearer tokens - Authentication webhooks - OpenID Connect (OIDC)
Authorization (What can you do?) - Role-Based Access Control (RBAC) - Attribute-Based Access Control (ABAC) - Node authorization - Webhook authorization
RBAC Example
# Create a role that can read pods
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
# Bind the role to a user
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
subjects:
- kind: User
name: jane
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
Pod Security
Security Context: Configure security settings for Pods and containers Pod Security Policies: Cluster-level resource that controls security-sensitive aspects Network Policies: Control communication between Pods
# Security Context example
apiVersion: v1
kind: Pod
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: secure-container
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
Secrets Management
Never hardcode sensitive data! Use Kubernetes Secrets:
# Create a secret
apiVersion: v1
kind: Secret
metadata:
name: db-secret
type: Opaque
data:
username: YWRtaW4= # base64 encoded
password: MWYyZDFlMmU2N2Rm # base64 encoded
---
# Use the secret in a Pod
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
image: myapp:latest
env:
- name: DB_USERNAME
valueFrom:
secretKeyRef:
name: db-secret
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: password
Troubleshooting Guide
When things go wrong in Kubernetes, systematic troubleshooting is key.
Essential Debugging Commands
Check cluster health
kubectl get nodes
kubectl get componentstatuses
kubectl cluster-info
Investigate Pod issues
kubectl get pods -o wide
kubectl describe pod <pod-name>
kubectl logs <pod-name> --previous
kubectl exec -it <pod-name> -- /bin/bash
Check events
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl get events --field-selector involvedObject.name=<resource-name>
Network debugging
kubectl get services
kubectl get endpoints
kubectl describe service <service-name>
Common Issues and Solutions
Pod Stuck in Pending State - Check resource requests vs. available resources - Verify node selector constraints - Look for taints/tolerations issues - Check PVC binding status
Pod Crash/Restart Loop
- Examine application logs: kubectl logs <pod> --previous
- Check resource limits
- Verify health check configurations
- Look at security contexts
Service Not Accessible
- Verify Service selector matches Pod labels
- Check endpoints: kubectl get endpoints <service-name>
- Test DNS resolution from within cluster
- Examine network policies
ImagePullBackOff Error - Check image name and tag - Verify image registry access - Look at image pull secrets - Check node's ability to pull images
Debugging Workflow
- Identify the scope: Is it cluster-wide, node-specific, or application-specific?
- Check the logs: Start with
kubectl describe
andkubectl logs
- Verify configuration: Ensure YAML manifests are correct
- Test connectivity: Use
kubectl exec
to test from inside the cluster - Check resource usage: Look at CPU, memory, and storage
- Review recent changes: What was deployed or changed recently?
This is just the beginning of our comprehensive Kubernetes journey. The guide continues with detailed sections for administrators and developers, complete resource type references, and real-world examples you can use in your own projects.
Ready to dive deeper? Let's explore the sections designed specifically for your role!
Sections
Kubernetes for Administrators
Kubernetes for Administrators
As a Kubernetes administrator, you're responsible for the health, security, and performance of the entire cluster. This section covers the essential skills and knowledge you need.
Cluster Installation and Setup
Installation Methods
Managed Services (Recommended for most)
- Google Kubernetes Engine (GKE)
- Amazon Elastic Kubernetes Service (EKS)
- Azure Kubernetes Service (AKS)
- Digital Ocean Kubernetes
Self-Managed Options - kubeadm (official tool) - kops (AWS focused) - Kubespray (Ansible-based) - Rancher
Cluster Sizing and Planning
Control Plane Sizing - Small clusters (1-100 nodes): 1-3 control plane nodes - Medium clusters (100-1000 nodes): 3-5 control plane nodes - Large clusters (1000+ nodes): 5+ control plane nodes
Worker Node Planning - Consider CPU, memory, and storage requirements - Plan for node failures and maintenance - Leave 10-20% capacity buffer for scaling
Day-to-Day Operations
Essential kubectl Commands for Admins
# Cluster health
kubectl get nodes -o wide
kubectl top nodes
kubectl describe node <node-name>
# Resource usage
kubectl top pods --all-namespaces
kubectl describe limits --all-namespaces
# Cluster info
kubectl cluster-info
kubectl get componentstatuses
kubectl version
# Events and debugging
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl logs -n kube-system <pod-name>
Node Management
Adding Nodes
# On control plane, get join command
kubeadm token create --print-join-command
# On new node, run the join command
sudo kubeadm join <control-plane-ip>:6443 --token <token> --discovery-token-ca-cert-hash <hash>
Node Maintenance
# Drain node before maintenance
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
# Mark node as unschedulable
kubectl cordon <node-name>
# After maintenance, make schedulable again
kubectl uncordon <node-name>
Backup and Recovery
etcd Backup
# Create snapshot
ETCDCTL_API=3 etcdctl snapshot save backup.db --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key
# Restore from snapshot (on all etcd nodes)
ETCDCTL_API=3 etcdctl snapshot restore backup.db --name=<node-name> --initial-cluster=<cluster-definition> --initial-advertise-peer-urls=<peer-url>
Application Data Backup - Use Velero for cluster-level backups - Backup persistent volumes separately - Test restore procedures regularly
Security Administration
RBAC Configuration
# Namespace admin role
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: production
name: namespace-admin
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: production-admins
namespace: production
subjects:
- kind: User
name: alice
- kind: Group
name: production-team
roleRef:
kind: Role
name: namespace-admin
apiGroup: rbac.authorization.k8s.io
Network Security
Network Policies
# Deny all ingress traffic by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
spec:
podSelector: {}
policyTypes:
- Ingress
---
# Allow specific communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
Pod Security
Pod Security Standards (replaces Pod Security Policies)
# Enforce restricted security standard
apiVersion: v1
kind: Namespace
metadata:
name: secure-namespace
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Monitoring and Observability
Cluster Monitoring Stack
Prometheus + Grafana Setup
# Prometheus configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
target_label: __address__
replacement: '${1}:9100'
Essential Metrics to Monitor - Node resource utilization (CPU, memory, disk) - Pod resource usage and limits - API server response times and error rates - etcd performance metrics - Network throughput and latency - Storage I/O and capacity
Log Aggregation
EFK Stack (Elasticsearch, Fluentd, Kibana)
# Fluentd DaemonSet for log collection
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
spec:
selector:
matchLabels:
name: fluentd
template:
spec:
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.logging.svc.cluster.local"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
Alerting
Critical Alerts for Admins - Node down or not ready - High resource utilization (>85%) - API server unavailable - etcd cluster health issues - Certificate expiration warnings - Persistent volume issues
Performance Optimization
Resource Management
Resource Quotas
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: development
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
persistentvolumeclaims: "10"
Limit Ranges
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
spec:
limits:
- default:
cpu: "200m"
memory: "256Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
type: Container
Horizontal Pod Autoscaler (HPA)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 3
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Cluster Autoscaling
Configure cluster autoscaler to automatically add/remove nodes based on demand:
# Cluster Autoscaler deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
template:
spec:
containers:
- image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.21.0
name: cluster-autoscaler
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/kubernetes-cluster-name
Disaster Recovery
Backup Strategy
What to Backup - etcd snapshots (cluster state) - Application data (persistent volumes) - Configuration files and secrets - Custom resource definitions - RBAC configurations
Backup Schedule - etcd: Every 6 hours, retain for 30 days - Application data: Daily, retain based on RTO/RPO requirements - Configuration: After any changes
Recovery Procedures
etcd Recovery 1. Stop all API servers 2. Restore etcd from snapshot on all control plane nodes 3. Start etcd cluster 4. Start API servers 5. Validate cluster state
Application Recovery 1. Restore persistent volume data 2. Apply application manifests 3. Verify application functionality 4. Update DNS/load balancer records if needed
High Availability Setup
Control Plane HA - Use odd number of control plane nodes (3 or 5) - Separate etcd cluster or stacked etcd - Load balancer for API server access - Separate zones/regions for nodes
Data Redundancy - Multiple replicas for stateful applications - Cross-zone persistent volume replication - Regular backup testing and validation
This administration guide provides the foundation for managing Kubernetes clusters effectively. Remember that Kubernetes is a complex system, and expertise comes with practice and experience managing real workloads.
Kubernetes for Developers
Kubernetes for Developers
As a developer working with Kubernetes, you need to understand how to package, deploy, and debug your applications effectively. This section focuses on practical development workflows and best practices.
Application Development Workflow
The Container-First Mindset
Before Kubernetes: Develop locally, deploy to servers With Kubernetes: Develop in containers, deploy to clusters
Development Workflow
- Develop Application
- Write code with containerization in mind
- Create Dockerfile
-
Consider 12-factor app principles
-
Build Container Image
- Build and tag images properly
- Use multi-stage builds for efficiency
-
Implement health checks
-
Test Locally
- Use Docker Compose or local Kubernetes
- Test container behavior
-
Validate resource requirements
-
Deploy to Kubernetes
- Write Kubernetes manifests
- Apply to cluster
-
Monitor and debug
-
Iterate and Improve
- Monitor application performance
- Update and redeploy
- Scale based on demand
Writing Kubernetes-Ready Applications
12-Factor App Principles
1. Codebase: One codebase tracked in revision control 2. Dependencies: Explicitly declare and isolate dependencies 3. Config: Store config in the environment 4. Backing Services: Treat backing services as attached resources 5. Build, Release, Run: Strictly separate build and run stages 6. Processes: Execute the app as one or more stateless processes 7. Port Binding: Export services via port binding 8. Concurrency: Scale out via the process model 9. Disposability: Maximize robustness with fast startup and graceful shutdown 10. Dev/Prod Parity: Keep development, staging, and production as similar as possible 11. Logs: Treat logs as event streams 12. Admin Processes: Run admin/management tasks as one-off processes
Configuration Management
Use ConfigMaps for Non-Sensitive Data
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
database_host: "postgres.database.svc.cluster.local"
database_port: "5432"
log_level: "info"
feature_flags: |
new_ui=true
beta_features=false
---
# Use in Pod
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
envFrom:
- configMapRef:
name: app-config
Use Secrets for Sensitive Data
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
type: Opaque
stringData: # Use stringData to avoid base64 encoding
database_password: "super-secret-password"
api_key: "abcd1234-5678-90ef"
---
# Use in Pod
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: app-secrets
key: database_password
Health Checks
Liveness Probe: Is the application running?
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
Readiness Probe: Is the application ready to receive traffic?
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 2
Startup Probe: For slow-starting applications
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
startupProbe:
httpGet:
path: /startup
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 30 # Allow up to 5 minutes for startup
Application Deployment Patterns
Rolling Updates (Default)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1 # Only 1 pod down at a time
maxSurge: 1 # Only 1 extra pod during update
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: web
image: myregistry/web-app:v2.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
Blue-Green Deployment
# Blue deployment (current)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app-blue
labels:
version: blue
spec:
replicas: 3
selector:
matchLabels:
app: web-app
version: blue
template:
metadata:
labels:
app: web-app
version: blue
spec:
containers:
- name: web
image: myregistry/web-app:v1.0.0
---
# Green deployment (new)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app-green
labels:
version: green
spec:
replicas: 3
selector:
matchLabels:
app: web-app
version: green
template:
metadata:
labels:
app: web-app
version: green
spec:
containers:
- name: web
image: myregistry/web-app:v2.0.0
---
# Service (switch by changing selector)
apiVersion: v1
kind: Service
metadata:
name: web-app-service
spec:
selector:
app: web-app
version: blue # Change to 'green' to switch traffic
ports:
- port: 80
targetPort: 8080
Canary Deployment
# Main deployment (90% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app-main
spec:
replicas: 9
selector:
matchLabels:
app: web-app
track: stable
---
# Canary deployment (10% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app-canary
spec:
replicas: 1
selector:
matchLabels:
app: web-app
track: canary
template:
metadata:
labels:
app: web-app
track: canary
spec:
containers:
- name: web
image: myregistry/web-app:v2.0.0-beta
---
# Service targets both deployments
apiVersion: v1
kind: Service
metadata:
name: web-app-service
spec:
selector:
app: web-app # Selects both stable and canary
ports:
- port: 80
targetPort: 8080
Debugging Applications
Essential Debugging Commands
Pod Inspection
# Get pod status and info
kubectl get pods -o wide
kubectl describe pod <pod-name>
# Check pod logs
kubectl logs <pod-name>
kubectl logs <pod-name> --previous # Previous container instance
kubectl logs <pod-name> -c <container-name> # Multi-container pod
kubectl logs -f <pod-name> # Follow logs
# Execute commands in pod
kubectl exec -it <pod-name> -- /bin/bash
kubectl exec -it <pod-name> -c <container-name> -- sh
Service Debugging
# Check service configuration
kubectl get services
kubectl describe service <service-name>
kubectl get endpoints <service-name>
# Test service from another pod
kubectl run debug-pod --image=busybox --rm -it -- sh
# Inside the debug pod:
nslookup <service-name>
wget -qO- http://<service-name>:<port>/health
Resource Usage
# Check resource usage
kubectl top pods
kubectl top nodes
kubectl describe node <node-name>
# Check resource quotas and limits
kubectl describe resourcequota
kubectl describe limitrange
Common Issues and Solutions
Image Pull Errors
# Check image name and tag
kubectl describe pod <pod-name>
# Common causes:
# - Typo in image name or tag
# - Private registry without proper secrets
# - Image doesn't exist
# - Network issues
# Solution: Verify image exists and create image pull secret if needed
kubectl create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-password>
Pod Stuck in Pending
# Check scheduling issues
kubectl describe pod <pod-name>
# Common causes:
# - Insufficient resources
# - Node selector constraints
# - Taints/tolerations
# - Pod/node affinity rules
# Check node resources
kubectl top nodes
kubectl describe nodes
Application Not Responding
# Check if pod is ready
kubectl get pods
kubectl describe pod <pod-name>
# Test connectivity
kubectl port-forward <pod-name> 8080:8080
# In another terminal:
curl localhost:8080/health
# Check service endpoints
kubectl get endpoints <service-name>
Debugging Tools
Debug Utilities Pod
apiVersion: v1
kind: Pod
metadata:
name: debug-utils
spec:
containers:
- name: debug
image: nicolaka/netshoot # Contains many network debugging tools
command: ["/bin/bash"]
args: ["-c", "while true; do sleep 30; done;"]
# Or use:
# image: busybox # Minimal debugging
# image: ubuntu # Full Linux distribution
Using kubectl debug (Kubernetes 1.18+)
# Create debugging container in existing pod
kubectl debug <pod-name> -it --image=busybox
# Create debugging container in new pod with same settings
kubectl debug <pod-name> -it --image=busybox --copy-to=<new-pod-name>
# Debug node by creating pod on specific node
kubectl debug node/<node-name> -it --image=busybox
Development Best Practices
Resource Requests and Limits
Always Set Resource Requests
spec:
containers:
- name: app
resources:
requests:
cpu: 100m # 0.1 CPU cores
memory: 128Mi # 128 megabytes
limits:
cpu: 500m # 0.5 CPU cores
memory: 512Mi # 512 megabytes
Sizing Guidelines - Start conservative with requests - Monitor actual usage in production - Set limits based on maximum expected usage - Leave some headroom for traffic spikes
Labels and Annotations
Effective Labeling Strategy
metadata:
labels:
app: web-server # Application name
component: frontend # Component type
version: "1.2.3" # Application version
tier: web # Application tier
environment: production # Environment
team: platform # Owning team
annotations:
deployment.kubernetes.io/revision: "3"
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
Multi-Environment Configuration
Using Kustomize
# base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
template:
spec:
containers:
- name: web
image: myregistry/web-app:latest
env:
- name: ENVIRONMENT
value: base
# overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../base
patchesStrategicMerge:
- deployment.yaml
# overlays/production/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 10
template:
spec:
containers:
- name: web
env:
- name: ENVIRONMENT
value: production
Deploy with Kustomize
# Deploy to production
kubectl apply -k overlays/production/
# Deploy to staging
kubectl apply -k overlays/staging/
Secrets Management
External Secrets Operator
# Use external secrets from cloud providers
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: vault-backend
spec:
provider:
vault:
server: "https://vault.example.com"
path: "secret"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "example-role"
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: app-secret
spec:
secretStoreRef:
name: vault-backend
kind: SecretStore
target:
name: app-secret
creationPolicy: Owner
data:
- secretKey: database_password
remoteRef:
key: app/database
property: password
Logging Best Practices
Application Logging
# Python example
import logging
import json
# Structure logs as JSON for better parsing
logger = logging.getLogger(__name__)
handler = logging.StreamHandler()
formatter = logging.Formatter('%(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
def log_structured(level, message, **kwargs):
log_entry = {
'timestamp': datetime.utcnow().isoformat(),
'level': level,
'message': message,
'service': 'web-app',
'version': '1.2.3',
**kwargs
}
logger.log(getattr(logging, level.upper()), json.dumps(log_entry))
# Usage
log_structured('info', 'User login successful', user_id=123, ip_address='192.168.1.1')
Log to stdout/stderr
- Don't write to log files inside containers
- Use stdout for application logs
- Use stderr for error logs
- Let Kubernetes handle log collection
Testing in Kubernetes
Unit Testing - Test application logic independently - Mock external dependencies - Use dependency injection for testability
Integration Testing
# Test database with real database
apiVersion: v1
kind: Pod
metadata:
name: integration-test
spec:
restartPolicy: Never
containers:
- name: test
image: myregistry/app-test:latest
env:
- name: DATABASE_URL
value: "postgres://test-db:5432/testdb"
command: ["npm", "run", "test:integration"]
- name: postgres
image: postgres:13
env:
- name: POSTGRES_DB
value: testdb
- name: POSTGRES_PASSWORD
value: testpass
End-to-End Testing
# Deploy test environment
kubectl apply -f test-environment/
# Run E2E tests
kubectl run e2e-test --image=myregistry/e2e-tests:latest --rm -it --restart=Never -- npm run test:e2e
# Cleanup
kubectl delete -f test-environment/
This development guide provides the foundation for building and deploying applications successfully on Kubernetes. Remember to start simple and gradually adopt more advanced patterns as your team's Kubernetes expertise grows.
Resource Types Reference
Kubernetes Resource Types Reference
This comprehensive reference covers all major Kubernetes resource types, when to use them, and practical examples for each.
Workload Resources
Pod
The smallest deployable unit in Kubernetes.
When to Use: - Testing and development - One-off tasks - Debugging
Best Practices: - Rarely create Pods directly in production - Use higher-level controllers (Deployments, Jobs, etc.) - Always set resource requests and limits
apiVersion: v1
kind: Pod
metadata:
name: my-pod
labels:
app: web-server
spec:
containers:
- name: web-container
image: nginx:1.21
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
ReplicaSet
Maintains a stable set of replica Pods running at any given time.
When to Use: - Rarely used directly - Managed automatically by Deployments - Only use directly if you need custom update orchestration
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: web-replicaset
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: nginx:1.21
ports:
- containerPort: 80
Deployment
Provides declarative updates for Pods and ReplicaSets.
When to Use: - Stateless applications - Web servers, APIs, microservices - Most common workload resource
Key Features: - Rolling updates - Rollback capabilities - Scaling - Update strategies
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-deployment
labels:
app: web
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: nginx:1.21
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
StatefulSet
Manages stateful applications with stable network identities and persistent storage.
When to Use: - Databases (PostgreSQL, MySQL, MongoDB) - Message queues (RabbitMQ, Kafka) - Applications requiring stable network identity - Applications requiring persistent storage
Key Features: - Stable Pod naming (pod-0, pod-1, etc.) - Ordered deployment and scaling - Stable storage with PVC templates
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres-statefulset
spec:
serviceName: postgres-service
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:13
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: postgres-storage
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd
DaemonSet
Ensures that all (or some) nodes run a copy of a Pod.
When to Use: - Node monitoring (Prometheus node exporter) - Log collection (Fluentd, Filebeat) - Storage daemons (Ceph, GlusterFS) - Network components (CNI plugins)
Key Features: - Runs on every node by default - Automatically scales with cluster - Can be restricted to specific nodes
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-daemonset
namespace: kube-system
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccount: fluentd
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.logging.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
Job
Runs pods to completion for batch workloads.
When to Use: - Data processing tasks - Database migrations - Backup operations - One-time computations
Key Features: - Runs to completion - Can run in parallel - Automatic cleanup options
apiVersion: batch/v1
kind: Job
metadata:
name: database-migration
spec:
completions: 1
parallelism: 1
backoffLimit: 3
template:
metadata:
labels:
app: db-migration
spec:
restartPolicy: Never
containers:
- name: migration
image: migrate/migrate:latest
command:
- migrate
- -path
- /migrations
- -database
- postgres://user:pass@postgres:5432/db?sslmode=disable
- up
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secret
key: url
volumeMounts:
- name: migrations
mountPath: /migrations
volumes:
- name: migrations
configMap:
name: migration-scripts
CronJob
Manages time-based Jobs on a repeating schedule.
When to Use: - Scheduled backups - Report generation - Cleanup tasks - Periodic data processing
Key Features: - Cron-like scheduling - Job history limits - Concurrency policies
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-cronjob
spec:
schedule: "0 2 * * *" # Every day at 2 AM
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: backup
image: postgres:13
command:
- sh
- -c
- |
pg_dump $DATABASE_URL | gzip > /backup/backup-$(date +%Y%m%d-%H%M%S).sql.gz
# Upload to S3 or other storage
aws s3 cp /backup/backup-*.sql.gz s3://my-backups/
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: postgres-secret
key: url
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-secret
key: access-key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-secret
key: secret-key
volumeMounts:
- name: backup-storage
mountPath: /backup
volumes:
- name: backup-storage
emptyDir: {}
Service Resources
Service
Abstract way to expose an application running on a set of Pods.
Service Types:
ClusterIP (Default) - Internal cluster access only - Stable internal IP address
apiVersion: v1
kind: Service
metadata:
name: web-service-internal
spec:
type: ClusterIP
selector:
app: web
ports:
- port: 80
targetPort: 8080
protocol: TCP
NodePort - Exposes service on each node's IP at a static port - Accessible from outside the cluster
apiVersion: v1
kind: Service
metadata:
name: web-service-nodeport
spec:
type: NodePort
selector:
app: web
ports:
- port: 80
targetPort: 8080
nodePort: 30080 # Optional, will be auto-assigned if not specified
LoadBalancer - Creates external load balancer (cloud provider specific) - Assigns external IP address
apiVersion: v1
kind: Service
metadata:
name: web-service-lb
spec:
type: LoadBalancer
selector:
app: web
ports:
- port: 80
targetPort: 8080
loadBalancerSourceRanges:
- 10.0.0.0/8
- 192.168.0.0/16
ExternalName - Maps service to DNS name - Returns CNAME record
apiVersion: v1
kind: Service
metadata:
name: external-database
spec:
type: ExternalName
externalName: my-database.example.com
ports:
- port: 5432
Ingress
API object that manages external access to services, typically HTTP.
When to Use: - HTTP/HTTPS traffic routing - SSL termination - Name-based virtual hosting - Load balancing
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- myapp.example.com
secretName: myapp-tls
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080
Config and Storage Resources
ConfigMap
Store non-confidential configuration data in key-value pairs.
When to Use: - Application configuration - Environment-specific settings - Configuration files
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
# Key-value pairs
database_host: "postgres.example.com"
database_port: "5432"
log_level: "info"
# File content
nginx.conf: |
server {
listen 80;
server_name localhost;
location / {
proxy_pass http://backend:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
# JSON configuration
app-settings.json: |
{
"features": {
"newUI": true,
"betaFeatures": false
},
"limits": {
"maxUsers": 1000,
"maxRequests": 10000
}
}
---
# Usage in Pod
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
# Use as environment variables
envFrom:
- configMapRef:
name: app-config
# Use specific keys as env vars
env:
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: app-config
key: log_level
# Mount as files
volumeMounts:
- name: config-volume
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
- name: app-settings
mountPath: /app/config
volumes:
- name: config-volume
configMap:
name: app-config
- name: app-settings
configMap:
name: app-config
items:
- key: app-settings.json
path: settings.json
Secret
Store and manage sensitive information.
Secret Types:
- Opaque
: arbitrary user-defined data (default)
- kubernetes.io/dockerconfigjson
: Docker registry credentials
- kubernetes.io/tls
: TLS certificate data
- kubernetes.io/service-account-token
: Service account token
# Opaque secret
apiVersion: v1
kind: Secret
metadata:
name: app-secret
type: Opaque
stringData: # Use stringData to avoid base64 encoding
database_password: "super-secret-password"
api_key: "abcd1234-5678-90ef"
data: # Use data for base64 encoded values
username: YWRtaW4= # "admin" base64 encoded
---
# Docker registry secret
apiVersion: v1
kind: Secret
metadata:
name: regcred
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: eyJhdXRocyI6eyJteS1yZWdpc3RyeS5jb20iOnsidXNlcm5hbWUiOiJteXVzZXIiLCJwYXNzd29yZCI6Im15cGFzcyIsImF1dGgiOiJiWGwxYzJWeU9tMTVjR0Z6Y3c9PSJ9fX0=
---
# TLS secret
apiVersion: v1
kind: Secret
metadata:
name: tls-secret
type: kubernetes.io/tls
data:
tls.crt: LS0tLS1CRUdJTi... # Base64 encoded certificate
tls.key: LS0tLS1CRUdJTi... # Base64 encoded private key
PersistentVolume (PV)
Represents a piece of storage in the cluster that has been provisioned by an administrator.
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-pv
labels:
type: local
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /data/postgres
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-node-1
PersistentVolumeClaim (PVC)
A request for storage by a user.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd
selector:
matchLabels:
environment: production
StorageClass
Describes the "classes" of storage offered by administrators.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iops: "3000"
throughput: "125"
fsType: ext4
encrypted: "true"
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
Networking Resources
NetworkPolicy
Controls traffic flow between Pods and network endpoints.
Default Deny All Ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
Allow Specific Traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
- namespaceSelector:
matchLabels:
name: staging
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432
- to: [] # Allow all egress for DNS, etc.
ports:
- protocol: UDP
port: 53
Security Resources
ServiceAccount
Provides an identity for processes that run in a Pod.
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-service-account
namespace: production
automountServiceAccountToken: true
imagePullSecrets:
- name: regcred
secrets:
- name: my-secret
---
# Use in Pod
apiVersion: v1
kind: Pod
spec:
serviceAccountName: my-service-account
containers:
- name: app
image: myapp:latest
Role
Defines permissions within a namespace.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: production
name: pod-manager
rules:
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: [""]
resources: ["services"]
verbs: ["get", "list", "watch"]
ClusterRole
Defines cluster-wide permissions.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-admin-reader
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps", "extensions"]
resources: ["*"]
verbs: ["get", "list", "watch"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["*"]
verbs: ["get", "list", "watch"]
RoleBinding
Grants permissions defined in a Role to users, groups, or ServiceAccounts.
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: pod-manager-binding
namespace: production
subjects:
- kind: User
name: alice
apiGroup: rbac.authorization.k8s.io
- kind: ServiceAccount
name: my-service-account
namespace: production
- kind: Group
name: production-team
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-manager
apiGroup: rbac.authorization.k8s.io
ClusterRoleBinding
Grants permissions defined in a ClusterRole across the entire cluster.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-admin-binding
subjects:
- kind: User
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
- kind: ServiceAccount
name: admin-service-account
namespace: kube-system
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
PodSecurityPolicy (Deprecated in 1.21, removed in 1.25)
Use Pod Security Standards instead.
Pod Security Standards
Pod Security Standards define security policies at the namespace level.
# Enforce restricted security standard
apiVersion: v1
kind: Namespace
metadata:
name: secure-namespace
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Auto-scaling Resources
HorizontalPodAutoscaler (HPA)
Automatically scales the number of Pods based on observed metrics.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 3
maxReplicas: 100
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: requests_per_second
target:
type: AverageValue
averageValue: "100"
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 60
VerticalPodAutoscaler (VPA)
Automatically adjusts Pod resource requests and limits.
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: web-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
updatePolicy:
updateMode: "Auto" # Auto, Initial, Off
resourcePolicy:
containerPolicies:
- containerName: web-container
maxAllowed:
cpu: 2
memory: 2Gi
minAllowed:
cpu: 100m
memory: 128Mi
controlledResources: ["cpu", "memory"]
Resource Management
ResourceQuota
Constrains resource consumption in a namespace.
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: production
spec:
hard:
# Compute resources
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
# Object counts
pods: "50"
services: "10"
secrets: "20"
configmaps: "20"
persistentvolumeclaims: "10"
# Storage
requests.storage: "100Gi"
# Extended resources
requests.nvidia.com/gpu: "4"
LimitRange
Constrains resource allocation per object in a namespace.
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: production
spec:
limits:
# Default limits for containers
- default:
cpu: "200m"
memory: "256Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
type: Container
# Limits for Pods
- max:
cpu: "4"
memory: "8Gi"
min:
cpu: "50m"
memory: "64Mi"
type: Pod
# Limits for PVCs
- max:
storage: "50Gi"
min:
storage: "1Gi"
type: PersistentVolumeClaim
This comprehensive resource reference provides the foundation for understanding and using all major Kubernetes resources effectively. Each resource type serves specific use cases, and understanding when and how to use them is key to successful Kubernetes operations.
API Schemas and Examples
Kubernetes API Schemas and Examples
This section provides detailed API schemas and practical examples for all major Kubernetes resources. Use this as a reference when writing your own YAML manifests.
Understanding Kubernetes APIs
API Versioning
Kubernetes uses API versions to indicate the stability and compatibility of resources:
- alpha: Early development, may be removed
- beta: Well-tested, but may change
- stable/v1: Production-ready, stable interface
API Groups
Resources are organized into API groups:
- Core/Legacy (
""
orv1
): Pods, Services, ConfigMaps - apps: Deployments, ReplicaSets, StatefulSets, DaemonSets
- batch: Jobs, CronJobs
- networking.k8s.io: NetworkPolicies, Ingress
- rbac.authorization.k8s.io: Roles, RoleBindings
- autoscaling: HorizontalPodAutoscaler
Complete API Schemas
Pod API Schema
apiVersion: v1 # API version (required)
kind: Pod # Resource type (required)
metadata: # Object metadata (required)
name: string # Pod name (required)
namespace: string # Namespace (optional, defaults to 'default')
labels: # Key-value pairs for identification
key1: value1
key2: value2
annotations: # Non-identifying metadata
key1: value1
key2: value2
creationTimestamp: string # RFC 3339 date-time (read-only)
uid: string # Unique identifier (read-only)
resourceVersion: string # Resource version (read-only)
generation: integer # Generation number (read-only)
finalizers: []string # List of finalizers
ownerReferences: # List of objects depended by this object
- apiVersion: string
kind: string
name: string
uid: string
controller: boolean
blockOwnerDeletion: boolean
spec: # Pod specification (required)
activeDeadlineSeconds: integer # Duration in seconds pod may be active
affinity: # Scheduling constraints
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: string
operator: In|NotIn|Exists|DoesNotExist
values: []string
preferredDuringSchedulingIgnoredDuringExecution:
- weight: integer # 1-100
preference:
matchExpressions:
- key: string
operator: In|NotIn|Exists|DoesNotExist
values: []string
podAffinity: # Similar structure to nodeAffinity
requiredDuringSchedulingIgnoredDuringExecution: []
preferredDuringSchedulingIgnoredDuringExecution: []
podAntiAffinity: # Similar structure to nodeAffinity
requiredDuringSchedulingIgnoredDuringExecution: []
preferredDuringSchedulingIgnoredDuringExecution: []
automountServiceAccountToken: boolean # Whether to auto mount SA token
containers: # List of containers (required)
- name: string # Container name (required)
image: string # Container image (required)
imagePullPolicy: Always|Never|IfNotPresent # Image pull policy
command: []string # Container entrypoint
args: []string # Arguments to entrypoint
workingDir: string # Container's working directory
ports: # Exposed ports
- name: string # Port name
containerPort: integer # Port number (required)
hostPort: integer # Port on host
protocol: TCP|UDP|SCTP # Protocol (default: TCP)
env: # Environment variables
- name: string # Variable name (required)
value: string # Variable value
valueFrom: # Source for the environment variable
fieldRef:
fieldPath: string # Pod field reference
apiVersion: string
resourceFieldRef:
containerName: string
resource: string
divisor: string
configMapKeyRef:
name: string # ConfigMap name
key: string # Key in ConfigMap
optional: boolean
secretKeyRef:
name: string # Secret name
key: string # Key in Secret
optional: boolean
envFrom: # Populate env from ConfigMap/Secret
- configMapRef:
name: string
optional: boolean
- secretRef:
name: string
optional: boolean
resources: # Resource requirements
requests: # Minimum resources needed
cpu: string # CPU request (e.g., "100m", "0.5")
memory: string # Memory request (e.g., "128Mi", "1Gi")
ephemeral-storage: string # Ephemeral storage request
nvidia.com/gpu: string # GPU request (extended resource)
limits: # Maximum resources allowed
cpu: string # CPU limit
memory: string # Memory limit
ephemeral-storage: string # Ephemeral storage limit
nvidia.com/gpu: string # GPU limit
volumeMounts: # Pod volumes to mount
- name: string # Volume name (required)
mountPath: string # Path within container (required)
subPath: string # Path within volume
readOnly: boolean # Mount read-only (default: false)
mountPropagation: None|HostToContainer|Bidirectional
livenessProbe: # Liveness probe configuration
httpGet:
path: string
port: integer|string
host: string
scheme: HTTP|HTTPS
httpHeaders:
- name: string
value: string
tcpSocket:
port: integer|string
host: string
exec:
command: []string
initialDelaySeconds: integer # Delay before first probe (default: 0)
periodSeconds: integer # Probe frequency (default: 10)
timeoutSeconds: integer # Probe timeout (default: 1)
successThreshold: integer # Success threshold (default: 1)
failureThreshold: integer # Failure threshold (default: 3)
readinessProbe: # Readiness probe (same structure as liveness)
# ... same fields as livenessProbe
startupProbe: # Startup probe (same structure as liveness)
# ... same fields as livenessProbe
lifecycle: # Container lifecycle hooks
postStart:
exec:
command: []string
httpGet:
path: string
port: integer|string
host: string
scheme: HTTP|HTTPS
preStop:
exec:
command: []string
httpGet:
path: string
port: integer|string
host: string
scheme: HTTP|HTTPS
terminationMessagePath: string # Path for termination message
terminationMessagePolicy: File|FallbackToLogsOnError
securityContext: # Container security context
runAsUser: integer # User ID to run container
runAsGroup: integer # Group ID to run container
runAsNonRoot: boolean # Must run as non-root user
readOnlyRootFilesystem: boolean # Root filesystem read-only
allowPrivilegeEscalation: boolean # Allow privilege escalation
privileged: boolean # Run in privileged mode
procMount: Default|Unmasked # How proc is mounted
capabilities:
add: []string # Capabilities to add
drop: []string # Capabilities to drop
seLinuxOptions:
user: string
role: string
type: string
level: string
windowsOptions:
gmsaCredentialSpecName: string
gmsaCredentialSpec: string
runAsUserName: string
stdin: boolean # Allocate stdin (default: false)
stdinOnce: boolean # Close stdin after attach (default: false)
tty: boolean # Allocate TTY (default: false)
dnsConfig: # DNS configuration
nameservers: []string # List of DNS servers
searches: []string # List of DNS search domains
options: # List of DNS options
- name: string
value: string
dnsPolicy: ClusterFirst|ClusterFirstWithHostNet|Default|None
enableServiceLinks: boolean # Enable service environment variables
hostAliases: # Host aliases for /etc/hosts
- ip: string
hostnames: []string
hostIPC: boolean # Use host IPC namespace
hostNetwork: boolean # Use host network namespace
hostPID: boolean # Use host PID namespace
hostname: string # Pod hostname
subdomain: string # Pod subdomain
imagePullSecrets: # Secrets for pulling images
- name: string
initContainers: # Init containers (same structure as containers)
- name: string
image: string
# ... same fields as regular containers
nodeName: string # Node name to schedule pod on
nodeSelector: # Node selector constraints
key1: value1
key2: value2
overhead: # Resource overhead
cpu: string
memory: string
preemptionPolicy: PreemptLowerPriority|Never # Preemption policy
priority: integer # Pod priority
priorityClassName: string # Priority class name
readinessGates: # Additional readiness conditions
- conditionType: string
restartPolicy: Always|OnFailure|Never # Pod restart policy
runtimeClassName: string # Runtime class for pod
schedulerName: string # Scheduler name
securityContext: # Pod security context
runAsUser: integer
runAsGroup: integer
runAsNonRoot: boolean
supplementalGroups: []integer
fsGroup: integer
fsGroupChangePolicy: Always|OnRootMismatch
seccompProfile:
type: RuntimeDefault|Unconfined|Localhost
localhostProfile: string
seLinuxOptions:
user: string
role: string
type: string
level: string
sysctls:
- name: string
value: string
windowsOptions:
gmsaCredentialSpecName: string
gmsaCredentialSpec: string
runAsUserName: string
serviceAccount: string # Service account (deprecated, use serviceAccountName)
serviceAccountName: string # Service account name
shareProcessNamespace: boolean # Share process namespace
terminationGracePeriodSeconds: integer # Grace period for termination
tolerations: # Tolerations for taints
- key: string
operator: Equal|Exists
value: string
effect: NoSchedule|PreferNoSchedule|NoExecute
tolerationSeconds: integer
topologySpreadConstraints: # Topology spread constraints
- maxSkew: integer
topologyKey: string
whenUnsatisfiable: DoNotSchedule|ScheduleAnyway
labelSelector:
matchLabels:
key1: value1
matchExpressions:
- key: string
operator: In|NotIn|Exists|DoesNotExist
values: []string
volumes: # Pod volumes
- name: string # Volume name (required)
# Volume types (use only one)
emptyDir:
medium: "" # "" for disk, "Memory" for tmpfs
sizeLimit: string # Size limit
hostPath:
path: string # Host path (required)
type: "" # Path type
configMap:
name: string # ConfigMap name
optional: boolean
defaultMode: integer # File permissions
items:
- key: string
path: string
mode: integer
secret:
secretName: string # Secret name
optional: boolean
defaultMode: integer
items:
- key: string
path: string
mode: integer
persistentVolumeClaim:
claimName: string # PVC name (required)
readOnly: boolean
nfs:
server: string # NFS server (required)
path: string # NFS path (required)
readOnly: boolean
awsElasticBlockStore:
volumeID: string # EBS volume ID (required)
fsType: string # Filesystem type
partition: integer
readOnly: boolean
azureDisk:
diskName: string # Disk name (required)
diskURI: string # Disk URI (required)
cachingMode: None|ReadOnly|ReadWrite
fsType: string
readOnly: boolean
kind: Shared|Dedicated|Managed
gcePersistentDisk:
pdName: string # PD name (required)
fsType: string
partition: integer
readOnly: boolean
status: # Pod status (read-only)
phase: Pending|Running|Succeeded|Failed|Unknown
conditions:
- type: PodScheduled|Ready|Initialized|ContainersReady
status: "True"|"False"|"Unknown"
lastProbeTime: string
lastTransitionTime: string
reason: string
message: string
hostIP: string
podIP: string
podIPs:
- ip: string
startTime: string
containerStatuses:
- name: string
state:
waiting:
reason: string
message: string
running:
startedAt: string
terminated:
exitCode: integer
signal: integer
reason: string
message: string
startedAt: string
finishedAt: string
containerID: string
lastState: {} # Same structure as state
ready: boolean
restartCount: integer
image: string
imageID: string
containerID: string
started: boolean
initContainerStatuses: [] # Same structure as containerStatuses
qosClass: Guaranteed|Burstable|BestEffort
message: string
reason: string
nominatedNodeName: string
ephemeralContainerStatuses: [] # Same structure as containerStatuses
Deployment API Schema
apiVersion: apps/v1 # API version (required)
kind: Deployment # Resource type (required)
metadata: # Standard object metadata
name: string # Deployment name (required)
namespace: string # Namespace
labels: # Labels for the deployment
app: string
version: string
annotations: # Annotations for the deployment
deployment.kubernetes.io/revision: string
spec: # Deployment specification
replicas: integer # Desired number of replicas (default: 1)
selector: # Label selector for pods (required)
matchLabels:
key1: value1
matchExpressions:
- key: string
operator: In|NotIn|Exists|DoesNotExist
values: []string
template: # Pod template (required)
metadata: # Pod metadata
labels: # Pod labels (must match selector)
key1: value1
annotations:
key1: value1
spec: # Pod specification
# ... Complete Pod spec as defined above
strategy: # Deployment strategy
type: Recreate|RollingUpdate # Strategy type (default: RollingUpdate)
rollingUpdate: # Rolling update config (only if type=RollingUpdate)
maxUnavailable: integer|string # Max pods unavailable during update
maxSurge: integer|string # Max pods above desired during update
revisionHistoryLimit: integer # Number of old ReplicaSets to retain (default: 10)
progressDeadlineSeconds: integer # Max time for deployment to make progress
paused: boolean # Whether deployment is paused
status: # Deployment status (read-only)
observedGeneration: integer # Most recent generation observed
replicas: integer # Total number of non-terminated pods
updatedReplicas: integer # Number of updated pods
readyReplicas: integer # Number of ready pods
availableReplicas: integer # Number of available pods
unavailableReplicas: integer # Number of unavailable pods
conditions:
- type: Progressing|Available|ReplicaFailure
status: "True"|"False"|"Unknown"
lastUpdateTime: string
lastTransitionTime: string
reason: string
message: string
collisionCount: integer # Count of hash collisions
Service API Schema
apiVersion: v1 # API version (required)
kind: Service # Resource type (required)
metadata: # Standard object metadata
name: string # Service name (required)
namespace: string # Namespace
labels:
app: string
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
spec: # Service specification
selector: # Pod selector (required for most service types)
app: string
ports: # Service ports (required)
- name: string # Port name
port: integer # Service port (required)
targetPort: integer|string # Pod port (default: same as port)
protocol: TCP|UDP|SCTP # Protocol (default: TCP)
nodePort: integer # Node port (only for NodePort/LoadBalancer)
type: ClusterIP|NodePort|LoadBalancer|ExternalName # Service type (default: ClusterIP)
# ClusterIP specific
clusterIP: string # Service cluster IP (can be "None" for headless)
clusterIPs: []string # List of cluster IPs (for dual-stack)
# ExternalName specific
externalName: string # External DNS name (for ExternalName type)
# LoadBalancer specific
loadBalancerIP: string # Desired load balancer IP
loadBalancerSourceRanges: []string # Allowed source IP ranges
loadBalancerClass: string # Load balancer class
# External traffic policy
externalTrafficPolicy: Cluster|Local # How to route external traffic
# Session affinity
sessionAffinity: ClientIP|None # Session affinity type (default: None)
sessionAffinityConfig:
clientIP:
timeoutSeconds: integer # Session timeout
# Publishing services
publishNotReadyAddresses: boolean # Include not-ready endpoints
# IP families
ipFamilyPolicy: SingleStack|PreferDualStack|RequireDualStack
ipFamilies: []string # IP families (IPv4, IPv6)
# Health check node port
healthCheckNodePort: integer # Node port for health checks
# Internal traffic policy
internalTrafficPolicy: Cluster|Local # How to route internal traffic
# Topology aware hints
topologyKeys: []string # DEPRECATED: use service.kubernetes.io/topology-aware-hints annotation
status: # Service status (read-only)
loadBalancer: # Load balancer status
ingress:
- ip: string
hostname: string
ports:
- port: integer
protocol: string
error: string
conditions:
- type: LoadBalancerPortsError
status: "True"|"False"|"Unknown"
lastTransitionTime: string
reason: string
message: string
ConfigMap API Schema
apiVersion: v1 # API version (required)
kind: ConfigMap # Resource type (required)
metadata: # Standard object metadata
name: string # ConfigMap name (required)
namespace: string # Namespace
labels:
app: string
version: string
annotations:
reloader.stakater.com/match: "true" # Third-party annotations
data: # Configuration data as key-value pairs
key1: | # Multi-line string value
# This is a configuration file
server:
host: localhost
port: 8080
database:
host: postgres
port: 5432
name: myapp
key2: "simple string value" # Simple string value
key3: |
{
"json": {
"key": "value",
"array": [1, 2, 3],
"nested": {
"property": true
}
}
}
binaryData: # Binary data (base64 encoded)
binary-key: <base64-encoded-data>
immutable: boolean # Whether the ConfigMap is immutable (default: false)
Secret API Schema
apiVersion: v1 # API version (required)
kind: Secret # Resource type (required)
metadata: # Standard object metadata
name: string # Secret name (required)
namespace: string # Namespace
labels:
app: string
annotations:
reloader.stakater.com/match: "true"
type: Opaque # Secret type
# Possible types:
# - Opaque: arbitrary user-defined data
# - kubernetes.io/service-account-token: service account token
# - kubernetes.io/dockercfg: serialized ~/.dockercfg file
# - kubernetes.io/dockerconfigjson: serialized ~/.docker/config.json file
# - kubernetes.io/basic-auth: basic authentication
# - kubernetes.io/ssh-auth: SSH authentication
# - kubernetes.io/tls: TLS certificate data
data: # Secret data (base64 encoded)
username: dXNlcg== # base64 encoded "user"
password: cGFzcw== # base64 encoded "pass"
stringData: # Secret data (plain text, will be base64 encoded)
database-url: "postgres://user:pass@localhost/db"
api-key: "abcd1234-5678-90ef-ghij1234"
immutable: boolean # Whether the Secret is immutable (default: false)
PersistentVolumeClaim API Schema
apiVersion: v1 # API version (required)
kind: PersistentVolumeClaim # Resource type (required)
metadata: # Standard object metadata
name: string # PVC name (required)
namespace: string # Namespace
labels:
app: string
annotations:
volume.beta.kubernetes.io/storage-class: string # Storage class (deprecated)
spec: # PVC specification
accessModes: # Access modes (required)
- ReadWriteOnce # RWO: single node read-write
- ReadOnlyMany # ROX: many nodes read-only
- ReadWriteMany # RWX: many nodes read-write
- ReadWriteOncePod # RWOP: single pod read-write (1.22+)
resources: # Resource requirements (required)
requests:
storage: string # Storage size request (e.g., "10Gi")
limits:
storage: string # Storage size limit
selector: # PV selector
matchLabels:
key1: value1
matchExpressions:
- key: string
operator: In|NotIn|Exists|DoesNotExist
values: []string
storageClassName: string # Storage class name
volumeMode: Filesystem|Block # Volume mode (default: Filesystem)
volumeName: string # Specific PV to bind to
dataSource: # Data source for PVC
apiGroup: string
kind: string # VolumeSnapshot, PVC, etc.
name: string
dataSourceRef: # Data source reference (1.24+)
apiGroup: string
kind: string
name: string
namespace: string
status: # PVC status (read-only)
phase: Pending|Bound|Lost # PVC phase
accessModes: []string # Actual access modes
capacity:
storage: string # Actual storage capacity
conditions:
- type: Resizing|FileSystemResizePending
status: "True"|"False"|"Unknown"
lastProbeTime: string
lastTransitionTime: string
reason: string
message: string
Ingress API Schema
apiVersion: networking.k8s.io/v1 # API version (required)
kind: Ingress # Resource type (required)
metadata: # Standard object metadata
name: string # Ingress name (required)
namespace: string # Namespace
labels:
app: string
annotations: # Ingress controller specific annotations
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
cert-manager.io/cluster-issuer: letsencrypt-prod
spec: # Ingress specification
ingressClassName: string # Ingress class name
defaultBackend: # Default backend
service:
name: string # Service name
port:
number: integer # Service port number
name: string # Service port name
resource: # Custom resource backend
apiGroup: string
kind: string
name: string
tls: # TLS configuration
- hosts: # List of hosts
- example.com
- www.example.com
secretName: string # TLS secret name
rules: # Ingress rules
- host: string # Hostname (optional)
http: # HTTP rule
paths: # List of paths
- path: string # Path (e.g., "/api")
pathType: Exact|Prefix|ImplementationSpecific # Path matching type
backend: # Backend service
service:
name: string # Service name
port:
number: integer # Service port number
name: string # Service port name
resource: # Custom resource backend
apiGroup: string
kind: string
name: string
status: # Ingress status (read-only)
loadBalancer: # Load balancer status
ingress:
- ip: string
hostname: string
ports:
- port: integer
protocol: string
error: string
Practical Examples with Full Context
Complete Web Application Stack
This example shows a complete web application deployment with all necessary resources:
# Namespace
apiVersion: v1
kind: Namespace
metadata:
name: webapp
labels:
name: webapp
environment: production
---
# ConfigMap for application configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: webapp-config
namespace: webapp
data:
database_host: "postgres-service"
database_port: "5432"
database_name: "webapp"
redis_host: "redis-service"
redis_port: "6379"
log_level: "info"
app_config.yaml: |
server:
host: 0.0.0.0
port: 8080
read_timeout: 30s
write_timeout: 30s
features:
enable_metrics: true
enable_tracing: true
enable_caching: true
limits:
max_connections: 1000
max_requests_per_minute: 10000
---
# Secret for sensitive data
apiVersion: v1
kind: Secret
metadata:
name: webapp-secrets
namespace: webapp
type: Opaque
stringData:
database_password: "super-secret-db-password"
redis_password: "redis-secret-password"
jwt_secret: "jwt-signing-secret-key"
api_key: "external-api-key-123456"
---
# PersistentVolumeClaim for database storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
namespace: webapp
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: ssd-retain
---
# PostgreSQL StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: webapp
spec:
serviceName: postgres-service
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:13
env:
- name: POSTGRES_DB
valueFrom:
configMapKeyRef:
name: webapp-config
key: database_name
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: webapp-secrets
key: database_password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 1Gi
livenessProbe:
exec:
command:
- pg_isready
- -U
- postgres
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- pg_isready
- -U
- postgres
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: postgres-pvc
---
# PostgreSQL Service
apiVersion: v1
kind: Service
metadata:
name: postgres-service
namespace: webapp
spec:
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432
protocol: TCP
type: ClusterIP
---
# Redis Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
namespace: webapp
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:6-alpine
command:
- redis-server
- --requirepass
- $(REDIS_PASSWORD)
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: webapp-secrets
key: redis_password
ports:
- containerPort: 6379
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 100m
memory: 128Mi
livenessProbe:
exec:
command:
- redis-cli
- --no-auth-warning
- -a
- $(REDIS_PASSWORD)
- ping
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- redis-cli
- --no-auth-warning
- -a
- $(REDIS_PASSWORD)
- ping
initialDelaySeconds: 5
periodSeconds: 5
---
# Redis Service
apiVersion: v1
kind: Service
metadata:
name: redis-service
namespace: webapp
spec:
selector:
app: redis
ports:
- port: 6379
targetPort: 6379
protocol: TCP
type: ClusterIP
---
# Web Application Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
namespace: webapp
labels:
app: webapp
version: v1.0.0
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
version: v1.0.0
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
serviceAccountName: webapp-sa
containers:
- name: webapp
image: myregistry.com/webapp:v1.0.0
ports:
- containerPort: 8080
name: http
protocol: TCP
envFrom:
- configMapRef:
name: webapp-config
env:
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: webapp-secrets
key: database_password
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: webapp-secrets
key: redis_password
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: webapp-secrets
key: jwt_secret
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: config-volume
mountPath: /app/config
readOnly: true
- name: tmp-volume
mountPath: /tmp
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
successThreshold: 1
failureThreshold: 3
startupProbe:
httpGet:
path: /startup
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 30
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
volumes:
- name: config-volume
configMap:
name: webapp-config
items:
- key: app_config.yaml
path: app.yaml
- name: tmp-volume
emptyDir: {}
securityContext:
fsGroup: 1000
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- webapp
topologyKey: kubernetes.io/hostname
---
# ServiceAccount for webapp
apiVersion: v1
kind: ServiceAccount
metadata:
name: webapp-sa
namespace: webapp
---
# Service for webapp
apiVersion: v1
kind: Service
metadata:
name: webapp-service
namespace: webapp
labels:
app: webapp
spec:
selector:
app: webapp
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
type: ClusterIP
---
# HorizontalPodAutoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-hpa
namespace: webapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 60
---
# Ingress for external access
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: webapp-ingress
namespace: webapp
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/rate-limit-window: "1m"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- webapp.example.com
secretName: webapp-tls
rules:
- host: webapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: webapp-service
port:
number: 80
---
# NetworkPolicy to secure communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: webapp-netpol
namespace: webapp
spec:
podSelector:
matchLabels:
app: webapp
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 8080
egress:
# Allow DNS
- to: []
ports:
- protocol: UDP
port: 53
# Allow access to postgres
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
# Allow access to redis
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
This comprehensive example demonstrates: - Complete application stack with database, cache, and web application - Security best practices with RBAC, security contexts, and network policies - Production-ready configuration with health checks, resource limits, and auto-scaling - Proper secret management separating sensitive data from configuration - Storage management with persistent volumes for the database - High availability with anti-affinity rules and multiple replicas - External access through Ingress with TLS termination - Monitoring integration with Prometheus annotations
This example serves as a template for deploying production-ready applications on Kubernetes with proper security, scalability, and maintainability considerations.
Additional Resources
Official Kubernetes Documentation
The comprehensive official documentation for Kubernetes
Visit Resource