Lab 4: Service Types + Endpoint Debugging (CKA Extension)
Time: 40 minutes
Objective: Compare Service type behavior and debug a broken NodePort mapping using Endpoints and selectors.
The Story
Your app is up. Pods are healthy. Dashboards look green. But users still get timeouts.
This is where many teams lose incident time: restarting workloads when the real break is Service wiring — selectors, endpoints, and port mapping. In this lab, you will intentionally break that wiring, read the exact symptom Kubernetes gives you, and build a repeatable debug loop you can run under CKA time pressure.
CKA Objectives Mapped
- Understand Service networking models (ClusterIP, NodePort, LoadBalancer)
- Troubleshoot Service connectivity and endpoint discovery
- Validate backend wiring with evidence-driven commands
Background: Kubernetes Service Types
The problem Services solve
Pods are ephemeral. They restart, reschedule, and get new IP addresses constantly. You cannot hardcode a pod IP in your application config — it will break.
A Service is a stable virtual endpoint that sits in front of a set of pods. It provides:
- A fixed ClusterIP that never changes
- A DNS name (
my-service.my-namespace.svc.cluster.local) - Load balancing across all healthy pods that match its selector
Under the hood, kube-proxy (or the CNI, in Cilium's case) programs iptables/ipvs rules so that traffic to the ClusterIP is DNAT'd to a real pod IP.
The four Service types
| Type | Reachable from | Use case |
|---|---|---|
ClusterIP |
Inside the cluster only | Default; service-to-service communication |
NodePort |
Outside via <NodeIP>:<port> (30000-32767) |
Dev/testing; direct node access |
LoadBalancer |
Outside via a cloud-provisioned external IP | Production; cloud clusters |
ExternalName |
Returns a CNAME | Alias an external DNS name as a Service |
Each type is additive — a NodePort Service also gets a ClusterIP. A LoadBalancer Service also gets a NodePort and a ClusterIP.
How the selector chain works
The selector chain is the most common source of Service bugs:
Service (selector: app=foo)
│
▼
Endpoints object (auto-populated with pod IPs that match the selector)
│
▼
Pods (must have label: app=foo AND be Ready)
If Endpoints is empty, the selector doesn't match any running pod. This is the diagnostic step most people miss — kubectl get endpoints <svc> shows you exactly what the Service sees.
Why NodePort has limits
NodePort solves external access without a cloud LB, but has real drawbacks:
- Port range is restricted (30000-32767) — not standard HTTP ports
- Requires knowing a node IP — fragile if nodes come and go
- One port per service — doesn't scale to many services on one cluster
This is why Ingress and Gateway API exist: to multiplex many services behind a single external IP on standard ports.
LoadBalancer in kind
In cloud clusters, type: LoadBalancer triggers the cloud provider to provision an external load balancer and assign an IP. In kind, there's no cloud provider — the EXTERNAL-IP stays <pending> unless you install a local implementation like MetalLB or use the kind-specific port mapping approach from Lab 1.
Further reading:
- Services concepts
- Service types reference
- Debugging Services
- EndpointSlices — the modern replacement for Endpoints objects
Prerequisites
Use your local cluster:
kubectl config use-context kind-lab
kubectl get nodesStarter assets for this lab are in starter/:
app-deployment.yamlservice-clusterip.yamlservice-nodeport-broken.yamlservice-loadbalancer.yamlendpoint-check.sh
Part 1: Deploy the Baseline App
Before you can debug Service behavior, you need a backend that is undeniably healthy. This step creates that clean baseline so later failures can be attributed to Service wiring, not a broken app.
kubectl apply -f week-06/labs/lab-04-service-types-nodeport-loadbalancer/starter/app-deployment.yaml
kubectl rollout status deployment/svc-types-demo --timeout=120s
kubectl get pods -l app=svc-types-demo -o wideNotice: rollout status tells you the Deployment converged. get pods -l app=svc-types-demo verifies the exact label your Services will depend on. If this label is wrong, every networking test later gives confusing results.
Operator mindset: establish a trustworthy baseline first, then debug one layer at a time.
Part 2: ClusterIP Baseline
Now you establish a known-good Service path inside the cluster. Think of this as your control case: if ClusterIP works, you have proof that selector matching and app reachability are fundamentally correct.
kubectl apply -f week-06/labs/lab-04-service-types-nodeport-loadbalancer/starter/service-clusterip.yaml
kubectl get svc svc-demo-clusterip
kubectl get endpoints svc-demo-clusteripNotice: this is your known-good baseline. You want two signals together: a ClusterIP exists, and Endpoints are populated. That pair tells you selector wiring is healthy before you move to NodePort.
Validate in-cluster reachability:
This probe is intentional: you are validating Service DNS and backend forwarding from within Kubernetes first, before adding host or NodePort variables.
kubectl run svc-probe --image=busybox:1.36 --restart=Never --rm -it -- wget -qO- -T 5 http://svc-demo-clusteripNote: This tests pure in-cluster routing (DNS name -> Service -> backend pod). If this works, your app and selector chain are sound before adding external-access complexity.
Operator mindset: always create a known-good control path before testing more complex exposure types.
Part 3: Broken NodePort (Intentional Failure)
Apply the intentionally broken NodePort manifest:
You are introducing a controlled failure on purpose. This is the fastest way to learn the diagnostic signal for bad Service wiring while the blast radius is small and predictable.
kubectl apply -f week-06/labs/lab-04-service-types-nodeport-loadbalancer/starter/service-nodeport-broken.yaml
kubectl get svc svc-demo-nodeport -o wideRun diagnostics:
The goal is not just to "run debug commands". The goal is to separate symptoms into classes: discovery problems vs forwarding problems, so your fix is precise instead of guesswork.
kubectl describe svc svc-demo-nodeport
kubectl get endpoints svc-demo-nodeport -o yaml
bash week-06/labs/lab-04-service-types-nodeport-loadbalancer/starter/endpoint-check.sh svc-demo-nodeportNotice: use these three together. describe svc shows intended wiring (selector and ports). get endpoints shows discovered backends. The helper script gives a fast pass/fail check. This is a strong CKA troubleshooting rhythm.
Expected symptom:
- Service exists, NodePort allocated
- Endpoints missing or traffic fails because
targetPortis wrong
The failure mode matters. In a selector mismatch, Endpoints are empty because the Service cannot discover matching Ready pods. In a targetPort mismatch, Endpoints are populated but traffic still fails because packets are forwarded to the wrong container port.
That distinction is your diagnostic shortcut: empty Endpoints means discovery is broken; populated Endpoints with failed traffic means forwarding is broken. On the CKA, this is exactly what you are expected to demonstrate: check Endpoints first, identify the failure class, then apply the right fix quickly.
Operator mindset: classify the failure before changing anything.
Part 4: Fix the NodePort Mapping
Patch the service to point at the app port (5678):
Now you repair only the broken layer (port mapping) and leave everything else untouched. This is an important production habit: minimal change, maximum verification.
kubectl patch svc svc-demo-nodeport --type merge -p '{"spec":{"ports":[{"port":80,"targetPort":5678,"nodePort":30080}]}}'
kubectl get svc svc-demo-nodeport -o yaml | sed -n '1,120p'
kubectl get endpoints svc-demo-nodeportNotice: after patching, verify both config and behavior signals: targetPort: 5678 is present, and Endpoints stay populated. Config-only checks are not enough; you need evidence traffic can route.
Retest from inside cluster:
This retest confirms the fix solved the real user path, not just the YAML diff. In incident response terms, this is your "service restored" evidence.
kubectl run nodeport-probe --image=busybox:1.36 --restart=Never --rm -it -- wget -qO- -T 5 http://svc-demo-nodeportNote: This is your proof step. You are not just trusting YAML output; you are proving the fixed Service actually routes traffic.
Optional host-level test (kind node mapped to localhost):
Run this if you want to isolate one more layer: internal Service routing can be healthy even if host exposure is not. This helps you avoid misdiagnosing environment issues as Kubernetes Service issues.
curl -s http://127.0.0.1:30080 | headNotice: this checks host-to-NodePort reachability in kind. If in-cluster probe works but this fails, your Service wiring is likely fine and the issue is environment exposure.
Operator mindset: make the smallest possible fix, then prove user-path recovery.
Part 4.5: Prove It (Break It a Second Way)
Now create a different failure on purpose: break discovery instead of port forwarding.
You already fixed one failure class. Now you deliberately trigger the other class so your mental model is robust under exam pressure.
kubectl patch svc svc-demo-nodeport --type merge -p '{"spec":{"selector":{"app":"svc-types-demo-typo"}}}'
kubectl get endpoints svc-demo-nodeportNotice: Endpoints should now be empty because no pod has app=svc-types-demo-typo. This is the selector-mismatch signature and should look clearly different from the earlier targetPort failure.
Restore the correct selector and confirm recovery:
This is the full troubleshooting cycle in miniature: induce failure, identify signal, apply targeted fix, then prove recovery with a live request.
kubectl patch svc svc-demo-nodeport --type merge -p '{"spec":{"selector":{"app":"svc-types-demo"}}}'
kubectl get endpoints svc-demo-nodeport
kubectl run selector-probe --image=busybox:1.36 --restart=Never --rm -it -- wget -qO- -T 5 http://svc-demo-nodeportNote: You just validated both failure classes. Empty Endpoints = discovery failure. Populated Endpoints with broken traffic = port-forwarding failure. This is high-value troubleshooting muscle for both exams and production.
Operator mindset: practice controlled break/fix loops until the symptom-to-cause mapping is automatic.
Part 5: LoadBalancer Service Behavior
Apply the LoadBalancer variant:
This final step teaches a key real-world distinction: Service internals can be correct while external exposure is still pending due to infrastructure. You are learning to separate Kubernetes wiring from platform provisioning.
kubectl apply -f week-06/labs/lab-04-service-types-nodeport-loadbalancer/starter/service-loadbalancer.yaml
kubectl get svc svc-demo-loadbalancer -o wideNotice: focus on TYPE, PORT(S), and EXTERNAL-IP. In kind, <pending> is expected for EXTERNAL-IP and does not mean the app is unhealthy.
In plain kind, EXTERNAL-IP is often <pending> unless you add a local LB implementation.
What to check anyway:
You are still verifying the same core chain (selector -> endpoints -> pods). The type changed, but your troubleshooting method stays consistent.
kubectl describe svc svc-demo-loadbalancer
kubectl get endpoints svc-demo-loadbalancerNote: Even without external provisioning, selector and endpoint wiring behave the same. Healthy Endpoints here mean Service internals are correct; the remaining gap is infrastructure integration.
Operator mindset: separate Kubernetes object health from infrastructure provisioning state.
Takeaway:
LoadBalancerstill depends on the same selector and endpoint wiring- The external IP allocation is infrastructure-dependent
Verification Checklist
You are done when:
- ClusterIP service routes traffic successfully
- You identify and fix the NodePort
targetPortmismatch - NodePort becomes reachable after fix
- You can explain why LoadBalancer stays pending in kind
Cleanup
kubectl delete svc svc-demo-clusterip svc-demo-nodeport svc-demo-loadbalancer --ignore-not-found
kubectl delete deployment svc-types-demo --ignore-not-foundReinforcement Scenarios
jerry-nodeport-mystery

