DEV Community
•
2026-04-29 03:32
War Story: Debugging a Kafka 3.7 Consumer Lag Issue in K8s 1.32 with KEDA 2.14 and Prometheus 2.50
At 03:17 UTC on a Tuesday, our on-call pager went off: Kafka consumer lag for the payments topic had hit 12,478,291 messages, with p99 processing latency spiking to 8.7 seconds. We were running Kafka 3.7.0, Kubernetes 1.32.0, KEDA 2.14.1, and Prometheus 2.50.1 — a stack we’d battle-tested for 18 months. This wasn’t a slow leak. This was a sudden, catastrophic failure that threatened to delay 40% o...