- Published on
Kubernetes API Errors Caused by Control Plane Exposure
- Authors
When Your Kubernetes API Starts Failing: A Lesson in Control Plane Exposure
Recently, I ran into a cluster issue that, at first glance, looked like a fairly typical internal problem: intermittent Kubernetes API errors. Timeouts, failed requests, and general instability.
The kind of issue that usually sends you digging into:
- resource pressure
- misbehaving workloads
- logging or monitoring pipelines
It turned out to be none of those.
The Symptoms
The cluster began showing signs of API instability:
- Intermittent request failures
- Increased latency from the Kubernetes API
- Errors appearing across multiple components
At first, the suspicion fell on internal services. But nothing obvious stood out:
- Resource usage was within expected bounds
- No clear spike in workload activity
- No obvious misconfiguration
The Turning Point
After raising the issue with the provider, they reported:
Unauthorised activity detected against the Kubernetes control plane.
That immediately reframed the problem.
This wasn’t an internal failure — it was external pressure on the API server.
The Root Cause
The Kubernetes API endpoint was more exposed than it should have been.
That meant:
- External actors could reach the control plane
- Requests (malicious or not) were hitting the API server
- The control plane was under unnecessary load
Even without successful authentication, this can:
- Increase latency
- Trigger rate limiting
- Cause intermittent failures for legitimate traffic
In short: your cluster can degrade even if no one actually gets access.
The Fix
The solution was straightforward:
Restrict access to the Kubernetes API using IP-based ACLs.
Only trusted sources were allowed:
- Admin networks
- VPN endpoints
- Known automation systems
As soon as this was implemented:
- API errors stopped
- Latency returned to normal
- Cluster stability was restored
Why This Matters
Kubernetes makes it easy to expose the API server, especially in managed environments.
But “accessible” doesn’t mean “safe”.
If your control plane is reachable from the internet, you are:
- Increasing your attack surface
- Allowing unnecessary traffic to hit critical components
- Relying entirely on authentication as your first line of defence
Key Takeaways
Lock Down the Control Plane
The Kubernetes API should not be broadly accessible unless absolutely required.
Use:
- IP allow lists (ACLs)
- Private endpoints
- VPN or bastion access
Don’t Assume Internal Causes
API instability doesn’t always originate inside the cluster.
Always consider:
- External traffic
- Probing or scanning
- Exposure misconfiguration
Authentication Isn’t Enough
Even failed requests consume resources.
Blocking traffic at the network layer is more effective than relying on:
- RBAC
- Tokens
- Authentication layers
Monitor Control Plane Access
Where possible, enable:
- API audit logs
- Request rate monitoring
- Connection metrics
These provide early warning of unusual behaviour.
A Simple Mental Model
Think of the Kubernetes API like SSH on a server.
You wouldn’t leave SSH open to the internet without restrictions—even with strong authentication.
The same principle applies here.
Final Thoughts
This issue was a useful reminder:
Not all incidents originate from within your cluster.
Sometimes, the problem is simply that your control plane is too easy to reach.
Restricting access is low effort, high impact, and immediately effective.
If you haven’t reviewed your Kubernetes API exposure recently, now is a good time.