K8s - Where is my source IP?

Congratulations! You've successfully created your first Kubernetes cluster!
You've got an application running on your node(s) and your clients are accessing the page and everyone is happy.
You go to have a celebratory drink, but before you crack open the champagne, you decide to check the logs on your server to see how many people have accessed your page. You pull all the logs, and look at the IP addresses and something looks strange. You realize that all of the requests look like they're coming from the server!

What went wrong?
If you're setting up a K8s cluster, you probably have the following:
- Some sort of load balancer -
MetalLB
- Some sort of ingress -
Traefik
- An application running -
WhoAmI
When we get a request from a client, without making any changes this is the flow
Client -> MetalLB -> Traefik -> WhoAmI
Now when you start to introduce load balancers to the equation, we run into an issue: How do we spread the traffic?
We normally have 2 options, assuming you are using a Layer 2 Load Balancer and not using BGP Load Balancing.
- Cluster - Default option, more efficient than local as it tries to divert traffic based on the number of pods running in each node, and allows redirection to a different node if none exist.
- Local - Alternative option, spreads traffic evenly across all nodes, regardless of how many pods running the service you requested (even if they have 0).
Cluster is the default because it is "safer and faster", in which I really mean it's just easier to not screw up. You can just reasonably assume if you make a request, that somehow it will get to the right place. The downside to this is that the header information is lost when the request is bounced around between the nodes in the clusters, resulting in the loss of the source IP address.
How do we approach this?
If you don't have a load balancer (you're running a bare metal K8s cluster with a single node) you can use Traefik
as both the ingress and the load balancer. Saying that, this is effectively no different than just exposing the application to the host ports, but allows us to fix this problem without creating more complexity.
You may have applications running already, but WhoAmI
is a great application for debugging as we get to see the headers of the request.
You can run this inside your cluster with the following, assuming you have Traefik running already:
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: whoami
labels:
app: whoami
spec:
replicas: 1
selector:
matchLabels:
app: whoami
template:
metadata:
labels:
app: whoami
spec:
containers:
- name: whoami
image: traefik/whoami
ports:
- name: web
containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: whoami
spec:
ports:
- name: web
port: 80
targetPort: web
selector:
app: whoami
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: whoami
spec:
entryPoints:
- web
routes:
- match: Host(`whoami.example.com`)
kind: Rule
services:
- name: whoami
port: 80
YAML file to deploy WhoAmI
Once you have it running, you can go to your website whoami.example.com
and it will show you something like the following:
Hostname: whoami-***-***
IP: 127.0.0.1
IP: 10.23.0.221
RemoteAddr: 10.23.0.100:53638
GET / HTTP/1.1
Host: YOUR HOST NAME
...
Upgrade-Insecure-Requests: 1
X-Forwarded-For: YOUR NODE IP
X-Forwarded-Host: YOUR HOST NAME
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-***-***
X-Real-Ip: YOUR NODE IP
Who Am I - Full Output
The part we really care about is the following two lines:
X-Forwarded-For: YOUR NODE IP
X-Real-Ip: YOUR NODE IP
Who Am I - Important Output
This is how we find the original IP address when we get to the pod.
Wrong approaches to this problem
I've seen a lot of solutions on the web on how to solve this, some fail to fully explain why you need to set the values as they are, and some open you up to additional risks.
You'll see many answers on the internet that suggest hostNetwork
to true in the values.yaml
, but the answer is wrong. Although it may solve your problem, it is not secure and will open you up to other issues.
hostNetwork: true # BAD!
Please do not do this!
Some people simply say set the forward headers for the ingress to accept all, which is closer but is the equivalent of running chmod 777
to solve your problem. This is bad because you will effectively trust any header that has been passed to the ingress, regardless of whether the load balancer set it!
forwardedHeaders:
insecure: true # BAD!
Please do not do this!
Solution
You should set the following in the values.yaml
file. Set the trustedIPs
to the IP of your load balancer.
forwardedHeaders:
trustedIPs:
- Your load balancer's IP address
insecure: false
Traefik - values.yaml
The insecure
value should remain false
as we do not want to allow all IP to be able to set the headers, just the load balancer.
If you have a web
and websecure
endpoint, you will need to do this for both entries, as you will want to redirect and pass the headers along.
You will also want to set the following in the Traefik values.yaml
file. This notifies the load balancer that you want the traffic to be distributed using the Local
balancing approach, meaning that you know that it will send packets evenly across all nodes. This fixes the problem where the clientIP
is lost.
spec:
externalTrafficPolicy: Local
Traefik - values.yaml
We can then reload the WhoAmI
service, and see if it worked! This was my result:
Hostname: whoami-***-***
IP: 127.0.0.1
IP: 10.23.0.221
RemoteAddr: 10.23.0.100:53638
GET / HTTP/1.1
Host: YOUR HOST NAME
...
Upgrade-Insecure-Requests: 1
X-Forwarded-For: YOUR CLIENT IP
X-Forwarded-Host: YOUR HOST NAME
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-***-***
X-Real-Ip: YOUR CLIENT IP
WhoAmI - Output After Fix
Complications of the fix
Like I mentioned before, this fixes the issue with the Source IP but opens you up to a new set of problems:
- You now have to make sure you have at least one pod running on each node
- You need to make sure the traffic is evenly balanced between the nodes, or evenly spread the services across the nodes.
#1 will result in people failing to connect to your service, #2 can cause issues with performance and wasted resources.