K8s - Where is my source IP?

K8s is a wild place

Congratulations! You've successfully created your first Kubernetes cluster!

You've got an application running on your node(s) and your clients are accessing the page and everyone is happy.

You go to have a celebratory drink, but before you crack open the champagne, you decide to check the logs on your server to see how many people have accessed your page. You pull all the logs, and look at the IP addresses and something looks strange. You realize that all of the requests look like they're coming from the server!

What went wrong?

If you're setting up a K8s cluster, you probably have the following:

Some sort of load balancer - MetalLB
Some sort of ingress - Traefik
An application running - WhoAmI

When we get a request from a client, without making any changes this is the flow

Client -> MetalLB -> Traefik -> WhoAmI

Now when you start to introduce load balancers to the equation, we run into an issue: How do we spread the traffic?

We normally have 2 options, assuming you are using a Layer 2 Load Balancer and not using BGP Load Balancing.

Cluster - Default option, more efficient than local as it tries to divert traffic based on the number of pods running in each node, and allows redirection to a different node if none exist.
Local - Alternative option, spreads traffic evenly across all nodes, regardless of how many pods running the service you requested (even if they have 0).

Cluster is the default because it is "safer and faster", in which I really mean it's just easier to not screw up. You can just reasonably assume if you make a request, that somehow it will get to the right place. The downside to this is that the header information is lost when the request is bounced around between the nodes in the clusters, resulting in the loss of the source IP address.

How do we approach this?

If you don't have a load balancer (you're running a bare metal K8s cluster with a single node) you can use Traefik as both the ingress and the load balancer. Saying that, this is effectively no different than just exposing the application to the host ports, but allows us to fix this problem without creating more complexity.

You may have applications running already, but WhoAmI is a great application for debugging as we get to see the headers of the request.

You can run this inside your cluster with the following, assuming you have Traefik running already:

---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: whoami
  labels:
    app: whoami
spec:
  replicas: 1
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
        - name: whoami
          image: traefik/whoami
          ports:
            - name: web
              containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: whoami
spec:
  ports:
    - name: web
      port: 80
      targetPort: web
  selector:
    app: whoami
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: whoami
spec:
  entryPoints:
    - web
  routes:
    - match: Host(`whoami.example.com`)
      kind: Rule
      services:
        - name: whoami
          port: 80

YAML file to deploy WhoAmI

Once you have it running, you can go to your website whoami.example.com and it will show you something like the following:

Hostname: whoami-***-***
IP: 127.0.0.1
IP: 10.23.0.221
RemoteAddr: 10.23.0.100:53638
GET / HTTP/1.1
Host: YOUR HOST NAME
...
Upgrade-Insecure-Requests: 1
X-Forwarded-For: YOUR NODE IP
X-Forwarded-Host: YOUR HOST NAME
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-***-***
X-Real-Ip: YOUR NODE IP

Who Am I - Full Output

The part we really care about is the following two lines:

X-Forwarded-For: YOUR NODE IP
X-Real-Ip: YOUR NODE IP

Who Am I - Important Output

This is how we find the original IP address when we get to the pod.

Wrong approaches to this problem

I've seen a lot of solutions on the web on how to solve this, some fail to fully explain why you need to set the values as they are, and some open you up to additional risks.

You'll see many answers on the internet that suggest hostNetwork to true in the values.yaml, but the answer is wrong. Although it may solve your problem, it is not secure and will open you up to other issues.

hostNetwork: true # BAD!

Please do not do this!

Some people simply say set the forward headers for the ingress to accept all, which is closer but is the equivalent of running chmod 777 to solve your problem. This is bad because you will effectively trust any header that has been passed to the ingress, regardless of whether the load balancer set it!

forwardedHeaders:
  insecure: true # BAD!

Please do not do this!

Solution

You should set the following in the values.yaml file. Set the trustedIPs to the IP of your load balancer.

forwardedHeaders:
  trustedIPs:
    - Your load balancer's IP address
  insecure: false

Traefik - values.yaml

The insecure value should remain false as we do not want to allow all IP to be able to set the headers, just the load balancer.

If you have a web and websecure endpoint, you will need to do this for both entries, as you will want to redirect and pass the headers along.

You will also want to set the following in the Traefik values.yaml file. This notifies the load balancer that you want the traffic to be distributed using the Local balancing approach, meaning that you know that it will send packets evenly across all nodes. This fixes the problem where the clientIP is lost.

spec: 
    externalTrafficPolicy: Local

Traefik - values.yaml

We can then reload the WhoAmI service, and see if it worked! This was my result:

Hostname: whoami-***-***
IP: 127.0.0.1
IP: 10.23.0.221
RemoteAddr: 10.23.0.100:53638
GET / HTTP/1.1
Host: YOUR HOST NAME
...
Upgrade-Insecure-Requests: 1
X-Forwarded-For: YOUR CLIENT IP
X-Forwarded-Host: YOUR HOST NAME
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-***-***
X-Real-Ip: YOUR CLIENT IP

WhoAmI - Output After Fix

Complications of the fix

Like I mentioned before, this fixes the issue with the Source IP but opens you up to a new set of problems:

You now have to make sure you have at least one pod running on each node
You need to make sure the traffic is evenly balanced between the nodes, or evenly spread the services across the nodes.

#1 will result in people failing to connect to your service, #2 can cause issues with performance and wasted resources.