Istio | Locality based load balancing

3 min readNov 23, 2020

Issue

The load is not getting evenly distributed among all pods. It is observed that sometimes a few pods in deployment have more traffic and the other pods have very little traffic. The CPU spikes are also observed which is causing high response time issues on the application.

Here, we have two Kubernetes clusters running in two different cloud regions, us-central and us-east. The Istio control plane is running in us-east, and we have set up a single control plane, the Istio multicluster, so that services running in both clusters can reach each other.

When we started both clusters, the cloud provider added region-specific failure-domain labels to the Kubernetes nodes:
failure-domain.beta.kubernetes.io/region: us-central1 failure-domain.beta.kubernetes.io/zone: us-central1-b

Istio will populate requests with these locality labels, allowing Istio to redirect requests to the closest available region.

If we delete the echo Deployment running in us-central, Istio will redirect loadgen requests to the echo Pod running in us-east

Background

While migrating some apps from X cloud to GCP. We’ve observed, Istio ingress gateway was not load balancing as expected.

Infra Setup:

GKE cluster with Istio installed using helm.
Multiple namespaces with Istio injection enabled (sidecars)

Issue:

3 among 10 random Pod’s CPU usage was spiking above 80 whereas the other 7 were running at 30%.
This uneven distribution was causing high response time issues on the application

Testing setup

Deployed 2 client pods in two nodes with different zones + deployed 2 application pod in two nodes in 2 different zones
Sent load from the client pod in the zone a to the application pod using (svc.cluster.local and ingress gateway) then tried from client pod in zone b.

Test Results

It was observed, client pod in zone a was sending traffic on the application pods in the zone A only and vice versa for the client in zone B.

Root cause

Istio’s locality based load balancing

Locality-prioritized load balancing is the default behavior for locality load balancing. In this mode, Istio tells Envoy to prioritize traffic to the workload instances most closely matching the locality of the Envoy sending the request. When all instances are healthy, the requests remains within the same locality. When instances become unhealthy, traffic spills over to instances in the next prioritized locality.

Learnings

locality-based load balancing is enabled by default when installing using helm. Please disable this feature if not needed.

To disable locality load balancing, pass the

 — set global.localityLbSetting.enabled=false flag when installing Istio.

Possible Solutions

Disable the locality settings on the cluster
We need to make sure to have the same number of ingress gateways in both zones. The number of them should be even, e.g. 6. The default behavior [6] should make sure the pods are spread across the zones equally if possible. (However as the autoscaler will change the number of pods with the load, there is no guarantee that the number will be even and so that the traffic will be balanced)
User -> VM in a DC outside Google -> *internal (intranet IP)* istio ingress gateway via direct connect -> application backend pods (2/2 istio-proxy, application)