gRPC Load Balancing issue with K8s Headless Service

Branch/Environment/Version

  • Branch/Version: Release 5.8.7
  • Environment: On-prem

Describe the bug
We are currently experiencing an issue where gRPC traffic is not being load-balanced across our backend pods as expected. Despite using a Headless Service, the Tyk Gateway seems to stick to a single backend connection.
It is my understanding that the “smart gRPC Client” in Tyk should resolve the multiple IPs returned by the headless service, establish multiple connections, and round-robin the requests. However, I cannot verify this behavior.

Reproduction steps
Steps to reproduce the behaviour:

  1. Deploy gRPC backend service, running with multiple replicas
  2. Create API, configured to use load-balancing (see below)

Actual behaviour
Only one backend service replica receives traffic.

Expected behaviour
The traffic should be distributed round-robin between the available replicas.

Configuration (tyk config file):
Environment variables (configuration by Tyk erlm chart)

      - env:
        - name: TYK_GW_LISTENPORT
          value: "8080"
        - name: TYK_GW_OAS_VALIDATE_EXAMPLES
          value: "false"
        - name: TYK_GW_OAS_VALIDATE_SCHEMA_DEFAULTS
          value: "false"
        - name: TYK_GW_ENABLEFIXEDWINDOWRATELIMITER
          value: "false"
        - name: TYK_GW_STORAGE_TLSMAXVERSION
        - name: TYK_GW_STORAGE_TLSMINVERSION
        - name: REDIGOCLUSTER_SHARDCOUNT
          value: "128"
        - name: TYK_GW_STORAGE_TYPE
          value: redis
        - name: TYK_GW_STORAGE_ADDRS
          value: master.***:6379
        - name: TYK_GW_STORAGE_ENABLECLUSTER
          value: "false"
        - name: TYK_GW_STORAGE_DATABASE
          value: "0"
        - name: TYK_GW_STORAGE_PASSWORD
          valueFrom:
            secretKeyRef:
              key: redisPass
              name: secrets-trip-api
        - name: TYK_GW_STORAGE_USESSL
          value: "true"
        - name: TYK_GW_SECRET
          valueFrom:
            secretKeyRef:
              key: APISecret
              name: secrets-trip-api
        - name: TYK_GW_NODESECRET
          valueFrom:
            secretKeyRef:
              key: APISecret
              name: secrets-trip-api
        - name: TYK_GW_POLICIES_ALLOWEXPLICITPOLICYID
          value: "true"
        - name: TYK_GW_HTTPSERVEROPTIONS_USESSL
          value: "false"
        - name: TYK_GW_TEMPLATEPATH
          value: /opt/tyk-gateway/templates
        - name: TYK_GW_TYKJSPATH
          value: /opt/tyk-gateway/js/tyk.js
        - name: TYK_GW_MIDDLEWAREPATH
          value: /mnt/tyk-gateway/middleware
        - name: TYK_GW_APPPATH
          value: /mnt/tyk-gateway/apps
        - name: TYK_GW_POLICIES_POLICYPATH
          value: /mnt/tyk-gateway/policies
        - name: TYK_GW_STORAGE_MAXIDLE
          value: "1000"
        - name: TYK_GW_ENABLENONTRANSACTIONALRATELIMITER
          value: "true"
        - name: TYK_GW_POLICIES_POLICYSOURCE
          value: file
        - name: TYK_GW_ENABLEANALYTICS
          value: "true"
        - name: TYK_GW_ANALYTICSCONFIG_TYPE
        - name: TYK_GW_POLICIES_POLICYRECORDNAME
          value: /mnt/tyk-gateway/policies/policies.json
        - name: TYK_GW_HASHKEYS
          value: "true"
        - name: TYK_GW_HASHKEYFUNCTION
          value: murmur128
        - name: TYK_GW_HTTPSERVEROPTIONS_ENABLEWEBSOCKETS
          value: "true"
        - name: TYK_GW_HTTPSERVEROPTIONS_MINVERSION
          value: "771"
        - name: TYK_GW_HTTPSERVEROPTIONS_CERTIFICATES
          value: '[{"cert_file":"/etc/certs/tyk-gateway/tls.crt","domain_name":"*","key_file":"/etc/certs/tyk-gateway/tls.key"}]'
        - name: TYK_GW_HTTPSERVEROPTIONS_SSLINSECURESKIPVERIFY
          value: "false"
        - name: TYK_GW_ALLOWINSECURECONFIGS
          value: "true"
        - name: TYK_GW_COPROCESSOPTIONS_ENABLECOPROCESS
          value: "true"
        - name: TYK_GW_MAXIDLECONNSPERHOST
          value: "500"
        - name: TYK_GW_ENABLECUSTOMDOMAINS
          value: "true"
        - name: TYK_GW_PIDFILELOCATION
          value: /mnt/tyk-gateway/tyk.pid
        - name: TYK_GW_DBAPPCONFOPTIONS_NODEISSEGMENTED
          value: "false"
        - name: TYK_GW_HTTPSERVEROPTIONS_ENABLEHTTP2
          value: "true"
        - name: TYK_GW_HTTPSERVEROPTIONS_FLUSHINTERVAL
          value: "1"
        - name: TYK_GW_PROXYENABLEHTTP2
          value: "true"
        - name: TYK_GW_LOGLEVEL
          value: info
        - name: TYK_GW_HTTPSERVEROPTIONS_READTIMEOUT
          value: "660"
        - name: TYK_GW_HTTPSERVEROPTIONS_WRITETIMEOUT
          value: "660"
        - name: TYK_GW_OAUTHTOKENEXPIREDRETAINPERIOD
          value: "1800"
        - name: TYK_GW_OAUTHTOKENEXPIRE
          value: "1800"
        - name: TYK_GW_GLOBALSESSIONLIFETIME
          value: "3600"

APIDefinition:

apiVersion: [tyk.tyk.io/v1alpha1](http://tyk.tyk.io/v1alpha1)
kind: ApiDefinition
metadata:
  name: grpc-smoketest
  namespace: trip-api-gateway
spec:
  active: true
  api_id: grpc-smoketest
  name: gRPC smoketest
  protocol: http
  proxy:
    enable_load_balancing: false
    listen_path: /moia.apigateway.SmoketestService/
    target_url: h2c://grpc-smoketest-headless.pe-tools.svc.cluster.local:50051
    transport: {}
  version_data:
    default_version: Default
    not_versioned: true
    versions:
      Default:
        name: Default
# ...

Additional context

I found the documentation regarding gRPC Load Balancing, but it doesn’t seem to resolve our specific case with the Headless Service. Is there a specific configuration required to force the Gateway to refresh its connection pool?
https://tyk.io/docs/key-concepts/grpc-proxy#grpc-load-balancing
I also noticed this configuration option: coprocess_options.grpc_round_robin_load_balancing (referenced here). Since we are not explicitly using the Coprocessor for this, is this flag necessary for standard gRPC proxying?
https://tyk.io/docs/tyk-oss-gateway/configuration#coprocess-options-grpc-round-robin-load-balancing
In the API Definition, proxy.enable_load_balancing is set to false. Should this be true even when using a K8s Headless service, or does Tyk handle gRPC balancing differently?
I would be happy to provide further details or logs if needed.
Add any other context about the problem here.