Connection timeout in TYK calls to upestream backend endpoint

Hi All

We are facing an issue in TYK 2.5.1 as described below:

On hitting an upstream URL with large load (ex - 10k requests sequentially) we are getting following error in response for 1-2% of requests.

There was a problem proxying the request with an error code 504

NOTE: All the requests are identical and simple get calls. So requests do not change in whole load.

While checking TYK logs we found following errors:
http: proxy error: dial tcp XX.XX.XXX.252:443: getsockopt: connection timed out
http: proxy error: net/http: TLS handshake timeout

Same load was then tested with directly to upstream URL, but there all requests were successful which confirms that the issue might not be at the backend server.

Can anyone suggests if we are missing here something that is causing this error.

Thanks!

Unsure what the error could be in 2.5.1. Maybe checking the http_server_options.read_timeout and http_server_options.write_timeout could help.

As a suggestion, could you use a more optimized version and check if the issue persists. I see my colleague as mentioned checking v2.7 series.

Both the listed configurations are for API Consumer → Gateway
But error in our case is on Gateway → backend side

could you use a more optimized version
yes we’ll try to upgrade but before that we wanted to know some root cause of this error
Like is there any connections limitations or network related issues at tyk side which is causing this issue as without gateway APIs are working fine.

Then checking for low timeouts might be the next step. The gateway default_proxy_timeout (set to 0 by default and would wait forever) or the API definition middleware enforced timeout maybe good places to start.

As for the cause based on load, it could be Tyk or your hardware. As mentioned, we do have better performance from v2.7 and above but maybe these performance configurations could help.