I am using GitHub - TykTechnologies/tyk-hybrid-docker: Tyk Hybrid Mode Docker Image as the basis for a hybrid tyk deployment. I have modified the Dockerfile to install package lua-nginx-redis and modified the nginx configs to suit my employer’s needs. The nginx configuration is tested and functional. I also set “enable_geo_ip” to false in the tyk.conf. The container is otherwise unmodified.

I have built and deployed this container along with linked redis and api containers using docker 1.7.1 on an Ubuntu 14.04 host. When testing the deployment using 30 concurrent client connections, I encountered repeated crashes in the docker container. Upon examining the logs, I encountered the following message immediately before a very verbose golang stack trace: “fatal error: concurrent map read and map write” the stack trace makes it cleat that this error is occurring in the tyk code.

I know that by default, golang maps are not safe for concurrent access. I have not examined the tyk source code.

The api I hope to deploy behind tyk is expected to handle many more than 30 concurrent connections. Is there a way for me to avoid crashes like this?

The stack trace would really help here :slight_smile:

We have hybrid setups pushing 500k requests per day, so we’re very sure about it handling traffic.

I’ve tried to replicate this both with the latest Tyk build (dev branch) throwing about 1k concurrents at it (geo_ip disabled) and our hybrid container and can;t seem to get the same issue you are getting, which implies it’s a tricky, nasty little problem that we’ll need the stack trace to pin down.

So it would really help to get it.

On the other hand - we can’t support anything outside of our own official docker hybrid build and setup - we build our docker image automatically on docker hub and that is the one we test and work with.

But I’d still like to get my hands on that stack trace :slight_smile:

Martin, thanks so much for your attention! I am not able to recover the stack trace at this time, but I will have it posted by tomorrow morning.

No worries - I’m testing the system with the race detector now - there are a few but they are minor and shouldn’t affect throughput or stability.

@Martin Thanks for your patience. The host machine was actually unplugged and moved between my first post and your response. I’ve got it running again now. :slight_smile:

Here are the first 106 lines of the stack trace. An additional 5500+ lines of goroutine traces were also produced.

Awesome, thank you - that’s been fixed now and pushed to our hybrid container (and github repo), if you pull and rebuild you should be able to continue