Resilience & SPOF discussion

Hi Martin,
Since I evaluate your solution, I’am more and more interested in it. But there are still some questions about the architecture including storage.

As I understand, there is two different storages involved in Tyk : Redis and Mongo. You are using the PUB/SUB model of redis to have a kind of notification mechanism between tyk nodes. So I understand that it’s very important to have a Redis cluster that is highly available.

But what about MongoDB ? Once again, as I understand, MongoDB is there to store analytics data. So, if mongoDB crashes, what happens? Do the system continues to work normally? Or do we have to use a more reliable mongoDB system using a replica set for example to ensure that every incoming requests continue to be routed to the wright microservice?

Those questions are very important for us before going further in the adoption of Tyk (which is, for now, very attractive).

Do you plan to write a documentation about the “best practices” to create a resilient Tyk cluster?

Thank you

Hi,

There’s actually a section in the docs around deploying Tyk to production:

https://tyk.io/v1.9/setup/deploy-tyk-to-production/

With regards to MongoDB - we would always recommend setting up a resilient MongoDB replica set. However, if Mongo goes down, the only issues you will have are that analytics will not be written and that data will be lost.

Also, if mongoDB goes down, the updates to the definitions stored in Ttk will fail - so if you hot reload a Tyk node cluster with a downed MDB, then they will load empty data and those Tyk nodes will no longer proxy traffic for your APIs, but this is a very edge case, since most hot reloads are performed by the dashboard, which will not work correctly if Mongo is down.

So, in conclusion, MongoDB is not mission critical, but we still recommend having a resilient setup.

In the next version of Tyk, the MDB dependency for the nodes is completely removed and analytics purging services have moved to a new component called Tyk Pump. This also means the failure edge case I mentioned above can’t happen, since we’ve made that more resilient too.

That means the MDB is only a dependency for the dashboard/portal.

Hope that answers things.

Cheers,
Martin