Hi everybody,
I’m a senior product manager at Tyk working on observability. We have been getting many questions from our users about distributed tracing, the sunset of OpenTracing and our upcoming support for OpenTelemetry.
I have summarised all the answers to those questions below.
Is anything missing? Don’t hesitate to ask.
If you are new to observability and/or OpenTelemetry, you should probably read those two articles first: Observability Primer | OpenTelemetry and What is OpenTelemetry? | OpenTelemetry.
Q: Is OpenTelemetry support coming to Tyk?
Yes, distributed tracing support with OpenTelemetry is on the near-term roadmap for Tyk API Gateway. If this is a valuable feature for you, please leave a comment below saying as much.
Here are a couple of things we’d love to learn from you:
- Why would you like to get distributed tracing from Tyk API Gateway? How will this make your life easier?
- Which observability platform/tool are you using (Datadog, Dynatrace, New Relic, Elastic, HoneyComb, Splunk, Lightstep, Jaeger, Grafana Tempo, …)?
- Do you have any specific requirements (e.g. format being used for trace-context propagation, granularity of the spans we will export, sampling, baggage, …)
Q: Will OpenTelemetry help me to monitor and troubleshoot GraphQL and UDG queries?
Yes! let us know you what you struggle with at the moment (e.g. federation) and we will look into your use cases.
Q: Now that OpenTracing is being sunsetted, can I still use OpenTracing with Tyk?
Yes! here’s what you should know:
Tyk API Gateway implements the OpenTracing specification using the OpenTracing Go library and exports the trace data using either the client libraries from Jaeger or from ZipKin. This can be configured in the Gateway (see Jaeger or Zipkin for details on the configuration option).
The CNCF (Cloud Native Foundation) has archived the OpenTracing project and Jaeger has also deprecated their client libraries. This means that no new pull requests or feature requests are accepted into OpenTracing or Jaeger repositories. This makes sense because the whole community has moved to OpenTelemetry with great progress!
OpenTelemetry support is on our near-term roadmap (see above). Until it is available you can definitively leverage OpenTracing to get Gateway timing and data in your traces. We are of course still supporting this functionality - let us know if you are having any issues with this.
Q: Which information does the current implementation with OpenTracing export?
Right now, you get timings for the time spent in the Tyk Gateway (version check, rate limit check, middleware, …) and for the time spent in upstream services.
There is room for improvement (error, http status code, …) and we plan to use the semantic convention from OpenTelemetry to guide us. Let us know if you are missing relevant insights.
Q: Can I use the OpenTelemetry collector to translate the spans exported with OpenTracing to the OpenTelemetry format?
The OpenTelemetry collector (the component responsible for collecting, processing and forwarding telemetry data) has the concept of receiver. A receiver accepts data in a specific format, translates it into the internal format and passes it to processors and exporters defined in the collector.
This means that you can use the traces exported by Tyk in the Jaeger or ZipKin format, translated into the OTLP format (or any other format required by your observability backend), by using the receiver for Jaeger or for ZipKin.
Attention: while it could be beneficial for your use case to be able to export the traces in the OTLP format and import them in the tool of your choice, you will still be missing one piece to achieve true end-to-end tracing: a unique format for context propagation.
Q: What is context propagation? Do you support W3C Trace Context?
When using distributed tracing, each service and component will export their own spans (part of the trace). A unique identifier (trace id) is needed to stitch those spans together to get the end-to-end distributed trace.
In the past, each observability vendor and tools implemented their own format to express this trace id (B3, Jaeger native propagation, …). A couple of years ago, a new standard was created: W3C trace-context and is the standard recommended by OpenTelemetry.
Here is a good video to learn more about context propagation with OpenTelemetry: Context Propagation makes OpenTelemetry awesome - YouTube.
With our OpenTracing support in Tyk API Gateway, we support B3 with Zipkin and Jaeger native propagation with Jaeger. We plan to use W3C Trace-Context in our upcoming support of OpenTelemetry.