CloudNative: Tracing requests across Microservices with Opentracing

Engineering organizations are trading the old monoliths with Microservices for scalability, reliability and agility, but debugging the system calls across the Microservices is not straight-forward. Opentracing provides framework to debug and trace the system calls across the Microservices.

opentracing_meme

Major debugging issues which engineering teams face after splitting the monolith into microservices are following:

  • User-Facing latency optimization
  • Root Cause Analysis of backend errors
  • Communication issues between the distinct components

Famous distributed tracing systems

  • Zipkin
  • Dapper
  • Jaeger
  • Appdash
  • Opentracing

Why Opentracing?

  • It offers consistent, expressive, vendor-neutral APIs for popular platforms(GoKit, Flask, GRPC, django, dropwizard, etc)
  • It makes it easy for engineering teams to switch implementations with just a configuration change.(Appdash, Jaeger, Instana, Zipkin, Datadog, etc)

Opentracing Concepts

  • Trace : It is directed acyclic graph of spans which represents the transaction/workflow propagating through the system.
  • Span : Span is encapsulation of operation-time, start timestamp, finish timestamp, tags and context. It represents a segment of work performed in the trace.

Example

Lets say we have 3 Microservices

  • PaymentService : It performs the payment action.
  • OrderService : It performs the order action.
  • MyService : It calls the payment and order actions.

Source Code : https://github.com/anshulpatel25/distributed_tracing

We will be using above source code to get the execution time for PaymentService and OrderService, when they are called from the MyService.

Output

opentracing

From the above Gantt chart we can observe the following:

  • Total execution time: 14.03s
  • Time taken by OrderService: 4.01s
  • Response code from OrderService: 200 OK
  • Time taken by PaymentService: 10.02s
  • Response code from PaymentService: 200 OK

Reference(s)