All About Correlation IDs

SUBHAM MAJAVADIYA
2 min readSep 3, 2021

--

At Scale, tracing a request is much more difficult compare to small/monolithic systems.

As of today with adoption of Micro-Services, Distributed System architecture in industry it is very much possible that a single request initiated by client will led to multiple applications with different sets of parameters/informations across system.

Tracing/debugging such request across system(multiple application) will be pretty much time consuming if we don’t have a common identifier passed across applications/system, that unique identifier is called Correlation-ID.

A Correlation-ID, is a unique identifier (probably UUID) that can be attached to requests and propagated across the system, to make tracing/debugging process much smooth and fast.

How to use Correlation ID effectively ?

It’s recommended to generate and attach, Correlation-ID in the Request Header from the request origin/client and the same should get propagated to all the upstream application of the system.

Also it’s best to make sure that the Correlation-ID is included in all log messages, otherwise you might miss the benefits of that in case you want to debug something.

To make sure that, we can have a middleware or interceptor which extracts the Correlation-ID from request header, adds the same to logging context and the same context can be used by logger to log any message, that way we don’t need to propagate correlation-id across various functions/methods.

If you’re not using HTTP you probably have an equivalent way to add request metadata, usually as some kind of request envelope properties. If you’re using GRPC, Metadata is one way.

Putting It All Together

The notion of a Correlation-ID is simple. It’s a value that is common to all requests, log messages and responses for a given transaction.

Asynchronous programming by nature is hard to track. There is no guarantee of the sequence of events. Things happen when they happen, sometimes sooner, sometimes later, sometimes not at all. Having a common “tag” among elements in a transaction allows a common reference by which logging and auditing can happen. When something goes wrong in an asynchronous, distributed system, troubleshooting is more than piecing together timestamps in a log. Including a Correlation ID in requests and messages and entering the Correlation ID as part of the standard practice for logging, provides the glue that is needed to create a coherent, understandable audit of events.

When we can group a transaction’s events under a unifying value, the Correlation ID, we can spend more time fixing the problem rather than finding the problem.

Tap the 👏 button if you found the article helpful.
Drop your thoughts, experience with/without Correlation-ID 😃 in comments section, will happy to learn more on that thanks!

--

--

SUBHAM MAJAVADIYA

Senior Software Engineer at Gojek | Ex-Engati Chatbot Platform | NITD