Imagine how easier your life could be if you were able to use debug output in production?

Often, when maintaining services, you deal with support tickets describing bugs, in which there is not enough information to diagnose and fix the problem.

In such cases your options are usually limited to: contact support engineer for additional information, try to recreate user environment in which bug occured and try to find signs of error in logs.

Unfortunately, these options are not optimal because they take too much time. Interaction with user as well as attempts to recreate user environment might drag on for several days. Searching for relevant information in log files is like looking for a needle in a haystack.

Regardless of your choice, ticket resolution time will increase, while user will be more and more disappointed in the service.

Imagine - you want to buy an airline ticket with time-limited promo fare or place a bid on auction. But because of some bug and untimely customer support response you miss the opportunity. If you face similar problem once again, you probably would leave that service and never come back.

Using conditional debugging in production can help reduce the time needed to solve user’s problem.

Conditional debugging usage scenario

Consider the scenario of using conditional debugging to solve the problem with inability to buy ticket through your service:

  1. Customer support receives complaint from user about inability to buy a ticket
  2. Support engineer enables conditional debugging for that particular user
  3. User retries to buy a ticket
  4. Support engineer sends gathered info to the developers to fix the error

Conditional debugging technique overview

To enable debugging of user requests, the user must be assigned a debug token. This token must be sent to the user using a Cookie. It acts as a unique id of debug session and contains logging configuration parameters for system components.

Upon receipt of subsequent requests from the user, the webserver checks the correctness of received debug token. In case the debug token passes validation, the webserver proxies the original request further setting X-Request-Id header with the generated unique identifier. In case the debug token fails to validate, no header is added to the request.

Every system component received request with X-Request-Id header must add it to logging output associated with this request.

conditional debugging scheme

Debug token

The token consists of:

  • unique id of debug session
  • logging parameters

The debug token is generated when starting a debug session for a user. When starting a debug session, you can specify logging parameters for each system component.

The debug token is checked by the webserver for data integrity in order to avoid forgery and DoS attacks through unauthorized intensive logging.

Token parameters

Token parameters are represented using sequential TLV (Type-Length-Value) entries, describing logging configuration for each system component. Token parameters are encoded with base64. The use of TLV for storing configuration allows to organize flexible, space-efficient configuration schemes for different services, where every service knows only it’s own configuration scheme and a corresponding type of TLV entry.

Unique Request ID (X-Request-Id)

Unique Request ID consists of:

  • debug token
  • random id, unique for current debug session.

X-Request-Id is generated by the webserver if a request with correct debug token has been received from the user.

The webserver drops X-Request-Id headers received from the client in order to avoid token forgery.

Processing requests with X-Request-Id header

Every service received a request with X-Request-Id header must check logging configuration parameters specified in debug token to determine which logging policies apply to current request.

Conclusion

Debugging simplification is crucial for any service, as it speeds up processing of support tickets, reduces unnecessary communication with the customer and have an overall positive impact on customer experience.

Using described technique in conjunction with carefully thought-out logging configuration allows you to gather almost any information about request processing without the need of services reconfiguration.