Handling timeouts

The write-up below meant to be a companion to the video above. Please watch the above video to build a better understanding.

Effective management of timeouts is essential when services interact. For instance, in a scenario where a search service retrieves blog posts based on user queries and depends on an analytics service for supplementary data, a delay from the analytics service can cause timeout problems. This highlights the importance of addressing timeout issues to ensure seamless communication between services.

Inter-service communication can face several challenges. Commonly encountered problems include requests failing to reach the intended service, responses being undelivered due to network disruptions, and delays in service response times. Awareness of these issues is crucial for developing effective timeout strategies to enhance communication reliability.

Setting timeouts during network calls is crucial. It prevents indefinite waiting for responses, which can hinder user experience. Selecting an appropriate timeout value tailored to the specific use case is essential, as it balances responsiveness with efficiency, avoiding unnecessary delays.

Ignoring timeouts is a common yet inadvisable practice, as it can result in unpredictable system behavior. A better strategy is to catch exceptions and manage them appropriately, ensuring users are informed about any timeout issues.

In situations where there is a timeout, a practical solution is to utilize default values. For example, if the analytics service fails to respond, the search service can provide a default value, like returning zero views for a blog.

Implementing retry logic after a timeout can be beneficial, particularly for read operations. However, it’s essential to avoid retries for non-idempotent actions, as these could lead to unintended outcomes, such as duplicate transactions.

Conditional retries focus on executing a retry only when essential. By incorporating checks to evaluate the success of prior operations, this approach ensures that retries are made safely and judiciously, thereby reducing the risk of adverse effects from unnecessary requests.

To enhance the resilience of your solution, consider re-architecting it to reduce synchronous dependencies. By adopting an event-driven approach or integrating necessary data into services, you can diminish reliance on synchronous communication, ultimately resulting in a more robust architecture.

Introduction

1. What are microservices?

2. Advantages of adopting microservices

3. How to scope a microservice?

Communication

1. Sync / Async Communication

2. Everything you need to know about REST

3. Introduction to RPC Remote Procedure Calls

4. Handling timeouts

Databases

1. Database per Service Pattern

2. Shared Database Pattern

Patterns

1. API Composition Pattern

2. Backend for Frontend Pattern

3. Designing Workflows - Orchestration vs Choreography

Productionization

1. Challenges in adopting and microservices

2. Things to remember while building microservices

3. Best practices to ease integration

4. Why have a standard way of building microservices?

Staff Engg at GCP Memorystore, Creator of DiceDB, ex-Staff Engg for Google Ads and GCP Dataproc, ex-Amazon Fast Data, ex-Director of Engg. SRE and Data Engineering at Unacademy. I spark engineering curiosity through my no-fluff engineering videos on YouTube and my courses