C# tutorials > Frameworks and Libraries > Other Important Libraries > Polly for resilience and fault handling (retries, circuit breaker)

Polly for resilience and fault handling (retries, circuit breaker)

Polly is a .NET resilience and transient-fault-handling library that allows developers to express policies such as Retry, Circuit Breaker, Timeout, and Fallback in a fluent and thread-safe manner. It's invaluable for building applications that can gracefully handle failures, network hiccups, and other common issues in distributed systems. This tutorial provides practical examples of using Polly to improve the robustness of your C# applications.

Installation

Before you can use Polly, you need to install it from NuGet. Open the NuGet Package Manager Console or the NuGet Package Manager UI in Visual Studio and run the following command: Install-Package Polly

Basic Retry Policy

This code demonstrates a basic retry policy that attempts to execute an operation up to 3 times if it encounters an HttpRequestException. The Handle() specifies the type of exception to handle. The RetryAsync(3, ...) configures the policy to retry up to 3 times, logging each retry attempt. The ExecuteAsync method wraps the code that might fail.

using Polly;
using System;
using System.Net.Http;
using System.Threading.Tasks;

public class RetryExample
{
    public static async Task Run()
    {
        var retryPolicy = Policy
            .Handle<HttpRequestException>()
            .RetryAsync(3, (exception, retryCount) =>
            {
                Console.WriteLine($"Retry {retryCount} due to: {exception.Message}");
            });

        try
        {
            await retryPolicy.ExecuteAsync(async () =>
            {
                // Simulate an HTTP request that might fail
                HttpResponseMessage response = await SimulateHttpRequest();
                response.EnsureSuccessStatusCode(); // Throw exception for bad status codes
                Console.WriteLine("Request successful!");
            });
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Request failed after multiple retries: {ex.Message}");
        }
    }

    private static async Task<HttpResponseMessage> SimulateHttpRequest()
    {
        // Simulate a failing request on the first two attempts
        if (DateTime.Now.Second % 5 == 0 || DateTime.Now.Second % 5 == 1)
        {
            throw new HttpRequestException("Simulated network error.");
        }

        // Simulate a successful request
        return new HttpResponseMessage(System.Net.HttpStatusCode.OK);
    }
}

Explanation of Retry Policy Components

  • Handle(): Specifies the type of exception that the policy should handle. You can also handle specific status codes.
  • RetryAsync(int retryCount, Action onRetry): Configures the policy to retry a specified number of times (retryCount). The onRetry delegate is executed before each retry attempt, allowing you to log or perform other actions.
  • ExecuteAsync(Func action): Executes the provided asynchronous action within the context of the retry policy.

Circuit Breaker Policy

This code demonstrates a circuit breaker policy. After two consecutive HttpRequestException failures, the circuit will open for 30 seconds, preventing further requests. The CircuitBreakerAsync(2, TimeSpan.FromSeconds(30), ...) configures this behavior. The policy transitions through three states: Closed (normal operation), Open (requests blocked), and Half-Open (attempting to recover). The provided delegates are executed during each state transition. The OnBreak delegate will be executed when the circuit transitions from Closed to Open. The OnHalfOpen delegate will be executed when the circuit transitions from Open to Half-Open. The OnReset delegate will be executed when the circuit transitions from Half-Open to Closed.

using Polly;
using System;
using System.Net.Http;
using System.Threading.Tasks;

public class CircuitBreakerExample
{
    public static async Task Run()
    {
        var circuitBreakerPolicy = Policy
            .Handle<HttpRequestException>()
            .CircuitBreakerAsync(2, TimeSpan.FromSeconds(30), (exception, timespan) =>
                {
                    Console.WriteLine($"Circuit broken for {timespan.TotalSeconds} seconds due to: {exception.Message}");
                },
                () =>
                {
                    Console.WriteLine("Circuit half-open. Attempting to recover...");
                },
                () =>
                {
                    Console.WriteLine("Circuit reset.  Normal operation resumed.");
                });

        for (int i = 0; i < 5; i++)
        {
            try
            {
                await circuitBreakerPolicy.ExecuteAsync(async () =>
                {
                    HttpResponseMessage response = await SimulateHttpRequest();
                    response.EnsureSuccessStatusCode();
                    Console.WriteLine("Request successful!");
                });
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Request failed: {ex.Message}");
            }
            await Task.Delay(1000); // Wait for 1 second between requests
        }
    }

    private static async Task<HttpResponseMessage> SimulateHttpRequest()
    {
        // Simulate a failing request for the first few attempts
        if (DateTime.Now.Second % 5 == 0 || DateTime.Now.Second % 5 == 1)
        {
            throw new HttpRequestException("Simulated network error.");
        }

        // Simulate a successful request
        return new HttpResponseMessage(System.Net.HttpStatusCode.OK);
    }
}

Explanation of Circuit Breaker Policy Components

  • CircuitBreakerAsync(int handledEventsAllowedBeforeBreaking, TimeSpan durationOfBreak, Action onBreak, Action onReset, Action onHalfOpen): Configures the circuit breaker. handledEventsAllowedBeforeBreaking specifies how many consecutive exceptions of the handled type are allowed before the circuit opens. durationOfBreak specifies how long the circuit remains open. onBreak is executed when the circuit breaks, onReset when it resets, and onHalfOpen when it enters the half-open state.

Real-Life Use Case Section

Imagine your application interacts with a third-party API that is sometimes unreliable. Without resilience, your application might experience cascading failures. Using Polly, you can implement a retry policy to automatically retry failed requests, potentially mitigating temporary network issues or server overloads. You can also use a circuit breaker policy to prevent your application from overwhelming the third-party API with requests when it is already unavailable, giving it time to recover. Polly can be used in conjunction with HttpClientFactory to manage HttpClient lifecycle. For example, Polly can be used to configure retry policies for HTTP requests managed by HttpClientFactory.

Best Practices

  • Choose the right policy for the situation: Retries are useful for transient faults, while circuit breakers are better for preventing cascading failures.
  • Configure retry delays intelligently: Avoid overwhelming failing services with rapid retries. Use exponential backoff to gradually increase the delay between retries.
  • Monitor your policies: Track circuit breaker state and retry attempts to gain insights into system health.
  • Combine policies: You can combine multiple policies, such as a retry policy wrapped around a circuit breaker policy, to create a comprehensive resilience strategy.
  • Logging is important: Log all events (retries, circuit breaker state changes) to have a clear understanding of what happened and why.

Interview Tip

Be prepared to discuss the different types of resilience policies (retry, circuit breaker, timeout, fallback) and when each is most appropriate. Explain how Polly helps improve the robustness of applications by gracefully handling failures. Explain how to use policies with HttpClientFactory. Be ready to show example code implementing these policies. Be aware of different policies strategies like Timeout or Fallback. Be able to explain exponential backoff.

When to use them

  • Retry policies: Use when dealing with transient faults like temporary network glitches, database connection issues, or temporary server overloads.
  • Circuit breaker policies: Use when interacting with unreliable services that might experience prolonged outages. This prevents your application from continuously attempting to access an unavailable service.
  • Timeout policies: Use when an operation is expected to complete within a certain time and shouldn't block indefinitely.
  • Fallback policies: Use when an operation fails, and you want to provide a default value or execute alternative logic instead of throwing an exception.

Memory footprint

Polly itself has a relatively small memory footprint. However, the memory usage depends on the complexity and configuration of your policies. For example, policies with long retry delays or large numbers of retries might consume more memory. It's generally not a significant concern, but it's good to be aware of it, especially in memory-constrained environments.

Alternatives

  • Hand-rolled solutions: You can implement retry logic and circuit breakers manually, but this is often more complex and error-prone than using a dedicated library like Polly.
  • Service meshes (e.g., Istio, Linkerd): Service meshes provide resilience features at the infrastructure level, but they require more complex setup and configuration.
  • Resilience4j (Java): A popular Java library similar to Polly for resilience and fault tolerance.

Pros

  • Simple and fluent API: Polly provides a clean and easy-to-use API for defining resilience policies.
  • Comprehensive set of policies: Polly supports a wide range of policies, including retry, circuit breaker, timeout, fallback, and more.
  • Thread-safe: Polly policies are thread-safe and can be used concurrently.
  • Extensible: Polly allows you to create custom policies to address specific resilience requirements.
  • Integration with HttpClientFactory: Seamless integration with HttpClientFactory for managing HttpClient lifecycle and resilience.

Cons

  • Dependency on Polly library: Introduces a dependency on an external library.
  • Configuration overhead: Requires configuration of policies, which can add some complexity.

FAQ

  • What is the difference between Retry and Circuit Breaker?

    The Retry policy attempts to re-execute an operation if it fails. It's suitable for transient faults. The Circuit Breaker policy prevents an operation from being executed if it has failed repeatedly. It's designed to protect against prolonged outages and cascading failures.
  • How do I combine multiple Polly policies?

    You can nest policies using the WrapAsync method. For example, you can wrap a Retry policy around a Circuit Breaker policy to first retry a failing operation and then, if it continues to fail, open the circuit breaker to prevent further attempts.
  • How do I test my Polly policies?

    You can use mocking frameworks to simulate failures and verify that your Polly policies are behaving as expected. You can also use integration tests to test the policies against real services.