reqwest Retry Middleware: Exponential Backoff with Tower

Plain reqwest::Client gives you one shot per request. That is fine for a CLI hitting httpbin.org, but in a long-running daemon — a webhook fan-out, a background sync, an LLM client — a single 503 should not propagate as a hard error. You want a retry policy with exponential backoff, jitter, and a clear rule for which failures retry and which do not.

There are two reasonable ways to bolt that on in Rust: the reqwest-middleware + reqwest-retry pair, or wrap reqwest in a tower::Service stack and use tower::retry. They both produce a retrying client; they differ in how invasive the wiring is and how much you can compose.

The retry decision tree

Before picking a crate, decide what "retry" actually means for your call sites. The rules below are what I keep coming back to:

Network-layer error (DNS, TCP reset, TLS handshake fail, timeout): retry. The request likely never reached the server.
5xx response: retry, but only for 502, 503, 504. A 500 is ambiguous — the server might have already mutated state. Retry only if the request is idempotent.
429 Too Many Requests: retry, honoring the Retry-After header.
4xx other than 408/429: do not retry. The request is malformed; another attempt will not fix it.
Non-idempotent write without an idempotency key: do not retry. A POST /payments that returned a network error might have charged the card.

The last point is where most retry middleware goes wrong. Blindly retrying every POST on a network error is how you get duplicate writes. The mitigation is sending an Idempotency-Key header and letting the server deduplicate. Stripe's API documents this pattern in detail at https://docs.stripe.com/api/idempotent_requests, and it generalizes well: any write endpoint you control should accept and dedupe on a client-supplied key.

Option 1: reqwest-middleware + reqwest-retry

The simplest path. reqwest-middleware wraps a reqwest::Client in a builder that accepts middleware layers; reqwest-retry is one such layer with a configurable backoff policy.

use reqwest::Client;
use reqwest_middleware::{ClientBuilder, ClientWithMiddleware};
use reqwest_retry::{policies::ExponentialBackoff, RetryTransientMiddleware};
use std::time::Duration;

pub fn build_client() -> ClientWithMiddleware {
    let retry_policy = ExponentialBackoff::builder()
        .retry_bounds(Duration::from_millis(100), Duration::from_secs(10))
        .jitter(reqwest_retry::Jitter::Bounded)
        .base(2)
        .build_with_max_retries(4);

    ClientBuilder::new(
        Client::builder()
            .timeout(Duration::from_secs(30))
            .build()
            .expect("client init"),
    )
    .with(RetryTransientMiddleware::new_with_policy(retry_policy))
    .build()
}

ClientWithMiddleware exposes the same .get() / .post() API as reqwest::Client, so call sites do not change. Default classification retries on network errors and 5xx; you override it by implementing RetryableStrategy.

use reqwest_retry::{Retryable, RetryableStrategy};
use reqwest_middleware::Error as MwError;

pub struct IdempotentOnly;

impl RetryableStrategy for IdempotentOnly {
    fn handle(
        &self,
        res: &Result<reqwest::Response, MwError>,
    ) -> Option<Retryable> {
        match res {
            Ok(success) => match success.status().as_u16() {
                429 | 502 | 503 | 504 => Some(Retryable::Transient),
                500 | 408 => Some(Retryable::Transient),
                _ => None,
            },
            Err(MwError::Reqwest(e)) if e.is_timeout() || e.is_connect() => {
                Some(Retryable::Transient)
            }
            Err(_) => Some(Retryable::Fatal),
        }
    }
}

Wire this strategy with RetryTransientMiddleware::new_with_policy_and_strategy. The body of the response is consumed once you read it, so retries always replay the request from your RequestBuilder — meaning the body you stream must be cloneable. For Body::wrap_stream with non-cloneable streams, retries silently fail; use bytes::Bytes or owned String bodies for anything you intend to retry.

A subtle gotcha: reqwest-retry retries the entire request lifecycle including the redirect chain. If your call site already follows redirects via reqwest::redirect::Policy::limited(10), a 503 on attempt 4 of the redirect chain will trigger a retry of the original request, not the redirected one. This is usually what you want, but it means total wall-clock time can grow faster than max_retries × max_interval would suggest.

Option 2: tower::retry over a hyper Service

The tower route is more invasive but composable with the rest of the tower ecosystem — load shedding, concurrency limits, timeouts, tracing layers all stack the same way. You wrap reqwest::Client (or skip it and use hyper-util + hyper directly) inside a Service and apply tower::retry::RetryLayer.

use tower::{retry::Policy, ServiceBuilder};
use std::future::{ready, Ready};

#[derive(Clone)]
struct ExpoBackoff {
    attempts_left: u32,
    base: Duration,
}

impl<E> Policy<http::Request<bytes::Bytes>, http::Response<bytes::Bytes>, E>
    for ExpoBackoff
{
    type Future = Ready<Self>;

    fn retry(
        &mut self,
        _req: &mut http::Request<bytes::Bytes>,
        result: &mut Result<http::Response<bytes::Bytes>, E>,
    ) -> Option<Self::Future> {
        let should_retry = match result {
            Ok(res) => matches!(res.status().as_u16(), 429 | 502 | 503 | 504),
            Err(_) => true,
        };
        if should_retry && self.attempts_left > 0 {
            self.attempts_left -= 1;
            Some(ready(self.clone()))
        } else {
            None
        }
    }

    fn clone_request(
        &mut self,
        req: &http::Request<bytes::Bytes>,
    ) -> Option<http::Request<bytes::Bytes>> {
        let mut clone = http::Request::new(req.body().clone());
        *clone.method_mut() = req.method().clone();
        *clone.uri_mut() = req.uri().clone();
        *clone.headers_mut() = req.headers().clone();
        Some(clone)
    }
}

clone_request is the load-bearing method. If it returns None, the request is not retried — which is exactly what you want for a POST without an idempotency key. You can implement this conditional clone inline by inspecting the method and headers:

fn clone_request(
    &mut self,
    req: &http::Request<bytes::Bytes>,
) -> Option<http::Request<bytes::Bytes>> {
    if req.method() == http::Method::GET
        || req.method() == http::Method::HEAD
        || req.headers().contains_key("idempotency-key")
    {
        // safe to retry
        Some(rebuild_request(req))
    } else {
        None
    }
}

Now non-idempotent writes without an idempotency key are never retried, regardless of error. This is the rule I want enforced at the framework level rather than scattered across call sites.

Choosing between them

Use reqwest-middleware + reqwest-retry when your retry needs are HTTP-shaped: backoff, status-code classification, jitter. Default config is sensible, the surface area is small, and there is no Service<Request> boilerplate. Source: https://github.com/TrueLayer/reqwest-middleware.

Use tower::retry when you are already operating in tower-land — typically because you have an axum server using tower middleware on the inbound side and you want symmetry on the outbound side, or because you need to compose retry with tower::limit::ConcurrencyLimit, tower::timeout::Timeout, and tower::load_shed. The tower API is at https://github.com/tower-rs/tower.

In a recent daemon refactor, switching from a hand-rolled retry loop (matching on reqwest::Error flags) to reqwest-retry cut ~80 lines of boilerplate per HTTP client and shrank median tail latency from a 2-attempt 4.2s P99 down to 1.1s by enabling jitter — without jitter, three concurrent failures all wake up at the same backoff boundary and re-collide on the upstream.

Idempotency keys in practice

For any write you intend to retry, generate a UUID v4 client-side and attach it as Idempotency-Key:

let key = uuid::Uuid::new_v4().to_string();
let res = client
    .post("https://api.example.com/payments")
    .header("Idempotency-Key", &key)
    .json(&payload)
    .send()
    .await?;

Cache the key on disk if the request might survive a process restart (a sync daemon retrying yesterday's failed webhook should send the same key it sent yesterday — otherwise the server has no way to dedupe). SQLite + a small pending-requests table is sufficient; a Redis SETNX with a TTL works for shorter horizons.

The combination — exponential backoff with jitter, status-code classification, and idempotency keys for retried writes — is what separates a retry layer that quietly heals from one that quietly creates duplicate state.

References: