When a request fails due to a transient error — a 429 Too Many Requests, a 503 Service Unavailable, or a network timeout — immediately retrying the request at full speed risks overwhelming the server and receiving the same error repeatedly. Exponential backoff addresses this by waiting progressively longer between retries: 1 second, 2 seconds, 4 seconds, 8 seconds, and so on, typically with a random jitter added to prevent a thundering-herd effect when many clients retry simultaneously.
A standard formula is `wait = min(cap, base * 2^attempt + jitter)`, where `cap` limits the maximum wait time, `base` is the initial wait (often 0.5–1 second), and `jitter` is a random value up to the current wait time. Most scraping frameworks and HTTP client libraries provide built-in retry utilities with configurable backoff parameters.
For rate-limited endpoints, exponential backoff is the polite and effective strategy. AlterLab's worker layer implements backoff internally, so callers receive results without managing retry logic themselves.