Retry logic is the code responsible for automatically re-attempting failed HTTP requests under appropriate conditions. In web scraping, failures occur frequently and for varied reasons: transient network errors (connection reset, timeout), rate limit responses (HTTP 429), temporary server errors (HTTP 500, 503), anti-bot challenges that require a different strategy, and CAPTCHA gates. Not all failures warrant a retry — 404 Not Found and 403 Forbidden (permanent ban) should not be retried.
Effective retry logic uses exponential backoff with jitter: the first retry waits 1 second, the second waits 2 seconds, the third waits 4 seconds, with random jitter added to each delay to prevent synchronised retry storms from multiple concurrent scrapers. A maximum retry count (typically 3-5) prevents infinite loops on permanently failing targets. Circuit breakers trip after a threshold of consecutive failures to give an overwhelmed target server time to recover.
Different failure types warrant different retry strategies: HTTP 429 should honour the Retry-After header if present; HTTP 503 suggests escalating to a higher anti-bot tier; connection timeouts suggest trying a different proxy IP; CAPTCHA responses require challenge resolution before retrying. AlterLab's retry layer handles these cases automatically — developers receive a successful response or a descriptive error after all retry attempts are exhausted.