HTTP status codes are grouped into five classes by their first digit. 1xx codes are informational (rarely encountered by scrapers). 2xx codes signal success: 200 OK (content returned), 204 No Content (success with empty body). 3xx codes are redirects: 301 Moved Permanently, 302 Found (temporary redirect), 304 Not Modified (cached content still valid).
4xx codes indicate client-side errors: 400 Bad Request (malformed request), 401 Unauthorized (authentication required), 403 Forbidden (authenticated but not permitted), 404 Not Found (resource absent), 429 Too Many Requests (rate limited, check Retry-After header). 5xx codes indicate server-side errors: 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout.
For scrapers, 403 and 429 are the most important to handle correctly. A 403 typically means the IP or session is blocked — rotating proxies or solving a challenge is needed. A 429 means the request rate should be reduced — respect the Retry-After header or apply exponential backoff. 5xx errors are generally transient and worth retrying.