Intermediate4 steps

How to Avoid Getting Blocked When Scraping

Web scrapers that send requests too fast, use identifiable patterns, or send unusual headers get blocked. Consistent, reliable data collection requires managing request pacing, rotating identifiers, and handling compatibility requirements automatically.

Step-by-Step Guide

Use realistic request pacing

Add a delay between requests to a single domain — 1 to 5 seconds is a reasonable baseline. Randomize the delay slightly rather than using a fixed interval, which is easier to detect.

Send realistic HTTP headers

Include Accept, Accept-Language, and Accept-Encoding headers in your requests. A browser User-Agent string is often required. AlterLab sets realistic headers automatically when using its browser rendering tier.

Let AlterLab handle compatibility automatically

AlterLab's 5-tier auto-escalation system selects the appropriate browser environment and request configuration for each URL. This handles compatibility requirements without manual configuration.

Implement retry logic with backoff

When a request returns a non-200 status or a challenge page, wait and retry with exponential backoff. AlterLab handles retries at the infrastructure level — check the status_code field in the response.

Code Example

Python

import requests
import time
import random

def fetch_with_pacing(urls: list[str], api_key: str) -> list[dict]:
    results = []
    for url in urls:
        response = requests.post(
            "https://alterlab.io/api/v1/scrape",
            headers={"X-API-Key": api_key, "Content-Type": "application/json"},
            json={"url": url, "render_js": True},
        )
        data = response.json()
        results.append(data)

        # Randomized delay between requests
        time.sleep(random.uniform(1.5, 3.5))

    return results

Replace YOUR_API_KEY with your key from the . No credit card required.

Try this yourself with AlterLab

Run this tutorial on live websites with AlterLab's API. Free tier includes 5,000 requests — no credit card required.

View API docs

Frequently Asked Questions

What response status codes indicate my scraper is being limited?

HTTP 429 (Too Many Requests), 403 (Forbidden), and 503 (Service Unavailable) are common indicators of rate limiting or blocks. Challenge pages may return a 200 status with HTML content instead of your target data.

Responsible Use

AlterLab is designed for extracting publicly available data. Always review the terms of service for any website you access, respect robots.txt directives, and ensure your use case complies with applicable laws in your jurisdiction.

Your first scrape.
Sixty seconds.

$1 free credit — up to 5,000 scrapes. No credit card.
Just a POST request.

terminal

curl -X POST https://api.alterlab.io/v1/scrape \

-H "X-API-Key: YOUR_KEY" \

-H "Content-Type: application/json" \

-d '{"url": "https://example.com", "formats": ["markdown"]}'

Start building free

No credit card required · $1 free credit, up to 5,000 scrapes · Balance never expires

How to Avoid Getting Blocked When Scraping

Step-by-Step Guide

Use realistic request pacing

Send realistic HTTP headers

Let AlterLab handle compatibility automatically

Implement retry logic with backoff

Code Example

Try this yourself with AlterLab

Frequently Asked Questions

What response status codes indicate my scraper is being limited?

Related Guides

Responsible Use

More tutorials

Your first scrape.
Sixty seconds.

How to Avoid Getting Blocked When Scraping

Step-by-Step Guide

Use realistic request pacing

Send realistic HTTP headers

Let AlterLab handle compatibility automatically

Implement retry logic with backoff

Code Example

Try this yourself with AlterLab

Frequently Asked Questions

What response status codes indicate my scraper is being limited?

What response status codes indicate my scraper is being limited?

Related Guides

Responsible Use

More tutorials

Your first scrape. Sixty seconds.

Your first scrape.
Sixty seconds.