Batch Scraping
Submit up to 100 URLs in a single API call. Jobs are processed in parallel and you can poll for results or receive a webhook when the batch completes.
Async by Design
batch_id. Use it to poll status or configure a webhook for delivery.How It Works
Submit
POST an array of URLs to /api/v1/batch. Credits are pre-debited for the estimated total cost.
Process
Each URL becomes an individual job processed in parallel. The same tier escalation and anti-bot logic applies to every job.
Collect
Poll GET /api/v1/batch/{batch_id} for individual results, or receive a batch.completed webhook when all jobs finish.
Submit a Batch
/api/v1/batchSubmit a batch of URLs for asynchronous processing with optional webhook delivery. Returns 202 Accepted.
curl -X POST https://api.alterlab.io/api/v1/batch \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"urls": [
{ "url": "https://example.com/page-1" },
{ "url": "https://example.com/page-2", "mode": "js" },
{
"url": "https://example.com/page-3",
"formats": ["text", "markdown"],
"cost_controls": { "max_tier": "3" }
}
],
"webhook_url": "https://your-server.com/webhook"
}'Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| urls | array | Yes | Array of URL objects (1–100 items) |
| webhook_url | string | No | URL to receive a batch.completed event when all jobs finish |
Per-URL Options
Each item in the urls array accepts the same parameters as a single scrape request:
| Field | Default | Description |
|---|---|---|
| url | — | Target URL (required) |
| mode | "auto" | auto, html, js, pdf, ocr |
| formats | ["html", "json"] | Output formats: text, json, html, markdown |
| cost_controls | null | force_tier, max_tier, prefer_cost, prefer_speed, fail_fast |
| extraction_schema | null | JSON schema for structured extraction |
| cache | false | Enable caching for this URL |
| timeout | 90 | Per-URL timeout in seconds (1–300) |
| wait_for | null | CSS selector to wait for before extracting |
Response (202 Accepted):
{
"batch_id": "batch_7a3b9c8d-4e2f-1a6b-8c5d-9e0f1a2b3c4d",
"total_urls": 3,
"status": "processing",
"estimated_credits": 45000,
"job_ids": [
"a1b2c3d4-...",
"e5f6g7h8-...",
"i9j0k1l2-..."
]
}Poll Batch Status
/api/v1/batch/{batch_id}Returns the aggregate status and individual job results for a batch.
curl https://api.alterlab.io/api/v1/batch/batch_7a3b9c8d-... \
-H "X-API-Key: your_api_key"Status Response
{
"batch_id": "batch_7a3b9c8d-...",
"status": "completed",
"total": 3,
"completed": 2,
"failed": 1,
"pending": 0,
"items": [
{
"job_id": "a1b2c3d4-...",
"url": "https://example.com/page-1",
"status": "succeeded",
"result": { "text": "...", "metadata": { ... } }
},
{
"job_id": "e5f6g7h8-...",
"url": "https://example.com/page-2",
"status": "succeeded",
"result": { "text": "...", "metadata": { ... } }
},
{
"job_id": "i9j0k1l2-...",
"url": "https://example.com/page-3",
"status": "failed",
"error": "BLOCKED_BY_ANTIBOT"
}
],
"created_at": "2026-03-03T12:00:00Z"
}Batch status values:
| Status | Meaning |
|---|---|
| processing | Some jobs are still running |
| completed | All jobs succeeded |
| partial | All jobs done, some failed |
| failed | All jobs failed |
Webhook Delivery
If you provide a webhook_url, AlterLab sends a batch.completed POST when every job in the batch finishes (succeeded or failed).
{
"event": "batch.completed",
"timestamp": "2026-03-03T12:01:00Z",
"data": {
"batch_id": "batch_7a3b9c8d-...",
"status": "partial",
"total": 3,
"completed": 2,
"failed": 1
}
}Webhook is fire-once
Billing & Credits
- Credits are pre-debited when the batch is submitted, based on estimated per-URL cost.
- If a job fails, the worker automatically refunds the credits for that URL.
- BYOP (Bring Your Own Proxy) discounts apply per-URL if you have an active proxy integration.
- Check
estimated_creditsin the submit response to see the total debited.
Limits
Batch Limits
- Maximum 100 URLs per batch request
- Each URL counts as a separate scrape credit-wise
- Batch jobs are processed in parallel (no guaranteed order)
- Batch metadata expires after 24 hours — poll or use webhooks before then
Python Example
import alterlab
import time
client = alterlab.AlterLab(api_key="your_api_key")
# Submit batch
batch = client.batch_scrape(
urls=[
{"url": "https://example.com/page-1"},
{"url": "https://example.com/page-2", "mode": "js"},
{"url": "https://example.com/page-3", "formats": ["markdown"]},
],
webhook_url="https://your-server.com/webhook",
)
print(f"Batch ID: {batch['batch_id']}")
print(f"Estimated credits: {batch['estimated_credits']}")
# Poll until complete
while True:
status = client.get_batch_status(batch["batch_id"])
print(f"Status: {status['status']} — {status['completed']}/{status['total']} done")
if status["status"] != "processing":
break
time.sleep(2)
# Process results
for item in status["items"]:
if item["status"] == "succeeded":
print(f"✓ {item['url']}: {len(item['result'].get('text', ''))} chars")
else:
print(f"✗ {item['url']}: {item.get('error', 'unknown')}")Node.js Example
import AlterLab from "alterlab";
const client = new AlterLab({ apiKey: "your_api_key" });
// Submit batch
const batch = await client.batchScrape({
urls: [
{ url: "https://example.com/page-1" },
{ url: "https://example.com/page-2", mode: "js" },
{ url: "https://example.com/page-3", formats: ["markdown"] },
],
webhookUrl: "https://your-server.com/webhook",
});
console.log(`Batch ID: ${batch.batchId}`);
// Poll until complete
let status;
do {
await new Promise((r) => setTimeout(r, 2000));
status = await client.getBatchStatus(batch.batchId);
console.log(`${status.completed}/${status.total} done`);
} while (status.status === "processing");
// Process results
for (const item of status.items) {
if (item.status === "succeeded") {
console.log(`✓ ${item.url}`);
} else {
console.log(`✗ ${item.url}: ${item.error}`);
}
}