WebSocket Real-Time Updates
Stream live status updates for your scraping jobs over WebSocket. Get instant notifications when jobs are queued, processing, completed, or failed — no polling required.
When to Use WebSocket
AlterLab offers three ways to get notified about job results. WebSocket is ideal when you need instant, bidirectional communication — for example, powering a live dashboard or processing results the moment they arrive.
WebSocket
Best for live dashboards, real-time UIs, and processing results instantly as they arrive. Persistent connection, lowest latency.
Webhooks
Best for server-to-server integrations and background pipelines. Fire-and-forget delivery to your HTTP endpoint.
Job Polling
Simplest option. Periodically check job status via REST API. Good for scripts and one-off jobs where latency is not critical.
Connection Setup
Connect to the WebSocket endpoint at wss://alterlab.io/api/v1/ws/jobs. Authentication is required before the connection is accepted.
Authentication
Pass your API key using one of these methods:
Option 1: Query Parameter (Recommended)
wss://alterlab.io/api/v1/ws/jobs?api_key=sk_live_your_key_hereOption 2: Sec-WebSocket-Protocol Header
Some WebSocket clients (e.g., browsers) cannot set custom headers. Use the Sec-WebSocket-Protocol header instead:
Sec-WebSocket-Protocol: sk_live_your_key_hereConnection Rejection
4001 and a reason string. Common reasons: missing API key, invalid key, disabled key, or suspended account.On successful authentication, the server sends a connected event confirming the connection is established.
Protocol
All messages are JSON-encoded text frames. The client sends action messages, and the server responds with typed events.
Client Messages
| Action | Payload | Description |
|---|---|---|
| subscribe | {"action":"subscribe","job_id":"<uuid>"} | Subscribe to updates for a specific job. You must own the job (submitted with your API key). |
| unsubscribe | {"action":"unsubscribe","job_id":"<uuid>"} | Stop receiving updates for a specific job. |
| ping | {"action":"ping"} | Client-initiated keep-alive. Server responds with pong. |
Server Messages
| Type | When |
|---|---|
| connected | Immediately after successful authentication |
| subscribed | Confirmation after a successful subscribe action |
| unsubscribed | Confirmation after a successful unsubscribe action |
| job_update | When a subscribed job changes status (queued, running, succeeded, failed) |
| heartbeat | Every 30 seconds to keep the connection alive |
| pong | Response to a client ping action |
| error | Invalid JSON, missing job_id, unknown action, or access denied |
Event Payloads
Every server message is a JSON object with a type field and a Unix ts timestamp. Below are the payloads for each event type.
connected
Sent once after authentication succeeds.
{
"type": "connected",
"message": "WebSocket connection established",
"ts": 1730451136
}subscribed
Confirms subscription to a job. Immediately followed by a job_update with the current job status.
{
"type": "subscribed",
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"ts": 1730451140
}job_update
Sent whenever a subscribed job transitions to a new status. The result field is populated when status is succeeded, and error when status is failed.
Queued:
{
"type": "job_update",
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "queued",
"result": null,
"error": null,
"ts": 1730451140
}Running:
{
"type": "job_update",
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "running",
"result": null,
"error": null,
"ts": 1730451142
}Succeeded:
{
"type": "job_update",
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "succeeded",
"result": {
"url": "https://example.com",
"status_code": 200,
"text": "<!doctype html>...",
"extracted_data": { ... }
},
"error": null,
"ts": 1730451148
}Failed:
{
"type": "job_update",
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "failed",
"result": null,
"error": "Target URL returned 403 after all retry tiers exhausted",
"ts": 1730451155
}heartbeat
Sent by the server every 30 seconds to keep the connection alive and detect stale clients. No client response is needed.
{
"type": "heartbeat",
"ts": 1730451166
}error
Sent when the server cannot process a client message.
{
"type": "error",
"message": "Job not found or access denied: 550e8400-...",
"ts": 1730451170
}Job Ownership
Code Examples
Full working examples for connecting, subscribing to a job, and handling events with reconnection logic.
import asyncio
import json
import websockets
API_KEY = "sk_live_your_key_here"
WS_URL = f"wss://alterlab.io/api/v1/ws/jobs?api_key={API_KEY}"
async def listen_for_updates(job_id: str):
"""Connect to WebSocket and stream job updates."""
async with websockets.connect(WS_URL) as ws:
# Wait for connection confirmation
msg = json.loads(await ws.recv())
print(f"Connected: {msg}")
# Subscribe to job
await ws.send(json.dumps({
"action": "subscribe",
"job_id": job_id
}))
# Listen for events
async for raw in ws:
event = json.loads(raw)
if event["type"] == "job_update":
print(f"Job {event['job_id']}: {event['status']}")
if event["status"] == "succeeded":
print(f"Result: {event['result']}")
return event["result"]
if event["status"] == "failed":
raise Exception(f"Job failed: {event['error']}")
elif event["type"] == "heartbeat":
pass # Connection is alive
elif event["type"] == "error":
print(f"Error: {event['message']}")
# Usage
# result = asyncio.run(listen_for_updates("your-job-id"))Reconnection Strategy
WebSocket connections can drop due to network changes, server deployments, or idle timeouts. Always implement reconnection logic in production applications.
1. Exponential Backoff
Start with a 1-second delay and double it on each consecutive failure, up to a maximum of 30 seconds. Reset the delay after a successful connection.
2. Re-Subscribe After Reconnect
Subscriptions are tied to the connection. After reconnecting, you must re-send subscribe messages for all active jobs.
3. Monitor Heartbeats
If you do not receive a heartbeat within 60 seconds (two missed intervals), assume the connection is dead and reconnect.
import asyncio
import json
import websockets
API_KEY = "sk_live_your_key_here"
WS_URL = f"wss://alterlab.io/api/v1/ws/jobs?api_key={API_KEY}"
async def robust_listener(job_ids: list[str], on_update):
"""WebSocket listener with exponential backoff reconnection."""
delay = 1
max_delay = 30
while True:
try:
async with websockets.connect(WS_URL) as ws:
delay = 1 # Reset on successful connect
# Wait for connected event
await ws.recv()
# Subscribe to all jobs
for job_id in job_ids:
await ws.send(json.dumps({
"action": "subscribe",
"job_id": job_id
}))
# Listen for events
async for raw in ws:
event = json.loads(raw)
if event["type"] == "job_update":
on_update(event)
except (websockets.ConnectionClosed, OSError) as e:
print(f"Disconnected: {e}. Retrying in {delay}s...")
await asyncio.sleep(delay)
delay = min(delay * 2, max_delay)WebSocket vs Polling vs Webhooks
| Feature | WebSocket | Webhooks | Polling |
|---|---|---|---|
| Latency | Instant (sub-second) | Near-instant (1-5s) | Depends on interval |
| Direction | Bidirectional | Server to client | Client to server |
| Setup complexity | Medium (connection management) | Medium (HTTP endpoint + verification) | Low (simple HTTP calls) |
| Infrastructure | Client-side only | Public HTTPS endpoint | Client-side only |
| Best for | Live dashboards, real-time UIs | Backend pipelines, automation | Scripts, simple integrations |
| Delivery guarantee | At-most-once (lost if disconnected) | At-least-once (retries on failure) | At-most-once (may miss if not polling) |
Combining Approaches
Best Practices
1. Always Implement Reconnection
Network interruptions are inevitable. Use exponential backoff and re-subscribe to all active jobs after reconnecting.
2. Monitor Heartbeats
Set a 60-second timer that resets on every heartbeat. If it fires, the connection is likely dead — close and reconnect.
3. Subscribe After Submitting Jobs
Submit the scrape job via REST API first, then subscribe to it via WebSocket. This ensures you have the job ID before subscribing.
4. Unsubscribe From Completed Jobs
Once a job reaches succeeded or failed, unsubscribe to free resources. Or close the connection if you are done.
5. Handle All Event Types
Even if you only care about succeeded, always handle error and failed events to avoid silent hangs.
6. Keep Your API Key Secure
In browser applications, connect through your own backend proxy instead of embedding the API key in client-side JavaScript.
Limits & Constraints
- Heartbeat interval: 30 seconds (server-initiated)
- Slow client timeout: 5 seconds per message send — clients that cannot keep up are automatically disconnected
- Authentication: Required before connection is accepted (API key via query parameter or protocol header)
- Job ownership: You can only subscribe to jobs submitted with your own API key
- Subscriptions: Multiple jobs per connection, multiple connections per user