protocol

WebSocket

WebSocket is a full-duplex communication protocol over a single TCP connection, used by websites to push real-time data such as live prices, chat messages, or event feeds.

WebSocket upgrades an HTTP connection to a persistent, bidirectional channel where both client and server can send messages at any time. The upgrade is initiated by the client with an HTTP Upgrade request; once established, WebSocket frames replace HTTP request/response cycles. WebSocket is widely used for real-time applications: live stock tickers, sports scores, collaborative editors, and trading platforms.

For scrapers, WebSocket presents an opportunity and a challenge. Many real-time data sources stream their updates over WebSocket rather than exposing polling REST endpoints. Subscribing to the WebSocket feed and capturing the message stream is far more efficient than polling. However, WebSocket connections require maintaining a persistent session and handling reconnection logic, ping/pong keepalives, and binary frame formats.

Browser-based scrapers can intercept WebSocket frames using the Page.webSocketFrameReceived CDP event. Libraries like `websocket-client` (Python) and `ws` (Node.js) provide standalone WebSocket clients that can connect directly to a target site's WebSocket endpoint without a browser.

Examples

# Python: subscribe to a WebSocket data feed
import asyncio, websockets, json

async def capture_feed(url):
    async with websockets.connect(url, extra_headers={"Origin": "https://target.com"}) as ws:
        await ws.send(json.dumps({"action": "subscribe", "channel": "prices"}))
        async for message in ws:
            data = json.loads(message)
            print(data)

asyncio.run(capture_feed("wss://target.com/ws"))

Related Terms

Extract WebSocket data from any website

AlterLab returns clean, structured data from any public URL — no scraper infrastructure needed. Start free, no credit card required.

View API docs

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.Just a POST request.

terminal
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "formats": ["markdown"]}'

No credit card required · Up to 5,000 free scrapes · Balance never expires

    WebSocket — Web Scraping Glossary | AlterLab