protocol

Gzip Compression

Gzip is a lossless data compression algorithm used to reduce HTTP response body size, decreasing transfer time for HTML and JSON responses by 60–90%.

When a browser or scraper sends `Accept-Encoding: gzip` in a request header, the server may compress the response body with gzip before sending it. Large HTML documents (50–200 KB uncompressed) typically compress to 10–40 KB, significantly reducing bandwidth consumption and transfer latency — especially on mobile or slow connections.

Most HTTP client libraries (Python's requests, httpx, Node's axios) automatically send `Accept-Encoding: gzip, deflate, br` and transparently decompress the response, so scrapers receive the decompressed body without extra code. Brotli (`br`) is a newer, higher-compression alternative to gzip that is widely supported by modern browsers and CDNs.

For large-scale scraping, always requesting gzip compression reduces bandwidth costs. On cloud providers, egress is billed by byte transferred; compressed responses can halve the egress bill for HTML-heavy crawls.

Examples

import requests

# gzip is requested automatically by requests
response = requests.get("https://example.com",
    headers={"Accept-Encoding": "gzip, deflate, br"})
print(response.headers.get("Content-Encoding"))  # 'gzip' if server compressed
print(f"Received {len(response.content)} bytes (decompressed automatically)")

Related Terms

Extract Gzip Compression data from any website

AlterLab returns clean, structured data from any public URL — no scraper infrastructure needed. Start free, no credit card required.

View API docs

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.Just a POST request.

terminal
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "formats": ["markdown"]}'

No credit card required · Up to 5,000 free scrapes · Balance never expires

    Gzip Compression — Web Scraping Glossary | AlterLab