protocol

Content-Type

The Content-Type HTTP header describes the media type and encoding of a request or response body, telling the recipient how to parse the payload.

Content-Type is one of the most important HTTP headers for scrapers. In responses, it tells the scraper how to interpret the body: `text/html; charset=utf-8` (HTML document), `application/json` (JSON payload), `application/octet-stream` (binary data), `text/csv` (CSV file), or `application/pdf` (PDF document). The charset parameter specifies the character encoding; mismatched encoding causes garbled text extraction.

In POST requests, Content-Type tells the server how to parse the request body: `application/x-www-form-urlencoded` (HTML form submission), `multipart/form-data` (file uploads), or `application/json` (API call with JSON body). Sending the wrong Content-Type for a POST will often result in a 400 Bad Request error because the server cannot parse the body correctly.

For API discovery, checking whether a URL returns `application/json` when scraped reveals undocumented API endpoints that power the site's frontend — a more reliable data source than HTML parsing.

Examples

import requests

response = requests.get("https://example.com/api/products")
content_type = response.headers.get("Content-Type", "")

if "application/json" in content_type:
    data = response.json()
elif "text/html" in content_type:
    # Parse HTML
    pass
else:
    print(f"Unknown content type: {content_type}")

Related Terms

Extract Content-Type data from any website

AlterLab returns clean, structured data from any public URL — no scraper infrastructure needed. Start free, no credit card required.

View API docs

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.Just a POST request.

terminal
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "formats": ["markdown"]}'

No credit card required · Up to 5,000 free scrapes · Balance never expires

    Content-Type — Web Scraping Glossary | AlterLab