format

Microdata

Microdata is an HTML specification for embedding machine-readable structured data directly in HTML elements using `itemscope`, `itemtype`, and `itemprop` attributes.

Microdata was introduced alongside HTML5 as an in-band method of annotating existing HTML markup with semantic meaning. An element with `itemscope` defines a new item; `itemtype` specifies the schema type URL (typically schema.org); and `itemprop` attributes on descendant elements name the properties being described. Unlike JSON-LD, Microdata is interleaved with the visible HTML rather than separated into a script block.

From a scraping standpoint, Microdata is harder to parse than JSON-LD because the property values are embedded in diverse HTML attributes (`content`, `href`, `src`, `datetime`, or the element's text content) depending on the element type. Libraries such as Python's `extruct` can extract Microdata, JSON-LD, and RDFa from a page simultaneously and return them in a unified format.

Microdata's use has declined in favour of JSON-LD, which is preferred by Google and easier for developers to maintain without touching the visual HTML. However, older sites and CMS platforms still use Microdata heavily.

Examples

# extruct: parse all structured data formats from HTML
import extruct
import requests

response = requests.get("https://example.com/product")
data = extruct.extract(response.text, base_url="https://example.com",
                       syntaxes=["json-ld", "microdata", "rdfa"])
print(data["microdata"])

Related Terms

Extract Microdata data from any website

AlterLab returns clean, structured data from any public URL — no scraper infrastructure needed. Start free, no credit card required.

View API docs

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.Just a POST request.

terminal
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "formats": ["markdown"]}'

No credit card required · Up to 5,000 free scrapes · Balance never expires

    Microdata — Web Scraping Glossary | AlterLab