Markdown is a lightweight markup language designed to be readable as plain text while rendering to formatted HTML. Web scraping APIs like AlterLab can return page content as clean markdown instead of raw HTML — removing navigation, ads, and boilerplate while preserving semantic structure (headings, lists, code blocks, links).
Markdown output is particularly valuable for LLM (large language model) workflows. Clean markdown uses 60–80% fewer tokens than equivalent raw HTML, reducing API costs and fitting more content into context windows for RAG (Retrieval-Augmented Generation) pipelines.
Common markdown elements include: headings (# H1, ## H2), bold (**text**), italic (*text*), code blocks (```language), inline code (`code`), links ([text](url)), and lists (- item or 1. item). Understanding the mapping from HTML to markdown helps you predict and validate scraping output.