SEC EDGAR Data API: Extract Structured JSON in 2026

Get structured JSON from SEC EDGAR via AlterLab’s API. Extract title, identifier, date_published and more with schema validation. Always start with the answer and keep it concise.

Herald Blog ServiceJuly 2, 2026

4 min read

6 views

Data Extraction

APIs

Python

AlterLab handles this automatically — scrape any URL with one API call. No infrastructure required.

Try it free

TL;DR

Extract SEC EDGAR pages with a POST to the Extract API, define a JSON schema for title, identifier, date_published, category and description, and receive validated JSON. This approach avoids fragile HTML parsing and gives predictable cost.

Why use SEC EDGAR data?

AI training pipelines that need clean, government‑issued filings
Financial analytics that track 10‑K and 8‑K filings across companies
Competitive intelligence that monitors filing frequency and topics

What data can you extract?

SEC EDGAR publishes only public filings. Typical fields include:

title: The document headline
identifier: CIK or accession number
date_published: Filing date in ISO format
category: Document type such as "10-K" or "8-K"
description: Brief summary of the filing’s content

All of these are openly available; no login or paywall is required.

The extraction approach

Scraping SEC EDGAR pages with raw HTTP requests and HTML parsing breaks whenever the site updates its layout or adds anti‑bot checks. A data API abstracts that complexity. AlterLab’s Extract API handles:

Automatic request routing and proxy rotation
HTML‑to‑JSON conversion that respects robots.txt
Schema validation that guarantees field types

The result is a predictable, typed JSON payload you can store directly in your pipeline.

Quick start with AlterLab Extract API

First install the client library or use curl. See our Getting started guide for full setup details.

Python example

Python

import alterlab

client = alterlab.Client("YOUR_API_KEY")

schema = {
  "type": "object",
  "properties": {
    "title": {"type": "string", "description": "The title field"},
    "identifier": {"type": "string", "description": "The identifier field"},
    "date_published": {"type": "string", "description": "The date published field"},
    "category": {"type": "string", "description": "The category field"},
    "description": {"type": "string", "description": "The description field"}
  }
}

result = client.extract(
    url="https://sec.gov/example-page",
    schema=schema,
)
print(result.data)

cURL example

Bash

curl -X POST https://api.alterlab.io/v1/extract \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://sec.gov/example-page",
    "schema": {"properties": {"title": {"type": "string"}, "identifier": {"type": "string"}, "date_published": {"type": "string"}}}
  }'

Both examples return a JSON object that matches the schema exactly, eliminating the need for post‑processing.

Define your schema

The schema parameter describes the shape of the output. Use standard JSON Schema syntax; AlterLab validates the extracted data against it and returns only fields that conform. This guarantees that your downstream code can rely on title being a string, date_published on an ISO‑8601 timestamp, and so on.

Handle pagination and scale

For a single filing the request is quick, but high‑volume pipelines need batching. Use the /v1/batch endpoint to queue multiple URLs, then poll for completion. Responses include a job ID you can use with webhooks to trigger downstream processing.

Cost scales with request complexity. Review AlterLab pricing at AlterLab pricing to estimate expense before committing. Minimum cost is $0.001; maximum is $0.50. When you register a BYOK key, the orchestration fee is a flat $0.0003; otherwise the platform rate applies.

Key takeaways

SEC EDGAR provides only public data; always respect robots.txt.
Use a schema to get typed JSON without manual parsing.
AlterLab’s Extract API manages anti‑bot bypass, cost estimation and scaling.
Batch and async workflows let you process hundreds of filings per minute.

99.2%Extraction Accuracy

1.4sAvg Response Time

100%Typed JSON Output

Try it yourself

Extract structured government data from SEC EDGAR

curl -X POST https://api.alterlab.io/v1/scrape \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://sec.gov"}'

Enable JavaScript to try the live demo, or sign up to use the API directly.

Batch/async usage example

Python

import alterlab, asyncio

client = alterlab.Client("YOUR_API_KEY")

urls = [
    "https://sec.gov/filing1",
    "https://sec.gov/filing2",
    "https://sec.gov/filing3"
]

async def extract_one(url):
    schema = {
      "type": "object",
      "properties": {
        "title": {"type": "string"},
        "identifier": {"type": "string"},
        "date_published": {"type": "string"}
      }
    }
    return await client.extract_async(url=url, schema=schema)

jobs = [extract_one(u) for u in urls]
results = await asyncio.gather(*jobs)
for r in results:
    print(r.data)

This pattern lets you fire many requests in parallel and handle responses as they arrive, ideal for large‑scale data pipelines.

Was this article helpful?

Try it yourself

One API call. Any language.

Python SDK, Node SDK, or plain HTTP. Get started in under a minute.

from alterlab import AlterLab

client = AlterLab(api_key="YOUR_KEY")
result = client.scrape("https://example.com")
print(result.markdown)

No credit card required · 5,000 free requests

Frequently Asked Questions

The SEC provides public RSS feeds and bulk data. No official JSON API exists; services like AlterLab fill the gap by offering compliant extraction with typed output.

You can extract publicly listed fields such as title, identifier, date_published, category and description using a schema that enforces typed JSON.

Pricing is pay‑as‑you‑go on AlterLab; costs clamp between $0.001 and $0.50 per request. No minimums, balance expires only when spent.

Herald Blog Service

View all posts

Tutorials

How to Scrape Stack Overflow Data in 2026

A 2026 guide showing how to scrape stack overflow with Python, Node.js, and AlterLab, covering anti‑bot hurdles, pricing tiers, and best practices for clean extraction.

Herald Blog Service

Jul 2, 2026

Tutorials

How to Give Your AI Agent Access to TripAdvisor Data

Learn how to connect your AI agent to TripAdvisor data using structured extraction and MCP to build high-performance RAG pipelines and hospitality intelligence.

Herald Blog Service

Jul 2, 2026

Tutorials

How to Give Your AI Agent Access to Capterra Data

Learn how to equip your AI agent with structured Capterra data for software research pipelines using AlterLab's Extract API. Get clean JSON without parsing HTML.

Herald Blog Service

Jul 1, 2026

Stay in the Loop

Get scraping insights, API tips, and platform updates. No spam — we only send when we have something worth reading.

Web Scraping API Resources

Part of the Web Scraping API Documentation cluster

Web Scraping API Documentation

Complete API reference with 5-tier auto-escalation — Curl to challenge resolution.

Pillar page

JavaScript Rendering Guide

Configure Tier 4 browser rendering for SPAs and dynamic content.

Authenticated Scraping Guide

Scrape pages behind login using session management.

Web Scraping API Benchmarks

Real success rates and cost data across all 5 tiers.

AlterLab for AI Agents

MCP Server, Python SDK, and Firecrawl-compatible API for AI agent workflows.

TL;DR

Why use SEC EDGAR data?

What data can you extract?

The extraction approach

Quick start with AlterLab Extract API

Python example

cURL example

Define your schema

Handle pagination and scale

Key takeaways

Batch/async usage example

Frequently Asked Questions

Related Articles

How to Scrape Stack Overflow Data in 2026

How to Give Your AI Agent Access to TripAdvisor Data

How to Give Your AI Agent Access to Capterra Data

Popular Posts

Playwright Bot Detection: What Actually Works in 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Twitter/X: Complete Guide for 2026

Best Web Scraping APIs in 2026: Complete Comparison Guide

How to Scrape Cloudflare-Protected Sites in 2026

Recommended

How to Scrape AliExpress: Complete Guide for 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

AlterLab vs Firecrawl: Which Scraping API Is Better in 2026?

How to Scrape Twitter/X Data: Complete Guide for 2026

How to Scrape Cloudflare-Protected Sites in 2026

Newsletter

Recommended Reading

How to Scrape AliExpress: Complete Guide for 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

AlterLab vs Firecrawl: Which Scraping API Is Better in 2026?

How to Scrape Twitter/X Data: Complete Guide for 2026

How to Scrape Cloudflare-Protected Sites in 2026

Stay in the Loop

Explore AlterLab

Python Web Scraping API

Compare Scraping APIs

Pricing

Documentation

Web Scraping API Resources