News scrapingMedia monitoringContent aggregation

News Aggregation API

Collect articles from news sites, regional media portals, and content platforms at scale. AlterLab extracts clean article content from any CMS-based publisher — WordPress, custom platforms, JavaScript-rendered sites. Multi-site batch crawls with change detection. From $0.0002/request.

Documentation
No credit card
SOC 2 aligned
99.9% uptime
Simple Pricing
$1
One dollar
=
5,000
Requests
Pay as you go
No subscriptions
Never expires
2,847,653+
Requests processed this week

News Sites → Clean Articles → Your Feed

One API call per article. Structured content ready for your pipeline.

extract_article.py
import requests

# Extract article with structured metadata
response = requests.post(
    "https://api.alterlab.io/v1/scrape",
    headers={"X-API-Key": api_key},
    json={
        "url": "https://news.example.com/article/123",
        "formats": ["markdown"],
    }
)

article = response.json()["data"]["markdown"]
# Clean article text, no ads or nav
monitor_sources.py
# Monitor multiple news sources at once
response = requests.post(
    "https://api.alterlab.io/v1/batch/scrape",
    headers={"X-API-Key": api_key},
    json={
        "urls": [
            "https://news-a.com",
            "https://news-b.com",
            "https://regional-daily.eu"
        ],
        "formats": ["markdown"]
    }
)
# Compare URLs to detect new articles

Global Coverage — Any Region, Any Language

AlterLab works with news publishers worldwide — European regional media, Asian news portals, Latin American dailies, and everything in between. Most regional news sites use standard CMS platforms that AlterLab handles at Tier 1 pricing ($0.0002/request). Content is returned in its original language and character encoding with no translation or modification.

Building a News Aggregation Pipeline

From raw news sites to a structured, continuously updated article feed.

1

Define Your News Sources

List the news sites and media portals you want to monitor. Include front pages, section pages (politics, business, technology, sports), and any RSS endpoints. AlterLab works with any publisher — WordPress blogs, custom CMS platforms, JavaScript-rendered news apps, and static HTML sites.

2

Schedule Multi-Site Crawls

Use the batch endpoint to check all your source pages on a recurring schedule. For breaking news, poll every 15 minutes. For daily digests, once per day is sufficient. The batch endpoint handles hundreds of URLs per call, so monitoring 50+ sources is a single API request.

3

Extract Clean Article Content

For each new article URL detected, call the scrape endpoint with markdown format for clean text or JSON with an extraction schema for structured fields (headline, author, publish date, category, article body). AlterLab removes ads, navigation, cookie banners, and related-article widgets — returning only the article itself.

4

Store, Deduplicate, and Distribute

Load extracted articles into your database or search index. Use headline similarity and URL canonicalization to deduplicate across sources covering the same story. Build RSS feeds, email digests, or real-time dashboards from your aggregated content.

News Aggregation Use Cases

How teams use AlterLab for news and media data collection.

Media Monitoring

Track brand mentions, competitor coverage, and industry news across hundreds of publishers. Schedule hourly crawls and filter by keyword to build real-time media intelligence dashboards.

Regional News Aggregation

Collect articles from local and regional news outlets — European dailies, Asian business press, Latin American media. AlterLab handles any language and CMS platform at consistent pricing.

Content Republishing Feeds

Build licensed content feeds from partner publishers. Extract clean article text with metadata (author, date, category) for integration into your content platform or news reader application.

Trend Detection & Analysis

Aggregate articles across sources to detect emerging stories, track topic frequency, and identify narrative shifts. Feed extracted content into NLP pipelines for sentiment and topic analysis.

Compliance & Regulatory Monitoring

Monitor government news portals, regulatory agency publications, and industry press for policy changes, enforcement actions, and compliance updates relevant to your business.

Research & Academic Collections

Collect news articles for media studies, political science research, or training datasets. AlterLab's structured markdown output is ready for corpus analysis and NLP processing.

News Aggregation API — FAQ

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.Just a POST request.

terminal
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "formats": ["markdown"]}'

No credit card required · Up to 5,000 free scrapes · Balance never expires

    News Scraping API — Aggregate Articles from News Sites & Media Portals | AlterLab