News Aggregation API
Collect articles from news sites, regional media portals, and content platforms at scale. AlterLab extracts clean article content from any CMS-based publisher — WordPress, custom platforms, JavaScript-rendered sites. Multi-site batch crawls with change detection. From $0.0002/request.
News Sites → Clean Articles → Your Feed
One API call per article. Structured content ready for your pipeline.
import requests
# Extract article with structured metadata
response = requests.post(
"https://api.alterlab.io/v1/scrape",
headers={"X-API-Key": api_key},
json={
"url": "https://news.example.com/article/123",
"formats": ["markdown"],
}
)
article = response.json()["data"]["markdown"]
# Clean article text, no ads or nav# Monitor multiple news sources at once
response = requests.post(
"https://api.alterlab.io/v1/batch/scrape",
headers={"X-API-Key": api_key},
json={
"urls": [
"https://news-a.com",
"https://news-b.com",
"https://regional-daily.eu"
],
"formats": ["markdown"]
}
)
# Compare URLs to detect new articlesGlobal Coverage — Any Region, Any Language
AlterLab works with news publishers worldwide — European regional media, Asian news portals, Latin American dailies, and everything in between. Most regional news sites use standard CMS platforms that AlterLab handles at Tier 1 pricing ($0.0002/request). Content is returned in its original language and character encoding with no translation or modification.
Building a News Aggregation Pipeline
From raw news sites to a structured, continuously updated article feed.
Define Your News Sources
List the news sites and media portals you want to monitor. Include front pages, section pages (politics, business, technology, sports), and any RSS endpoints. AlterLab works with any publisher — WordPress blogs, custom CMS platforms, JavaScript-rendered news apps, and static HTML sites.
Schedule Multi-Site Crawls
Use the batch endpoint to check all your source pages on a recurring schedule. For breaking news, poll every 15 minutes. For daily digests, once per day is sufficient. The batch endpoint handles hundreds of URLs per call, so monitoring 50+ sources is a single API request.
Extract Clean Article Content
For each new article URL detected, call the scrape endpoint with markdown format for clean text or JSON with an extraction schema for structured fields (headline, author, publish date, category, article body). AlterLab removes ads, navigation, cookie banners, and related-article widgets — returning only the article itself.
Store, Deduplicate, and Distribute
Load extracted articles into your database or search index. Use headline similarity and URL canonicalization to deduplicate across sources covering the same story. Build RSS feeds, email digests, or real-time dashboards from your aggregated content.
News Aggregation Use Cases
How teams use AlterLab for news and media data collection.
Media Monitoring
Track brand mentions, competitor coverage, and industry news across hundreds of publishers. Schedule hourly crawls and filter by keyword to build real-time media intelligence dashboards.
Regional News Aggregation
Collect articles from local and regional news outlets — European dailies, Asian business press, Latin American media. AlterLab handles any language and CMS platform at consistent pricing.
Content Republishing Feeds
Build licensed content feeds from partner publishers. Extract clean article text with metadata (author, date, category) for integration into your content platform or news reader application.
Trend Detection & Analysis
Aggregate articles across sources to detect emerging stories, track topic frequency, and identify narrative shifts. Feed extracted content into NLP pipelines for sentiment and topic analysis.
Compliance & Regulatory Monitoring
Monitor government news portals, regulatory agency publications, and industry press for policy changes, enforcement actions, and compliance updates relevant to your business.
Research & Academic Collections
Collect news articles for media studies, political science research, or training datasets. AlterLab's structured markdown output is ready for corpus analysis and NLP processing.
News Aggregation API — FAQ
Related Resources
Market Research
Monitor reviews, track trends, and analyze competitor sentiment at scale.
AI Training Data
Collect web data for LLM fine-tuning and RAG knowledge bases.
Other Use Cases
Price monitoring, lead generation, SERP tracking — see all AlterLab use cases.
Pricing
From $0.0002/request. No subscriptions. Balance never expires.
Your first scrape.
Sixty seconds.
$1 free balance. No credit card. No SDK.
Just a POST request.
No credit card required · Up to 5,000 free scrapes · Balance never expires