general

Data Freshness

Data freshness measures how recently scraped data was collected relative to the actual source, a critical metric for price monitoring, news aggregation, and inventory tracking.

Scraped data has a shelf life. A product price scraped 24 hours ago may no longer reflect the current price; a flight fare scraped 6 hours ago may have changed. For use cases where currency matters — competitive pricing, live inventory, news alerts — data freshness is as important as accuracy.

Freshness is typically tracked by recording a `scraped_at` timestamp alongside each record and measuring the lag between when the source updated and when the scraper captured the change. SLA-driven scraping systems define freshness requirements per data type (e.g., prices must be at most 30 minutes old) and configure crawl schedules accordingly.

Event-driven scraping improves freshness compared to fixed schedules: instead of crawling every N hours, the system triggers a re-scrape whenever a change signal is detected (a product's last-modified header changes, a sitemap entry's `<lastmod>` updates, or a webhook fires from the source platform).

Related Terms

A data pipeline is an automated sequence of steps that ingests raw data from a source, transforms it, and delivers it to a destination such as a database, data warehouse, or analytics system.

Storing copies of web responses to serve repeat requests faster, which can cause scrapers to receive stale data instead of live content.

HTTP caching stores copies of responses at the client or an intermediate proxy, allowing subsequent requests for the same resource to be served without a full server round-trip.

ETL (Extract, Transform, Load)

ETL is a data integration pattern where raw data is Extracted from a source, Transformed into the desired format, and Loaded into a destination system.

Event-Driven Scraping

Event-driven scraping triggers scrape jobs in response to external events — webhooks, schedule triggers, or message queue messages — rather than running on a fixed polling interval.

Extract Data Freshness data from any website

AlterLab returns clean, structured data from any public URL — no scraper infrastructure needed. Start free, no credit card required.

Back to Glossary

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.
Just a POST request.

terminal

curl -X POST https://api.alterlab.io/v1/scrape \

-H "X-API-Key: YOUR_KEY" \

-H "Content-Type: application/json" \

-d '{"url": "https://example.com", "formats": ["markdown"]}'

Start building free

No credit card required · Up to 5,000 free scrapes · Balance never expires

Data Freshness — Web Scraping Glossary | AlterLab