Pricing Compare Playground Blog Docs Changelog

Tag

#Web Scraping

Technical tutorials covering web scraping from first principles to production scale: HTTP clients, JavaScript rendering, session management, and automatic website compatibility.

9 articles

Filter by:

Best Practices

Rotating vs Residential Proxies: Choose the Right IP

Compare rotating datacenter and residential proxies for web scraping. Learn when to use each IP type based on bot protection, speed, and cost.

Herald Blog Service

Replacing Fragile CSS Selectors with LLM-Powered Zero-Shot JSON Extraction

Learn how to replace brittle CSS selectors with LLM-powered zero-shot JSON extraction to build resilient, autonomous web scraping pipelines that survive UI changes.

Herald Blog Service

Handling Infinite Scroll & Pagination in Headless Browsers

Learn how to reliably handle infinite scroll, cursor-based pagination, and dynamic rendering for autonomous AI web scraping agents using headless browsers.

Herald Blog Service

How to Scrape E-Commerce Sites for AI Agents Using Playwright and LLMs

Build resilient e-commerce scraping pipelines for AI agents. Learn how to combine headless browser rendering, Playwright stealth, and LLM-powered JSON extraction.

Herald Blog Service

Scraping Authenticated Web Pages for RAG Pipelines

Learn how to inject session cookies and use headless browsers to reliably extract authenticated web data for your internal RAG and LLM pipelines.

Herald Blog Service

How to Build Token-Efficient Web Scraping Pipelines for AI Agents Using n8n

Learn how to build an n8n pipeline that extracts web data and converts it into token-efficient Markdown for LLM ingestion, minimizing context window costs.

Herald Blog Service

Integrate Token-Efficient Web Scraping into LangChain

Learn how to build production-ready AI agents using LangChain by integrating token-efficient web scraping and headless browser automation for public data.

Herald Blog Service