A CSS selector is a pattern syntax originally designed for styling web pages that is widely used in web scraping to target specific HTML elements. CSS selectors match elements by tag name (`div`), class (`.price`), ID (`#product-title`), attribute (`a[href]`), hierarchical relationship (`article > p`), or combinations of these. The syntax is concise and readable, making it the most common extraction method for structured HTML.
CSS selectors are supported natively in browsers via `document.querySelector()` and `document.querySelectorAll()`, and in server-side parsing libraries including BeautifulSoup (`soup.select()`), Cheerio (`$('.price')`), and lxml (`tree.cssselect()`). Playwright and Puppeteer accept CSS selectors directly in element interaction methods.
The primary limitation of CSS selectors is fragility: they are coupled to the specific DOM structure of the target page. If the site changes its HTML (redesign, A/B test, framework migration), the selector may return empty results silently. For stable internal structures, CSS selectors are the most efficient extraction method; for variable or frequently changing pages, AI-powered schema extraction is more resilient.