In a web crawl, depth 0 is the seed URL itself, depth 1 is every page linked from the seed, depth 2 is pages linked from depth-1 pages, and so on. Setting a maximum crawl depth prevents the crawler from venturing into deeply nested pages that are unlikely to contain the target data and helps control crawl budget.
Depth limits are especially important for sites with infinite or near-infinite link graphs generated by parameter combinations (calendar pages, faceted search results). A crawler with no depth limit encountering such a site will generate an unbounded crawl frontier.
In Scrapy, depth is controlled by the `DEPTH_LIMIT` setting. Individual spiders can override depth per request by checking `response.meta.get('depth')` and declining to follow links past a threshold. Some crawlers use a breadth-first strategy (process all depth-N pages before depth-N+1) to ensure the highest-level pages are captured first even if the crawl is terminated early.