An embedding model transforms a piece of text — a word, sentence, paragraph, or document — into a fixed-length vector of floating-point numbers. The model is trained so that semantically related texts produce vectors that are geometrically close (high cosine similarity), while unrelated texts produce distant vectors. This property allows semantic search: a query vector finds documents by meaning rather than exact keyword match.
Embeddings are the foundation of modern retrieval systems, clustering algorithms, recommendation engines, and RAG pipelines. Common embedding models include OpenAI's `text-embedding-3-large`, Cohere Embed, and open-source alternatives like `all-MiniLM-L6-v2` from Sentence Transformers.
In a web scraping context, scraped text is typically chunked into overlapping segments of 200–500 tokens, embedded, and stored in a vector database. At query time, the user's question is embedded and the nearest chunks are retrieved by approximate nearest-neighbour search.