Definition
Embedding
An embedding is a numerical representation of text, images or other data that captures meaning for search, comparison and AI workflows.
Short definition
An embedding is a list of numbers that represents the meaning or features of an item. Text passages, images, products, users and documents can all be converted into embeddings so software can compare them mathematically.
How it works
An embedding model maps input data into a vector space. Items with similar meaning are placed closer together. This makes it possible to search by meaning rather than exact keywords.
Example
In a support knowledge base, the questions password reset failed and cannot access my account may have similar embeddings even though they use different words. A retrieval system can use that similarity to find helpful documents.
Why it matters
Embeddings are core to semantic search, recommendation systems and RAG. Their usefulness depends on the embedding model, the data domain and how vectors are indexed and evaluated.
How embeddings are compared
Common similarity measures include cosine similarity, dot product and Euclidean distance. The metric should match the embedding model's intended use. Documents and queries must be encoded with the same model, and switching models normally requires rebuilding the index.
Common limitations
An embedding may capture general meaning but struggle with numbers, negation or specialized terminology. Semantic search is therefore often combined with filters and lexical search. Privacy also matters: a vector is not readable like a sentence, but it still represents source content and should be protected when that content is sensitive.