You can store vector embeddings alongside your other MongoDB data. These embeddings capture meaningful relationships in your data and allow you to perform semantic search and implement RAG.
Get Started
Use the following tutorial to learn how to create vector embeddings and query them using vector search. Specifically, you perform the following actions:
- Define a function that uses an embedding model to generate vector embeddings. - Select whether you want to use a proprietary or open-source model. For state-of-the-art embeddings, use Voyage AI. 
- Create embeddings from your data and store them in MongoDB. - Select whether you want to create embeddings from new data or from existing data that you already have in a MongoDB collection. 
- Create embeddings from your search terms and run a vector search query. 
For production applications, you typically write a script to generate vector embeddings. You can start with the sample code on this page and customize it for your use case.
Prerequisites
To complete this tutorial, you must have the following:
Use an Embedding Model
Create Embeddings from Data
In this section, you create vector embeddings from your data using the function that you defined, and then you store these embeddings in a MongoDB collection.
Create Embeddings for Queries
In this section, you index the vector embeddings in your collection and create an embedding that you use to run a sample vector search query.
The vector search returns documents whose embeddings are closest in distance to the embedding from your query. This indicates that they are similar in meaning.
Considerations
Consider the following factors when creating vector embeddings:
Choosing a Method to Create Embeddings
In order to create vector embeddings, you must use an embedding model. Embedding models are algorithms that you use to generate numerical representations of your data. Choose one of the following ways to access an embedding model:
| Method | Description | 
|---|---|
| Load an open-source model | If you don't have an API key for a proprietary embedding model, load an open-source embedding model locally from your application. | 
| Use a proprietary model | Most AI providers offer APIs for their proprietary embedding models that you can use to create vector embeddings. For state-of-the-art embeddings, use Voyage AI. | 
| Leverage an integration | You can integrate MongoDB Vector Search with open-source frameworks and AI services to quickly connect to both open-source and proprietary embedding models and generate vector embeddings for MongoDB Vector Search. To learn more, see Integrate MongoDB with AI Technologies. | 
Choosing an Embedding Model
The embedding model you choose affects your query results and determines the number of dimensions you specify in your MongoDB Vector Search index. Each model offers different advantages depending on your data and use case. For state-of-the-art embeddings, including multi-modal and domain-specific embedding models, use Voyage AI.
When choosing an embedding model for MongoDB Vector Search, consider the following metrics:
- Embedding Dimensions: The length of the vector embedding. - Smaller embeddings are more storage efficient, while larger embeddings can capture more nuanced relationships in your data. The model you choose should strike a balance between efficiency and complexity. 
- Max Tokens: The number of tokens that can be compressed in a single embedding. 
- Model Size: The size of the model in gigabytes. - While larger models perform better, they require more computational resources as you scale MongoDB Vector Search to production. 
- Retrieval Average: A score that measures the performance of retrieval systems. - A higher score indicates that the model is better at ranking relevant documents higher in the list of retrieved results. This score is important when choosing a model for RAG applications. 
Vector Compression
If you have a large number of float vectors and want to reduce the
storage and WiredTiger footprint
(such as disk and memory usage) in mongod, compress your embeddings
by converting them to binData vectors.
BinData is a BSON data type
that stores binary data. The default type for vector embeddings is an
array of 32-bit floats (float32). Binary data is more storage efficient than
the default array format, and therefore requires three times less disk space.
Storing binData vectors improves query performance
since less resources are needed to load a document into the working set.
This can significantly improve query speed for vector queries where you are
returning over 20 documents. If you compress your float32 embeddings,
you can query them with either float32 or binData vectors.
The tutorial on this page includes an example function that you can use to convert your
float32 vectors to binData vectors.
Supported Drivers
BSON BinData vectors are supported by the following drivers:
- C++ Driver v4.1.0 or later 
- C#/.NET Driver v3.2.0 or later 
- Go Driver v2.1.0 or later 
- PyMongo Driver v4.10 or later 
- Node.js Driver v6.11 or later 
- Java Driver v5.3.1 or later 
Background
Float vectors are typically difficult to compress because each
element in the array has its own type (despite most vectors
being uniformly typed). For this reason, converting
the float vector output of an embedding model to a binData
vector with subtype float32 is a more efficient
serialization scheme. binData vectors store a single type
descriptor for the entire vector, which reduces storage overhead.
Validating Your Embeddings
Consider the following strategies to ensure that your embeddings are correct and optimal:
Consider the following best practices when generating and querying your embeddings:
- Test your functions and scripts. - Generating embeddings takes time and computational resources. Before you create embeddings from large datasets or collections, test that your embedding functions or scripts work as expected on a small subset of your data. 
- Create embeddings in batches. - If you want to generate embeddings from a large dataset or a collection with many documents, create them in batches to avoid memory issues and optimize performance. 
- Evaluate performance. - Run test queries to check if your search results are relevant and accurately ranked. - To learn more about how to evaluate your results and fine-tune the performance of your indexes and queries, see How to Measure the Accuracy of Your Query Results and Benchmark for MongoDB Vector Search. 
Consider the following strategies if you encounter issues with your embeddings:
- Verify your environment. - Check that the necessary dependencies are installed and up-to-date. Conflicting library versions can cause unexpected behavior. Ensure that no conflicts exist by creating a new environment and installing only the required packages. - Note- If you're using Colab, ensure that your notebook session's IP address is included in your Atlas project's access list. 
- Monitor memory usage. - If you experience performance issues, check your RAM, CPU, and disk usage to identify any potential bottlenecks. For hosted environments like Colab or Jupyter Notebooks, ensure that your instance is provisioned with sufficient resources and upgrade the instance if necessary. 
- Ensure consistent dimensions. - Verify that the MongoDB Vector Search index definition matches the dimensions of the embeddings stored in MongoDB and your query embeddings match the dimensions of the indexed embeddings. Otherwise, you might encounter errors when running vector search queries. 
To troubleshoot specific problems, see Troubleshooting.
Next Steps
Once you've learned how to create embeddings and query your embeddings with MongoDB Vector Search, start building generative AI applications by implementing retrieval-augmented generation (RAG):
You can also quantize your 32-bit float vector embeddings into fewer bits to further reduce resource consumption and improve query speed. To learn more, see Vector Quantization.