A hybrid search is an aggregation of different search methods or search
queries for the same or similar query criteria. This technique utilizes
algorithms to rank results and return unified results from the different
methods of search. You can use the $rankFusion to perform a
hybrid search.
What is Reciprocal Rank Fusion?
Reciprocal rank fusion is a technique to combine results from different search methods into a single result set by performing the following actions:
- Calculate the reciprocal rank of the documents in the results. - For each ranked document in each search result, first add the rank ( - r) of the document with a constant number,- 60, to smooth the score (- rank_constant), and then divide- 1by the sum of- rand- rank_constantfor the reciprocal rank of the document in the results. You can't set the value of- rank_constantand it defaults to- 60.- reciprocal_rank = 1 / ( r + rank_constant ) - For each method of search, apply different weights ( - w) to give more importance to that method of search. For each document, the weighted reciprocal rank is calculated by multiplying the weight by the reciprocal rank of the document.- weighted_reciprocal_rank = w x reciprocal_rank 
- Combine the rank-derived and weighted scores of the documents in the results. - For each document across all search results, add the calculated reciprocal ranks for a single score for the document. 
- Sort the results by the combined score of the documents in the results. - Sort the documents in the results based on the combined score across the results for a single, combined ranked list of documents in the results. 
About the Different Hybrid Search Use Cases
You can leverage MongoDB Vector Search to perform several types of hybrid search. Specifically, MongoDB Vector Search supports the following use cases:
- Full-text and vector search in a single query: You can combine results from different search methods, such as a semantic and a full-text search. You can use the - $vectorSearchfor the semantic search and the- $searchfor the full-text search results and combine the results by using the reciprocal rank fusion technique. To learn more, see the Perform Hybrid Search with MongoDB Vector Search and MongoDB Search tutorial, which demonstrates how to perform a semantic search and full-text search against the- sample_mflix.embedded_moviesnamespace and retrieve combined ranked results by using reciprocal rank fusion.- Alternatively, for a more granular hybrid search where the score matters in addition to the relative ordering of results, you can use the - $scoreFusionpipeline stage. To learn more, see the Perform Hybrid Search with MongoDB Vector Search and MongoDB Search tutorial, which demonstrates how to perform a semantic search and full-text search against the- sample_mflix.embedded_moviesnamespace and retrieve input pipeline results into a final scored results set.- While - $rankFusionranks documents based on their positions (relative ranks) in input pipelines using the Reciprocal Rank Fusion algorithm,- $scoreFusionranks documents based on scores assigned by the input pipelines, using mathematical expressions for combining the results.- In - $rankFusion, rankings are influenced by pipeline weights. In- $scoreFusion, weights control the contribution of each pipeline's scores to the final result.
- Multiple vector search queries in a single query: The MongoDB - $rankFusionpipeline supports multiple sub-pipelines that contain vector search queries executed against the same collection and combining their results using the reciprocal rank fusion technique. The How to Combine Multiple- $vectorSearchQueries tutorial demonstrates the following types of vector search:- Perform a comprehensive search of your dataset for semantically similar terms in the same query. 
- Search multiple fields in your dataset to determine which fields return the best results for the query. 
- Search using embeddings from different embedding models to determine the semantic interpretation differences between the different models. 
 
Considerations
When using the $rankFusion or $scoreFusion
pipeline stage for hybrid search, consider the following.
Disjoint Result Sets
If you want to capture false negatives that one search methodology couldn't catch, having disjoint results from individual sub-pipelines might be acceptable. When you have disjoint results, most or all of the results might appear to be returned from one of the pipelines and not the other. However, if you want all the sub-pipelines to return similar results, try increasing the number of results per sub-pipeline.
Weights
We recommend weighing lexical and vector queries on a per-query basis rather than having static weights for all queries to improve the relevance of the results for each query. This also improves computation resource utilization by allocating resources on the query that needs it most.
Multiple Pipelines
You can combine an arbitrary number of sub-pipelines together in the
$rankFusion or $scoreFusion stage, but they must all execute
against the same collection. You can't use the $rankFusion or
$scoreFusion stage to search across collections. Use the
$unionWith stage with $vectorSearch for
cross-collection search.
Non-Search Pipelines
We recommend using $match, $sort, and so on in
your pipeline to boost on specific fields within your collection
without requiring a search pipeline.
Geospatial Relevance
You can use the $geoNear and the near operator
inside $search for a geographic location search within the
$rankFusion or $scoreFusion stage. However, the
$geoNear and the near operator use
different coordinate reference frames. Therefore, the result ordinals
and scores might not be identical.
Limit the Results
We recommend setting limits for the number of results to return for each sub-pipeline.
Limitations
The following limitations apply to hybrid search using
$rankFusion and $scoreFusion:
- $rankFusionis only supported on MongoDB 8.0.14 (including latest version with auto upgrades).- Note- MongoDB 8.0.14 is being released and it might take up to two weeks. When you upgrade from 8.0, you might have to pause executing - rankFusionqueries.
- $rankFusionand- $scoreFusionsub-pipelines can contain only the following stages:
- $rankFusionand- $scoreFusionpreserve a traceable link back to the original input document for each sub-pipeline. Therefore, it doesn't support the following:- $projectstage
- storedSource fields 
 
- $rankFusionand- $scoreFusionsub-pipelines run serially, not in parallel.
- $rankFusionand- $scoreFusiondon't support pagination.
- rankFusioncan be run on Views only on clusters running MongoDB 8.0 or higher. You can't run- rankFusionwithin a view definition or on a time series collection.
Prerequisites
To try these tutorials, you must have the following:
- An Atlas cluster with MongoDB version v8.0 or later. 
- The sample_mflix database loaded into your Atlas cluster. 
- mongoshto try the queries on your Atlas cluster.- Note- You can also try these hybrid search use cases with local Atlas deployments that you create with the Atlas CLI. To learn more, see Create a Local Atlas Deployment.