Blog Post

Educator Developer Blog
7 MIN READ

Adopting Hybrid Search with Azure Cosmos DB

kevin_comba's avatar
kevin_comba
Iron Contributor
Feb 12, 2025

In today's data-driven landscape, the ability to efficiently search and retrieve information is paramount. Azure Cosmos DB introduces Hybrid Search, a powerful feature that combines the strengths of both Vector and Full-Text Search. This hybrid approach facilitates the creation of more nuanced and effective search experiences by integrating semantic understanding with traditional keyword-based search. In this blog post, we'll delve into the intricacies of Hybrid Search in Azure Cosmos DB, exploring its features, enabling mechanisms, and practical use cases.

Introduction to Hybrid Search

 
Hybrid Search in Azure Cosmos DB seamlessly combines Vector Search and Full-Text Search to deliver highly relevant search results. By leveraging semantic understanding alongside keyword-based search, Hybrid Search enhances the accuracy and relevance of search outcomes, making it ideal for applications that require sophisticated data retrieval capabilities.

Why Hybrid Search?

 

  • Enhanced Relevance: Combines multiple scoring functions to improve result accuracy.
  • Versatility: Supports a wide range of applications from e-commerce to AI-driven chatbots.
  • Scalability: Efficiently handles high-dimensional vectors at any scale.

Understanding Vector Search

 
Vector Search in Azure Cosmos DB enables the storage and querying of high-dimensional vectors, facilitating efficient and accurate searches based on vector similarity. This is particularly useful for applications involving machine learning and AI, where data is often represented as vectors.

Key Features of Vector Search:

 

  • Efficient Indexing: Designed to handle high-dimensional vectors with optimized indexing methods.
  • Flexibility in Indexing Methods: Choose from various vector indexing methods based on your application's needs.
    • Flat (Brute-Force) Search: Provides 100% retrieval recall for smaller, focused searches.
    • Quantized Flat Index: Compresses vectors using DiskANN-based quantization for improved efficiency.
    • DiskANN Algorithms: Utilizes state-of-the-art vector indexing algorithms developed by Microsoft Research for high accuracy and efficiency.
  • Seamless Integration: Vectors are stored directly within documents, simplifying data management and AI application architectures.

How Vector Search Works:

 
Vector Search can be integrated with other Azure Cosmos DB NoSQL query filters and indexes using WHERE clauses, ensuring that your vector searches are contextually relevant to your application data.

Exploring Full-Text Search

 
Full-Text Search in Azure Cosmos DB enhances data querying capabilities by enabling advanced text processing and indexing. This feature is currently in preview and brings robust search functionalities to NoSQL databases.

Features of Full-Text Search:

 

  • Advanced Text Processing: Includes stemming, stop word removal, and tokenization for efficient text searches.
  • Specialized Text Index: Utilizes a text index optimized for rapid and accurate search operations.
  • Full-Text Scoring: Employs the BM25 algorithm to evaluate and rank the relevance of documents based on term frequency, inverse document frequency, and document length.

Benefits of Full-Text Search:

 

  • Improved Accuracy: Ensures the most relevant documents appear at the top of search results.
  • Versatile Applications: Ideal for scenarios such as e-commerce product searches, content management, customer support, user-generated content analysis, and enhancing AI-driven chatbot responses.

Enabling Vector and Full-Text Search in Azure Cosmos DB

 
To harness the power of Hybrid Search, you need to enable both Vector Search and Full-Text Search features in your Azure Cosmos DB account. Here's how to configure these features:

Enabling Vector Search

  1. Navigate to Your Azure Cosmos DB for NoSQL Resource Page:
    • Log in to the Azure Portal.
    • Go to your Azure Cosmos DB account.
  2. Access the Features Pane:
    • Select the "Features" pane under the "Settings" menu item.
  3. Enable Vector Search:
    • Locate and select the "Vector Search in Azure Cosmos DB for NoSQL" feature.
    • Read the description to understand the feature.
    • Click "Enable" to activate vector indexing and search capabilities.

Enabling Full-Text Search

  1. Navigate to Your Azure Cosmos DB for NoSQL Resource Page:
    • Log in to the Azure Portal.
    • Go to your Azure Cosmos DB account.
  2. Access the Features Pane:
    • Select the "Features" pane under the "Settings" menu item.
  3. Enable Full-Text & Hybrid Search:
    • Locate and select the "Full-Text & Hybrid Search for NoSQL API (preview)" feature.
    • Read the description to confirm your intention to enable it.
    • Click "Enable" to activate full-text indexing and search capabilities.

Note: Full Text & Hybrid Search (preview) may not be available in all regions at this time.

How Hybrid Search Works

 
Hybrid Search leverages the Reciprocal Rank Fusion (RRF) system function to combine multiple scoring functions, such as VectorDistance and FullTextScore. This combination enhances the relevance of search results beyond what individual search methods can achieve.

The RRF System:

 

  • VectorDistance: Measures the similarity between vectors to find semantically related documents.
  • FullTextScore: Evaluates document relevance based on keyword matching and text relevance (using BM25).

    By fusing these scores, Hybrid Search provides a more comprehensive and accurate ranking of search results, ensuring that users receive the most pertinent information.

Use Cases for Hybrid Search

 
Hybrid Search in Azure Cosmos DB is versatile and can be applied across various domains:

  1. E-commerce:
    • Quickly find products based on descriptions, reviews, and other text attributes.
    • Improve search accuracy and user experience by combining semantic and keyword-based searches.
  2. Content Management:
    • Efficiently search through articles, blogs, and documents.
    • Enhance content discoverability and management.
  3. Customer Support:
    • Retrieve relevant support tickets, FAQs, and knowledge base articles.
    • Streamline support processes and improve response times.
  4. User Content:
    • Analyze and search through user-generated content such as posts and comments.
    • Gain insights into user behavior and preferences.
  5. RAG for Chatbots:
    • Enhance chatbot responses by retrieving relevant information from large text corpora.
    • Improve the accuracy and relevance of AI-driven interactions.
  6. Multi-Agent AI Applications:
    • Enable multiple AI agents to collaboratively search and analyze vast amounts of text data.
    • Provide comprehensive and nuanced insights through collaborative intelligence.

Knowledge Management System

This is a simple project demonstrating building an enterprise knowledge management system using Nest.js integrated with Azure Cosmos DB's hybrid search capabilities. It combines vector similarity search with full-text search using Reciprocal Rank Fusion (RRF) for optimal results.

Project Entity

We are going to use KnowledgeItem as our entity which we want to save to our database.

 

 

As you can see we have saved content as a string and also as a vector on contentVector property, the same applies to metadataVector.

Vector Embedding Policy

Let's create VectorEmbeddingPolicy to issue instructions to the container that contentVector and metadataVector properties will save Vector embeddings of type Float32.

 

 

 

 Indexing Policy

The IndexingPolicy will ensure that indexing will help in normal queries and advanced queries. For example full-text search will be supported due to title and content property being applied under fullTextIndexes. Vector Indexes will be applied to contentVector and metadataVector properties. Then we have to exclude contentVector and metadataVector from regular indexing for better performance and to ensure optimized performance for insertion.

 

 

Full Text Policy

FullTextPolicy will ensure proper full-text search on our container on title and content properties in our entity.

 

Initialize Cosmos Client

Import all our policies and our env variables

 

 

Then create our database if it doesn’t exist. This will be followed by creating our container knowledge-items and apply indexingPolicy, vectorEmbeddingPolicy and fulltextPolicy

 

 

Consuming our database Service

Vector Search

vectorSearchContent method going to performs a vector similarity search in Azure Cosmos DB for content items based on the user's search text.

Step-by-Step Flow

  1. Sets default limit of 10 results if not specified
  2. Gets reference to Cosmos DB container
  3. Converts input search text into a vector (numerical representation)
  4. Executes a SQL query that:
    • Selects content items (id, title, content, etc.)
    • Calculates similarity score between search vector and stored vectors
    • Filters by project context
    • Orders results by closest vector match
    • Limits results to specified count

In Simple Terms

Think of it like finding similar documents in a library:

  • You give it some text to search for
  • It converts your text into a special format (vector)
  • It looks through all documents in the specified project
  • It measures how similar each document is to your search
  • Returns the most similar items, sorted by similarity

Full-text search

titleFullTextSearch method performs a full-text search on document titles in Cosmos DB using keywords from input text.

Step-by-Step Flow

  1. Takes search parameters:
    • searchText: Text to search for in titles
    • top: Maximum results (defaults to 10)
  2. Gets database container reference
  3. Splits search text into keywords
  4. Builds SQL query that:
    • Uses FullTextContainsAny for title search
    • Returns document fields (id, title, content, etc.)
    • Orders by timestamp descending
    • Limits results count
  5. Executes query and returns matching documents

In Simple Terms

Like using a library's title search:

  • You enter search words "Azure Cloud"
  • System splits into ["Azure", "Cloud"]
  • Searches for titles containing ANY of these words
  • Returns newest matches first
  • Limited to requested number of results
  • Returns full document details for matches

Example matches for "Azure Cloud":

  • "Azure Functions Development"
  • "Cloud Computing Basics"
  • "Introduction to Azure"

 

Hybrid Search

hybridSearchContent performs hybrid search combining full-text and vector similarity search in Cosmos DB using Reciprocal Rank Fusion (RRF).

Step-by-Step Flow

  1. Takes parameters:
    • searchText: Text to search
    • top: Maximum results (defaults to 10)
  2. Gets database container
  3. Performs dual conversion:
    • Generates vector from search text
    • Splits text into keywords for full-text search
  4. Executes SQL query that:
    • Uses RRF to combine:
      • Full-text search scores on content
      • Vector similarity scores
    • Returns document fields
    • Limits results count

In Simple Terms

Like having two librarians search simultaneously:

  • One looks for exact keyword matches
  • One looks for similar meaning content
  • Results are combined using RRF ranking
  • Best matches from both methods rise to top
  • Returns documents sorted by combined relevance

Example

 // Search: "deploying containerized applications"

const results = await hybridSearchContent({

  searchText: "deploying containerized applications",

  top: 5

});

// Finds both exact matches for these words

// AND conceptually similar content about Docker, Kubernetes, etc.

Source Code

To see the whole code base visit this GitHub Link

Conclusion

 
Azure Cosmos DB's Hybrid Search is a game-changer for developers and businesses looking to build sophisticated search functionalities within their applications. By seamlessly integrating Vector Search and Full-Text Search, Hybrid Search offers unparalleled relevance and efficiency in data retrieval. Whether you're developing an e-commerce platform, a content management system, or an AI-driven chatbot, Hybrid Search provides the tools you need to create responsive and intelligent search experiences.

Related Resources

Updated Feb 03, 2025
Version 1.0