Blog Post

Educator Developer Blog
5 MIN READ

Full-Text Search in Azure Cosmos DB

kevin_comba's avatar
kevin_comba
Iron Contributor
Feb 04, 2025

A Powerful Way to Enhance Search Capabilities

What is full-text search?

Full-text search is a technique that finds specific information within a large corpus of text. It goes beyond keyword matching and analyzes the content of documents to identify relevant results based on the user’s search query. 
Azure Cosmos DB for NoSQL now offers a powerful Full Text Search feature in preview, designed to enhance the search capabilities of your applications. Read more about it here.

How does full-text search work?

A full-text search involves two primary stages:

Indexing

During the indexing stage, the system analyzes the text content of documents and stores the data in a structured format. This process typically involves:

  • Tokenization: Breaking down text into individual words or units called tokens. This is like separating a sentence into individual words.
  • Stemming: Reducing words to their root form, such as "running" to "run". This ensures that variations of the same word are treated as a single term during search.
  • Stop word removal: Removing common words that are not particularly meaningful in search, such as "the", "a", and "is". This helps to reduce the index size and improve search speed.
  • Building an index: Creating a data structure that maps keywords to their locations within documents. This index acts as a roadmap, allowing the search engine to quickly locate relevant documents.

Searching

Once the index is built, the search stage allows users to submit queries and retrieve relevant results. The system analyzes the search query and uses the index to identify documents containing the relevant keywords.

Advantages of full-text search

Full-text search has several advantages over traditional search algorithms. Some of the key advantages include:

  • Improved Accuracy and Relevance: It ranks search results based on relevance, often considering factors such as word frequency, proximity of terms, and context, which improves the likelihood of finding the most relevant documents for the query.
  • Faster Query Execution: Once indexed, full-text search allows for rapid querying of large datasets. Instead of scanning data sequentially, it uses optimized data structures like inverted indexes to retrieve results quickly.
  • Efficient Handling of Unstructured Data: Unlike traditional search methods that rely on structured data, full-text search is well-suited for searching unstructured or semi-structured data, such as documents, emails, or social media content.
  • Support for Complex Queries: Full-text search engines often support advanced query capabilities, such as Boolean operators, wildcards, fuzzy searches, and proximity searches, enabling users to tailor their searches to their needs.

Full Text Search is ideal for a variety of scenarios, including:

  • E-Commerce Platforms: Helping users find products quickly using keyword searches, filtering, and auto-suggestions, even with typos or partial matches.
  • Content Management Systems (CMS): Allowing users to search through large volumes of articles, blogs, and documents with relevant ranking and keyword highlighting.
  • Customer Support Portals: Enabling users to search through FAQs, knowledge bases, and support tickets to find solutions efficiently.
  • Enterprise Document Management: Facilitating the retrieval of unstructured or semi-structured data like contracts, emails, or reports from corporate repositories.
  • Social Media Platforms: Powering search for posts, comments, and hashtags while handling high volumes of unstructured text in real-time.
  • Healthcare and Legal Research: Quickly retrieving relevant case studies, research papers, medical records, or legal documents based on complex queries and context.

Full-text search in Azure Cosmos DB

Prerequisites

Before diving into the implementation, ensure you have the following:

How to use Full-text search in Azure Cosmos DB

  1. Enable the "Full Text & Hybrid Search for NoSQL" preview feature.
  • Make sure you have an Azure account, get one here
  • Provision Cosmos DB, if you have difficulties please follow this blog.
  • Select the "Features" pane under the "Settings" menu item.

 

 

  • Select the "Full-Text & Hybrid Search for NoSQL API (preview)" feature.
  • Read the description of the feature to confirm you want to enable it.
  • Select "Enable" to turn on the vector indexing and search capability.

 

 

  1. Configure a container with a full-text policy: To use full-text search capabilities, you'll first need to define two policies:
  • At container-level full-text policy that defines what paths will contain text for the new full-text query system functions. Using the Azure Cosmos DB for NoSQL with Azure SDK for Node.js as an example, if I want to compute text search on an entity like below.

 

 

I will have to create a full-text policy on fields like title, metadata and content as an example.

 

 

  • After adding a full-text policy, next is creating a full-text indexing policy that enables efficient search.

 

 

After creating both the indexing policy and full text policy next is adding them to our Cosmos client. Below is how we define CosmosClient with endpoint and key as the main requirements. Both are available in the Azure portal in the settings panel -> keys.

 

The code below will create the database and container if they do not exist. Also, index the container with our indexing policy and full-text policy.

 

 

 

  1. Run the app, this will make the SDK to apply all policies to our database.
  2. Insert your data in your database.
  3. Run hybrid queries against the data: before we do so we need to understand Full-text search queries.

Full text search and scoring operations are performed using the following system functions in the Azure Cosmos DB for NoSQL query language:

  • FullTextContains: Returns true if a given string is contained in the specified property of a document. This is useful in a WHERE clause when you want to ensure specific key words are included in the documents returned by your query.
SELECT TOP 10 * FROM c WHERE FullTextContains(c.text, "bicycle")
  • FullTextContainsAll: Returns true if all of the given strings are contained in the specified property of a document. This is useful in a WHERE clause when you want to ensure that multiple key words are included in the documents returned by your query.
SELECT TOP 10 * FROM c WHERE FullTextContainsAll(c.text, "red", "bicycle")

 

  • FullTextContainsAny: Returns true if any of the given strings are contained in the specified property of a document. This is useful in a WHERE clause when you want to ensure that at least one of the keywords is included in the documents returned by your query.
SELECT TOP 10 * FROM c WHERE FullTextContains(c.text, "red") AND FullTextContainsAny(c.text, "bicycle", "skateboard")

 

  • FullTextScore: Returns a score. This can only be used in an ORDER BY RANK clause, where the returned documents are ordered by the rank of the full-text score, with the most relevant (highest scoring) documents at the top, and least relevant (lowest scoring) documents at the bottom.
SELECT TOP 10 * FROM c ORDER BY RANK FullTextScore(c.text, ["bicycle", "mountain"])

Below are code samples showcasing the above Full-text search queries

 

 

 

 

 

 

 

 

This is how you implement full-text search in Azure Cosmos DB using the Azure Cosmos DB client library for JavaScript/TypeScript.

Source Code

To see the whole code base visit this here.

Updated Jan 30, 2025
Version 1.0
No CommentsBe the first to comment