We are thrilled to announce the launch of Cohere Rerank v3.5 model on model catalog in Azure AI Foundry. This integration empowers developers and enterprises to enhance their search systems with state-of-the-art reranking capabilities, ensuring more accurate and contextually relevant search results.
What is Cohere Rerank v3.5?
Cohere Rerank v3.5 is the latest iteration in Cohere's series of reranking models, designed to reorder search results based on a deep semantic understanding of user queries and document content. With a context window of 4,096 tokens, Rerank v3.5 excels in processing complex queries and large documents, delivering precise and relevant search outcomes. Notably, it offers multilingual support for over 100 languages, making it an invaluable tool for global enterprises.
How Does a Reranker Work?
In search systems, initial retrieval methods often rely on keyword matching or vector search. The initial retrieval returns a mix of both relevant and irrelevant information, in no particular order. While there is some semblance of relevance in descending search results, precision of relevance is not guaranteed
For example, you may retrieve 20 documents in your first stage retrieval, but only 5 documents are relevant, and those 5 relevant results do not show up at the top: [2, 10, 12, 3, 5, 4, 1]
Rerank would push those results in order of relevance: [1, 2, 3, 4, 5]
A reranker addresses this by reordering the initially retrieved documents, evaluating them based on their semantic alignment with the user's query. This process ensures that the most pertinent information surfaces at the top, enhancing the overall search experience. By integrating a reranker, systems can mitigate issues where relevant items might otherwise be buried deep in the results due to limitations in the initial retrieval process.
Use Cases and Performance Benefits
Rerank v3.5 is versatile and can significantly enhance various applications:
- Semantic Search Applications: By understanding the context and intent behind user queries, Rerank v3.5 refines search results to be more relevant, improving user satisfaction.
- Retrieval-Augmented Generation (RAG) Pipelines: In RAG systems, Rerank v3.5 improves the quality of retrieved documents, providing a more robust foundation for generative models to produce accurate and contextually appropriate responses.
Performance evaluations have demonstrated that Rerank v3.5 outperforms traditional retrieval models. For instance, in the financial domain, it has shown a 30.8% improvement over traditional BM25 search algorithms and a 25% improvement over dense retrieval methods, highlighting its capability to handle complex, domain-specific queries effectively.
Why Should Customers Care?
In today's data-driven world, the ability to quickly and accurately retrieve information is paramount. Traditional search systems often fall short when handling nuanced queries or large datasets. Rerank v3.5 addresses these challenges by providing:
- Enhanced Accuracy: Delivers more precise search results by deeply understanding query context and content semantics.
- Multilingual Support: Breaks language barriers, allowing enterprises to operate seamlessly across global markets.
- Easy Integration: Rerank v3.5 can be incorporated into existing RAG pipielines with just a single line of code, offering immediate improvements without extensive overhauls.
Benefits of Integration with Azure AI Foundry
Cohere Rerank v3.5 is now available as serverless APIs through Models as a Service (MaaS) in Azure AI Foundry. This enables enterprise-scale workloads with ease.
- Network Isolation for Inferencing: Protect your data from public network access.
- Expanded Regional Availability: Access from multiple regions.
- Data Privacy and Security: Robust measures to ensure data protection.
- Quick Endpoint Provisioning: Set up a rerank endpoint in Azure AI Foundry in seconds.
Azure AI ensures seamless integration, enhanced security, and rapid deployment for your AI needs.
How to deploy Cohere Rerank v3.5 models in Azure AI Foundry?
Prerequisites:
- If you don’t have an Azure subscription, get one here: https://azure.microsoft.com/en-us/pricing/purchase-options/pay-as-you-go
- Familiarize yourself with Azure AI Model Catalog
- Create an Azure AI Foundry hub and project. Make sure you pick East US, West US3, South Central US, West US, North Central US, East US 2 or Sweden Central as the Azure region for the hub.
Create a deployment to obtain the inference API and key:
- Open the model card in the model catalog on Azure AI Foundry.
- Click on Deploy and select the Pay-as-you-go option.
- Subscribe to the Marketplace offer and deploy. You can also review the API pricing at this step.
- You should land on the deployment page that shows you the API and key in less than a minute.
These steps are outlined in detail in the product documentation.
Please check some samples to get started – LangChain, Web Requests, Cohere Client
Improve your RAG pipeline today
Integrating Cohere Rerank v3.5 from Azure AI Foundry into your AI applications enhances search and recommendation capabilities while providing a scalable, secure, and easy-to-deploy platform. Get started with Cohere Rerank v3.5 on Azure AI Foundry today!
Updated Feb 26, 2025
Version 2.0Sharmichock
Microsoft
Joined December 14, 2023
AI - Machine Learning Blog
Follow this blog board to get notified when there's new activity