Public preview of DiskANN on Azure Database for PostgreSQL is now open - No sign-up needed!
We’re thrilled to announce the public preview of DiskANN on Azure Database for PostgreSQL is now open! No sign-up needed — it's available to all Azure Database for PostgreSQL customers right now.
Based on your valuable feedback from our initial release in October, we've supercharged DiskANN with parallel index build for improved performance, numerous bug fixes, and enhanced stability. DiskANN enables developers to perform highly accurate and efficient vector searches on large vector datasets, making it an ideal solution for scaling Generative AI applications.
Try DiskANN today and elevate your AI projects to the next level!
What is DiskANN?
Developed by Microsoft Research and used extensively at Microsoft in global services such as Bing and Microsoft 365, DiskANN is an approximate nearest neighbor search algorithm designed for efficient vector search at scale. It provides high recall, high throughput, and low query latency essential for modern AI and RAG applications.
Why use Azure Database for PostgreSQL with DiskANN Vector Index?
- Scalability: DiskANN is optimized for large datasets, making it ideal for handling millions of vectors.
- Accuracy: DiskANN uses iterative post filtering to enhance the accuracy of filtered vector search results without compromising on speed or precision.
- Low Latency: The DiskANN graph index construction makes it very efficient during search, minimizing the number of SSD reads to achieve high throughput and low latency.
- Integration: Seamlessly integrates with Azure Database for PostgreSQL, leveraging the power and flexibility of PostgreSQL.
- Learn more about DiskANN from Microsoft.
Benefits of using a vector index in your AI application
Using a vector index in PostgreSQL, such as pg_diskann, dramatically improves query performance and reduces latency for high-dimensional data applications like search engines, recommendation systems, and e-commerce websites. Unlike brute-force search, vector indexes optimize similarity searches by organizing data for efficient nearest neighbor queries using metrics like cosine similarity, Euclidean distance, or inner product. They leverage approximate algorithms, such as DiskANN, to reduce the search space, enabling sub-linear query times even for datasets with millions of vectors.
On average using a Vector Index you can achieve sub-10-millisecond query times on a 1-million-row dataset, while brute-force search could take ~200 milliseconds or more, making using Vector index ideal for real-time applications.
For example, an Airbnb-style platform could use vector search to match a user's query with similar properties in the database, and the index allows the system to quickly surface the most relevant listings, transforming what could be seconds-long processing into millisecond responses, ensuring a fast and personalized search experience.
Using DiskANN on Azure Database for PostgreSQL
Using DiskANN on Azure Database for PostgreSQL is easy.
-
- Enable the pgvector & diskann Extension: Allowlist the pgvector and diskann extension within your server configuration.
- Create Extension in Postgres: Create the pg_diskann extension on your database along with any dependencies.
CREATE EXTENSION IF NOT EXISTS pg_diskann CASCADE;
- Create a Vector Column: Define a table to store your vector data, including a column of type vector for the vector embeddings.
-
CREATE TABLE demo ( id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, embedding public.vector(3) ); INSERT INTO demo (embedding) VALUES ('[1.0, 2.0, 3.0]'), ('[4.0, 5.0, 6.0]'), ('[7.0, 8.0, 9.0]');
- Index the Vector Column: Create an index on the vector column to optimize search performance. The pg_diskann PostgreSQL extension is compatible with pgvector, it uses the same types, distance functions and syntactic style.
CREATE INDEX demo_embedding_diskann_idx ON demo USING diskann (embedding vector_cosine_ops)
- Perform Vector Searches: Use SQL queries to search for similar vectors based on various distance metrics (cosine similarity in the example below).
-
SELECT id, embedding FROM demo ORDER BY embedding <=> '[2.0, 3.0, 4.0]' LIMIT 5;
Ready to Dive In?
Use the DiskANN preview today and explore the future of AI-driven applications with the power of Azure Database for PostgreSQL!
Learn More
Integrating DiskANN with Azure Database for PostgreSQL enables scalable, efficient AI applications. By leveraging advanced vector search capabilities, you can enhance the performance of your AI applications and deliver more accurate results faster than ever before.
Updated Feb 07, 2025
Version 3.0abeomor-msft
Microsoft
Joined September 27, 2024
Azure Database for PostgreSQL Blog
Follow this blog board to get notified when there's new activity