Increase scalability, optimize performance, and integrate advanced AI features with Azure Database for PostgreSQL Flexible Server.
Increase scalability, optimize performance, and integrate advanced AI features with Azure Database for PostgreSQL Flexible Server. Scale up with SSD v2 for up to 4x more transactions per second and significantly lower latency. Scale out effortlessly with Elastic Clusters using Citus, distributing workloads across multiple servers at no extra cost. Optimize performance with Automatic Index Tuning, reducing server utilization and enhancing efficiency. Leverage advanced AI features like DiskANN, semantic reranking, and Graph RAG to build better AI apps.
Charles Feddersen, Partner Director of Program Management for Postgres at Microsoft, joins Jeremy Chapman to show you how to build scalable, high-performance AI apps on Azure PostgreSQL Flexible Server today.
Up to 4x more transactions per second & lower latency.
Get started with SSD v2 public preview on Azure Database for PostgreSQL Flexible Server today.
Scale your PostgreSQL workloads using an elastic cluster.
Run your Postgres workload at cluster scale with Flexible Server today. See it here.
Reduce query time from minutes to milliseconds.
Optimize PostgreSQL performance with Azure Database for PostgreSQL Automatic Index Tuning. Start here.
Watch our video here.
QUICK LINKS:
00:00 — Azure Database for Postgres Flexible Server
00:39 — Updates to PostGreSQL
01:50 — Faster storage with SSD v2
03:06 — Scale out workloads using elastic cluster
03:59 — Provision an elastic cluster
05:34 — Scale “in-place”
07:12 — Eliminate redundant indexes
07:44 — Optimize server parameters
09:52 — Improve AI apps with Postgres Flexible Server
11:40 — Code behind queries using PG Admin
12:31 — Scale workloads
13:36 — Wrap up
Link References
Get started at https://aka.ms/azurepostgresblog
Access code at https://aka.ms/AzurePostgresAI
Unfamiliar with Microsoft Mechanics?
As Microsoft’s official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft.
- Subscribe to our YouTube:https://www.youtube.com/c/MicrosoftMechanicsSeries
- Talk with other IT Pros, join us on the Microsoft Tech Community:https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog
- Watch or listen from anywhere, subscribe to our podcast:https://microsoftmechanics.libsyn.com/podcast
Keep getting this insider knowledge, join us on social:
- Follow us on Twitter:https://twitter.com/MSFTMechanics
- Share knowledge on LinkedIn:https://www.linkedin.com/company/microsoft-mechanics/
- Enjoy us on Instagram:https://www.instagram.com/msftmechanics/
- Loosen up with us on TikTok:https://www.tiktok.com/@msftmechanics
Video Transcript:
- Postgres is one of the most popular open source databases in use today. And the team at Microsoft has been working to make Postgres on Azure more scalable, easy to manage, and more optimized for building new AI apps. And today we’re going to explore Azure Database for Postgres Flexible Server to learn more about scaling infrastructure and designing data models to support different workloads, optimizing configurations for specific workloads quickly and easily, leveraging new AI features to build better AI apps. And to walk us through all this, I’m joined once again by Charles Feddersen who leads the product management team for Postgres on Azure. Welcome back.
- Thanks for having me. It’s great to be back, Jeremy.
- And it’s really great to have you back on. You know, last time you were on the show, we actually showed how Microsoft is deeply invested in Postgres. Not only do we run the Flexible Server managed service in Azure, we also contribute heavily to the open source community. So what are some of the latest updates that we have that we’ve worked on with Postgres?
- Yeah, look, there’s really been a ton of updates. For the latest Postgres 17 release, Microsoft engineers made 412 commits to the open source project, and these include capabilities like I/O combining, which can improve I/O efficiency, and planner changes to optimize UNION scenarios, amongst many others. Both of these optimizations improve query performance and are easily applied to your existing workloads.
- Okay, so as we make updates and the community makes updates, how soon can we get those into the service running on Azure?
- So we’ve got it down to just a few days. In fact, this year we reduced the time between open source Postgres 17 becoming generally available, and when we shipped it in Flexible Server to just four days. And one of the best things is that we’ll make Postgres 17 in Azure generally available in-place so that you won’t need to redo any of the work that you’ve done during preview. And we’re working on ways to bring new Postgres versions to Azure even faster in the future.
- Okay, so Postgres is already running on version 17 on Azure. So what are we doing now in terms of making Azure the best place to run your Postgres workloads?
- So we’re really focused on scalability, and we’ve taken three distinct approaches to this. The first is to provide the latest Azure infrastructure for Flexible Server, and in this case, it’s faster storage with SSD v2. I’ll show you the performance of these disks. I’ve set up a side-by-side benchmark to demonstrate the performance of premium SSD v1 versus premium SSD v2 on the right. I’ll go ahead and kick off the benchmark, and on the right side, I’ll highlight the host name to show that I’m using SSD v2 on the right, whereas I’m using SSD v1 on the left. I’ll kick off the SSD v1 benchmark on the left. And as we can start to get some initial results, you can see that v2 are running at about 15,000 transactions per second, whereas v1 is only running at about 4,000 transactions per second. And this correlates to the latency being much higher on SSD v1 than v2. And so scaling up the disk to SSD v2 alone can provide almost 4x transactions. These disks are still in preview, but you can start to test and get started today.
- All right, so we’ve seen now scaling up using faster disks, but you mentioned there were three different approaches. So what are the other ones?
- Yeah, the second approach is scaling out workloads using the new elastic cluster feature. This enables you to distribute a single Postgres workload across multiple servers using sharded data. Elastic clusters use Citus to scale workloads, and it’s now available in Flexible Server for no additional cost. And Citus supports two approaches for scaling out your data. You can either create distributed tables that spread the rows of a table across different servers, or you can distribute entire schemas where one or more schemas are stored on each node of a cluster. And unlike previous versions of Citus, the elastic cluster and Flexible Server has no coordinated node. You can just add compute nodes and scale your workload. This makes it more efficient and easier to manage.
- Okay, so what does it take then to set up an elastic cluster in this case?
- Yeah, I can show you. I’ll start by provisioning a Flexible Server with elastic cluster. I’m in the portal, and I’m going to configure a new Flexible Server. I’ll click configure server, and you can see I’ve got a new cluster option. The default is server, which is a traditional single node Flexible Server that you can use today, or I can choose the new elastic cluster option. I’ll select that, and then I just need to set the number of nodes in my cluster, and I’ll start with four. So I’ll move over to pgAdmin, and I’m connected to my cluster. So let’s scale out a table. I’ve created a normal Postgres table with this query, and distributed it out using a Citus command. Super easy. Now, I’ve pre-populated a table earlier, so let’s use that to build on our previous demo to show SSD v2 performance in a cluster. Here, I’ve got another side-by-side demo with terminal windows, where the left side is a single Flexible Server, and on the right is an elastic cluster with four nodes. I’ll kick off the left with the single server, and now the cluster on the right. And as this runs, you’ll see that the cluster with sharded data has significantly higher transactions per second versus our Flexible Server running on SSD v2. And once it’s complete, you’ll see the average transactions per second is much higher, and the latency is also significantly lower in this case.
- It’s really great to see this has come to Flexible Server, but now we’ve kind of looked at scaling up with our disks and scaling out with our elastic cluster. So what’s the next one?
- Yeah, so the third approach is simply scaling in-place, where we’ve built features to help maximize the usage of existing resources. And there are two ways to do this automatically in Postgres Flexible Server. First, instead of manually tuning indexes, which is time consuming and error prone, we’ve built Automatic Index Tuning to optimize indexes for you. A well-designed index can reduce query time from minutes to milliseconds. Let me show you an example. Here I’m in the portal on the index tuning blade, and you can see that the tuning engine is configured to analyze the past one hour of queries. But I’d like a larger sample size. So I’ll click on tuning settings and change the frequency to four hours, and this will then analyze a four-hour window of my workload. You can see there are other configurations specific to create and drop recommendations, but we’ll leave those as is. Once the tuning is finished, I can see there are 20 recommendations, and I’ll go ahead and click to view those. I’ll select the top recommendation. And you can see it tells me the database, schema, and table that the index applies to. It gives me the SQL script to create the index. And it even gives me an impact assessment so that I can see the estimated performance improvement, size on my disk, and the reason for creating this index. Skipping ahead, I’ve gone ahead and created those 20 indexes, and I’ll look at the resource usage on my database. You can see that before index tuning, my server was running at about 70% utilization, but now it’s dropped significantly to only about 15%. This is a great outcome using a really simple index optimization feature.
- So in this case, we saw how the creation of indexes basically helps in terms of performance, but I know for personal experience also dropping redundant indexes can help performance as well.
- Yeah, I think generally the focus is on creating indexes to improve performance, but what a lot of people don’t realize is that over time, redundant indexes accumulate, which can have a significant impact on performance. This feature is really useful for eliminating redundant indexes and it reduces the compute required to maintain indexes and the IOPS and disk space needed to modify and store them.
- So how else then can we optimize our Postgres workloads?
- So another key approach in Postgres is optimizing server parameters. Tuning server parameters is a well-known optimization technique for Postgres, but it’s hard to get right, and even once you do, workloads change and the new configuration may not be optimal anymore. So to help with this, we’ve now automated parameter tuning, and we’ve had really positive feedback from organizations who have faced this challenge in their own workloads.
- Okay, so how would something like this then work?
- Well, it profiles your workload and progressively reconfigures a subset of server parameters to find the optimal configuration to maximize throughput. Let me show you how it works. I’m in the portal, and you can see I’m in the parameter tuning blade. I’ll click run tuning session, and the blade gives me the choice of restarting during the optimization process or not. This is because Postgres has some parameters that need a restart to take effect. I’ll keep no restart enabled to avoid interruption to my workload. And you can see that the session has started. And when I click into the tuning session, there are two charts. The top one shows an average query runtime, which we want to minimize, and the lower one is for transactions per second, which we want to maximize. The tuning session actually takes about four hours, so I’m going to show you one we prepared earlier. About halfway through, you can see the charts are populated to show the performance of different parameter configurations. And if we skip to the end, the green shows the optimal configuration. Now, if you’re curious, you can select other parameter configurations from the chart and then look at the table below to see the different settings for each parameter.
- Right, and those recommendations effectively add scale. And I say it in air quotes, because it’s kind of like if you want to make a race car faster, you can either lighten its weight or you can add more horsepower. So getting that kind of in-place optimization there first is always a good practice.
- Yeah, look, you always want to maximize the resources that you’ve got first before you add more, but sometimes there is no option but to add more power. The scale-out capabilities are unique to this class of Postgres service, and if the data model is well architected for scale-out, you can see a significant performance boost in your apps.
- That’s really good advice. But I want to switch gears because I think a lot of people that are watching right now are developers who’re thinking about things like writing AI apps. So how can Postgres make AI apps better?
- Yeah, so we’re a couple of years into this AI journey. Application patents are starting to be established, and we’ve seen a lot of amazing AI apps on Flexible Server. And your feedback for AI features is clear that the most important requirement for querying a database is accuracy. And so to help with this, we’ve implemented three new features: a new vector index called DiskANN, which was developed by Microsoft to optimize how indexes are used in memory and offloads work to the disk, Semantic Reranking to reorder query results and improve their accuracy, and Graph RAG that allows querying of complex relationships between data points and can significantly improve recall.
- And the good news is, as a Postgres developer, you can just use normal SQL commands.
- That’s right. You can leverage each of these features independently or combine them to improve the recall of a query. Let me show you how to use them together. For this demo, I’m going to use a simple app to show how recall improves, and then we’ll walk through the code to show you how it actually works. I’ve got an app that searches through public US case law data. At the top, I can choose to search using either Vector Search, Semantic Ranker, or Graph RAG, which combines all techniques into a single query. I’ll run the prompt on the left, which relates to prominent cases of water leaking in Washington. In the results, we can see cases related to water but nothing prominent. Let’s go back now and try the Semantic Ranker approach. And these results, I can see water leaks, but still not quite right now. I’ll go one more time and use Graph RAG. These results look really good, and I can see a specific reference to a prominent case of water leakage in Bremerton, which is in Washington. Let’s look at the code behind each of these three example queries using pgAdmin. I’ve got three tabs here, one for each query, and I’ll just show you the key syntax for each one. The first query just uses a standard pgvector operator for cosine distance to return the most relevant results. The second query for ranking is a little more complex. This part of the code passes my initial results to the ranker model to optimize for better relevance. And the third query is a little more complex, again. To provide Graph RAG functionality, we’re using the Apache AGE extension. This part of the code is the key difference where we’re using the cypher query to count the number of citations each legal case has to identify which ones are the most prominent.
- Right, and search is such an important part for RAG and also agentic-based applications, so I wonder though, if you can take basically the scale that we saw before with all the things we’re doing to scale up and out and kind of in place, and this search technology together to make things better.
- Yeah, absolutely. When you start to work with embeddings, you’ll notice pretty quickly that they’re really large in size and computationally intensive to query, and you can exhaust the scale of a single server or a disk pretty quickly. With Flexible Server Elastic Cluster, you can easily scale these workloads across machines to multiply compute and storage for these large workloads. Let me give you a quick example. I’m connected to a standard single node Flexible Cluster on the left and an elastic cluster with four nodes on the right. I’m running a simple similarity search query against embeddings with a vector index. On the left, you can see my single node returns about 2000 transactions per second, but my four node cluster on the right provides the scalability with around 8,000 transactions per second. This horizontal scalability is very important for building RAG-based apps on Postgres, where you may have thousands of concurrent users chatting with the data at once. So now we’ve shown how Postgres Flexible Server can meet performance demands and also how it can power new AI apps. So for all the folks that are watching right now looking to get started, what do you recommend?
- You can find all of our latest updates at aka.ms/azurepostgresblog, and you can also access the code I walked through today at aka.ms/AzurePostgresAI.
- Thanks, Charles. It’s great to see how Postgres just keeps getting better on Azure. Of course, you’re going to want to check out our previous shows on Postgres and AI. Subscribe to be the first to see the new ones. And as always, thank you for watching.
Updated Nov 22, 2024
Version 1.0Zachary-Cavanell
Bronze Contributor
Joined July 14, 2016
Microsoft Mechanics Blog
Follow this blog board to get notified when there's new activity