azure
2738 TopicsEnhancing Infrastructure as Code Generation with GitHub Copilot for Azure
Discover the latest update to GitHub Copilot for Azure, designed to streamline Infrastructure as Code (IaC) generation using Bicep or Terraform. With a new update panel, developers can easily modify project details, hosting services, target services, bindings, and environment variables—all within an intuitive UI. This enhancement eliminates the need for chat-based modifications, improving efficiency and reducing errors. Save time, automate infrastructure deployment, and experience seamless cloud configuration. Try the new GitHub Copilot for Azure update today and optimize your Azure development workflow effortlessly!Azure ADF ServiceNow connector can't retrieve table columns but same login can do in REST API
I have tried to create a pipeline using copy activity to extract data from a table in our ServiceNow dev platform. I have first used the latest version of the ServiceNow connector. However, it didn't work. When I tried to import schema, it shows below error message: Failed to load. The API request to ServiceNow failed. Request Url: https://airtrunkautemp.service-now.com/api/now/table/sys_dictionary?sysparm_query=name%3dfacilities_request^ORname%3dsm_order^ORname%3dtask, Status Code: Forbidden, Error message: {"error":{"message":"Insufficient rights to query records","detail":"Field(s) present in the query do not have permission to be read"},"status":"failure"} Activity ID: 5a99e871-893d-4426-809e-0b22654248f8 Then I tried to use the legacy version of the ServiceNow connector, extract full table data using query. After I executed the pipeline, only 1 column sys_id is returned. I have contacted the ServiceNow support for the issue, they checked and got back me the access login has no issue. Then I wrote python to use REST API to retrieve data from the same table, it works, I could extract all table columns without the insufficient rights issue. Does anyone have this experience before? How did you solve it?25Views0likes1CommentUse AI for Free with GitHub Models and TypeScript! 💸💸💸
Learn how to use AI for free with GitHub Models! Test models like GPT-4o without paying for APIs or setting up infrastructure. This step-by-step guide shows how to integrate GitHub Models with TypeScript in the Microblog AI Remix project. Start exploring AI for free today!What service principal is used to authenticate Logic Apps to Azure resources?
This question is a bit more academic than practical, but I'm just trying to enhance my knowledge of how Azure authentication works under the hood. The default way to authenticate managed Logic Apps connections is through an OAuth popup asking you to grant permissions. Based on my reading of the Azure docs, this means that you're granting access to the delegated permissions of a service principal. For connectors that access the Graph API, such a service principal in your tenant with the correct delegated permissions: However, I'm struggling to find an equivalent service principal for connectors that use the Azure Resource Management API to interact with services like Log Analytics, sentinel, Logic Apps, etc. I do see a service principal called Azure Logic Apps, but it doesn't have any permissions associated with it. My understanding is that it would need to have the delegated permission user_impersonation to access Azure resources: So my questions here are What Service Principal is used for the OAuth connection to the Azure Resource Management API? If the Azure Logic Apps service principal is used, how is it able to connect to the ARM API without any permissions? Is there some Azure magic happening under the hood here?144Views0likes5CommentsTest failover for Azure SQL database
Hi I want to use a failover group to protect an Azure SQL server, for DR purposes, but I'm unsure how to perform a test failover. Can I use a recovery plan perform a test failover, keeping the primary node up and running for production, whilst the secondary is available for DR testing? Cheers Alex10Views0likes0CommentsSpeed Up OpenAI Embedding By 4x With This Simple Trick!
In today’s fast-paced world of AI applications, optimizing performance should be one of your top priorities. This guide walks you through a simple yet powerful way to reduce OpenAI embedding response sizes by 75%—cutting them from 32 KB to just 8 KB per request. By switching from float32 to base64 encoding in your Retrieval-Augmented Generation (RAG) system, you can achieve a 4x efficiency boost, minimizing network overhead, saving costs and dramatically improving responsiveness. Let's consider the following scenario. Use Case: RAG Application Processing a 10-Page PDF A user interacts with a RAG-powered application that processes a 10-page PDF and uses OpenAI embedding models to make the document searchable from an LLM. The goal is to show how optimizing embedding response size impacts overall system performance. Step 1: Embedding Creation from the 10-Page PDF In a typical RAG system, the first step is to embed documents (in this case, a 10-page PDF) to store meaningful vectors that will later be retrieved for answering queries. The PDF is split into chunks. In our example, each chunk contains approximately 100 tokens (for the sake of simplicity), but the recommended chunk size varies based on the language and the embedding model. Assumptions for the PDF: - A 10-page PDF has approximately 3325 tokens (about 300 tokens per page). - You’ll split this document into 34 chunks (each containing 100 tokens). - Each chunk will then be sent to the embedding OpenAI API for processing. Step 2: The User Interacts with the RAG Application Once the embeddings for the PDF are created, the user interacts with the RAG application, querying it multiple times. Each query is processed by retrieving the most relevant pieces of the document using the previously created embeddings. For simplicity, let’s assume: - The user sends 10 queries, each containing 200 tokens. - Each query requires 2 embedding requests (since the query is split into 100-token chunks for embedding). - After embedding the query, the system performs retrieval and returns the most relevant documents (the RAG response). Embedding Response Size The OpenAI Embeddings models take an input of tokens (the text to embed) and return a list of numbers called a vector. This list of numbers represents the “embedding” of the input in the model so that it can be compared with another vector to measure similarity. In RAG, we use embedding models to quickly search for relevant data in a vector database. By default, embeddings are serialized as an array of floating-point values in a JSON document so each response from the embedding API is relatively large. The array values are 32-bit floating point numbers, or float32. Each float32 value occupies 4 bytes, and the embedding vector returned by models like OpenAI’s text-embedding-ada-002 typically consists of 1536-dimensional vectors. The challenge is the size of the embedding response: - Each response consists of 1536 float32 values (one per dimension). - 1536 float32 values result in 6144 bytes (1536 × 4 bytes). - When serialized as UTF-8 for transmission over the network, this results in approximately 32 KB per response due to additional serialization overhead (like delimiters). Optimizing Embedding Response Size One approach to optimize the embedding response size is to serialize the embedding as base64. This encoding reduces the overall size by compressing the data, while maintaining the integrity of the embedding information. This leads to a significant reduction in the size of the embedding response. With base64-encoded embeddings, the response size reduces from 32 KB to approximately 8 KB, as demonstrated below: base64 vs float32 Min (Bytes) Max (Bytes) Mean (Bytes) Min (+) Max (+) Mean (+) 100 tokens embeddings: text-embedding-3-small 32673.000 32751.000 32703.800 8192.000 (4.0x) (74.9%) 8192.000 (4.0x) (75.0%) 8192.000 (4.0x) (74.9%) 100 tokens embeddings: text-embedding-3-large 65757.000 65893.000 65810.200 16384.000 (4.0x) (75.1%) 16384.000 (4.0x) (75.1%) 16384.000 (4.0x) (75.1%) 100 tokens embeddings: text-embedding-ada-002 32882.000 32939.000 32909.000 8192.000 (4.0x) (75.1%) 8192.000 (4.0x) (75.2%) 8192.000 (4.0x) (75.1%) The source code of this benchmark can be found at: https://github.com/manekinekko/rich-bench-node (kudos to Anthony Shaw for creating the rich-bench python runner) Comparing the Two Scenarios Let’s break down and compare the total performance of the system in two scenarios: Scenario 1: Embeddings Serialized as float32 (32 KB per Response) Scenario 2: Embeddings Serialized as base64 (8 KB per Response) Scenario 1: Embeddings Serialized as Float32 In this scenario, the PDF embedding creation and user queries involve larger responses due to float32 serialization. Let’s compute the total response size for each phase: 1. Embedding Creation for the PDF: - 34 embedding requests (one per 100-token chunk). - 34 responses with 32 KB each. Total size for PDF embedding responses: 34 × 32 KB = 1088 KB = 1.088 MB 2. User Interactions with the RAG App: - Each user query consists of 200 tokens (which is split into 2 chunks of 100 tokens). - 10 user queries, requiring 2 embedding responses per query (for 2 chunks). - Each embedding response is 32 KB. Total size for user queries: Embedding responses: 20 × 32 KB = 640 KB. RAG responses: 10 × 32 KB = 320 KB. Total size for user interactions: 640 KB (embedding) + 320 KB (RAG) = 960 KB. 3. Total Size: Total size for embedding responses (PDF + user queries): 1088 KB + 640 KB = 1.728 MB Total size for RAG responses: 320 KB. Overall total size for all 10 responses: 1728 KB + 320 KB = 2048 KB = 2 MB Scenario 2: Embeddings Serialized as Base64 In this optimized scenario, the embedding response size is reduced to 8 KB by using base64 encoding. 1. Embedding Creation for the PDF: - 34 embedding requests. - 34 responses with 8 KB each. Total size for PDF embedding responses: 34 × 8 KB = 272 KB. 2. User Interactions with the RAG App: - Embedding responses for 10 queries, 2 responses per query. - Each embedding response is 8 KB. Total size for user queries: Embedding responses: 20 × 8 KB = 160 KB. RAG responses: 10 × 8 KB = 80 KB. Total size for user interactions: 160 KB (embedding) + 80 KB (RAG) = 240 KB 3. Total Size (Optimized Scenario): Total size for embedding responses (PDF + user queries): 272 KB + 160 KB = 432 KB. Total size for RAG responses: 80 KB. Overall total size for all 10 responses: 432 KB + 80 KB = 512 KB Performance Gain: Comparison Between Scenarios The optimized scenario (base64 encoding) is 4 times smaller than the original (float32 encoding): 2048 / 512 = 4 times smaller. The total size reduction between the two scenarios is: 2048 KB - 512 KB = 1536 KB = 1.536 MB. And the reduction in data size is: (1536 / 2048) × 100 = 75% reduction. How to Configure base64 encoding format When getting a vector representation of a given input that can be easily consumed by machine learning models and algorithms, as a developer, you usually call either the OpenAI API endpoint directly or use one of the official libraries for your programming language. Calling the OpenAI or Azure OpenAI APIs Using OpenAI endpoint: curl -X POST "https://api.openai.com/v1/embeddings" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{ "input": "The five boxing wizards jump quickly", "model": "text-embedding-ada-002", "encoding_format": "base64" }' Or, calling Azure OpenAI resources: curl -X POST "https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2024-10-21" \ -H "Content-Type: application/json" \ -H "api-key: YOUR_API_KEY" \ -d '{ "input": ["The five boxing wizards jump quickly"], "encoding_format": "base64" }' Using OpenAI Libraries JavaScript/TypeScript const response = await client.embeddings.create({ input: "The five boxing wizards jump quickly", model: "text-embedding-3-small", encoding_format: "base64" }); A pull request has been sent to the openai SDK for Node.js repository to make base64 the default encoding when/if the user does not provide an encoding. Please feel free to give that PR a thumb up. Python embedding = client.embeddings.create( input="The five boxing wizards jump quickly", model="text-embedding-3-small", encoding_format="base64" ) NB: from 1.62 the openai SDK for Python will default to base64. Java EmbeddingCreateParams embeddingCreateParams = EmbeddingCreateParams .builder() .input("The five boxing wizards jump quickly") .encodingFormat(EncodingFormat.BASE64) .model("text-embedding-3-small") .build(); .NET The openai-dotnet library is already enforcing the base64 encoding, and does not allow setting encoding_format by the user (see). Conclusion By optimizing the embedding response serialization from float32 to base64, you achieved a 75% reduction in data size and improved performance by 4x. This reduction significantly enhances the efficiency of your RAG application, especially when processing large documents like PDFs and handling multiple user queries. For 1 million users sending 1,000 requests per month, the total size saved would be approximately 22.9 TB per month simply by using base64 encoded embeddings. As demonstrated, optimizing the size of the API responses is not only crucial for reducing network overhead but also for improving the overall responsiveness of your application. In a world where efficiency and scalability are key to delivering robust AI-powered solutions, this optimization can make a substantial difference in both performance and user experience. ■ Shoutout to my colleague Anthony Shaw for the the long and great discussions we had about embedding optimisations.Unlocking the Power of Azure Container Apps in 1 Minute Video
Azure Container Apps provides a seamless way to build, deploy, and scale cloud-native applications without the complexity of managing infrastructure. Whether you’re developing microservices, APIs, or AI-powered applications, this fully managed service enables you to focus on writing code while Azure handles scalability, networking, and deployments. In this blog post, we explore five essential aspects of Azure Container Apps—each highlighted in a one-minute video. From intelligent applications and secure networking to effortless deployments and rollbacks, these insights will help you maximize the capabilities of serverless containers on Azure. Azure Container Apps - in 1 Minute Azure Container Apps is a fully managed platform designed for cloud-native applications, providing effortless deployment and scaling. It eliminates infrastructure complexity, letting developers focus on writing code while Azure automatically handles scaling based on demand. Whether running APIs, event-driven applications, or microservices, Azure Container Apps ensures high performance and flexibility with minimal operational overhead. Watch the video on YouTube Intelligent Apps with Azure Container Apps – in 1 Minute Azure Container Apps, Azure OpenAI, and Azure AI Search make it possible to build intelligent applications with Retrieval-Augmented Generation (RAG). Your app can call Azure OpenAI in real-time to generate and interpret data, while Azure AI Search retrieves relevant information, enhancing responses with up-to-date context. For advanced scenarios, AI models can execute live code via Azure Container Apps, and GPU-powered instances support fine-tuning and inferencing at scale. This seamless integration enables AI-driven applications to deliver dynamic, context-aware functionality with ease. Watch the video on YouTube Networking for Azure Container Apps: VNETs, Security Simplified – in 1 Minute Azure Container Apps provides built-in networking features, including support for Virtual Networks (VNETs) to control service-to-service communication. Secure internal traffic while exposing public endpoints with custom domain names and free certificates. Fine-tuned ingress and egress controls ensure that only the right traffic gets through, maintaining a balance between security and accessibility. Service discovery is automatic, making inter-app communication seamless within your Azure Container Apps environment. Watch the video on YouTube Azure Continuous Deployment and Observability with Azure Container Apps - in 1 Minute Azure Container Apps simplifies continuous deployment with built-in integrations for GitHub Actions and Azure DevOps pipelines. Every code change triggers a revision, ensuring smooth rollouts with zero downtime. Observability is fully integrated via Azure Monitor, Log Streaming, and the Container Console, allowing you to track performance, debug live issues, and maintain real-time visibility into your app’s health—all without interrupting operations. Watch the video on YouTube Effortless Rollbacks and Deployments with Azure Container Apps – in 1 Minute With Azure Container Apps, every deployment creates a new revision, allowing multiple versions to run simultaneously. This enables safe, real-time testing of updates without disrupting production. Rolling back is instant—just select a previous revision and restore your app effortlessly. This powerful revision control system ensures that deployments remain flexible, reliable, and low-risk. Watch the video on YouTube Watch the Full Playlist For a complete overview of Azure Container Apps capabilities, watch the full JavaScript on Azure Container Apps YouTube Playlist Create Your Own AI-Powered Video Content Inspired by these short-form technical videos? You can create your own AI-generated videos using Azure AI to automate scriptwriting and voiceovers. Whether you’re a content creator, or business looking to showcase technical concepts, Azure AI makes it easy to generate professional-looking explainer content. Learn how to create engaging short videos with Azure AI by following our open-source AI Video Playbook. Conclusion Azure Container Apps is designed to simplify modern application development by providing a fully managed, serverless container environment. Whether you need to scale microservices, integrate AI capabilities, enhance security with VNETs, or streamline CI/CD workflows, Azure Container Apps offers a comprehensive solution. By leveraging its built-in features such as automatic scaling, revision-based rollbacks, and deep observability, developers can deploy and manage applications with confidence. These one-minute videos provide a quick technical overview of how Azure Container Apps empowers you to build scalable, resilient applications with ease. FREE Content Check out our other FREE content to learn more about Azure services and Generative AI: Generative AI for Beginners - A JavaScript Adventure! Learn more about Azure AI Agent Service LlamaIndex on Azure JavaScript on Azure Container Apps JavaScript at MicrosoftDelivering Information with Azure Synapse and Data Vault 2.0
Data Vault has been designed to integrate data from multiple data sources, creatively destruct the data into its fundamental components, and store and organize it so that any target structure can be derived quickly. This article focused on generating information models, often dimensional models, using virtual entities. They are used in the data architecture to deliver information. After all, dimensional models are easier to consume by dashboarding solutions, and business users know how to use dimensions and facts to aggregate their measures. However, PIT and bridge tables are usually needed to maintain the desired performance level. They also simplify the implementation of dimension and fact entities and, for those reasons, are frequently found in Data Vault-based data platforms. This article completes the information delivery. The following articles will focus on the automation aspects of Data Vault modeling and implementation.147Views0likes0Comments