Connect with experts and redefine what’s possible at work – join us at the Microsoft 365 Community Conference May 6-8. Learn more >

azure resource manager

9 Topics

Using NVIDIA Triton Inference Server on Azure Container Apps
TOC Introduction to Triton System Architecture Architecture Focus of This Tutorial Setup Azure Resources File and Directory Structure ARM Template ARM Template From Azure Portal Testing Azure Container Apps Conclusion References 1. Introduction to Triton Triton Inference Server is an open-source, high-performance inferencing platform developed by NVIDIA to simplify and optimize AI model deployment. Designed for both cloud and edge environments, Triton enables developers to serve models from multiple deep learning frameworks, including TensorFlow, PyTorch, ONNX Runtime, TensorRT, and OpenVINO, using a single standardized interface. Its goal is to streamline AI inferencing while maximizing hardware utilization and scalability. A key feature of Triton is its support for multiple model execution modes, including dynamic batching, concurrent model execution, and multi-GPU inferencing. These capabilities allow organizations to efficiently serve AI models at scale, reducing latency and optimizing throughput. Triton also offers built-in support for HTTP/REST and gRPC endpoints, making it easy to integrate with various applications and workflows. Additionally, it provides model monitoring, logging, and GPU-accelerated inference optimization, enhancing performance across different hardware architectures. Triton is widely used in AI-powered applications such as autonomous vehicles, healthcare imaging, natural language processing, and recommendation systems. It integrates seamlessly with NVIDIA AI tools, including TensorRT for high-performance inference and DeepStream for video analytics. By providing a flexible and scalable deployment solution, Triton enables businesses and researchers to bring AI models into production with ease, ensuring efficient and reliable inferencing in real-world applications. 2. System Architecture Architecture Development Environment OS: Ubuntu Version: Ubuntu 18.04 Bionic Beaver Docker version: 26.1.3 Azure Resources Storage Account: SKU - General Purpose V2 Container Apps Environments: SKU - Consumption Container Apps: N/A Focus of This Tutorial This tutorial walks you through the following stages: Setting up Azure resources Publishing the project to Azure Testing the application Each of the mentioned aspects has numerous corresponding tools and solutions. The relevant information for this session is listed in the table below. Local OS Windows Linux Mac V How to setup Azure resources and deploy Portal (i.e., REST api) ARM Bicep Terraform V 3. Setup Azure Resources File and Directory Structure Please open a terminal and enter the following commands: git clone https://github.com/theringe/azure-appservice-ai.git cd azure-appservice-ai After completing the execution, you should see the following directory structure: File and Path Purpose triton/tools/arm-template.json The ARM template to setup all the Azure resources related to this tutorial, including a Container Apps Environments, a Container Apps, and a Storage Account with the sample dataset. ARM Template We need to create the following resources or services: Manual Creation Required Resource/Service Container Apps Environments Yes Resource Container Apps Yes Resource Storage Account Yes Resource Blob Yes Service Deployment Script Yes Resource Let’s take a look at the triton/tools/arm-template.json file. Refer to the configuration section for all the resources. Since most of the configuration values don’t require changes, I’ve placed them in the variables section of the ARM template rather than the parameters section. This helps keep the configuration simpler. However, I’d still like to briefly explain some of the more critical settings. As you can see, I’ve adopted a camelCase naming convention, which combines the [Resource Type] with [Setting Name and Hierarchy]. This makes it easier to understand where each setting will be used. The configurations in the diagram are sorted by resource name, but the following list is categorized by functionality for better clarity. Configuration Name Value Purpose storageAccountContainerName data-and-model [Purpose 1: Blob Container for Model Storage] Use this fixed name for the Blob Container. scriptPropertiesRetentionInterval P1D [Purpose 2: Script for Uploading Models to Blob Storage] No adjustments are needed. This script is designed to launch a one-time instance immediately after the Blob Container is created. It downloads sample model files and uploads them to the Blob Container. The Deployment Script resource will automatically be deleted after one day. caeNamePropertiesPublicNetworkAccess Enabled [Purpose 3: For Testing] ACA requires your local machine to perform tests; therefore, external access must be enabled. appPropertiesConfigurationIngressExternal true [Purpose 3: For Testing] Same as above. appPropertiesConfigurationIngressAllowInsecure true [Purpose 3: For Testing] Same as above. appPropertiesConfigurationIngressTargetPort 8000 [Purpose 3: For Testing] The Triton service container uses port 8000. appPropertiesTemplateContainers0Image nvcr.io/nvidia/tritonserver:22.04-py3 [Purpose 3: For Testing] The Triton service container utilizes this online resource. ARM Template From Azure Portal In addition to using az cli to invoke ARM Templates, if the JSON file is hosted on a public network URL, you can also load its configuration directly into the Azure Portal by following the method described in the article [Deploy to Azure button - Azure Resource Manager]. This is my example. Click Me After filling in all the required information, click Create. And we could have a test once the creation process is complete. 4. Testing Azure Container App In our local environment, use the following command to start a one-time Docker container. We will use NVIDIA's official test image and send a sample image from within it to the Triton service that was just deployed to Container Apps. # Replace XXX.YYY.ZZZ.azurecontainerapps.io with the actual FQDN of your app. There is no need to add https:// docker run --rm nvcr.io/nvidia/tritonserver:22.04-py3-sdk /workspace/install/bin/image_client -u XXX.YYY.ZZZ.azurecontainerapps.io -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg After sending the request, you should see the prediction results, indicating that the deployed Triton server service is functioning correctly. 5. Conclusion Beyond basic model hosting, Triton Inference Server's greatest strength lies in its ability to efficiently serve AI models at scale. It supports multiple deep learning frameworks, allowing seamless deployment of diverse models within a single infrastructure. With features like dynamic batching, multi-GPU execution, and optimized inference pipelines, Triton ensures high performance while reducing latency. While it may not replace custom-built inference solutions for highly specialized workloads, it excels as a standardized and scalable platform for deploying AI across cloud and edge environments. Its flexibility makes it ideal for applications such as real-time recommendation systems, autonomous systems, and large-scale AI-powered analytics. 6. References Quickstart — NVIDIA Triton Inference Server Deploying an ONNX Model — NVIDIA Triton Inference Server Model Repository — NVIDIA Triton Inference Server Triton Tutorials — NVIDIA Triton Inference Server
theringe
Mar 06, 2025 Place Apps on Azure Blog
269Views
0likes
0Comments
Deploy Smarter, Scale Faster – Secure, AI-Ready, Cost-Effective Kubernetes Apps at Your Fingertips!
In our previous blog post, we explored the exciting launch of Kubernetes Apps on Azure Marketplace. This follow-up blog will take you a step further by demonstrating how to programmatically deploy Kubernetes Apps using tools like Terraform, Azure CLI, and ARM templates. As organizations scale their Kubernetes environments, the demand for secure, intelligent, and cost-effective deployments has never been higher. By programmatically deploying Kubernetes Apps through Azure Marketplace, organizations can harness powerful security frameworks, cost-efficient deployment options, and AI solutions to elevate their Azure Kubernetes Service (AKS) and Azure Arc-enabled clusters. This automated approach significantly reduces operational overhead, accelerates time-to-market, and allows teams to dedicate more time to innovation. Whether you're aiming to strengthen security, streamline application lifecycle management, or optimize AI and machine learning workloads, Kubernetes Apps on Azure Marketplace provide a robust, flexible, and scalable solution designed to meet modern business needs. Let’s explore how you can leverage these tools to unlock the full potential of your Kubernetes deployments. Secure Deployment You Can Trust Certified and Secure from the Start – Every Kubernetes app on Azure Marketplace undergoes a rigorous certification process and vulnerability scans before becoming available. Solution providers must resolve any detected security issues, ensuring the app is safe from the outset. Continuous Threat Monitoring – After publication, apps are regularly scanned for vulnerabilities. This ongoing monitoring helps to maintain the integrity of your deployments by identifying and addressing potential threats over time. Enhanced Security with RBAC – Eliminates the need for direct cluster access, reducing attack surfaces by managing permissions and deployments through Azure Role-Based Access Control (RBAC). Lowering Cost of your Applications If your organization has Azure Consumption Commitment (MACC) agreements with Microsoft, you can unlock significant cost savings when deploying your applications. Kubernetes Apps available on the Azure Marketplace are MACC eligible and you can gain the following benefits: Significant Cost Savings and Predictable Expenses – Reduce overall cloud costs with discounts and credits for committed usage, while ensuring stable, predictable expenses to enhance financial planning. Flexible and Comprehensive Commitment Usage – Allocate your commitment across various Marketplace solutions that maximizes flexibility and value for evolving business needs. Simplified Procurement and Budgeting – Benefit from unified billing and streamlined procurement to driving efficiency and performance. AI-Optimized Apps High-Performance Compute and Scalability - Deploy AI-ready apps on Kubernetes clusters with dynamic scaling and GPU acceleration. Optimize performance and resource utilization for intensive AI/ML workloads. Accelerated Time-to-Value - Pre-configured solutions reduce setup time, accelerating progress from proof-of-concept to production, while one-click deployments and automated updates keep AI environments up-to-date effortlessly. Hybrid and Multi-Cloud Flexibility - Deploy AI workloads seamlessly on AKS or Azure Arc-enabled Kubernetes clusters, ensuring consistent performance across on-premises, multi-cloud, or edge environments, while maintaining portability and robust security. Lifecycle Management of Kubernetes Apps Automated Updates and Patching – The auto-upgrade feature keeps your Kubernetes applications up-to-date with the latest features and security patches, seamlessly applied during scheduled maintenance windows to ensure uninterrupted operations. Our system guarantees automated consistency and reliability by continuously reconciling the cluster state with the desired declarative configuration and maintaining stability by automatically rolling back unauthorized changes. CI/CD Automation with ARM Integration – Leverage ARM-based APIs and templates to automate deployment and configuration, simplifying application management and boosting operational efficiency. This approach enables seamless integration with Azure policies, monitoring, and governance tools, ensuring streamlined and consistent operations. Flexible Billing Options for Kubernetes Apps We support a variety of billing models to suit your needs: Private Offers for Upfront Billing - Lock in pricing with upfront payments to gain better control and predictability over your expenditures. Multiple Billing Models - Choose from flexible billing options to suit your needs, including usage-based billing, where you pay per core, per node, or other usage metrics, allowing you to scale as required. Opt for flat-rate pricing for predictable monthly or annual costs, ensuring financial stability and peace of mind. Programmatic Deployments of Apps There are several ways of deploying Kubernetes app as follows: - Programmatically deploy using Terraform: Utilize the power of Terraform to automate and manage your Kubernetes applications. - Deploy programmatically with Azure CLI: Leverage the Azure CLI for straightforward, command-line based deployments. - Use ARM templates for programmatic deployment: Define and deploy your Kubernetes applications efficiently with ARM templates. - Deploy via AKS in the Azure portal: Take advantage of the user-friendly Azure portal for a seamless deployment experience. We hope this guide has been helpful and has simplified the process of deploying Kubernetes. Stay tuned for more tips and tricks, and happy deploying! Additional Links: Get started with Kubernetes Apps: https://aka.ms/deployK8sApp. Find other Kubernetes Apps listed on Azure Marketplace: https://aka.ms/KubernetesAppsInMarketplace For Customer support, please visit: https://learn.microsoft.com/en-us/azure/aks/aks-support-help#create-an-azure-support-request Partner with us: If you are an ISV or Azure partner interested in listing your Kubernetes App, please visit: http://aka.ms/K8sAppsGettingStarted Learn more about Partner Benefits: https://learn.microsoft.com/en-us/partner-center/marketplace/overview#why-sell-with-microsoft For Partner Support, please visit: https://partner.microsoft.com/support/?stage=1
bobmital
Jan 14, 2025 Place Apps on Azure Blog
1.2KViews
0likes
0Comments
Azure Kubernetes Service Baseline - The Hard Way
Are you ready to tackle Kubernetes on Azure like a pro? Embark on the “AKS Baseline - The Hard Way” and prepare for a journey that’s likely to be a mix of command line, detective work and revelations. This is a serious endeavour that will equip you with deep insights and substantial knowledge. As you navigate through the intricacies of Azure, you’ll not only face challenges but also accumulate a wealth of learning that will sharpen your skills and broaden your understanding of cloud infrastructure. Get set for an enriching experience that’s all about mastering the ins and outs of Azure Kubernetes Service!
Alibengtsson
Oct 08, 2024 Place Apps on Azure Blog
42KViews
7likes
6Comments
New Azure Extensions in VS Code for the Web (vscode.dev)!
We are excited to announce the release of 3 Azure Extensions to vscode.dev! These extensions are Azure Static Web Apps, Azure Container Apps, and Azure Resource Groups.
meerahari
May 02, 2023 Place Apps on Azure Blog
8.5KViews
7likes
0Comments
Publishing OpenAPI Document from Azure Functions to Azure API Management within CI/CD Pipeline
In this post, I'm going to discuss how to publish the OpenAPI doc automatically generated from the Azure Function app, within the CI/CD pipeline.
justinyoo
Mar 05, 2022 Place Apps on Azure Blog
3.3KViews
2likes
0Comments
How managed identities work on Azure resources
Managed Identities are a great way to eliminate the need to store credentials in the source code, and retrieve token from Azure AD while abstracting the entire process for apps running on Azure resources. Learn how it works and what magic happens on the backend!
RishabhJain
Aug 18, 2021 Place Apps on Azure Blog
8.6KViews
3likes
0Comments
Azure Bicep Refreshed
This post refreshes the new features of Azure Bicep since the last update.
justinyoo
Apr 26, 2021 Place Apps on Azure Blog
5.2KViews
1like
0Comments
Dealing with "Upgrade your Java/Tomcat/PHP/Python versions on App Service"
You may have received Security recommendations for your App Services similar to the one shown below: "Upgrade your Java and Tomcat versions on App Service to continue receiving critical security updates You're receiving this email because you currently use an outdated version of Java or Tomcat on App Service." Not just for Java, you may receive these notifications for other stacks like PHP, Python, .NET etc. These recommendations do not provide a list of Apps in your subscription that are non-compliant (in this case, apps using outdated Java or Tomcat version). To take proper action on this recommendation, you will first need to find out what Java versions are used by your Apps. This article discusses how you can obtain this information using Azure CLI.
madhurabharadwaj
Mar 04, 2021 Place Apps on Azure Blog
4.5KViews
0likes
0Comments
Azure CLI for EventGrid Subscription to Custom Topic
This post discusses how to use Azure CLI to provision Azure Event Grid Subscription resource with Logic App as an event handler, which is not supported by ARM template out-of-the-box.
justinyoo
Jan 11, 2021 Place Apps on Azure Blog
4.2KViews
1like
0Comments