azure resource manager
9 TopicsUsing NVIDIA Triton Inference Server on Azure Container Apps
TOC Introduction to Triton System Architecture Architecture Focus of This Tutorial Setup Azure Resources File and Directory Structure ARM Template ARM Template From Azure Portal Testing Azure Container Apps Conclusion References 1. Introduction to Triton Triton Inference Server is an open-source, high-performance inferencing platform developed by NVIDIA to simplify and optimize AI model deployment. Designed for both cloud and edge environments, Triton enables developers to serve models from multiple deep learning frameworks, including TensorFlow, PyTorch, ONNX Runtime, TensorRT, and OpenVINO, using a single standardized interface. Its goal is to streamline AI inferencing while maximizing hardware utilization and scalability. A key feature of Triton is its support for multiple model execution modes, including dynamic batching, concurrent model execution, and multi-GPU inferencing. These capabilities allow organizations to efficiently serve AI models at scale, reducing latency and optimizing throughput. Triton also offers built-in support for HTTP/REST and gRPC endpoints, making it easy to integrate with various applications and workflows. Additionally, it provides model monitoring, logging, and GPU-accelerated inference optimization, enhancing performance across different hardware architectures. Triton is widely used in AI-powered applications such as autonomous vehicles, healthcare imaging, natural language processing, and recommendation systems. It integrates seamlessly with NVIDIA AI tools, including TensorRT for high-performance inference and DeepStream for video analytics. By providing a flexible and scalable deployment solution, Triton enables businesses and researchers to bring AI models into production with ease, ensuring efficient and reliable inferencing in real-world applications. 2. System Architecture Architecture Development Environment OS: Ubuntu Version: Ubuntu 18.04 Bionic Beaver Docker version: 26.1.3 Azure Resources Storage Account: SKU - General Purpose V2 Container Apps Environments: SKU - Consumption Container Apps: N/A Focus of This Tutorial This tutorial walks you through the following stages: Setting up Azure resources Publishing the project to Azure Testing the application Each of the mentioned aspects has numerous corresponding tools and solutions. The relevant information for this session is listed in the table below. Local OS Windows Linux Mac V How to setup Azure resources and deploy Portal (i.e., REST api) ARM Bicep Terraform V 3. Setup Azure Resources File and Directory Structure Please open a terminal and enter the following commands: git clone https://github.com/theringe/azure-appservice-ai.git cd azure-appservice-ai After completing the execution, you should see the following directory structure: File and Path Purpose triton/tools/arm-template.json The ARM template to setup all the Azure resources related to this tutorial, including a Container Apps Environments, a Container Apps, and a Storage Account with the sample dataset. ARM Template We need to create the following resources or services: Manual Creation Required Resource/Service Container Apps Environments Yes Resource Container Apps Yes Resource Storage Account Yes Resource Blob Yes Service Deployment Script Yes Resource Let’s take a look at the triton/tools/arm-template.json file. Refer to the configuration section for all the resources. Since most of the configuration values don’t require changes, I’ve placed them in the variables section of the ARM template rather than the parameters section. This helps keep the configuration simpler. However, I’d still like to briefly explain some of the more critical settings. As you can see, I’ve adopted a camelCase naming convention, which combines the [Resource Type] with [Setting Name and Hierarchy]. This makes it easier to understand where each setting will be used. The configurations in the diagram are sorted by resource name, but the following list is categorized by functionality for better clarity. Configuration Name Value Purpose storageAccountContainerName data-and-model [Purpose 1: Blob Container for Model Storage] Use this fixed name for the Blob Container. scriptPropertiesRetentionInterval P1D [Purpose 2: Script for Uploading Models to Blob Storage] No adjustments are needed. This script is designed to launch a one-time instance immediately after the Blob Container is created. It downloads sample model files and uploads them to the Blob Container. The Deployment Script resource will automatically be deleted after one day. caeNamePropertiesPublicNetworkAccess Enabled [Purpose 3: For Testing] ACA requires your local machine to perform tests; therefore, external access must be enabled. appPropertiesConfigurationIngressExternal true [Purpose 3: For Testing] Same as above. appPropertiesConfigurationIngressAllowInsecure true [Purpose 3: For Testing] Same as above. appPropertiesConfigurationIngressTargetPort 8000 [Purpose 3: For Testing] The Triton service container uses port 8000. appPropertiesTemplateContainers0Image nvcr.io/nvidia/tritonserver:22.04-py3 [Purpose 3: For Testing] The Triton service container utilizes this online resource. ARM Template From Azure Portal In addition to using az cli to invoke ARM Templates, if the JSON file is hosted on a public network URL, you can also load its configuration directly into the Azure Portal by following the method described in the article [Deploy to Azure button - Azure Resource Manager]. This is my example. Click Me After filling in all the required information, click Create. And we could have a test once the creation process is complete. 4. Testing Azure Container App In our local environment, use the following command to start a one-time Docker container. We will use NVIDIA's official test image and send a sample image from within it to the Triton service that was just deployed to Container Apps. # Replace XXX.YYY.ZZZ.azurecontainerapps.io with the actual FQDN of your app. There is no need to add https:// docker run --rm nvcr.io/nvidia/tritonserver:22.04-py3-sdk /workspace/install/bin/image_client -u XXX.YYY.ZZZ.azurecontainerapps.io -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg After sending the request, you should see the prediction results, indicating that the deployed Triton server service is functioning correctly. 5. Conclusion Beyond basic model hosting, Triton Inference Server's greatest strength lies in its ability to efficiently serve AI models at scale. It supports multiple deep learning frameworks, allowing seamless deployment of diverse models within a single infrastructure. With features like dynamic batching, multi-GPU execution, and optimized inference pipelines, Triton ensures high performance while reducing latency. While it may not replace custom-built inference solutions for highly specialized workloads, it excels as a standardized and scalable platform for deploying AI across cloud and edge environments. Its flexibility makes it ideal for applications such as real-time recommendation systems, autonomous systems, and large-scale AI-powered analytics. 6. References Quickstart — NVIDIA Triton Inference Server Deploying an ONNX Model — NVIDIA Triton Inference Server Model Repository — NVIDIA Triton Inference Server Triton Tutorials — NVIDIA Triton Inference Server269Views0likes0CommentsDeploy Smarter, Scale Faster – Secure, AI-Ready, Cost-Effective Kubernetes Apps at Your Fingertips!
In our previous blog post, we explored the exciting launch of Kubernetes Apps on Azure Marketplace. This follow-up blog will take you a step further by demonstrating how to programmatically deploy Kubernetes Apps using tools like Terraform, Azure CLI, and ARM templates. As organizations scale their Kubernetes environments, the demand for secure, intelligent, and cost-effective deployments has never been higher. By programmatically deploying Kubernetes Apps through Azure Marketplace, organizations can harness powerful security frameworks, cost-efficient deployment options, and AI solutions to elevate their Azure Kubernetes Service (AKS) and Azure Arc-enabled clusters. This automated approach significantly reduces operational overhead, accelerates time-to-market, and allows teams to dedicate more time to innovation. Whether you're aiming to strengthen security, streamline application lifecycle management, or optimize AI and machine learning workloads, Kubernetes Apps on Azure Marketplace provide a robust, flexible, and scalable solution designed to meet modern business needs. Let’s explore how you can leverage these tools to unlock the full potential of your Kubernetes deployments. Secure Deployment You Can Trust Certified and Secure from the Start – Every Kubernetes app on Azure Marketplace undergoes a rigorous certification process and vulnerability scans before becoming available. Solution providers must resolve any detected security issues, ensuring the app is safe from the outset. Continuous Threat Monitoring – After publication, apps are regularly scanned for vulnerabilities. This ongoing monitoring helps to maintain the integrity of your deployments by identifying and addressing potential threats over time. Enhanced Security with RBAC – Eliminates the need for direct cluster access, reducing attack surfaces by managing permissions and deployments through Azure Role-Based Access Control (RBAC). Lowering Cost of your Applications If your organization has Azure Consumption Commitment (MACC) agreements with Microsoft, you can unlock significant cost savings when deploying your applications. Kubernetes Apps available on the Azure Marketplace are MACC eligible and you can gain the following benefits: Significant Cost Savings and Predictable Expenses – Reduce overall cloud costs with discounts and credits for committed usage, while ensuring stable, predictable expenses to enhance financial planning. Flexible and Comprehensive Commitment Usage – Allocate your commitment across various Marketplace solutions that maximizes flexibility and value for evolving business needs. Simplified Procurement and Budgeting – Benefit from unified billing and streamlined procurement to driving efficiency and performance. AI-Optimized Apps High-Performance Compute and Scalability - Deploy AI-ready apps on Kubernetes clusters with dynamic scaling and GPU acceleration. Optimize performance and resource utilization for intensive AI/ML workloads. Accelerated Time-to-Value - Pre-configured solutions reduce setup time, accelerating progress from proof-of-concept to production, while one-click deployments and automated updates keep AI environments up-to-date effortlessly. Hybrid and Multi-Cloud Flexibility - Deploy AI workloads seamlessly on AKS or Azure Arc-enabled Kubernetes clusters, ensuring consistent performance across on-premises, multi-cloud, or edge environments, while maintaining portability and robust security. Lifecycle Management of Kubernetes Apps Automated Updates and Patching – The auto-upgrade feature keeps your Kubernetes applications up-to-date with the latest features and security patches, seamlessly applied during scheduled maintenance windows to ensure uninterrupted operations. Our system guarantees automated consistency and reliability by continuously reconciling the cluster state with the desired declarative configuration and maintaining stability by automatically rolling back unauthorized changes. CI/CD Automation with ARM Integration – Leverage ARM-based APIs and templates to automate deployment and configuration, simplifying application management and boosting operational efficiency. This approach enables seamless integration with Azure policies, monitoring, and governance tools, ensuring streamlined and consistent operations. Flexible Billing Options for Kubernetes Apps We support a variety of billing models to suit your needs: Private Offers for Upfront Billing - Lock in pricing with upfront payments to gain better control and predictability over your expenditures. Multiple Billing Models - Choose from flexible billing options to suit your needs, including usage-based billing, where you pay per core, per node, or other usage metrics, allowing you to scale as required. Opt for flat-rate pricing for predictable monthly or annual costs, ensuring financial stability and peace of mind. Programmatic Deployments of Apps There are several ways of deploying Kubernetes app as follows: - Programmatically deploy using Terraform: Utilize the power of Terraform to automate and manage your Kubernetes applications. - Deploy programmatically with Azure CLI: Leverage the Azure CLI for straightforward, command-line based deployments. - Use ARM templates for programmatic deployment: Define and deploy your Kubernetes applications efficiently with ARM templates. - Deploy via AKS in the Azure portal: Take advantage of the user-friendly Azure portal for a seamless deployment experience. We hope this guide has been helpful and has simplified the process of deploying Kubernetes. Stay tuned for more tips and tricks, and happy deploying! Additional Links: Get started with Kubernetes Apps: https://aka.ms/deployK8sApp. Find other Kubernetes Apps listed on Azure Marketplace: https://aka.ms/KubernetesAppsInMarketplace For Customer support, please visit: https://learn.microsoft.com/en-us/azure/aks/aks-support-help#create-an-azure-support-request Partner with us: If you are an ISV or Azure partner interested in listing your Kubernetes App, please visit: http://aka.ms/K8sAppsGettingStarted Learn more about Partner Benefits: https://learn.microsoft.com/en-us/partner-center/marketplace/overview#why-sell-with-microsoft For Partner Support, please visit: https://partner.microsoft.com/support/?stage=11.2KViews0likes0CommentsAzure Kubernetes Service Baseline - The Hard Way
Are you ready to tackle Kubernetes on Azure like a pro? Embark on the “AKS Baseline - The Hard Way” and prepare for a journey that’s likely to be a mix of command line, detective work and revelations. This is a serious endeavour that will equip you with deep insights and substantial knowledge. As you navigate through the intricacies of Azure, you’ll not only face challenges but also accumulate a wealth of learning that will sharpen your skills and broaden your understanding of cloud infrastructure. Get set for an enriching experience that’s all about mastering the ins and outs of Azure Kubernetes Service!42KViews7likes6CommentsHow managed identities work on Azure resources
Managed Identities are a great way to eliminate the need to store credentials in the source code, and retrieve token from Azure AD while abstracting the entire process for apps running on Azure resources. Learn how it works and what magic happens on the backend!8.6KViews3likes0CommentsDealing with "Upgrade your Java/Tomcat/PHP/Python versions on App Service"
You may have received Security recommendations for your App Services similar to the one shown below: "Upgrade your Java and Tomcat versions on App Service to continue receiving critical security updates You're receiving this email because you currently use an outdated version of Java or Tomcat on App Service." Not just for Java, you may receive these notifications for other stacks like PHP, Python, .NET etc. These recommendations do not provide a list of Apps in your subscription that are non-compliant (in this case, apps using outdated Java or Tomcat version). To take proper action on this recommendation, you will first need to find out what Java versions are used by your Apps. This article discusses how you can obtain this information using Azure CLI.4.5KViews0likes0Comments