azure resource manager
19 TopicsUsing NVIDIA Triton Inference Server on Azure Container Apps
TOC Introduction to Triton System Architecture Architecture Focus of This Tutorial Setup Azure Resources File and Directory Structure ARM Template ARM Template From Azure Portal Testing Azure Container Apps Conclusion References 1. Introduction to Triton Triton Inference Server is an open-source, high-performance inferencing platform developed by NVIDIA to simplify and optimize AI model deployment. Designed for both cloud and edge environments, Triton enables developers to serve models from multiple deep learning frameworks, including TensorFlow, PyTorch, ONNX Runtime, TensorRT, and OpenVINO, using a single standardized interface. Its goal is to streamline AI inferencing while maximizing hardware utilization and scalability. A key feature of Triton is its support for multiple model execution modes, including dynamic batching, concurrent model execution, and multi-GPU inferencing. These capabilities allow organizations to efficiently serve AI models at scale, reducing latency and optimizing throughput. Triton also offers built-in support for HTTP/REST and gRPC endpoints, making it easy to integrate with various applications and workflows. Additionally, it provides model monitoring, logging, and GPU-accelerated inference optimization, enhancing performance across different hardware architectures. Triton is widely used in AI-powered applications such as autonomous vehicles, healthcare imaging, natural language processing, and recommendation systems. It integrates seamlessly with NVIDIA AI tools, including TensorRT for high-performance inference and DeepStream for video analytics. By providing a flexible and scalable deployment solution, Triton enables businesses and researchers to bring AI models into production with ease, ensuring efficient and reliable inferencing in real-world applications. 2. System Architecture Architecture Development Environment OS: Ubuntu Version: Ubuntu 18.04 Bionic Beaver Docker version: 26.1.3 Azure Resources Storage Account: SKU - General Purpose V2 Container Apps Environments: SKU - Consumption Container Apps: N/A Focus of This Tutorial This tutorial walks you through the following stages: Setting up Azure resources Publishing the project to Azure Testing the application Each of the mentioned aspects has numerous corresponding tools and solutions. The relevant information for this session is listed in the table below. Local OS Windows Linux Mac V How to setup Azure resources and deploy Portal (i.e., REST api) ARM Bicep Terraform V 3. Setup Azure Resources File and Directory Structure Please open a terminal and enter the following commands: git clone https://github.com/theringe/azure-appservice-ai.git cd azure-appservice-ai After completing the execution, you should see the following directory structure: File and Path Purpose triton/tools/arm-template.json The ARM template to setup all the Azure resources related to this tutorial, including a Container Apps Environments, a Container Apps, and a Storage Account with the sample dataset. ARM Template We need to create the following resources or services: Manual Creation Required Resource/Service Container Apps Environments Yes Resource Container Apps Yes Resource Storage Account Yes Resource Blob Yes Service Deployment Script Yes Resource Let’s take a look at the triton/tools/arm-template.json file. Refer to the configuration section for all the resources. Since most of the configuration values don’t require changes, I’ve placed them in the variables section of the ARM template rather than the parameters section. This helps keep the configuration simpler. However, I’d still like to briefly explain some of the more critical settings. As you can see, I’ve adopted a camelCase naming convention, which combines the [Resource Type] with [Setting Name and Hierarchy]. This makes it easier to understand where each setting will be used. The configurations in the diagram are sorted by resource name, but the following list is categorized by functionality for better clarity. Configuration Name Value Purpose storageAccountContainerName data-and-model [Purpose 1: Blob Container for Model Storage] Use this fixed name for the Blob Container. scriptPropertiesRetentionInterval P1D [Purpose 2: Script for Uploading Models to Blob Storage] No adjustments are needed. This script is designed to launch a one-time instance immediately after the Blob Container is created. It downloads sample model files and uploads them to the Blob Container. The Deployment Script resource will automatically be deleted after one day. caeNamePropertiesPublicNetworkAccess Enabled [Purpose 3: For Testing] ACA requires your local machine to perform tests; therefore, external access must be enabled. appPropertiesConfigurationIngressExternal true [Purpose 3: For Testing] Same as above. appPropertiesConfigurationIngressAllowInsecure true [Purpose 3: For Testing] Same as above. appPropertiesConfigurationIngressTargetPort 8000 [Purpose 3: For Testing] The Triton service container uses port 8000. appPropertiesTemplateContainers0Image nvcr.io/nvidia/tritonserver:22.04-py3 [Purpose 3: For Testing] The Triton service container utilizes this online resource. ARM Template From Azure Portal In addition to using az cli to invoke ARM Templates, if the JSON file is hosted on a public network URL, you can also load its configuration directly into the Azure Portal by following the method described in the article [Deploy to Azure button - Azure Resource Manager]. This is my example. Click Me After filling in all the required information, click Create. And we could have a test once the creation process is complete. 4. Testing Azure Container App In our local environment, use the following command to start a one-time Docker container. We will use NVIDIA's official test image and send a sample image from within it to the Triton service that was just deployed to Container Apps. # Replace XXX.YYY.ZZZ.azurecontainerapps.io with the actual FQDN of your app. There is no need to add https:// docker run --rm nvcr.io/nvidia/tritonserver:22.04-py3-sdk /workspace/install/bin/image_client -u XXX.YYY.ZZZ.azurecontainerapps.io -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg After sending the request, you should see the prediction results, indicating that the deployed Triton server service is functioning correctly. 5. Conclusion Beyond basic model hosting, Triton Inference Server's greatest strength lies in its ability to efficiently serve AI models at scale. It supports multiple deep learning frameworks, allowing seamless deployment of diverse models within a single infrastructure. With features like dynamic batching, multi-GPU execution, and optimized inference pipelines, Triton ensures high performance while reducing latency. While it may not replace custom-built inference solutions for highly specialized workloads, it excels as a standardized and scalable platform for deploying AI across cloud and edge environments. Its flexibility makes it ideal for applications such as real-time recommendation systems, autonomous systems, and large-scale AI-powered analytics. 6. References Quickstart — NVIDIA Triton Inference Server Deploying an ONNX Model — NVIDIA Triton Inference Server Model Repository — NVIDIA Triton Inference Server Triton Tutorials — NVIDIA Triton Inference Server274Views0likes0CommentsEnable SFTP on Azure File Share using ARM Template and upload files using WinScp
SFTP is a very widely used protocol which many organizations use today for transferring files within their organization or across organizations. Creating a VM based SFTP is costly and high-maintenance. ACI service is very inexpensive and requires very little maintenance, while data is stored in Azure Files which is a fully managed SMB service in cloud. This template demonstrates an creating a SFTP server using Azure Container Instances (ACI). The template generates two resources: storage account is the storage account used for persisting data, and contains the Azure Files share sftp-group is a container group with a mounted Azure File Share. The Azure File Share will provide persistent storage after the container is terminated. ARM Template for creation of SFTP with New Azure File Share and a new Azure Storage account Resources.json { "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", "contentVersion": "1.0.0.0", "metadata": { "_generator": { "name": "bicep", "version": "0.4.63.48766", "templateHash": "17013458610905703770" } }, "parameters": { "storageAccountType": { "type": "string", "defaultValue": "Standard_LRS", "metadata": { "description": "Storage account type" }, "allowedValues": [ "Standard_LRS", "Standard_ZRS", "Standard_GRS" ] }, "storageAccountPrefix": { "type": "string", "defaultValue": "sftpstg", "metadata": { "description": "Prefix for new storage account" } }, "fileShareName": { "type": "string", "defaultValue": "sftpfileshare", "metadata": { "description": "Name of file share to be created" } }, "sftpUser": { "type": "string", "defaultValue": "sftp", "metadata": { "description": "Username to use for SFTP access" } }, "sftpPassword": { "type": "securestring", "metadata": { "description": "Password to use for SFTP access" } }, "location": { "type": "string", "defaultValue": "[resourceGroup().location]", "metadata": { "description": "Primary location for resources" } }, "containerGroupDNSLabel": { "type": "string", "defaultValue": "[uniqueString(resourceGroup().id, deployment().name)]", "metadata": { "description": "DNS label for container group" } } }, "functions": [], "variables": { "sftpContainerName": "sftp", "sftpContainerGroupName": "sftp-group", "sftpContainerImage": "atmoz/sftp:debian", "sftpEnvVariable": "[format('{0}:{1}:1001', parameters('sftpUser'), parameters('sftpPassword'))]", "storageAccountName": "[take(toLower(format('{0}{1}', parameters('storageAccountPrefix'), uniqueString(resourceGroup().id))), 24)]" }, "resources": [ { "type": "Microsoft.Storage/storageAccounts", "apiVersion": "2019-06-01", "name": "[variables('storageAccountName')]", "location": "[parameters('location')]", "kind": "StorageV2", "sku": { "name": "[parameters('storageAccountType')]" } }, { "type": "Microsoft.Storage/storageAccounts/fileServices/shares", "apiVersion": "2019-06-01", "name": "[toLower(format('{0}/default/{1}', variables('storageAccountName'), parameters('fileShareName')))]", "dependsOn": [ "[resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName'))]" ] }, { "type": "Microsoft.ContainerInstance/containerGroups", "apiVersion": "2019-12-01", "name": "[variables('sftpContainerGroupName')]", "location": "[parameters('location')]", "properties": { "containers": [ { "name": "[variables('sftpContainerName')]", "properties": { "image": "[variables('sftpContainerImage')]", "environmentVariables": [ { "name": "SFTP_USERS", "secureValue": "[variables('sftpEnvVariable')]" } ], "resources": { "requests": { "cpu": 1, "memoryInGB": 1 } }, "ports": [ { "port": 22, "protocol": "TCP" } ], "volumeMounts": [ { "mountPath": "[format('/home/{0}/upload', parameters('sftpUser'))]", "name": "sftpvolume", "readOnly": false } ] } } ], "osType": "Linux", "ipAddress": { "type": "Public", "ports": [ { "port": 22, "protocol": "TCP" } ], "dnsNameLabel": "[parameters('containerGroupDNSLabel')]" }, "restartPolicy": "OnFailure", "volumes": [ { "name": "sftpvolume", "azureFile": { "readOnly": false, "shareName": "[parameters('fileShareName')]", "storageAccountName": "[variables('storageAccountName')]", "storageAccountKey": "[listKeys(resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName')), '2019-06-01').keys[0].value]" } } ] }, "dependsOn": [ "[resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName'))]" ] } ], "outputs": { "containerDNSLabel": { "type": "string", "value": "[format('{0}.{1}.azurecontainer.io', reference(resourceId('Microsoft.ContainerInstance/containerGroups', variables('sftpContainerGroupName'))).ipAddress.dnsNameLabel, reference(resourceId('Microsoft.ContainerInstance/containerGroups', variables('sftpContainerGroupName')), '2019-12-01', 'full').location)]" } } } Parameters.json { "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#", "contentVersion": "1.0.0.0", "parameters": { "storageAccountType": { "value": "Standard_LRS" }, "storageAccountPrefix": { "value": "sftpstg" }, "fileShareName": { "value": "sftpfileshare" }, "sftpUser": { "value": "sftp" }, "sftpPassword": { "value": null }, "location": { "value": "[resourceGroup().location]" }, "containerGroupDNSLabel": { "value": "[uniqueString(resourceGroup().id, deployment().name)]" } } } ARM Template to Enable SFTP for an Existing Azure File Share in Azure Storage account Resources.json { "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", "contentVersion": "1.0.0.0", "metadata": { "_generator": { "name": "bicep", "version": "0.4.63.48766", "templateHash": "16190402726175806996" } }, "parameters": { "existingStorageAccountResourceGroupName": { "type": "string", "metadata": { "description": "Resource group for existing storage account" } }, "existingStorageAccountName": { "type": "string", "metadata": { "description": "Name of existing storage account" } }, "existingFileShareName": { "type": "string", "metadata": { "description": "Name of existing file share to be mounted" } }, "sftpUser": { "type": "string", "defaultValue": "sftp", "metadata": { "description": "Username to use for SFTP access" } }, "sftpPassword": { "type": "securestring", "metadata": { "description": "Password to use for SFTP access" } }, "location": { "type": "string", "defaultValue": "[resourceGroup().location]", "metadata": { "description": "Primary location for resources" } }, "containerGroupDNSLabel": { "type": "string", "defaultValue": "[uniqueString(resourceGroup().id, deployment().name)]", "metadata": { "description": "DNS label for container group" } } }, "functions": [], "variables": { "sftpContainerName": "sftp", "sftpContainerGroupName": "sftp-group", "sftpContainerImage": "atmoz/sftp:debian", "sftpEnvVariable": "[format('{0}:{1}:1001', parameters('sftpUser'), parameters('sftpPassword'))]" }, "resources": [ { "type": "Microsoft.ContainerInstance/containerGroups", "apiVersion": "2019-12-01", "name": "[variables('sftpContainerGroupName')]", "location": "[parameters('location')]", "properties": { "containers": [ { "name": "[variables('sftpContainerName')]", "properties": { "image": "[variables('sftpContainerImage')]", "environmentVariables": [ { "name": "SFTP_USERS", "secureValue": "[variables('sftpEnvVariable')]" } ], "resources": { "requests": { "cpu": 1, "memoryInGB": 1 } }, "ports": [ { "port": 22, "protocol": "TCP" } ], "volumeMounts": [ { "mountPath": "[format('/home/{0}/upload', parameters('sftpUser'))]", "name": "sftpvolume", "readOnly": false } ] } } ], "osType": "Linux", "ipAddress": { "type": "Public", "ports": [ { "port": 22, "protocol": "TCP" } ], "dnsNameLabel": "[parameters('containerGroupDNSLabel')]" }, "restartPolicy": "OnFailure", "volumes": [ { "name": "sftpvolume", "azureFile": { "readOnly": false, "shareName": "[parameters('existingFileShareName')]", "storageAccountName": "[parameters('existingStorageAccountName')]", "storageAccountKey": "[listKeys(extensionResourceId(format('/subscriptions/{0}/resourceGroups/{1}', subscription().subscriptionId, parameters('existingStorageAccountResourceGroupName')), 'Microsoft.Storage/storageAccounts', parameters('existingStorageAccountName')), '2019-06-01').keys[0].value]" } } ] } } ], "outputs": { "containerDNSLabel": { "type": "string", "value": "[format('{0}.{1}.azurecontainer.io', reference(resourceId('Microsoft.ContainerInstance/containerGroups', variables('sftpContainerGroupName'))).ipAddress.dnsNameLabel, reference(resourceId('Microsoft.ContainerInstance/containerGroups', variables('sftpContainerGroupName')), '2019-12-01', 'full').location)]" } } } Parameters.json { "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#", "contentVersion": "1.0.0.0", "parameters": { "existingStorageAccountResourceGroupName": { "value": null }, "existingStorageAccountName": { "value": null }, "existingFileShareName": { "value": null }, "sftpUser": { "value": "sftp" }, "sftpPassword": { "value": null }, "location": { "value": "[resourceGroup().location]" }, "containerGroupDNSLabel": { "value": "[uniqueString(resourceGroup().id, deployment().name)]" } } } Deploy the ARM Templates using PowerShell or Azure CLI or Custom Template deployment using Azure Portal. Choose the subscription you want to create the sftp service in Create a new Resource Group It will automatically create a storage account Give a File Share Name Provide a SFTP user name Provide a SFTP password Wait till the deployment is done successfully Click on the container sftp-group Copy the FQDN from the container group Download WinScp from WinSCP :: Official Site :: Download Provide Hostname : FQDN for ACI; Port Number: 22; User Name and Password Click on Login 13. Drag and drop a file from the left side to the Right side. 14. Now, go to the Storage Account and Navigate to File share. The file appears on the file share.10KViews2likes5CommentsDeploy Smarter, Scale Faster – Secure, AI-Ready, Cost-Effective Kubernetes Apps at Your Fingertips!
In our previous blog post, we explored the exciting launch of Kubernetes Apps on Azure Marketplace. This follow-up blog will take you a step further by demonstrating how to programmatically deploy Kubernetes Apps using tools like Terraform, Azure CLI, and ARM templates. As organizations scale their Kubernetes environments, the demand for secure, intelligent, and cost-effective deployments has never been higher. By programmatically deploying Kubernetes Apps through Azure Marketplace, organizations can harness powerful security frameworks, cost-efficient deployment options, and AI solutions to elevate their Azure Kubernetes Service (AKS) and Azure Arc-enabled clusters. This automated approach significantly reduces operational overhead, accelerates time-to-market, and allows teams to dedicate more time to innovation. Whether you're aiming to strengthen security, streamline application lifecycle management, or optimize AI and machine learning workloads, Kubernetes Apps on Azure Marketplace provide a robust, flexible, and scalable solution designed to meet modern business needs. Let’s explore how you can leverage these tools to unlock the full potential of your Kubernetes deployments. Secure Deployment You Can Trust Certified and Secure from the Start – Every Kubernetes app on Azure Marketplace undergoes a rigorous certification process and vulnerability scans before becoming available. Solution providers must resolve any detected security issues, ensuring the app is safe from the outset. Continuous Threat Monitoring – After publication, apps are regularly scanned for vulnerabilities. This ongoing monitoring helps to maintain the integrity of your deployments by identifying and addressing potential threats over time. Enhanced Security with RBAC – Eliminates the need for direct cluster access, reducing attack surfaces by managing permissions and deployments through Azure Role-Based Access Control (RBAC). Lowering Cost of your Applications If your organization has Azure Consumption Commitment (MACC) agreements with Microsoft, you can unlock significant cost savings when deploying your applications. Kubernetes Apps available on the Azure Marketplace are MACC eligible and you can gain the following benefits: Significant Cost Savings and Predictable Expenses – Reduce overall cloud costs with discounts and credits for committed usage, while ensuring stable, predictable expenses to enhance financial planning. Flexible and Comprehensive Commitment Usage – Allocate your commitment across various Marketplace solutions that maximizes flexibility and value for evolving business needs. Simplified Procurement and Budgeting – Benefit from unified billing and streamlined procurement to driving efficiency and performance. AI-Optimized Apps High-Performance Compute and Scalability - Deploy AI-ready apps on Kubernetes clusters with dynamic scaling and GPU acceleration. Optimize performance and resource utilization for intensive AI/ML workloads. Accelerated Time-to-Value - Pre-configured solutions reduce setup time, accelerating progress from proof-of-concept to production, while one-click deployments and automated updates keep AI environments up-to-date effortlessly. Hybrid and Multi-Cloud Flexibility - Deploy AI workloads seamlessly on AKS or Azure Arc-enabled Kubernetes clusters, ensuring consistent performance across on-premises, multi-cloud, or edge environments, while maintaining portability and robust security. Lifecycle Management of Kubernetes Apps Automated Updates and Patching – The auto-upgrade feature keeps your Kubernetes applications up-to-date with the latest features and security patches, seamlessly applied during scheduled maintenance windows to ensure uninterrupted operations. Our system guarantees automated consistency and reliability by continuously reconciling the cluster state with the desired declarative configuration and maintaining stability by automatically rolling back unauthorized changes. CI/CD Automation with ARM Integration – Leverage ARM-based APIs and templates to automate deployment and configuration, simplifying application management and boosting operational efficiency. This approach enables seamless integration with Azure policies, monitoring, and governance tools, ensuring streamlined and consistent operations. Flexible Billing Options for Kubernetes Apps We support a variety of billing models to suit your needs: Private Offers for Upfront Billing - Lock in pricing with upfront payments to gain better control and predictability over your expenditures. Multiple Billing Models - Choose from flexible billing options to suit your needs, including usage-based billing, where you pay per core, per node, or other usage metrics, allowing you to scale as required. Opt for flat-rate pricing for predictable monthly or annual costs, ensuring financial stability and peace of mind. Programmatic Deployments of Apps There are several ways of deploying Kubernetes app as follows: - Programmatically deploy using Terraform: Utilize the power of Terraform to automate and manage your Kubernetes applications. - Deploy programmatically with Azure CLI: Leverage the Azure CLI for straightforward, command-line based deployments. - Use ARM templates for programmatic deployment: Define and deploy your Kubernetes applications efficiently with ARM templates. - Deploy via AKS in the Azure portal: Take advantage of the user-friendly Azure portal for a seamless deployment experience. We hope this guide has been helpful and has simplified the process of deploying Kubernetes. Stay tuned for more tips and tricks, and happy deploying! Additional Links: Get started with Kubernetes Apps: https://aka.ms/deployK8sApp. Find other Kubernetes Apps listed on Azure Marketplace: https://aka.ms/KubernetesAppsInMarketplace For Customer support, please visit: https://learn.microsoft.com/en-us/azure/aks/aks-support-help#create-an-azure-support-request Partner with us: If you are an ISV or Azure partner interested in listing your Kubernetes App, please visit: http://aka.ms/K8sAppsGettingStarted Learn more about Partner Benefits: https://learn.microsoft.com/en-us/partner-center/marketplace/overview#why-sell-with-microsoft For Partner Support, please visit: https://partner.microsoft.com/support/?stage=11.2KViews0likes0CommentsAzure Kubernetes Service Baseline - The Hard Way
Are you ready to tackle Kubernetes on Azure like a pro? Embark on the “AKS Baseline - The Hard Way” and prepare for a journey that’s likely to be a mix of command line, detective work and revelations. This is a serious endeavour that will equip you with deep insights and substantial knowledge. As you navigate through the intricacies of Azure, you’ll not only face challenges but also accumulate a wealth of learning that will sharpen your skills and broaden your understanding of cloud infrastructure. Get set for an enriching experience that’s all about mastering the ins and outs of Azure Kubernetes Service!42KViews7likes6CommentsAzure Logic App workflow (Standard) Resubmit and Retry
Hello Experts, A workflow is scheduled to run daily at a specific time and retrieves data from different systems using REST API Calls (8-9). The data is then sent to another system through API calls using multiple child flows. We receive more than 1500 input data, and for each data, an API call needs to be made. During the API invocation process, there is a possibility of failure due to server errors (5xx) and client errors (4xx). To handle this, we have implemented a "Retry" mechanism with a fixed interval. However, there is still a chance of flow failure due to various reasons. Although there is a "Resubmit" feature available at the action level, I cannot apply it in this case because we are using multiple child workflows and the response is sent back from one flow to another. Is it necessary to utilize the "Resubmit" functionality? The Retry Functionality has been developed to handle any Server API errors (5xx) that may occur with Connectors (both Custom and Standard), including client API errors 408 and 429. In this specific scenario, it is reasonable to attempt retrying or resubmitting the API Call from the Azure Logic Apps workflow. Nevertheless, there are other situations where implementing the retry and resubmit logic would result in the same error outcome. Is it acceptable to proceed with the Retry functionality in this particular scenario? It would be highly appreciated if you could provide guidance on the appropriate methodology. Thanks -Sri889Views0likes0CommentsAzure Logic Apps vs Power Automate
Hello Experts, Please guide me in selecting the more suitable option between Azure Logic Apps and Power Automate for developing an Enterprise application that operates on a scheduled basis. This application must interact with multiple on-premises and SaaS systems by making several REST API calls (approximately 8 - 10 calls) and storing the retrieved data (structural and unstructured). Thanks -Sri4.8KViews0likes3Comments