storage
128 TopicsUsing Azure CycleCloud with Weka
What is Azure CycleCloud? Azure CycleCloud is an enterprise-friendly tool for orchestrating and managing HPC environments on Azure. With Azure CycleCloud, users can provision infrastructure for HPC systems, deploy familiar HPC schedulers, and automatically scale the infrastructure to run jobs efficiently at any scale. CycleCloud is used for running workloads like scientific simulations, rendering tasks, Genomics and Bionomics, Financial Modeling, Artificial Intelligence, Machine Learning and other data-intensive operations that require large amounts of compute power. CycleCloud supports GPU computing which is useful for the workloads described above. One of the strengths of Azure CycleCloud is its ability to automatically scale resources up or down based on demand. If your workload requires more GPU power (such as for deep learning training), CycleCloud can provision additional GPU-enabled instances as needed. The question remains – If the GPU’s provisioned by CycleCloud are waiting for storage I/O operations, not only is the performance of the application severely impacted, the GPU is also underutilized meaning you are not fully exploiting the resources you are paying for! This brings us to Weka.io. But before we talk about the problems Weka & CycleCloud solve, let's talk about what Weka is. What is WEKA? The WEKA® Data Platform was purpose-built to seamlessly and sustainably deliver speed, simplicity, and scale that meets the needs of modern enterprises and research organizations without compromise. Its advanced, software-defined architecture supports next-generation workloads in virtually any location with cloud simplicity and on-premises performance. At the heart of the WEKA® Data Platform is a modern fully distributed parallel filesystem, WekaFS™ which can span across 1,000’s of NVMe SSD spread across multiple hosts and seamlessly extend itself over S3 compatible object storage. You can deploy WEKA software on a cluster of Microsoft Azure LSv3 VMs with local SSD to create a high-performance storage layer. WEKA can also take advantage of Azure Blob Storage to scale your namespace at the lowest cost. You can automate your WEKA deployment through HashiCorp Terraform templates for fast easy installation. Data stored with your WEKA environment is accessible to applications in your environment through multiple protocols, including NFS, SMB, POSIX, and S3-compliant applications. d object storage in a single global namespace Key components to WEKA Data Platform in Azure include: The Architecture is deployed directly in the customer Tenant within a subscription ID of the customers choosing. WEKA software is deployed across 6 or more Azure LSv3 VMs. The LSv3 VMs are clustered to act as one single device. The WekaFS™ namespace is extended transparently onto Azure Hot Blob Scale Up and Scale down functions are driven by Logic App’s and Function Apps All client secrets are kept in Azure Vault Deployment is fully automated using Terraform WEKA Templates What is the integration? Using the Weka-CycleCloud template available here, any compute nodes deployed via CycleCloud will automatically install the WEKA agent as well as automatically mount to the WEKA filesystem. Users can deploy 10, 100, even 1000’s of compute nodes and they will all mount to the fastest storage in Azure (WEKA). Full integration steps are available here: WEKA/CycleCloud for SLUM Integration Benefits The combined solution of Weka combines the best of both worlds. With the CycleCloud / Weka template, customers will get: Simplified HPC management. With CycleCloud, you can provision clusters with a few clicks using preconfigured templates – and the clusters will all be mounted directly to WEKA. A High-Performance End to End Architecture. CycleCloud & WEKA allows users to combine the benefits of CPUs/GPUs with ultra fast storage. This is essential to ensure high throughput and low latency for computational workloads. The goal is to ensure that the storage subsystem can keep up with the high-speed demands of the CPU/GPU, especially in scenarios where you're running compute-heavy workloads like deep learning, scientific simulations, or large-scale data processing. Cost Optimization #1. Both CycleCloud and WEKA allow for autoscaling (up and down). Adjust the number of compute resources (CycleCloud) as well as the number of Storage backend nodes (WEKA) based on workload needs. Cost Optimization #2. WEKA.IO offers intelligent data tiering to help optimize performance and storage costs. The tiering system is designed to automatically move data between different storage classes based on access patterns, which maximizes efficiency while minimizing expenses. Conclusion The CycleCloud & WEKA integration delivers a simplified HPC (AI/ML) cloud management platform, exceptional performance for data-intensive workloads, cost optimization via elastic scaling, flash optimization, & data tiering, all in one user Interface. This enables organizations to achieve high throughput, low latency, and optimal CPU/GPU resource utilization for their most demanding applications and use cases. Try it today! Special thanks to Raj Sharma and the WEKA team for their work on this integration!PowerShell script to delete all Containers from a Storage Account
After move the BootDiag settings out of the Custom Storage Account, the original Storage Account used for are still consuming space for nothing. This is part of the standard Clean Up stream need to be consider into the FinOps Plan. This script will help you to clean these Storage Accounts quickly and avoid cost paid for nothing. Connect-AzAccount #Your Subscription $MySubscriptionToClean = "MyGuid-MyGuid-MyGuid-MyGuid-MyGuid" $MyStorageAccountName = "MyStorageAccountForbootdiags" $MyStorageAccountKey = "MySAKeyWithAllCodeProvidedByYourStorageAccountSetting+MZ3cUvdQ==" $ContainerStartName = "bootdiag*" #Set subscription ID Set-AzContext -Subscription $MySubscriptionToClean Get-AzContext $ctx = New-AzStorageContext -StorageAccountName $MyStorageAccountName -StorageAccountKey $MyStorageAccountKey $myContainers = Get-AzStorageContainer -Name $ContainerStartName -Context $ctx -MaxCount 1000 foreach($mycontainer in $myContainers) { Remove-AzStorageContainer -Name $mycontainer.Name -Force -Context $ctx } I used this script to remove millions of BootDiag Containers from several Storage Accounts. You can also use it for any other cleanup use case if you need it. Fab61Views0likes1CommentDeploying ZFS Scratch Storage for NVMe on Azure Kubernetes Service (AKS)
This guide demonstrates how to use ZFS LocalPV to efficiently manage the NVMe storage available on Azure NDv5 H100 VMs. Equipped with eight 3.5TB NVMe disks, these VMs are tailored for high-performance workloads like AI/ML and large-scale data processing. By combining the flexibility of AKS with the advanced storage capabilities of ZFS, you can dynamically provision stateful node-local volumes while aggregating NVMe disks for optimal performance.Breaking the Speed Limit with WEKA File System on top of Azure Hot Blob
WEKA delivers unbeatable performance for your most demanding applications running in Microsoft Azure, supporting high I/O, low latency, small files, and mixed workloads with zero tuning and automatic storage rebalancing. Examine how WEKA’s patented filesystem, WekaFS™, and its parallel processing algorithms accelerate Blob storage performance. The WEKA® Data Platform is purpose-built to deliver speed, simplicity, and scale that meets the needs of modern enterprises and research organizations without compromise. At the heart of the WEKA® Data Platform is a modern fully distributed parallel filesystem, WekaFS™, which can span across 1,000’s of NVMe SSD spread across multiple hosts and seamlessly extend itself over compatible object storage.Migrate data to Azure Managed Lustre retaining POSIX attributes
In this blog, you learn how to copy data to your Azure Managed Lustre file system and then to long-term storage in Azure Blob Storage while retaining certain POSIX attributes that include permissions and user and group ownership. This process includes using the export jobs with archive process.Scaling Up in the Cloud: The WEKA Data Platform and Azure HPC Windows Grid Integration
Unlocking the potential of High Performance Compute (HPC) grids in the Financial Services Industry demands a storage solution that goes beyond the limitations of traditional systems. Windows-based workloads, often overlooked by existing HPC Storage, demand a specialized approach to affordably deliver great performance and massive scale. Enter the WEKA Data Platform—a new contender in the SMB Shares arena in Azure. In our quest for high-performance, scalable storage solutions for massive Windows grid environments, we delve deeper into WEKA's capabilities and its promise to revolutionize the landscape.Store a file in Azure and make it accessible over the Internet
I wish to store a txt file in Azure and make it so that the file is accessible when someone goes to it's corresponding URL. No need for access control or authentication The file contain a filter list that the ABP extension will access via URL and this txt file will be stored somewhere in Azure. What Azure solution can I use for this which keeps the price down for us?981Views0likes2CommentsAzure Managed Lustre with Automatic Synchronisation to Azure BLOB Storage
This blog post walks through how to setup an Azure Managed Lustre Filesystem (AMLFS) that will automatically synchronise to an Azure BLOB Storage container. The synchronisation is achieved using the Lustre HSM (Hierarchical Storage Management) interface combined with the Robinhood policy engine and a tool that reads the Lustre changelog and synchronises metadata with the archived storage. The lfsazsync repository on GitHub contains a Bicep template to deploy and setup a virtual machine for this purpose.Introducing New Performance Tiers for Azure Managed Lustre: Enhancing HPC Workloads
Building upon the success of its General Availability (GA) launch last month, we’re excited to unveil two new performance tiers for Azure Managed Lustre (AMLFS): 40MB/s per TiB and 500MB/s per TiB. This blog post explores the specifics of these new tiers and how they embody a customer-centric approach to innovation.