onnx
11 TopicsGetting Started Using Phi-3-mini-4k-instruct-onnx for Text Generation with NLP Techniques
In this tutorial, we'll cover how to use the Phi-3 mini models for text generation using NLP techniques. Whether you're a beginner or an experienced AI developer, you'll learn how to download and run these powerful tools on your own computer. From setting up the Python environment to generating responses with the generate() API, we'll provide clear instructions and code examples throughout the tutorial. So, let's get started and see what the Phi-3 mini models can do!9.6KViews1like1CommentUse WebGPU + ONNX Runtime Web + Transformer.js to build RAG applications by Phi-3-mini
Learn how to harness the power of WebGPU, ONNX Runtime, and Web Transformer.js to create cutting-edge Retrieval-Augmented Generation (RAG) models. Dive into this technical guide and build intelligent applications that combine retrieval and generation seamlessly.6.8KViews2likes0CommentsGPU compute within Windows Subsystem for Linux 2 supports AI and ML workloads
Adding GPU compute support to WSL has been our #1 most requested feature since the first release. Over the last few years, the WSL, Virtualization, DirectX, Windows Driver, Windows AI teams, and our silicon partners have been working hard to deliver this capability.5.9KViews2likes0CommentsRunning Phi-3-vision via ONNX on Jetson Platform
Unlock the potential of NVIDIA's Jetson platform by running the Phi-3-vision model in ONNX format. Dive into the seamless process of compiling onnxruntime-genai, setting up the environment, and executing high-performance inference tasks on low-power devices like Jetson Orin Nano. Discover how to utilize quantized models efficiently, enabling robust image and text dialogue tasks, all while keeping your GPU workload-optimized. Whether you’re working with FP16 or Int 4 models, this guide will walk you through each step, ensuring you harness the full capabilities of edge AI on Jetson.5.8KViews2likes18CommentsJourney Series for Generative AI Application Architecture - Model references and evaluation models
In the previous content, we integrated the entire SLMOps process through Microsoft Olive. The development team can configure everything from data, fine-tuning, format conversion, deployment, etc. through Olive.config. In this article, I hope to talk about model reference and evaluation.5.8KViews3likes0CommentsONNX and NPU Acceleration for Speech on ARM
This project is from Students at University College London and explores the benefits of ONNX and NPU accelerators in accelerating the inference of Whisper models and developing a local Whisper model leveraging these techniques for ARM-based systems.1.4KViews0likes0Comments