slm
12 TopicsAI Sparks: AI Toolkit for VS Code - from playground to production
Are you building AI-powered applications from scratch or infusing intelligence into existing production code and systems? AI Sparks is your go-to webinar series for mastering the AI Toolkit (AITK) for VS Code from foundational concepts to cutting-edge techniques. In this bi-weekly, hands-on series, we’ll cover: SLMs & Local Models – Test and deploy AI models and applications efficiently on your own terms locally, to edge devices or to the cloud Embedding Models & RAG – Supercharge retrieval for smarter applications using existing data. Multi-Modal AI – Work with images, text, and beyond. Agentic Frameworks – Build autonomous, decision-making AI systems. What will you learn from this session? Whether you're a developer, startup founder, or AI enthusiast, you'll gain practical insights, live demos, and actionable takeaways to level up your AI integration journey. Join us and spark your AI transformation! You can click here and register for the entire series on the reactor page Episode list You can also sign up for the individual episode and read about the topics covered using the following links: Feb 13 th 2025 – WATCH ON DEMAND @ Introduction to AI toolkit and feature walkthrough In In this episode, we’ll introduce the AI Toolkit extension for VS Code—a powerful way to explore and integrate the latest AI models from OpenAI, Meta, Deepseek, Mistral, and more. With this extension, you can browse state-of-the-art models, download some for local use, or experiment with others remotely. Whether you're enhancing an existing application or building something new, the AI Toolkit simplifies the process of selecting and integrating the right model for your needs. Feb 27 th 2025 – A short introduction to SLMs and local model with use cases In this episode, we’ll explore Small Language Models (SLMs) and how they compare to larger models. SLMs are efficient, require less compute and memory, and can run on edge devices while still excelling at a variety of tasks. We’ll dive into the Phi-3.5 and Phi-4 model series and demonstrate how to build a practical application using these models. Mar 13 th 2025 – How to work with embedding models and build a RAG application In this episode, we’ll dive into embedding models—important tools for working with vector databases and large language models. These models convert text into numerical representations, making it easier to process and retrieve information efficiently. After covering the core concepts, we’ll apply them in practice by building a Retrieval-Augmented Generation (RAG) app using Small Language Models (SLMs) and a vector database. Mar 27 th 2025 – Multi-modal support and image analysis In this episode, we’ll dig deeper into multi-modal capabilities of local and remote AI models and use visualization tools for better insights. We’ll also dive into multi-modal support in the AI Toolkit, showcasing how to process and analyze images alongside text. By the end, you’ll see how these capabilities come together to enhance powerful AI applications. Apr 10 th 2025 – Evaluations – How to choose the best model for you applications needs In this episode, we’ll explore how to evaluate AI models and choose the right one for your needs. We’ll cover key performance metrics, compare different models, and demonstrate testing strategies using features like Playground, Bulk Run, and automated evaluations. Whether you're experimenting with the latest models or transitioning to a new version, these evaluation techniques will help you make informed decisions with confidence. Apr 24 th 2025 – Agents and Agentic Frameworks In this episode, we’ll explore agents and agentic frameworks—systems that enable AI models to make decisions, take actions, and automate complex tasks. We’ll break down how these frameworks work, their practical applications, and how to build and integrate them into your projects. By the end, you’ll have a clear understanding of how to build and leverage AI agents effectively. We will explore how to use and build agentic frameworks using AI Toolkit. Resources AI toolkit for VSCode - https://aka.ms/AIToolkit AI toolkit for VSCode Documentation - https://aka.ms/AIToolkit/doc Building Retrieval Augmented Generation (RAG) apps on VSCode & AI Toolkit Understanding and using Reasoning models such as DeepSeek R1 on AI toolkit - Using Ollama and OpenAI, Google and Anthropic hosted models with AI toolkit AI Sparks - YouTube PlaylistAI Genius - AI Skilling series for Developers
We are conducting a six-part AI Skilling series called AI Genius starting January 28th, 2025, to kickstart your AI learning journey from beginner to advanced use cases. The series will feature experts from Microsoft talking about different aspects of using AI and building AI Applications. This is targeted towards developers who are looking to upskill their AI capabilities in the latest AI technologies such as SLMs, RAG and AI agents.Getting Started with the AI Dev Gallery
The AI Dev Gallery is a new open-source project designed to inspire and support developers in integrating on-device AI functionality into their Windows apps. It offers an intuitive UX for exploring and testing interactive AI samples powered by local models. Key features include: Quickly explore and download models from well-known sources on GitHub and HuggingFace. Test different models with interactive samples over 25 different scenarios, including text, image, audio, and video use cases. See all relevant code and library references for every sample. Switch between models that run on CPU and GPU depending on your device capabilities. Quickly get started with your own projects by exporting any sample to a fresh Visual Studio project that references the same model cache, preventing duplicate downloads. Part of the motivation behind the Gallery was exposing developers to the host of benefits that come with on-device AI. Some of these benefits include improved data security and privacy, increased control and parameterization, and no dependence on an internet connection or third-party cloud provider. Requirements Device Requirements Minimum OS Version: Windows 10, version 1809 (10.0; Build 17763) Architecture: x64, ARM64 Memory: At least 16 GB is recommended Disk Space: At least 20GB free space is recommended GPU: 8GB of VRAM is recommended for running samples on the GPU Visual Studio 2022 You will need Visual Studio 2022 with the Windows Application Development workload. Running the Gallery To run the gallery: Clone the repository: git clone https://github.com/microsoft/AI-Dev-Gallery.git Run the solution: .\AIDevGallery.sln Hit F5 to build and run the Gallery Using the Gallery The AI Dev Gallery has can be navigated in two ways: The Samples View The Models View Navigating Samples In this view, samples are broken up into categories (Text, Code, Image, etc.) and then into more specific samples, like in the Translate Text pictured below: On clicking a sample, you will be prompted to choose a model to download if you haven’t run this sample before: Next to the model you can see the size of the model, whether it will run on CPU or GPU, and the associated license. Pick the model that makes the most sense for your machine. You can also download new models and change the model for a sample later from the sample view. Just click the model drop down at the top of the sample: The last thing you can do from the Sample pane is view the sample code and export the project to Visual Studio. Both buttons are found in the top right corner of the sample, and the code view will look like this: Navigating Models If you would rather navigate by models instead of samples, the Gallery also provides the model view: The model view contains a similar navigation menu on the right to navigate between models based on category. Clicking on a model will allow you to see a description of the model, the versions of it that are available to download, and the samples that use the model. Clicking on a sample will take back over to the samples view where you can see the model in action. Deleting and Managing Models If you need to clear up space or see download details for the models you are using, you can head over the Settings page to manage your downloads: From here, you can easily see every model you have downloaded and how much space on your drive they are taking up. You can clear your entire cache for a fresh start or delete individual models that you are no longer using. Any deleted model can be redownload through either the models or samples view. Next Steps for the Gallery The AI Dev Gallery is still a work in progress, and we plan on adding more samples, models, APIs, and features, and we are evaluating adding support for NPUs to take the experience even further If you have feedback, noticed a bug, or any ideas for features or samples, head over to the issue board and submit an issue. We also have a discussion board for any other topics relevant to the Gallery. The Gallery is an open-source project, and we would love contribution, feedback, and ideation! Happy modeling!3.7KViews4likes3CommentsAI Toolkit for VS Code January Update
AI Toolkit is a VS Code extension aiming to empower AI engineers in transforming their curiosity into advanced generative AI applications. This toolkit, featuring both local-enabled and cloud-accelerated inner loop capabilities, is set to ease model exploration, prompt engineering, and the creation and evaluation of generative applications. We are pleased to announce the January Update to the toolkit with support for OpenAI's o1 model and enhancements in the Model Playground and Bulk Run features. What's New? January’s update brings several exciting new features to boost your productivity in AI development. Here's a closer look at what's included: Support for OpenAI’s new o1 Model: We've added access to GitHub hosted OpenAI’s latest o1 model. This new model replaces the o1-preview and offers even better performance in handling complex tasks. You can start interacting with the o1 model within VS Code for free by using the latest AI Toolkit update. Chat History Support in Model Playground: We have heard your feedback that tracking past model interactions is crucial. The Model Playground has been updated to include support for chat history. This feature saves chat history as individual files stored entirely on your local machine, ensuring privacy and security. Bulk Run with Prompt Templating: The Bulk Run feature, introduced in the AI Toolkit December release, now supports prompt templating with variables. This allows users to create templates for prompts, insert variables, and run them in bulk. This enhancement simplifies the process of testing multiple scenarios and models. Stay tuned for more updates and enhancements as we continue to innovate and support your journey in AI development. Try out the AI Toolkit for Visual Studio Code, share your thoughts, and file issues and suggest features in our GitHub repo. Thank you for being a part of this journey with us!Getting Started - Generative AI with Phi-3-mini: Running Phi-3-mini in Intel AI PC
In 2024, with the empowerment of AI, we will enter the era of AI PC. On May 20, Microsoft also released the concept of Copilot + PC, which means that PC can run SLM/LLM more efficiently with the support of NPU. We can use models from different Phi-3 family combined with the new AI PC to build a simple personalized Copilot application for individuals. This content will combine Intel's AI PC, use Intel's OpenVINO, NPU Acceleration Library, and Microsoft's DirectML to create a local Copilot.30KViews2likes2CommentsAI Toolkit for Visual Studio Code: October 2024 Update Highlights
The AI Toolkit’s October 2024 update revolutionizes Visual Studio Code with game-changing features for developers, researchers, and enthusiasts. Explore multi-model integration, including GitHub Models, ONNX, and Google Gemini, alongside custom model support. Dive into multi-modal capabilities for richer AI testing and seamless multi-platform compatibility across Windows, macOS, and Linux. Tailored for productivity, the enhanced Model Catalog simplifies choosing the best tools for your projects. Try it now and share feedback to shape the future of AI in VS Code!2.6KViews4likes0CommentsResponsible AI Mitigation Layers
Generative AI is increasingly being used in various kinds of systems to augment humans and infuse intelligent behavior into existing and new apps. While this opens up a world of opportunities for new functionalities, it has also created a new set of risks due to its probabilistic nature and interaction using natural language prompts. In this blog post, we will talk about the mitigation strategies to be used against attack against generative AI systems.Getting started with Microsoft Phi-3-mini - Try running the Phi-3-mini on iPhone with ONNX Runtime
In this article, we explore how to deploy generative AI applications to mobile devices, specifically on iPhone, using ONNX Runtime. We cover the steps to compile ONNX Runtime for iOS and then create an App application in Xcode. We also show you how to copy the ONNX quantized INT4 model to the project and add the C++ API to generate text. This is a preliminary exploration of deploying generative AI on mobile devices, but it provides a good starting point for further development.18KViews2likes2CommentsAccelerate the development of Generative AI application with GitHub Models
Introducing GitHub Models, a new feature for over 100 million developers to become AI engineers using top AI models. Access models like Llama 3.1, GPT-4o, GPT-4o mini, Phi 3, and Mistral Large 2 through a built-in playground on GitHub. Test prompts and model settings for free. When ready, seamlessly integrate models into Codespaces and VS Code. For production, Azure AI offers responsible AI, enterprise-grade security, data privacy, and global availability. Models are accessible in over 25 Azure regions with provisioned throughput. Now, building and running your AI application is easier than ever.4.2KViews0likes0Comments