Connect with experts and redefine what’s possible at work – join us at the Microsoft 365 Community Conference May 6-8. Learn more >

ai

25 Topics

Reliable, scalable, future-ready: AudioCodes extends the value of Teams for customer experiences with Azure Communication Services and Voca CIC
To improve customer experiences with cost-effective AI technology, AudioCodes fully onboarded their contact center from Microsoft Teams to Azure Communication Services.
AzureCommServicesSpotlight
Feb 13, 2025 Place Azure Communication Services Blog
741Views
3likes
0Comments
Azure Communication Services technical documentation table of contents update
Technical documentation is like a map for using a platform—whether you're building services, solving problems, or learning new features, great documentation shows you the way to the solution you need. But what good is a map if it’s hard to read or confusing to follow? That’s why easy-to-navigate documentation is so important. It saves time, reduces frustration, and helps users focus on what they want to achieve. Azure Communication Services is a powerful platform, and powerful platforms require great documentation for both new and experienced developers. Our customers tell us consistently that our docs are a crucial part of their experience of using our platform. Some studies suggest that documentation and samples are the most important elements of a great developer experience. In this update, we’re excited to share how we’ve improved our technical documentation’s navigation to make it quicker and simpler than ever to find the information you need when you need it. Why did we change? In order for our content to be useful to you, it first needs to be findable. When we launched Azure Communication Services, the small number of articles on our site made it easy to navigate and find relevant content. As we’ve grown, though, our content became harder to find for users due to the quantity of articles they need to navigate. To refresh your memory, the table of contents on our docs site used to be structured with these base categories: Overview Quickstart Tutorials Samples Concepts Resources References These directory names describ e the type of content they contain. This structure is a very useful model for products with a clearly-defined set of use cases, where typically a customer’s job-to-be-done is more constrained, but it breaks down when used for complex, powerful platforms that support a broad range of use cases in the way that Azure Communication Services does. We tried a number of small-scale changes to address the problems people were having on our site, such as having certain directories default to open on page load, but as the site grew, we became concerned that our site navigation model was becoming confusing to users and having a negative impact on their experience with our product. We decided to test that hypothesis and consider different structures that might serve our content and our customers better. Our user research team interviewed 18 customers with varying levels of experience on our platform. The research uncovered several problems that customers were having with the way our docs navigation was structured. From confusing folder titles, to related topics being far away from each other in the nav model, to general confusion around what folder titles meant, to problems finding some of the most basic information about using our platform, and a host of other issues, our user research made it clear to us that we had a problem that we needed to fix for our users. What did we change in this release? To help address these issues, we made a few key changes to make our table of contents simpler and easier to navigate. The changes we made were strictly to site navigation, not page content, and they include: We've restructured the root-level navigation to be focused on communication modality and feature type, rather than content type, to better model our customers' jobs-to-be-done. Topics include All supported communication channels Horizontal features that span more than one channel Topics of special interest to our customers, like AI Basic needs, like troubleshooting and support This will allow customers to more easily find the content they need by focusing on the job they need to do, rather than on the content type. We've simplified the overview and fundamentals sections to make the site less overwhelming on first load. We've surfaced features that customers told us were difficult to find, such as UI Library, Teams interop, and Job router. We've organized the content within each directory to roughly follow a beginner->expert path to make content more linear, and to make it easier for a user to find the next step in completing their task. We've removed unnecessary layers in our nav, making content easier to find. We've added a link to pricing information to each primitive to address a common customer complaint, that pricing information is difficult to find and understand. We've combined quickstarts, samples, and tutorials into one directory per primitive, called "Samples and tutorials", to address a customer complaint that our category names were confusing. We added a directory to each primitive for Resources, to keep important information close by. We added root-level directories for Common Scenarios, Troubleshooting, and Help and support. We did a full pass across all TOC entries to ensure correct casing, and edited entries for readability and consistency with page content, as well as for length to adhere to Microsoft guidelines and improve readability. These changes have led us to a structure that we feel less taxing for the reader, especially on first visit, maps more closely to the customer’s mental model of the information by focusing on the job-to-be-done rather than content type, helps lead them through the content from easiest to hardest, helps make it easier for them to find the information they need when they need it, and helps remind them of all the different features we support. Here’s what the table of contents looks like on page load as of Feb 6: These changes are live now. You can see them on the Azure Communication Services Technical documentation site. What’s next: In the coming weeks we will continue to make refinements based on customer feedback and our assessment of usage metrics. Our content team will begin updating article content to improve readability and enhance learning. We will be monitoring our changes and seeking your feedback. How will we monitor the effectiveness of our changes? To track the effectiveness of our changes and to be sure we haven’t regressed, we’ll be tracking a few key metrics Bounce rates: We’ll be on the lookout for an increase in bounce rates, which would indicate that customers are frequently landing on pages that don’t meet their expectations. Page Views: We’ll be tracking the number of page views for our most-visited pages across different features. A decrease in page views for these pages will be an indicator that customers are not able to find pages that had previously been popular. Customer Interviews: We will be reaching out to some of you to get your impressions of the new structure of our content over the coming weeks. Customer Surveys: We've created a survey that you can use to give us your feedback. We'll also be adding this link to select pages to allow you to tell us what you think of our changes while you're using them! So, give our new site navigation a try, and please don’t hesitate to share your feedback either by filling out our survey or by sending an email to acs-docs-feedback@microsoft.com. We look forward to hearing from you! A
sroons
Feb 19, 2025 Place Azure Communication Services Blog
618Views
2likes
0Comments
Ignite 2024: Bidirectional real-time audio streaming with Azure Communication Services
Today at Microsoft Ignite, we are excited to announce the upcoming preview of bidirectional audio streaming for Azure Communication Services Call Automation SDK, which unlocks new possibilities for developers and businesses. This capability results in seamless, low-latency, real-time communication when integrated with services like Azure Open AI and the real-time voice APIs, significantly enhancing how businesses can build and deploy conversational AI solutions. With the advent of new AI technologies, companies are developing solutions to reduce customer wait times and improve the overall customer experience. To achieve this, many businesses are turning to AI-powered agents. These AI-based agents must be capable of having conversations with customers in a human-like manner while maintaining very low latencies to ensure smooth interactions. This is especially critical in the voice channel, where any delay can significantly impact the fluidity and natural feel of the conversation. With bidirectional streaming, businesses can now elevate their voice solutions to low-latency, human-like, interactive conversational AI agents. Our bidirectional streaming APIs enable developers to stream audio from an ongoing call on Azure Communication Services to their web server in real-time. On the server, powerful language models interpret the caller's query and stream the responses back to the caller. All this is accomplished while maintaining low latency, ensuring the caller feels like they are speaking to a human. One such example of this would be to take the audio streams and processing them through Azure Open AI’s real-time voice API and then streaming the responses back into the call. With the integration of bidirectional streaming into Azure Communication Services Call Automation SDK, developers have new tools to innovate: Leverage conversational AI Solutions: Develop sophisticated customer support virtual agents that can interact with customers in real-time, providing immediate responses and solutions. Personalized customer experiences: By harnessing real-time data, businesses can offer more personalized and dynamic customer interactions in real-time, leading to increased satisfaction and loyalty. Reduce wait times for customers: By using bidirectional audio streams in combination with Large Language Models (LLMs) you can build virtual agents that can be the first point of contact for customers reducing the need for customers waiting for a human agent being available. Integrating with real-time voice-based Large Language Models (LLMs) With the advancements in voice based LLMs, developers want to take advantage of services like bidirectional streaming and send audio directly between the caller and the LLM. Today we’ll show you how you can start audio streaming through Azure Communication Services. Developers can start bidirectional streaming at the time of answering the call by providing the WebSocket URL. //Answer call with bidirectional streaming websocketUri = appBaseUrl.Replace("https", "wss") + "/ws"; var options = new AnswerCallOptions(incomingCallContext, callbackUri) { MediaStreamingOptions = new MediaStreamingOptions( transportUri: new Uri(websocketUri), contentType: MediaStreamingContent.Audio, audioChannelType: MediaStreamingAudioChannel.Mixed, startMediaStreaming: true) { EnableBidirectional = true, AudioFormat = AudioFormat.Pcm24KMono } }; At the same time, you should open your connection with Azure Open AI real-time voice API. Once the WebSocket connection is setup, Azure Communication Services starts streaming audio to your webserver. From there you can relay the audio to Azure Open AI voice and vice versa. Once the LLM reasons over the content provided in the audio it streams audio to your service which you can stream back into the Azure Communication Services call. (More information about how to set this up will be made available after Ignite) //Receiving streaming data from Azure Communication Services over websocket private async Task StartReceivingFromAcsMediaWebSocket() { if (m_webSocket == null) return; try { while (m_webSocket.State == WebSocketState.Open || m_webSocket.State == WebSocketState.Closed) { byte[] receiveBuffer = new byte[2048]; WebSocketReceiveResult receiveResult = await m_webSocket.ReceiveAsync(new ArraySegment<byte>(receiveBuffer), m_cts.Token); if (receiveResult.MessageType == WebSocketMessageType.Close) continue; var data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0'); if(StreamingData.Parse(data) is AudioData audioData) { using var ms = new MemoryStream(audioData.Data); await m_aiServiceHandler.SendAudioToExternalAI(ms); } } } catch (Exception ex) { Console.WriteLine($"Exception -> {ex}"); } } Streaming audio data back into Azure Communication Services //create and serialize streaming data private void ConvertToAcsAudioPacketAndForward( byte[] audioData ) { var audio = new OutStreamingData(MediaKind.AudioData) { AudioData = new AudioData(audioData) }; // Serialize the JSON object to a string string jsonString = System.Text.Json.JsonSerializer.Serialize<OutStreamingData>(audio); // Queue the async operation for later execution try { m_channel.Writer.TryWrite(async () => await m_mediaStreaming.SendMessageAsync(data)); } catch (Exception ex) { Console.WriteLine($"\"Exception received on ReceiveAudioForOutBound {ex}"); } } //Send encoded data over the websocket to Azure Communication Services public async Task SendMessageAsync(string message) { if (m_webSocket?.State == WebSocketState.Open) { byte[] jsonBytes = Encoding.UTF8.GetBytes(message); // Send the PCM audio chunk over WebSocket await m_webSocket.SendAsync(new ArraySegment<byte>(jsonBytes), WebSocketMessageType.Text, endOfMessage: true, CancellationToken.None); } } To reduce developer overhead when integrating with voice-based LLMs, Azure Communication Services supports a new sample rate of 24Khz, eliminating the need for developers to resample audio data and helping preserve audio quality in the process Next steps The SDK and documentation will be available in the next few weeks after this announcement, offering tools and information to integrate bidirectional streaming and utilize voice-based LLMs in your applications. Stay tuned and check our blog for updates!
KunaalPunjabi
Feb 10, 2025 Place Azure Communication Services Blog
2KViews
3likes
5Comments
New tools make it easier to build with Azure Communication Services
New sample library and tutorial for using GitHub Copilot to accelerate development with Azure Communication Services. These resources are designed to accelerate your solution development while improving code quality.
pereiralex
Feb 04, 2025 Place Azure Communication Services Blog
839Views
1like
0Comments
AI-Powered Chat with Azure Communication Services and Azure OpenAI
Many applications offer chat with automated capabilities but lack the depth to fully understand and address user needs. What if a chat app could not only connect people but also improve conversations with AI insights? Imagine detecting customer sentiment, bringing in experts as needed, and supporting global customers with real-time language translation. These aren’t hypothetical AI features, but ways you can enhance your chat apps using Azure Communication Services and Azure OpenAI today. In this blog post, we guide you through a quickstart available on GitHub for you to clone and try on your own. We highlight key features and functions, making it easy to follow along. Learn how to upgrade basic chat functionality using AI to analyze user sentiment, summarize conversations, and translate messages in real-time. Natural Language Processing for Chat Messages First, let’s go through the key features of this project. Chat Management: The Azure Communication Services Chat SDK enables you to manage chat threads and messages, including adding and removing participants in addition to sending messages. AI Integration: Use Azure OpenAI GPT models to perform: Sentiment Analysis: Determine if user chat messages are positive, negative, or neutral. Summarization: Get a summary of chat threads to understand the key points of a conversation. Translation: Translate into different languages. RESTful endpoints: Easily integrate these AI capabilities and chat management through RESTful endpoints. Event Handling (optional): Use Azure Event Grid to handle chat message events and trigger the AI processing. The starter code for the quickstart is designed to get you up and running quickly. After entering your Azure Communication Services and OpenAI credentials in the config file and running a few commands in your terminal, you can observe the features listed above in action. There are two main components to this example. The first is the ChatClient, which manages the capturing and sending of messages, via a basic chat application using Azure Communication Services. The second component, OpenAIClient, enhances your chat application by transmitting messages to Azure OpenAI along with instructions for the desired types of AI analysis. AI Analysis with OpenAIClient Azure OpenAI can perform a multitude of AI analyses, but this quickstart focuses on summarizing, sentiment analysis, and translation. To achieve this, we created three distinct prompts for each of the AI analysis we want to perform on our chat messages. These system prompts serve as the instructions for how Azure OpenAI should process the user messages. To summarize a message, we hard-coded a system prompt that says, “Act like you are an agent specialized in generating summary of a chat conversation, you will be provided with a JSON list of messages of a conversation, generate a summary for the conversation based on the content message.” Like the best LLM prompts, it’s clear, specific, and provides context for the inputs it will get. The system prompts for translating and sentiment analysis follow a similar pattern. The quickstart provides the basic architecture that enables you to take the chat content and pass it to Azure OpenAI for analysis. , and summarization. The Core Function: getChatCompletions The getChatCompletions function is a pivotal part of the AI chat sample project. It processes user messages from a chat application, sends them to the OpenAI service for analysis, and returns the AI-generated responses. Here’s a detailed breakdown of how it works: Parameters The getChatCompletions function takes in two required parameters: systemPrompt: A string that provides instructions or context to the AI model. This helps guide OpenAI to generate appropriate and relevant responses. userPrompt: A string that contains the actual message from the user. This is what the AI model analyzes and responds to. Deployment Name: The getChatCompletions function starts by retrieving the deployment name for the OpenAI model from the environment variables. Message Preparation: The function formats and prepares messages to send to OpenAI. This includes the system prompt with instructions for the AI model and user prompts that contain the actual chat messages. Sending to OpenAI: The function sends these prepared messages to the OpenAI service using the openAiClient’s getChatCompletions method. This method interacts with the OpenAI model to generate a response based on the provided prompts. Processing the Response: The function receives the response from OpenAI, extracts the AI-generated content, logs it, and returns it for further use. Explore and Customize the Quickstart The goal of the quickstart is to demonstrate how to connect a chat application and Azure OpenAI, then expand on the capabilities. To run this project locally, make sure you meet the prerequisites and follow the instructions in the GitHub repository. The system prompts and user messages are provided as samples for you experiment with. The sample chat interaction is quite pleasant. Feel free to play around with the system prompts and change the sample messages between fictional Bob and Alice in client.ts to something more hostile and see how the analysis changes. Below is an example of changing the sample messages and running the project again. Real-time messages For your chat application, you should analyze messages in real-time. This demo is designed to simulate that workflow for ease of setup, with messages sent through your local demo server. However, the GitHub repository for this quickstart project provides instructions for implementing this in your actual application. To analyze real-time messages, you can use Azure Event Grid to capture any messages sent to your Azure Communication Resource along with the necessary chat data. From there, you trigger the function that calls Azure OpenAI with the appropriate context and system prompts for the desired analysis. More information about setting up this workflow is available with "optional" tags in the quickstart's README on GitHub. Conclusion Integrating Azure Communication Services with Azure OpenAI enables you to enhance your chat applications with AI analysis and insights. This guide helps you set up a demo that shows sentiment analysis, translation, and summarization, improving user interactions and engagement. To dive deeper into the code, check out the Natural Language Processing of Chat Messages repository, and build your own AI-powered chat application today!
seankeegan
Jan 02, 2025 Place Azure Communication Services Blog
861Views
1like
0Comments
Purple CTO Shares His Azure Communication Services Experience for a Secure Contact Center
In 2022, we knew legacy on-premise systems were not going to sustain us as a fully independent contact center solution for the dozens of customers we serve. At Purple, we needed a cloud-based solution that enabled us to deliver better customer support and engagement—and allowed us to stay competitive in the market. We looked at four options, but it was an easy decision for us: Azure Communication Services bridges the gap between outdated infrastructure and a secure, scalable platform, enabling our entire business to expand its services while ensuring data is securely managed and compliant with regulatory standards. How we transformed Purple’s technological base with Azure Communication Services Our previous investments in Microsoft Teams and Teams Direct Routing for PSTN connectivity aligned seamlessly with Azure Communication Service’s interoperable framework. By adopting ACS, we modernized our technological stack and expanded our service capabilities to include reception and delegation services. Azure Communication Service’s efficiency has allowed us to develop a cost-effective, reliable solution with minimal development effort while also addressing data storage and compliance requirements. Sensitive customer data is now stored securely within customers’ Azure tenants, enhancing security and regulatory compliance. Integrating AI for enhanced contact center capabilities The migration and integration processes presented logistical and technical challenges, particularly in transferring large volumes of PSTN minutes and seamlessly transitioning services for existing customers without disrupting their operations. But our team at Purple did a great job integrating ACS into client operations, which has bolstered our position in the contact center market. Leveraging ACS features—such as call automation, direct routing, job router, call recording, transcription, and media functionalities—we enhanced our communication capabilities to support chat, email, and SMS services. We also tap into several Microsoft AI technologies to improve our contact center capabilities. Services like speech-to-text (STT), text-to-speech (TTS), transcription, summarization, and sentiment analysis provide actionable insights for businesses and agents. For optimized performance, planned integrations with Copilot studio let managers and customers query specific contact center metrics, such as agent availability and peak interaction times. Flexibility and scalability translate to cost-effectiveness for customers With ACS’s flexibility and scalability, we've developed a business model centered around cost-effectiveness and reliability. Its pay-as-you-go structure supports unlimited agents and queues, charging customers based on usage, which has reduced our costs by up to 50% and improved stability by 83% compared to older solutions. At Purple, we offer granular billing that differentiates costs for VoIP minutes, call recordings, and transcriptions. Integration with platforms like Salesforce, Jira, and Dynamics365 further streamlines operations, and helps us deliver a seamless, high-quality, cost-effective experience for all of our clients. We are excited about the AI-driven collaboration with Microsoft, which enhances our voice, chat, and CRM integration services, delivering significant value to our customers. This partnership will optimize the end-user experience, seamlessly integrate existing customer data, and provide a more cost-effective solution for businesses to scale and elevate their customer interactions. - Purple Chief Technology Officer Tjeerd Verhoeff
AzureCommServicesSpotlight
Dec 16, 2024 Place Azure Communication Services Blog
771Views
7likes
0Comments
Build your own real-time voice agent - Announcing preview of bidirectional audio streaming APIs
We are pleased to announce the public preview of bidirectional audio streaming, enhancing the capabilities of voice based conversational AI. During Satya Nadella’s keynote at Ignite, Seth Juarez demonstrated a voice agent engaging in a live phone conversation with a customer. You can now create similar experiences using Azure Communication Services bidirectional audio streaming APIs and GPT 4o model. In our recent Ignite blog post, we announced the upcoming preview of our audio streaming APIs. Now that it is publicly available, this blog describes how to use the bidirectional audio streaming APIs available in Azure Communication Services Call Automation SDK to build low-latency voice agents powered by GPT 4o Realtime API. How does the bi-directional audio streaming API enhance the quality of voice-driven agent experiences? AI-powered agents facilitate seamless, human-like interactions and can engage with users through various channels such as chat or voice. In the context of voice communication, low latency in conversational responses is crucial as delays can cause users to perceive a lack of response and disrupt the flow of conversation. Gone are the days when building a voice bot required stitching together multiple models for transcription, inference, and text-to-speech conversion. Developers can now stream live audio from an ongoing call (VoIP or telephony) to their backend server logic using the bi-directional audio streaming APIs, leverage GPT 4o to process audio input, and deliver responses back with minimal latency for the caller/user. Building Your Own Real-Time Voice Agent In this section, we walk you through a QuickStart for using Call Automation’s audio streaming APIs for building a voice agent. Before you begin, ensure you have the following: Active Azure Subscription: Create an account for free. Azure Communication Resource: Create an Azure Communication Resource and record your resource connection string for later use. Azure Communication Services Phone Number: A calling-enabled phone number. You can buy a new phone number or use a free trial number. Azure Dev Tunnels CLI: For details, see Enable dev tunnel. Azure OpenAI Resource: Set up an Azure OpenAI resource by following the instructions in Create and deploy an Azure OpenAI Service resource. Azure OpenAI Service Model: To use this sample, you must have the GPT-4o-Realtime-Preview model deployed. Follow the instructions at GPT-4o Realtime API for speech and audio (Preview) to set it up. Development Environment: Familiarity with .NET and basic asynchronous programming. Clone the quick start sample application: You can find the quick start at Azure Communication Services Call Automation and Azure OpenAI Service. git clone https://github.com/Azure-Samples/communication-services-dotnet-quickstarts.git After completing the prerequisites, open the cloned project and follow these setup steps. Environment Setup Before running this sample, you need to set up the previously mentioned resources with the following configuration updates: Setup and host your Azure dev tunnel Azure Dev tunnels is an Azure service that enables you to expose locally hosted web services to the internet. Use the following commands to connect your local development environment to the public internet. This creates a tunnel with a persistent endpoint URL and enables anonymous access. We use this endpoint to notify your application of calling events from the Azure Communication Services Call Automation service. devtunnel create --allow-anonymous devtunnel port create -p 5165 devtunnel host 2. Navigate to the quick start CallAutomation_AzOpenAI_Voice from the project you cloned. 3. Add the required API keys and endpoints Open the appsettings.json file and add values for the following settings: DevTunnelUri: Your dev tunnel endpoint AcsConnectionString: Azure Communication Services resource connection string AzureOpenAIServiceKey: OpenAI Service Key AzureOpenAIServiceEndpoint: OpenAI Service Endpoint AzureOpenAIDeploymentModelName: OpenAI Model name Run the Application Ensure your AzureDevTunnel URI is active and points to the correct port of your localhost application. Run the command dotnet run to build and run the sample application. Register an Event Grid Webhook for the IncomingCall Event that points to your DevTunnel URI (https://<your-devtunnel-uri/api/incomingCall>). For more information, see Incoming call concepts. Test the app Once the application is running: Call your Azure Communication Services number: Dial the number set up in your Azure Communication Services resource. A voice agent answer, enabling you to converse naturally. View the transcription: See a live transcription in the console window. QuickStart Walkthrough Now that the app is running and testable, let’s explore the quick start code snippet and how to use the new APIs. Within the program.cs file, the endpoint /api/incomingCall, handles inbound calls. app.MapPost("/api/incomingCall", async ( [FromBody] EventGridEvent[] eventGridEvents, ILogger<Program> logger) => { foreach (var eventGridEvent in eventGridEvents) { Console.WriteLine($"Incoming Call event received."); // Handle system events if (eventGridEvent.TryGetSystemEventData(out object eventData)) { // Handle the subscription validation event. if (eventData is SubscriptionValidationEventData subscriptionValidationEventData) { var responseData = new SubscriptionValidationResponse { ValidationResponse = subscriptionValidationEventData.ValidationCode }; return Results.Ok(responseData); } } var jsonObject = Helper.GetJsonObject(eventGridEvent.Data); var callerId = Helper.GetCallerId(jsonObject); var incomingCallContext = Helper.GetIncomingCallContext(jsonObject); var callbackUri = new Uri(new Uri(appBaseUrl), $"/api/callbacks/{Guid.NewGuid()}?callerId={callerId}"); logger.LogInformation($"Callback Url: {callbackUri}"); var websocketUri = appBaseUrl.Replace("https", "wss") + "/ws"; logger.LogInformation($"WebSocket Url: {callbackUri}"); var mediaStreamingOptions = new MediaStreamingOptions( new Uri(websocketUri), MediaStreamingContent.Audio, MediaStreamingAudioChannel.Mixed, startMediaStreaming: true ) { EnableBidirectional = true, AudioFormat = AudioFormat.Pcm24KMono }; var options = new AnswerCallOptions(incomingCallContext, callbackUri) { MediaStreamingOptions = mediaStreamingOptions, }; AnswerCallResult answerCallResult = await client.AnswerCallAsync(options); logger.LogInformation($"Answered call for connection id: {answerCallResult.CallConnection.CallConnectionId}"); } return Results.Ok(); }); In the preceding code, MediaStreamingOptions encapsulates all the configurations for bidirectional streaming. WebSocketUri: We use the dev tunnel URI with the WebSocket protocol, appending the path /ws. This path manages the WebSocket messages. MediaStreamingContent: The current version of the API supports only audio. Audio Channel: Supported formats include: Mixed: Contains the combined audio streams of all participants on the call, flattened into one stream. Unmixed: Contains a single audio stream per participant per channel, with support for up to four channels for the most dominant speakers at any given time. You also get a participantRawID to identify the speaker. StartMediaStreaming: This flag, when set to true, enables the bidirectional stream automatically once the call is established. EnableBidirectional: This enables audio sending and receiving. By default, it only receives audio data from Azure Communication Services to your application. AudioFormat: This can be either 16k pulse code modulation (PCM) mono or 24k PCM mono. Once you configure all these settings, you need to pass them to AnswerCallOptions. Now that the call is established, let's dive into the part for handling WebSocket messages. This code snippet handles the audio data received over the WebSocket. The WebSocket's path is specified as /ws, which corresponds to the WebSocketUri provided in the configuration. app.Use(async (context, next) => { if (context.Request.Path == "/ws") { if (context.WebSockets.IsWebSocketRequest) { try { var webSocket = await context.WebSockets.AcceptWebSocketAsync(); var mediaService = new AcsMediaStreamingHandler(webSocket, builder.Configuration); // Set the single WebSocket connection await mediaService.ProcessWebSocketAsync(); } catch (Exception ex) { Console.WriteLine($"Exception received {ex}"); } } else { context.Response.StatusCode = StatusCodes.Status400BadRequest; } } else { await next(context); } }); The method await mediaService.ProcessWebSocketAsync() processesg all incoming messages. The method establishes a connection with OpenAI, initiates a conversation session, and waits for a response from OpenAI. This method ensures seamless communication between the application and OpenAI, enabling real-time audio data processing and interaction. // Method to receive messages from WebSocket public async Task ProcessWebSocketAsync() { if (m_webSocket == null) { return; } // Start forwarder to AI model m_aiServiceHandler = new AzureOpenAIService(this, m_configuration); try { m_aiServiceHandler.StartConversation(); await StartReceivingFromAcsMediaWebSocket(); } catch (Exception ex) { Console.WriteLine($"Exception -> {ex}"); } finally { m_aiServiceHandler.Close(); this.Close(); } } Once the application receives data from Azure Communication Services, it parses the incoming JSON payload to extract the audio data segment. The application then forwards the segment to OpenAI for further processing. The parsing ensures data integrity ibefore sending it to OpenAI for analysis. // Receive messages from WebSocket private async Task StartReceivingFromAcsMediaWebSocket() { if (m_webSocket == null) { return; } try { while (m_webSocket.State == WebSocketState.Open || m_webSocket.State == WebSocketState.Closed) { byte[] receiveBuffer = new byte; WebSocketReceiveResult receiveResult = await m_webSocket.ReceiveAsync(new ArraySegment(receiveBuffer), m_cts.Token); if (receiveResult.MessageType != WebSocketMessageType.Close) { string data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0'); await WriteToAzOpenAIServiceInputStream(data); } } } catch (Exception ex) { Console.WriteLine($"Exception -> {ex}"); } } Here is how the application parses and forwards the data segment to OpenAI using the established session: private async Task WriteToAzOpenAIServiceInputStream(string data) { var input = StreamingData.Parse(data); if (input is AudioData audioData) { using (var ms = new MemoryStream(audioData.Data)) { await m_aiServiceHandler.SendAudioToExternalAI(ms); } } } Once the application receives the response from OpenAI, it formats the data to be forwarded to Azure Communication Services and relays the response in the call. If the application detects voice activity while OpenAI is talking, it sends a barge-in message to Azure Communication Services to manage the voice playing in the call. // Loop and wait for the AI response private async Task GetOpenAiStreamResponseAsync() { try { await m_aiSession.StartResponseAsync(); await foreach (ConversationUpdate update in m_aiSession.ReceiveUpdatesAsync(m_cts.Token)) { if (update is ConversationSessionStartedUpdate sessionStartedUpdate) { Console.WriteLine($"<<< Session started. ID: {sessionStartedUpdate.SessionId}"); Console.WriteLine(); } if (update is ConversationInputSpeechStartedUpdate speechStartedUpdate) { Console.WriteLine($" -- Voice activity detection started at {speechStartedUpdate.AudioStartTime} ms"); // Barge-in, send stop audio var jsonString = OutStreamingData.GetStopAudioForOutbound(); await m_mediaStreaming.SendMessageAsync(jsonString); } if (update is ConversationInputSpeechFinishedUpdate speechFinishedUpdate) { Console.WriteLine($" -- Voice activity detection ended at {speechFinishedUpdate.AudioEndTime} ms"); } if (update is ConversationItemStreamingStartedUpdate itemStartedUpdate) { Console.WriteLine($" -- Begin streaming of new item"); } // Audio transcript updates contain the incremental text matching the generated output audio. if (update is ConversationItemStreamingAudioTranscriptionFinishedUpdate outputTranscriptDeltaUpdate) { Console.Write(outputTranscriptDeltaUpdate.Transcript); } // Audio delta updates contain the incremental binary audio data of the generated output audio // matching the output audio format configured for the session. if (update is ConversationItemStreamingPartDeltaUpdate deltaUpdate) { if (deltaUpdate.AudioBytes != null) { var jsonString = OutStreamingData.GetAudioDataForOutbound(deltaUpdate.AudioBytes.ToArray()); await m_mediaStreaming.SendMessageAsync(jsonString); } } if (update is ConversationItemStreamingTextFinishedUpdate itemFinishedUpdate) { Console.WriteLine(); Console.WriteLine($" -- Item streaming finished, response_id={itemFinishedUpdate.ResponseId}"); } if (update is ConversationInputTranscriptionFinishedUpdate transcriptionCompletedUpdate) { Console.WriteLine(); Console.WriteLine($" -- User audio transcript: {transcriptionCompletedUpdate.Transcript}"); Console.WriteLine(); } if (update is ConversationResponseFinishedUpdate turnFinishedUpdate) { Console.WriteLine($" -- Model turn generation finished. Status: {turnFinishedUpdate.Status}"); } if (update is ConversationErrorUpdate errorUpdate) { Console.WriteLine(); Console.WriteLine($"ERROR: {errorUpdate.Message}"); break; } } } catch (OperationCanceledException e) { Console.WriteLine($"{nameof(OperationCanceledException)} thrown with message: {e.Message}"); } catch (Exception ex) { Console.WriteLine($"Exception during AI streaming -> {ex}"); } } Once the data is prepared for Azure Communication Services, the application sends the data over the WebSocket: public async Task SendMessageAsync(string message) { if (m_webSocket?.State == WebSocketState.Open) { byte[] jsonBytes = Encoding.UTF8.GetBytes(message); // Send the PCM audio chunk over WebSocket await m_webSocket.SendAsync(new ArraySegment<byte>(jsonBytes), WebSocketMessageType.Text, endOfMessage: true, CancellationToken.None); } } This wraps up our QuickStart overview. We hope you create outstanding voice agents with the new audio streaming APIs. Happy coding! For more information about Azure Communication Services bidirectional audio streaming APIs , check out: GPT-4o Realtime API for speech and audio (Preview) Audio streaming overview - audio subscription Quickstart - Server-side Audio Streaming
MilanKaur
Dec 05, 2024 Place Azure Communication Services Blog
2.4KViews
2likes
0Comments
Pre-send email analysis: Detecting sensitive data and inappropriate content using Azure AI
Azure Communication Services email enables organizations to send high volume messages to their customers using their applications. This tutorial shows how to leverage Azure AI to ensure that your messages accurately reflect your business’s brand and reputation before sending them. Azure AI offers services to analyze your email content for sensitive data and identify inappropriate content. This tutorial describes how to use Azure AI Text Analytics to check for sensitive data and Azure AI Content Safety to identify inappropriate text content. Use these functions to check your content before sending the email using Azure Communication Services. Prerequisites You need to complete these quickstarts to set up the Azure AI resources: Quickstart: Detect Personally Identifying Information (PII) in text Quickstart: Moderate text and images with content safety in Azure AI Studio Prerequisite check In a terminal or command window, run the dotnet command to check that the .NET client library is installed. reg query "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\NET Framework Setup\NDP" View the subdomains associated with your Email Communication Services resource. Sign in to the Azure portal. Locate your Email Communication Services resource. Open the Provision domains tab from the left navigation pane. Make sure that the email sub-domain you plan to use for sending email is verified in your email communication resource. For more information, see Quickstart: How to add custom verified email domains. View the domains connected to your Azure Communication Services resource. Sign in to the Azure portal. Locate your Azure Communication Services resource. Open the Email > Domains tab from the left navigation pane. Verified custom sub-domains must be connected with your Azure Communication Services resource before you use the resource to send emails. For more information, see Quickstart: How to connect a verified email domain. Create a new C# application This section describes how to create a new C# application, install required packages, and create the Main function. In a console window (such as cmd, PowerShell, or Bash), use the dotnet new command to create a new console app with the name EmailPreCheck. This command creates a simple "Hello World" C# project with a single source file: Program.cs. dotnet new console -o EmailPreCheck Change your directory to the newly created EmailPreCheck app folder and use the dotnet build command to compile your application. cd EmailPreCheck dotnet build Install required packages From the application directory, install the Azure Communication Services Email client and Azure AI libraries for .NET packages using the dotnet add package commands. dotnet add package Azure.Communication.Email dotnet add package Azure.AI.TextAnalytics dotnet add package Microsoft.Azure.CognitiveServices.ContentModerator Create the Main function Open Program.cs and replace the existing contents with the following code. The using directives include the Azure.Communication.Email and Azure.AI namespaces. The rest of the code outlines the SendMail function for your program. using System; using System.Collections.Generic; using System.Threading; using System.Threading.Tasks; using Azure; using Azure.Communication.Email; using Azure.AI.TextAnalytics; using Azure.AI.ContentSafety; namespace SendEmail { internal class Program { static async Task Main(string[] args) { // Authenticate and create the Azure Communication Services email client // Set sample content // Pre-check for sensitive data and inappropriate content // Send Email } } } Add function that checks for sensitive data Create a new function to analyze the email subject and body for sensitive data such as social security numbers and credit card numbers. private static async Task<bool> AnalyzeSensitiveData(List<string> documents) { // Client authentication goes here // Function implementation goes here } Create the Text Analytics client with authentication Create a new function with a Text Analytics client that also retrieves your connection information. Add the following code into the AnalyzeSensitiveData function to retrieve the connection key and endpoint for the resource from environment variables named LANGUAGE_KEY and LANGUAGE_ENDPOINT. It also creates the new TextAnalyticsClient and AzureKeyCredential variables. For more information about managing your Text Analytics connection information, see Quickstart: Detect Personally Identifiable Information (PII) > Get your key and endpoint. // This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT" string languageKey = Environment.GetEnvironmentVariable("LANGUAGE_KEY"); string languageEndpoint = Environment.GetEnvironmentVariable("LANGUAGE_ENDPOINT"); var client = new TextAnalyticsClient(new Uri(languageEndpoint), new AzureKeyCredential(languageKey)); Check the content for sensitive data Loop through the content to check for any sensitive data. Start the sensitivity check with a baseline of false. If sensitive data is found, return true. Add the following code into the AnalyzeSensitiveData function following the line that creates the TextAnalyticsClient variable. bool senstiveDataDetected = false; // we start with a baseline that of no sensitive data var actions = new TextAnalyticsActions { RecognizePiiEntitiesActions = new List<RecognizePiiEntitiesAction> { new RecognizePiiEntitiesAction() } }; var operation = await client.StartAnalyzeActionsAsync(documents, actions); await operation.WaitForCompletionAsync(); await foreach (var documentResults in operation.Value) { foreach (var actionResult in documentResults.RecognizePiiEntitiesResults) { if (actionResult.HasError) { Console.WriteLine($"Error: {actionResult.Error.ErrorCode} - {actionResult.Error.Message}"); } else { foreach (var document in actionResult.DocumentsResults) { foreach (var entity in document.Entities) { if (document.Entities.Count > 0) { senstiveDataDetected = true; // Sensitive data detected } } } } } } return senstiveDataDetected; Add function that checks for inappropriate content Create another new function to analyze the email subject and body for inappropriate content such as hate or violence. static async Task<bool> AnalyzeInappropriateContent(List<string> documents) { // Client authentication goes here // Function implementation goes here } Create the Content Safety client with authentication Create a new function with a Content Safety client that also retrieves your connection information. Add the following code into the AnalyzeInappropriateContent function to retrieve the connection key and endpoint for the resource from environment variables named CONTENT_LANGUAGE_KEY and CONTENT_LANGUAGE_ENDPOINT. It also creates a new ContentSafetyClient variable. If you're using the same Azure AI instance for Text Analytics, these values remain the same. For more information about managing your Content Safety connection information, see Quickstart: Detect Personally Identifiable Information (PII) > Create environment variables. // This example requires environment variables named "CONTENT_LANGUAGE_KEY" and "CONTENT_LANGUAGE_ENDPOINT" string contentSafetyLanguageKey = Environment.GetEnvironmentVariable("CONTENT_LANGUAGE_KEY"); string contentSafetyEndpoint = Environment.GetEnvironmentVariable("CONTENT_LANGUAGE_ENDPOINT"); var client = new ContentSafetyClient(new Uri(contentSafetyEndpoint), new AzureKeyCredential(contentSafetyLanguageKey)); Check for inappropriate content Loop through the content to check for inappropriate content. Start the inappropriate content detection with a baseline of false. If inappropriate content is found, return true. Add the following code into the AnalyzeInappropriateContent function after the line that creates the ContentSafetyClient variable. bool inappropriateTextDetected = false; foreach (var document in documents) { var options = new AnalyzeTextOptions(document); AnalyzeTextResult response = await client.AnalyzeTextAsync(options); // Check the response if (response != null) { // Access the results from the response foreach (var category in response.CategoriesAnalysis) { if (category.Severity > 2) // Severity: 0=safe, 2=low, 4=medium, 6=high { inappropriateTextDetected = true; } } } else { Console.WriteLine("Failed to analyze content."); } } return inappropriateTextDetected; // No inappropriate content detected Update the Main function to run prechecks and send email Now that you added the two functions for checking for sensitive data and inappropriate content, you can call them before sending email from Azure Communication Services. Create and authenticate the email client You have a few options available for authenticating to an email client. This example fetches your connection string from an environment variable. Open Program.cs in an editor. Add the following code to the body of the Main function to initialize an EmailClient with your connection string. This code retrieves the connection string for the resource from an environment variable named COMMUNICATION_SERVICES_CONNECTION_STRING. For more information about managing your resource connection string, see Quickstart: Create and manage Communication Services resources > Store your connection string. // This code shows how to fetch your connection string from an environment variable. string connectionString = Environment.GetEnvironmentVariable("COMMUNICATION_SERVICES_CONNECTION_STRING"); EmailClient emailClient = new EmailClient(connectionString); Add sample content Add the sample email content into the Main function, following the lines that create the email client. You need to get the sender email address. For more information about Azure Communication Services email domains, see Quickstart: How to add Azure Managed Domains to Email Communication Service. Modify the recipient email address variable. Put both the subject and the message body into a List<string> which can be used by the two content checking functions. //Set sample content var sender = "donotreply@xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.azurecomm.net"; // get the send email from your email resource in the Azure Portal var recipient = "emailalias@contoso.com"; // modify the recipient var subject = "Precheck Azure Communication Service Email with Azure AI"; var htmlContent = "<html><body><h1>Precheck email test</h1><br/><h4>This email message is sent from Azure Communication Service Email. </h4>"; htmlContent += "<p> My SSN is 123-12-1234. My Credit Card Number is: 1234 4321 5678 8765. My address is 1011 Main St, Redmond, WA, 998052 </p>"; htmlContent += "<p>A 51-year-old man was found dead in his car. There were blood stains on the dashboard and windscreen."; htmlContent += "At autopsy, a deep, oblique, long incised injury was found on the front of the neck. It turns out that he died by suicide.</p>"; htmlContent += "</body></html>"; List<string> documents = new List<string> { subject, htmlContent }; Pre-check content before sending email You need to call the two functions to look for violations and use the results to determine whether or not to send the email. Add the following code to the Main function after the sample content. // Pre-Check content bool containsSensitiveData = await AnalyzeSensitiveData(documents); bool containsInappropriateContent = await AnalyzeInappropriateContent(documents); // Send email only if not sensitive data or inappropriate content is detected if (containsSensitiveData == false && containsInappropriateContent == false) { /// Send the email message with WaitUntil.Started EmailSendOperation emailSendOperation = await emailClient.SendAsync( Azure.WaitUntil.Started, sender, recipient, subject, htmlContent); /// Call UpdateStatus on the email send operation to poll for the status manually try { while (true) { await emailSendOperation.UpdateStatusAsync(); if (emailSendOperation.HasCompleted) { break; } await Task.Delay(100); } if (emailSendOperation.HasValue) { Console.WriteLine($"Email queued for delivery. Status = {emailSendOperation.Value.Status}"); } } catch (RequestFailedException ex) { Console.WriteLine($"Email send failed with Code = {ex.ErrorCode} and Message = {ex.Message}"); } /// Get the OperationId so that it can be used for tracking the message for troubleshooting string operationId = emailSendOperation.Id; Console.WriteLine($"Email operation id = {operationId}"); } else { Console.WriteLine("Sensitive data and/or inappropriate content detected, email not sent\n\n"); } With this step, we have completed the tutorial. Happy coding! You can learn more about Azure Communication Email Service through the following links - Overview of Azure Communication Services email How to create authentication credentials for sending emails using SMTP
MilanKaur
Nov 07, 2024 Place Azure Communication Services Blog
293Views
0likes
0Comments
Part 2 - Multichannel notification system using Azure Communication Services and Azure Functions
In this second part of this tutorial, we complete coding the remaining Azure Functions triggers and then go ahead to deploy the multichannel notification system to Azure Functions, testing the Email, SMS, and WhatsApp triggers with OpenAI GPTs. Let’s get started!
MilanKaur
Nov 06, 2024 Place Azure Communication Services Blog
1.3KViews
0likes
0Comments
Building a WhatsApp AI bot for customer support
In this blog post, we’ll explore how to build a customer support application that integrates with WhatsApp using Azure Communication Services and Azure OpenAI. This app enables users to interact with a self-service bot to resolve common customer queries, such as troubleshooting errors or checking order status.
MilanKaur
Oct 25, 2024 Place Azure Communication Services Blog
6.1KViews
2likes
0Comments