In the previous content, we learned about AI Agent. If you didn't read it, please read my previous content - Understanding AI Agents. We have many different frameworks to implement AI Agents. AutoGen from Microsoft is a relatively mature AI Agents framework. Now AutoGen is mainly based on two programming languages .NET and Python. The more mature version is the Python version. The content in this article is mainly based on the Python version https://microsoft.github.io/autogen. If you want to learn the .NET version, you can visit here https://microsoft.github.io/autogen-for-net
AutoGen Features
From the perspective of AI agents, AutoGen has the ability to be compatible with different LLMs, tool chains for different tasks, and human-computer interaction capabilities. It is an open source framework used to solve the interaction between agents. Its biggest feature is its ability to automate task orchestration, optimize workflow, and have powerful multi-agent conversation capabilities to adjust to workflow or goals. Combined with the different APIs provided in the framework, the cache construction, error handling, different LLMs configurations, context association, dialogue process settings required by the AI agent . Compared with Semantic Kernel and LangChain, the frameworks based on Copilot applications, AutoGen has more advantages in automated task orchestration scenarios and is more front-end-oriented. After receiving the target task, AutoGen can arrange the task, and Semantic Kernel or LangChain are more like providing an ammunition library to solve the task arrangement process, providing various tools and methods that can support the completion of the task.
The construction of AutoGen is very simple. You only need simple code to quickly configure the agent. By building simple anthropomorphic user agents and assistants, you can complete the construction of a simple agent. Here's how to quickly build a single agent
1. Configuration file, AutoGen. For configuration files, Azure OpenAI Service is generally placed in the AOAI_CONFIG_LIST in the root directory, such as
[
{
"model": "Your Azure OpenAI Service Deployment Model Name",
"api_key": "Your Azure OpenAI Service API Key",
"base_url": "Your Azure OpenAI Service Endpoin",
"api_type": "azure",
"api_version": "Your Azure OpenAI Service version, eg 2023-12-01-preview"
},
{
"model": "Your Azure OpenAI Service Deployment Model Name",
"api_key": "Your Azure OpenAI Service API Key",
"base_url": "Your Azure OpenAI Service Endpoin",
"api_type": "azure",
"api_version": "Your Azure OpenAI Service version, eg 2023-12-01-preview"
},
{
"model": "Your Azure OpenAI Service Deployment Model Name",
"api_key": "Your Azure OpenAI Service API Key",
"base_url": "Your Azure OpenAI Service Endpoin",
"api_type": "azure",
"api_version": "Your Azure OpenAI Service version, eg 2023-12-01-preview"
}
]
If it is OpenAI Service, OAI_CONFIG_LIST placed in the root directory, the content includes
[
{
"model": "Your OpenAI Model Name",
"api_key": "Your OpenAI API Key"
},
{
"model": "Your OpenAI Model Name",
"api_key": "Your OpenAI API Key"
},
{
"model": "Your OpenAI Model Name",
"api_key": "Your OpenAI API Key"
},
]
After completing the configuration, you can use Python to initial
config_list = autogen.config_list_from_json(
env_or_file="AOAI_CONFIG_LIST",
file_location=".",
filter_dict={
"model": {
"Your Model list"
}
},
)
2. Create user proxy agent and assistant agent
# Create an AssistantAgent instance named "assistant"
assistant = autogen.AssistantAgent(
name="assistant",
llm_config={
"cache_seed": 42,
"config_list": config_list,
}
)
# create a UserProxyAgent instance named "user_proxy"
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="ALWAYS"
)
Notice
A. The AI agent assistant corresponds to the configuration file and adds a cache. The role of the configuration is to give the agent a powerful "brain" and cache memory.
B. User proxy agent can simulate human behavior, and you can set whether human intervention occurs. We know that the characteristics of AI agents are not only human thinking, but also human interactive behaviors. When we solve problems through AI agents, we need to consider whether human intervention is needed. You can choose Never. But sometimes you must choose ALWAYS. Because when we need to obtain some APIs, we need some Keys or the cooperation of some network address files. You can set it according to your own scene.
3. The last step is to associate the user proxy agent and the assistant agent together and give them a task
messages = "tell me today's top 10 news in the world "
user_proxy.initiate_chat(assistant, message=messages)
We can clearly see how the agent completes the complete interaction and generates code to obtain today's latest news. If you want to learn this example, please visit
AutoGen scenarios
There are many implementation scenarios based on AutoGen. You can learn from the cases in https://microsoft.github.io/autogen/docs/Examples. I would like to tell you how to use AutoGen from two scenarios and how Autogen works through a detailed application scenario.
Scenarios
Case 1: Combining multi-modal capabilities to complete object detection
Requirement: During our production process, we need to conduct safety helmet detection. If you find that employees are not wearing safety helmets, please mark it.
From traditional AI applications, what we need is to collect data of people wearing helmets, label them, complete the model through deep learning training, and then inference and label it. Now that we have multimodal models, we can simplify a lot of our work. In this scenario, we can combine multi-modal agents, code agents, and running-code agents to complete related work.
AutoGen supports group chat, and multiple agents can be combined to complete tasks in a session. The code is as follows:
groupchat = autogen.GroupChat(agents=[user_proxy, checker,coder, commander], messages=[], max_round=10)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
Group chat can combine different agents to complete tasks, which is very interesting work. If you want to know more, please visit
Case 2: AutoGen powered by Assistant API
The Assistant API is designed for AI agents. You can build AI agent applications with less code through the Assistant API. It integrates capabilities such as state management, context association, chat threads, and code execution, making it easier to access third-party extensions ( Code interpreter, knowledge retrieval and Function Calling, etc.). Although AutoGen already had similar functions before the Assistant API, with the support of the Assistant API, AutoGen can be more flexibly defined in multi-agent scenarios, can set more interactive scenarios, more flexible task execution, and more Good end-to-end process management.
Note: Before the article was written, AutoGen did not support the Assistant API provided on Azure OpenAI Service, so this article will be completed based on the OpenAI Assistant API.
Before using the Assistant API, the relevant Assistants must be created on the OpenAI or Azure OpenAI Service portal. For details, please refer to https://learn.microsoft.com/azure/ai-services/openai/assistants-quickstart
Using the Assistant API in AutoGen requires adjusting the configuration. The tools type can be set to code_interpreter, retrival, function calling.
llm_config = {
"config_list": config_list,
"assistant_id": "Your OpenAI Assistant ID ",
"tools": [{"type": "code_interpreter"}],
"file_ids": [
"Your OpenAI Assistant File 1",
"Your OpenAI Assistant File 2"
],
}
AI Agent-based settings
gpt_assistant = GPTAssistantAgent(
name="Your Assistant Agent Name", instructions="Your Assistant Agent Instructions ", llm_config=llm_config
)
If you want to try running the contents of this repo, please visit
Build a visualization solution for AutoGen - AutoGen Studio
For enterprise solutions, more people like to use a combination of visualization and low-code methods to complete relevant workflow settings. Use AutoGen Studio to bring better workflow-based agent-customized visualization solutions to enterprises.
Installation
AutoGen Studio is recommended to be started in the Python 3.11 environment. You can use conda to install the Python environment and install the AutoGen Studio package.
conda create -n agstudioenv python=3.11.7
conda activate agstudioenv
pip install autogenstudio
Remember to configure OPENAI_API_KEY or your AZURE_OPENAI_API_KEY before starting
export OPENAI_API_KEY='Your OpenAI Key'
export AZURE_OPENAI_API_KEY='Your Azure OpenAI Service Key'
Start your AutoGen Studio, where port is the network port and can be set as needed
autogenstudio ui --port 8088
Use Case/Scenario
Everyone knows that I am a Premier League fan. I hope to build an AI agent to help me analyze the situation of each Premier League team in the new season based on the standings.
Assemble ammunition for your AI agent
AutoGen Studio now supports configuring skills, models, agents, and workflows. These four functions can be seen by selecting the Build menu.
1. Skills Different functions can be added to the agent through Python. Here I add a get_league_standing Skills.
Note: You need to register https://www.football-data.org/ to get API Key
import requests
import json
def get_league_standings(api_key='Your football-data API Key'):
url = "http://api.football-data.org/v4/competitions/PL/standings"
headers = {"X-Auth-Token": api_key}
response = requests.get(url, headers=headers)
data = response.json()
standings = []
if 'standings' in data:
for standing in data['standings']:
if standing['type'] == 'TOTAL':
for team in standing['table']:
team_data = {
"position": team['position'],
"teamName": team['team']['name'],
"playedGames": team['playedGames'],
"won": team['won'],
"draw": team['draw'],
"lost": team['lost'],
"points": team['points'],
"goalsFor": team['goalsFor'],
"goalsAgainst": team['goalsAgainst'],
"goalDifference": team['goalDifference']
}
standings.append(team_data)
break
standings_json = json.dumps(standings, ensure_ascii=False, indent=4)
return standings_json
else:
return "Error"
After saving, as shown in the figure
2. Models corresponds to the binding of the LLMs. The design needs to set the Key of OpenAI or Azure OpenAI Service before starting. We add a binding of the gpt-4-turbo model. Here we use Azure OpenAI Services service, so the name and EndPoint of the deployment need to correspond one-to-one with your Azure OpenAI Service
3. Agents Add your AI agent. You can set different agents here. In our case, we only need to set up a single agent. Add a football_expert_assistant agent here and set a role for the system and bind the Skill - get_league_standing and Models just added, as shown in the figure.
4. Workflows We can set the workflow of the agent and the interactive dialogue workflow of the agent. We set the simplest two agent interaction mode - "Two Agents."
We need to set up the receiver - Receiver and bind the set football_expert_assistant's agent and LLMs.
Running Your Agents
You can run your application through the Playground in the AutoGen Studio UI. You only need to create a Session to correspond to the set Workflows.
The result:
Of course, you can also publish the agent by selecting Session, which can be viewed through the Gallery menu.
Summary
AutoGen is a relatively comprehensive AI agent framework. For enterprises that want to build AI agents, it not only provides an application framework, but also provides a visual and interactive visual UI of AutoGen, which lowers the entry barrier for intelligent agents and allows more people to take advantage of intelligent agents. We have taken the first step to build an AI agent using AutoGen, and will incorporate more advanced content in the following series.
Resources
-
Microsoft AutoGen https://microsoft.github.io/autogen/
-
Microsoft AutoGen Studio UI 2.0 https://microsoft.github.io/autogen/blog/2023/12/01/AutoGenStudio/
-
AutoGen Studio: Interactively Explore Multi-Agent Workflows https://microsoft.github.io/autogen/blog/2023/12/01/AutoGenStudio/
-
Azure OpenAI Service Assistant API Docs https://learn.microsoft.com/azure/ai-services/openai/assistants-quickstart
Updated Feb 08, 2024
Version 1.0kinfey
Microsoft
Joined September 17, 2021
Educator Developer Blog
Follow this blog board to get notified when there's new activity