Building AI Agents on edge devices using Ollama + Phi-4-mini Function Calling

Microsoft

Mar 11, 2025

The new Phi-4-mini and Phi-4-multimodal now support Function Calling. This feature enables the models to connect with external tools and APIs. By deploying Phi-4-mini and Phi-4-multimodal with Function Calling capabilities on edge devices, we can achieve local expansion of knowledge capabilities and enhance their task execution efficiency. This blog will focus on how to use Phi-4-mini's Function Calling capabilities to build efficient AI Agents on edge devices.

What‘s Function Calling

How it works

First we need to learn how Function Calling works

Tool Integration: Function Calling allows LLM/SLM to interact with external tools and APIs, such as weather APIs, databases, or other services.
Function Definition: Defines a function (tool) that LLM/SLM can call, specifying its name, parameters, and expected output.
LLM Detection: LLM/SLM analyzes the user's input and determines if a function call is required and which function to use.
JSON Output: LLM/SLM outputs a JSON object containing the name of the function to call and the parameters required by the function.
External Execution: The application executes the function call using the parameters provided by LLM/SLM.
Response to LLM: Returns the output of Function Calling to LLM/SLM, and LLM/SLM can use this information to generate a response to the user.

Application scenarios

Data retrieval: convert natural language queries into API calls to fetch data (e.g., "show my recent orders" triggers a database query)
Operation execution: convert user requests into specific function calls (e.g., "schedule a meeting" becomes a calendar API call)
Computational tasks: handle mathematical or logical operations through dedicated functions (e.g., calculate compound interest or statistical analysis)
Data processing: chain multiple function calls together (e.g., get data → parse → transform → store)
UI/UX integration: trigger interface updates based on user interactions (e.g., update map markers or display charts)

Phi-4-mini / Phi-4-multimodal's Function Calling

Phi-4-mini / Phi-4-multimodal supports single and parallel Function Calling. Things to note when calling

You need to define Tools in System to start single or parallel Function Calling
If you want to start parallel Function Calling, you also need to add 'some tools' to the System prompt

The following is an example

Single Function Calling

tools = [ { "name": "get_match_result", "description": "get match result", "parameters": { "match": { "description": "The name of the match", "type": "str", "default": "Arsenal vs ManCity" } } }, ] messages = [ { "role": "system", "content": "You are a helpful assistant", "tools": json.dumps(tools), # pass the tools into system message using tools argument }, { "role": "user", "content": "What is the result of Arsenal vs ManCity today?" } ]

Full Sample : Click

Parallel Function Calling

AGENT_TOOLS = { "booking_fight": { "name": "booking_fight", "description": "booking fight", "parameters": { "departure": { "description": "The name of Departure airport code", "type": "str", }, "destination": { "description": "The name of Destination airport code", "type": "str", }, "outbound_date": { "description": "The date of outbound flight", "type": "str", }, "return_date": { "description": "The date of return flight", "type": "str", } } }, "booking_hotel": { "name": "booking_hotel", "description": "booking hotel", "parameters": { "query": { "description": "The name of the city", "type": "str", }, "check_in_date": { "description": "The date of check in", "type": "str", }, "check_out_date": { "description": "The date of check out", "type": "str", } } }, } SYSTEM_PROMPT = """ You are my travel agent with some tools available. """ messages = [ { "role": "system", "content": SYSTEM_PROMPT, "tools": json.dumps(AGENT_TOOLS), # pass the tools into system message using tools argument }, { "role": "user", "content": """I have a business trip from London to New York in March 21 2025 to March 27 2025, can you help me to book a hotel and flight tickets""" } ]

Full sample : click

Using Ollama and Phi-4-mini Function Calling to Create AI Agents on Edge Devices

Ollama is a popular free tool for deploying LLM/SLM locally and can be used in combination with AI Toolkit for VS Code. In addition to being deployed on your PC/Laptop, it can also be deployed on IoT, mobile phones, containers, etc. To use Phi-4-mini on Ollama, you need to use Ollama 0.5.13+. Different quantitative versions are supported on Ollama, as shown in the figure below:

Using Ollama, we can deploy Phi-4-mini on the edge, and implement AI Agent with Function Calling under limited computing power, so that Generative AI can be applied more effectively on the edge.

Current Issues

A sad experience - If you directly use the interface to try to call Ollama in the above way, you will find that Function Calling will not be triggered. There are discussions on Ollama's GitHub Issue. You can enter the Issue https://github.com/ollama/ollama/issues/9437. By modifying the Phi-4-mini Template on the ModelFile to implement a single Function Calling, but the call to Parallel Function Calling still failed.

Resolution

We have implemented a fix by making a adjustments to the template. We have improved it according to Phi-4-mini's Chat Template and re-modified the Modelfile. Of course, the quantitative model has a huge impact on the results. The adjustments are as follows:


TEMPLATE """
{{- if .Messages }}
{{- if or .System .Tools }}<|system|>

{{ if .System }}{{ .System }}
{{- end }}
In addition to plain text responses, you can chose to call one or more of the provided functions.

Use the following rule to decide when to call a function:
  * if the response can be generated from your internal knowledge (e.g., as in the case of queries like "What is the capital of Poland?"), do so
  * if you need external information that can be obtained by calling one or more of the provided functions, generate a function calls

If you decide to call functions:
  * prefix function calls with functools marker (no closing marker required)
  * all function calls should be generated in a single JSON list formatted as functools[{"name": [function name], "arguments": [function arguments as JSON]}, ...]
  * follow the provided JSON schema. Do not hallucinate arguments or values. Do to blindly copy values from the provided samples
  * respect the argument type formatting. E.g., if the type if number and format is float, write value 7 as 7.0
  * make sure you pick the right functions that match the user intent

Available functions as JSON spec:
{{- if .Tools }}
{{ .Tools }}
{{- end }}<|end|>
{{- end }}
{{- range .Messages }}
{{- if ne .Role "system" }}<|{{ .Role }}|>
{{- if and .Content (eq .Role "tools") }}

{"result": {{ .Content }}}
{{- else if .Content }}

{{ .Content }}
{{- else if .ToolCalls }}

functools[
{{- range .ToolCalls }}{{ "{" }}"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}{{ "}" }}
{{- end }}]
{{- end }}<|end|>
{{- end }}
{{- end }}<|assistant|>

{{ else }}
{{- if .System }}<|system|>

{{ .System }}<|end|>{{ end }}{{ if .Prompt }}<|user|>

{{ .Prompt }}<|end|>{{ end }}<|assistant|>

{{ end }}{{ .Response }}{{ if .Response }}<|user|>{{ end }}
"""

We have tested the solution using different quantitative models. In the laptop environment, we recommend that you use the following model to enable single/parallel Function Calling: phi4-mini:3.8b-fp16. Note: you need to bind the defined Modelfile and phi4-mini:3.8b-fp16 together to enable this to work.
Please execute the following command in the command line:

#If you haven't downloaded it yet, please execute this command firstr

ollama run phi4-mini:3.8b-fp16

#Binding with the adjusted Modelfile

ollama create phi4-mini:3.8b-fp16 -f  {Your Modelfile Path}

To test the single Function Calling and Parallel Function Calling of Phi-4-mini.

Single Function Calling

Parallel Function Calling

Full Sample in notebook

The above example is just a simple introduction. As we move forward with the development we hope to find simpler ways to apply it on the edge, use Function Calling to expand the scenarios of Phi-4-mini / Phi-4-multimodal, and also develop more usecases in vertical industries.