Microsoft Agent Framework

Test your Microsoft Agent Framework agents without API keys. MAF wraps standard provider SDKs — point them at aimock and every LLM call returns deterministic fixture responses.

Quick Start

MAF wraps standard provider SDKs. Point them at aimock and every call goes through your fixtures instead of the real API.

Minimal setup python

from agent_framework import Agent, AgentRuntime
from agent_framework_openai import OpenAIChatClient
import asyncio

# Start aimock first: npx -p @copilotkit/aimock llmock --fixtures ./fixtures
client = OpenAIChatClient(
    model="gpt-4o",
    base_url="http://localhost:4010/v1",
    api_key="test",
)

agent = Agent(
    name="Researcher",
    instructions="You are a helpful research assistant.",
    model_client=client,
)

async def main():
    runtime = AgentRuntime()
    runtime.register(agent)
    response = await runtime.send_message(agent, "What are the latest AI testing frameworks?")
    print(response)

asyncio.run(main())

Minimal setup csharp

using Microsoft.Extensions.AI;
using OpenAI;

// Start aimock first: npx -p @copilotkit/aimock llmock --fixtures ./fixtures
var openaiClient = new OpenAIClient(
    new ApiKeyCredential("test"),
    new OpenAIClientOptions
    {
        Endpoint = new Uri("http://localhost:4010/v1")
    }
);

var agent = openaiClient
    .GetChatClient("gpt-4o")
    .AsIChatClient();

var response = await agent.GetResponseAsync("What are the latest AI testing frameworks?");
Console.WriteLine(response.Text);

This works with any MAF provider that wraps an OpenAI-compatible API. For Azure OpenAI, see the Azure OpenAI guide.

Environment variables vs explicit base_url — OPENAI_BASE_URL works when agents use the default OpenAI provider, but passing base_url explicitly to OpenAIChatClient is more reliable and recommended.

Anthropic Provider

For Claude-backed agents, inject a pre-configured Anthropic client. Note that aimock serves the Claude Messages API at the root path, not under /v1.

Anthropic provider setup python

from anthropic import AsyncAnthropic
from agent_framework_anthropic import AnthropicClient

anthropic_client = AsyncAnthropic(
    api_key="test",
    base_url="http://localhost:4010",
)

client = AnthropicClient(
    model="claude-sonnet-4-6",
    anthropic_client=anthropic_client,
)

Anthropic provider setup csharp

using Anthropic;

var client = new AnthropicClient(new APIAuthentication("test"))
{
    BaseUrl = "http://localhost:4010"
};

Pass this client as the model_client parameter when creating your agent, exactly as in the OpenAI example above. See Claude Messages for provider-specific fixture details.

With aimock-pytest

The aimock-pytest plugin starts and stops the server automatically per test, so you never need to manage a background process.

Install shell

pip install aimock-pytest

Install shell

dotnet add package Microsoft.Extensions.AI

conftest.py python

import os, pytest

@pytest.fixture(autouse=True)
def mock_llm(aimock):
    """aimock-pytest provides the `aimock` fixture automatically.
    It starts a fresh server for each test that requests it (function-scoped).
    You must set OPENAI_BASE_URL yourself so MAF agents route to aimock."""
    os.environ["OPENAI_BASE_URL"] = aimock.url + "/v1"
    os.environ["OPENAI_API_KEY"] = "test"
    aimock.load_fixtures("./fixtures/maf-agents.json")
    yield aimock

test_agent.py python

from agent_framework import Agent, AgentRuntime
from agent_framework_openai import OpenAIChatClient

import pytest

@pytest.mark.asyncio
async def test_researcher_agent(aimock):
    client = OpenAIChatClient(
        model="gpt-4o",
        base_url=aimock.url + "/v1",
        api_key="test",
    )
    agent = Agent(
        name="Researcher",
        instructions="You are a helpful research assistant.",
        model_client=client,
    )
    runtime = AgentRuntime()
    runtime.register(agent)
    response = await runtime.send_message(agent, "Summarize recent AI breakthroughs")
    assert "AI" in str(response)

AgentTests.cs csharp

using Microsoft.Extensions.AI;
using OpenAI;
using Xunit;

public class AgentTests : IAsyncLifetime
{
    private Process? _aimock;

    public Task InitializeAsync()
    {
        _aimock = Process.Start(new ProcessStartInfo
        {
            FileName = "npx",
            Arguments = "@copilotkit/aimock --fixtures ./fixtures",
            RedirectStandardOutput = true,
        });
        return Task.Delay(1000); // wait for server startup
    }

    public Task DisposeAsync()
    {
        _aimock?.Kill();
        return Task.CompletedTask;
    }

    [Fact]
    public async Task ResearcherAgent_ReturnsFixtureResponse()
    {
        var openaiClient = new OpenAIClient(
            new ApiKeyCredential("test"),
            new OpenAIClientOptions
            {
                Endpoint = new Uri("http://localhost:4010/v1")
            }
        );

        var agent = openaiClient
            .GetChatClient("gpt-4o")
            .AsIChatClient();

        var response = await agent.GetResponseAsync(
            "Summarize recent AI breakthroughs"
        );

        Assert.Contains("AI", response.Text);
    }
}

Multi-Agent Workflows

MAF supports multi-agent orchestration where agents communicate with each other. Each agent makes independent LLM calls — aimock matches each by the userMessage field, so you can write fixtures that target each agent's prompt pattern independently.

fixtures/maf-agents.json json

{
  "fixtures": [
    {
      "match": { "userMessage": "research" },
      "response": {
        "content": "Based on my research, the key findings are:\n\n1. LLM testing with fixture-based mocks eliminates flaky tests caused by non-deterministic API responses.\n2. Proxy recording captures real interactions for replay in CI without API keys.\n3. Multi-agent frameworks like MAF benefit most because each agent multiplies the number of LLM calls per run."
      }
    },
    {
      "match": { "userMessage": "summarize" },
      "response": {
        "content": "# Summary: LLM Testing Best Practices\n\nFixture-based mocking brings determinism to AI-powered applications. By replacing real API calls with recorded responses, teams ship faster with confidence.\n\n## Key Takeaway\n\nEvery agent in a MAF runtime makes independent LLM calls. Without mocking, a multi-agent workflow means multiple sources of non-determinism per run. With aimock, every call returns the exact same response every time."
      }
    }
  ]
}

Two-agent workflow python

import asyncio
from agent_framework import Agent, AgentRuntime
from agent_framework_openai import OpenAIChatClient

client = OpenAIChatClient(
    model="gpt-4o",
    base_url="http://localhost:4010/v1",
    api_key="test",
)

researcher = Agent(
    name="Researcher",
    instructions="You research topics thoroughly and report findings.",
    model_client=client,
)

summarizer = Agent(
    name="Summarizer",
    instructions="You summarize research into concise reports.",
    model_client=client,
)

async def main():
    runtime = AgentRuntime()
    runtime.register(researcher)
    runtime.register(summarizer)

    research = await runtime.send_message(researcher, "Research LLM testing best practices")
    summary = await runtime.send_message(summarizer, f"Summarize the following findings: {research}")
    print(summary)

asyncio.run(main())

Two-agent workflow csharp

using Microsoft.Extensions.AI;
using OpenAI;

var openaiClient = new OpenAIClient(
    new ApiKeyCredential("test"),
    new OpenAIClientOptions
    {
        Endpoint = new Uri("http://localhost:4010/v1")
    }
);

var agent = openaiClient
    .GetChatClient("gpt-4o")
    .AsIChatClient();

var research = await agent.GetResponseAsync(
    "Research LLM testing best practices"
);

var summary = await agent.GetResponseAsync(
    $"Summarize the following findings: {research.Text}"
);

Console.WriteLine(summary.Text);

The researcher's prompt contains "research", matching the first fixture. The summarizer's prompt contains "summarize", matching the second. Each agent gets its own deterministic response. See Fixtures and Sequential Responses for the full matching reference.

Tool Calls

MAF agents can use tools via function calling. When an agent invokes a tool, the framework sends a chat completion with tool_choice and expects a tool-call response. aimock fixtures handle this with the toolCalls response field.

fixtures/maf-tools.json json

{
  "fixtures": [
    {
      "match": { "userMessage": "search", "sequenceIndex": 0 },
      "response": {
        "toolCalls": [
          {
            "name": "web_search",
            "arguments": { "query": "LLM testing frameworks 2025" }
          }
        ]
      }
    },
    {
      "match": { "userMessage": "search", "sequenceIndex": 1 },
      "response": {
        "content": "Based on the search results, the top LLM testing frameworks are aimock, promptfoo, and deepeval."
      }
    }
  ]
}

Agent with tools python

import asyncio
from agent_framework import Agent, AgentRuntime, FunctionTool
from agent_framework_openai import OpenAIChatClient

def web_search(query: str) -> str:
    """Search the web for information."""
    return f"Results for: {query}"

client = OpenAIChatClient(
    model="gpt-4o",
    base_url="http://localhost:4010/v1",
    api_key="test",
)

researcher = Agent(
    name="Researcher",
    instructions="Search the web for information when asked.",
    model_client=client,
    tools=[FunctionTool(web_search)],
)

async def main():
    runtime = AgentRuntime()
    runtime.register(researcher)
    result = await runtime.send_message(researcher, "Search for the latest LLM testing frameworks")
    print(result)

asyncio.run(main())

Agent with tools csharp

using Microsoft.Extensions.AI;
using OpenAI;

var openaiClient = new OpenAIClient(
    new ApiKeyCredential("test"),
    new OpenAIClientOptions
    {
        Endpoint = new Uri("http://localhost:4010/v1")
    }
);

var agent = openaiClient
    .GetChatClient("gpt-4o")
    .AsIChatClient();

// aimock returns tool call fixtures — the client processes them
// just like real OpenAI tool call responses
var response = await agent.GetResponseAsync(
    "Search for the latest LLM testing frameworks"
);

Console.WriteLine(response.Text);

The first fixture triggers the tool call. After MAF processes the tool result and sends it back to the LLM, the second fixture matches the follow-up message and returns the final answer.

CI with GitHub Action

Use the CopilotKit/aimock GitHub Action to run aimock as a background service in your CI pipeline.

.github/workflows/test.yml yaml

name: Test MAF Agents
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - uses: CopilotKit/aimock@v1
        with:
          fixtures: ./fixtures
      - run: pip install agent-framework agent-framework-openai pytest pytest-asyncio aimock-pytest
      - run: pytest
        env:
          OPENAI_BASE_URL: http://127.0.0.1:4010/v1
          OPENAI_API_KEY: test

.github/workflows/test.yml yaml

name: Test MAF Agents
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-dotnet@v4
        with:
          dotnet-version: "8.0"
      - uses: CopilotKit/aimock@v1
        with:
          fixtures: ./fixtures
      - run: dotnet test
        env:
          OPENAI_BASE_URL: http://127.0.0.1:4010/v1
          OPENAI_API_KEY: test

The action starts aimock on port 4010, loads your fixtures, and keeps the server running for the duration of the job. No real API keys needed.

Record & Replay

Record a full agent session against a real LLM, then replay it deterministically in tests and CI. This is especially useful for capturing complex multi-agent interactions.

Record an agent session shell

# Start aimock in record mode — unmatched requests go to OpenAI
npx -p @copilotkit/aimock llmock --fixtures ./fixtures \
  --record \
  --provider-openai https://api.openai.com

# Run your agent with the real API key (proxied through aimock)
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=sk-your-real-key
python agent.py

# New fixtures appear in ./fixtures/recorded/
# Commit them to your repo for deterministic replay

On subsequent runs without --record, aimock replays the recorded fixtures. Every agent in the runtime gets the exact same response it received during the original recording, making your tests fully reproducible. See Record & Replay for the full reference.