CrewAI
Test your CrewAI crews without API keys. Each agent in a crew makes its own LLM calls — aimock handles them all with fixture-based responses.
Quick Start
CrewAI agents make OpenAI-compatible LLM calls by default. Point them at aimock and every
agent in your crew will send requests to the mock server instead of the real API. The
recommended approach is to use CrewAI's LLM class with an explicit
base_url, shown in the examples below.
# Terminal 1 — start the mock server
npx aimock --fixtures ./fixtures
# Terminal 2 — run your CrewAI script
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=test
python crew.py
from crewai import Agent, Task, Crew, LLM
# Recommended: use the LLM class with an explicit base_url
llm = LLM(
model="openai/gpt-4o",
base_url="http://localhost:4010/v1",
api_key="test",
)
researcher = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
llm=llm,
)
task = Task(
description="Research the history of testing",
expected_output="A short summary",
agent=researcher,
)
crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
print(result)
LLM(base_url=...) — Setting
OPENAI_BASE_URL works when agents use the default OpenAI provider, but the
LLM(base_url=...) approach is more reliable across all configurations and is
the recommended way to point CrewAI at aimock.
With aimock-pytest
The aimock-pytest plugin starts and stops the server automatically per test,
so you never need to manage a background process.
pip install aimock-pytest
import os, pytest
@pytest.fixture(autouse=True)
def mock_llm(aimock):
"""aimock-pytest provides the `aimock` fixture automatically.
It starts a fresh server for each test that requests it (function-scoped).
You must set OPENAI_BASE_URL yourself so CrewAI agents route to aimock."""
os.environ["OPENAI_BASE_URL"] = aimock.url + "/v1"
os.environ["OPENAI_API_KEY"] = "test"
aimock.load_fixtures("./fixtures/crewai-crew.json")
yield aimock
from crewai import Agent, Task, Crew
def test_researcher_crew():
researcher = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
)
task = Task(
description="Summarize recent AI breakthroughs",
expected_output="A short summary",
agent=researcher,
)
crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
assert "AI" in str(result)
Multi-Agent Crews
In a CrewAI crew, each agent makes independent LLM calls. The researcher agent sends its
own chat completion request, then the writer agent sends a separate one. Because aimock
matches on the userMessage field, you can write fixtures that target each
agent's prompt pattern independently.
{
"fixtures": [
{
"match": { "userMessage": "research" },
"response": {
"content": "Based on my research, the key findings are:\n\n1. LLM testing with fixture-based mocks eliminates flaky tests caused by non-deterministic API responses.\n2. Proxy recording captures real interactions for replay in CI without API keys.\n3. Multi-agent frameworks like CrewAI benefit most because each agent multiplies the number of LLM calls per run."
}
},
{
"match": { "userMessage": "write" },
"response": {
"content": "# Testing LLMs in CI\n\nFixture-based mocking brings determinism to AI-powered applications. By replacing real API calls with recorded responses, teams ship faster with confidence.\n\n## Why It Matters\n\nEvery agent in a CrewAI crew makes independent LLM calls. Without mocking, a two-agent crew means two sources of non-determinism per run. With aimock, every call returns the exact same response every time."
}
}
]
}
from crewai import Agent, Task, Crew, Process
researcher = Agent(
role="Researcher",
goal="Research topics thoroughly",
backstory="Senior research analyst",
)
writer = Agent(
role="Writer",
goal="Write compelling articles",
backstory="Technical content writer",
)
research_task = Task(
description="Research LLM testing best practices",
expected_output="Key findings as bullet points",
agent=researcher,
)
write_task = Task(
description="Write a blog post from the research",
expected_output="A short blog post in markdown",
agent=writer,
)
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, write_task],
process=Process.sequential,
)
result = crew.kickoff()
The researcher's prompt contains "research", matching the first fixture. The writer's prompt contains "write", matching the second. Each agent gets its own deterministic response.
Tool Calls
CrewAI agents can use tools. When an agent invokes a tool, CrewAI sends a chat completion
with tool_choice and expects a tool-call response. aimock fixtures handle
this with the toolCalls response field.
{
"fixtures": [
{
"match": { "userMessage": "search", "sequenceIndex": 0 },
"response": {
"toolCalls": [
{
"name": "web_search",
"arguments": "{\"query\": \"LLM testing frameworks 2025\"}"
}
]
}
},
{
"match": { "userMessage": "search", "sequenceIndex": 1 },
"response": {
"content": "Based on the search results, the top LLM testing frameworks are aimock, promptfoo, and deepeval."
}
}
]
}
from crewai import Agent, Task, Crew
from crewai_tools import SerperDevTool
search_tool = SerperDevTool()
researcher = Agent(
role="Researcher",
goal="Search the web for information",
backstory="Expert web researcher",
tools=[search_tool],
)
task = Task(
description="Search for the latest LLM testing frameworks",
expected_output="A ranked list of frameworks",
agent=researcher,
)
crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
The first fixture triggers the tool call. After CrewAI processes the tool result and sends it back to the LLM, the second fixture matches the follow-up message and returns the final answer.
CI with GitHub Action
Use the CopilotKit/aimock GitHub Action to run aimock as a background service
in your CI pipeline.
name: Test CrewAI Crew
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- uses: CopilotKit/aimock@v1
with:
fixtures: ./fixtures
- run: pip install crewai pytest aimock-pytest
- run: pytest
env:
OPENAI_BASE_URL: http://127.0.0.1:4010/v1
OPENAI_API_KEY: test
The action starts aimock on port 4010, loads your fixtures, and keeps the server running for the duration of the job. No real API keys needed.
Record & Replay
Record a full crew execution against a real LLM, then replay it deterministically in tests and CI. This is especially useful for capturing complex multi-agent interactions.
# Start aimock in record mode — unmatched requests go to OpenAI
npx aimock --fixtures ./fixtures \
--record \
--provider-openai https://api.openai.com
# Run your crew with the real API key (proxied through aimock)
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=sk-your-real-key
python crew.py
# New fixtures appear in ./fixtures/recorded/
# Commit them to your repo for deterministic replay
On subsequent runs without --record, aimock replays the recorded fixtures.
Every agent in the crew gets the exact same response it received during the original
recording, making your tests fully reproducible.