Switching from Python mock libraries to aimock
pytest-mockllm, openai-responses-python, and evalcraft work great for single-process Python tests. When your AI app spans multiple services—or you want to test from any language—aimock gives you a real mock server accessible from anywhere.
The libraries
| Library | Approach | Scope |
|---|---|---|
| pytest-mockllm | pytest fixture + monkey-patching | OpenAI and Anthropic in-process |
| openai-responses-python | Decorator that intercepts httpx |
OpenAI API responses only |
| evalcraft | Mock + evaluation framework | OpenAI completions + eval metrics |
All three work by intercepting HTTP calls within the same Python process. This is convenient for unit tests, but it breaks down when your AI application spans multiple services (API server, agent worker, background jobs) or when you need to test from Playwright, a Node.js frontend, or another language entirely.
Honest assessment
Two paths for Python teams. If you have Node.js available,
npx aimock starts a mock server in one command — no Docker needed.
The aimock-pytest pip package is in development to provide native pytest
fixture integration with automatic server lifecycle management. For Docker-based CI
environments, the ghcr.io/copilotkit/aimock image works with any language.
Code comparison
Here's what the switch looks like in practice. The Python decorator becomes a Docker
container + conftest.py fixture.
pytest-mockllm (before)
import pytest
from pytest_mockllm import mock_openai
@mock_openai(response="Hello from the mock")
def test_my_agent():
result = my_agent.run("hello")
assert result == "Hello from the mock"
openai-responses-python (before)
from openai_responses import mock_completions
@mock_completions(content="Hello from the mock")
def test_chat():
client = OpenAI()
resp = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "hello"}]
)
assert resp.choices[0].message.content == "Hello from the mock"
aimock (after)
import pytest
import subprocess, time, os
@pytest.fixture(scope="session")
def aimock_server():
# Start aimock via Docker
proc = subprocess.Popen([
"docker", "run", "--rm",
"-p", "4010:4010",
"-v", f"{os.getcwd()}/fixtures:/fixtures",
"ghcr.io/copilotkit/aimock:latest",
"-f", "/fixtures"
])
time.sleep(2) # wait for server
# Point OpenAI SDK at the mock
os.environ["OPENAI_BASE_URL"] = "http://localhost:4010/v1"
os.environ["OPENAI_API_KEY"] = "mock-key"
yield "http://localhost:4010"
proc.terminate()
proc.wait()
import openai
def test_chat_completion(aimock_server):
client = openai.OpenAI(
base_url=f"{aimock_server}/v1",
api_key="mock-key"
)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "hello"}]
)
assert response.choices[0].message.content == "Hello from the mock"
{
"match": { "userMessage": "hello" },
"response": { "content": "Hello from the mock" }
}
What you gain
Cross-process, cross-language
Your Python tests, Node.js frontend, Go microservices, and Playwright E2E tests all hit the same mock server. No per-language patching.
10+ LLM providers
OpenAI, Claude, Gemini, Bedrock, Azure, Vertex AI, Ollama, Cohere. The Python libraries only cover OpenAI (and sometimes Anthropic).
Record & replay
Proxy real APIs, save responses as fixtures, replay forever. No manual response construction.
MCP / A2A / Vector
Mock your entire AI stack—MCP tool servers, A2A agent endpoints, vector databases—not just LLM calls.
WebSocket + streaming
Built-in SSE streaming and WebSocket protocol support (OpenAI Realtime, Gemini Live). The Python libraries don't handle streaming.
Chaos testing
Inject latency, drop chunks, corrupt payloads mid-stream. Test your error handling under realistic failure conditions.
What you lose (honestly)
| Capability | Python mocks | aimock | Notes |
|---|---|---|---|
| In-process decorator convenience | ✓ | ✗ | Coming with aimock-pytest pip package |
| Native pytest integration | ✓ | conftest.py fixture | Works, but more boilerplate today |
| Zero infrastructure | ✓ | Docker or npx | Requires Docker or Node.js runtime |
| Cross-process mocking | ✗ | ✓ | aimock's key advantage |
| Multi-provider | 1–2 providers | 10+ | |
| Streaming SSE | ✗ | Built-in | |
| WebSocket protocols | ✗ | 3 protocols | |
| Record & replay | ✗ | ✓ | |
| MCP / A2A / Vector | ✗ | ✓ | |
| Chaos testing | ✗ | ✓ |
CLI / Docker quick start
# Run the mock server (requires Node.js)
npx aimock -p 4010 -f ./fixtures
# Point your Python app at the mock
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=mock-key
# Run your tests
pytest
# Pull and run
docker run -d -p 4010:4010 \
-v $(pwd)/fixtures:/fixtures \
ghcr.io/copilotkit/aimock:latest \
-f /fixtures
# Point your Python app at the mock
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=mock-key
# Run your tests
pytest
Docker is the recommended path for Python teams since it doesn't
require Node.js in your development environment. Add the container to your
docker-compose.yml or CI pipeline alongside your Python services.
Alternative: npx fixture (no Docker)
If Node.js is available in your environment, you can skip Docker entirely and use
npx aimock directly from your conftest.py.
import pytest
import subprocess, time, os
@pytest.fixture(scope="session")
def aimock_server():
proc = subprocess.Popen(
["npx", "aimock", "-p", "4010", "-f", "./fixtures"],
stdout=subprocess.PIPE, stderr=subprocess.PIPE
)
time.sleep(2) # wait for server
os.environ["OPENAI_BASE_URL"] = "http://localhost:4010/v1"
os.environ["OPENAI_API_KEY"] = "mock-key"
yield "http://localhost:4010"
proc.terminate()
proc.wait()