Switching from Python mock libraries to aimock

pytest-mockllm, openai-responses-python, and evalcraft work great for single-process Python tests. When your AI app spans multiple services—or you want to test from any language—aimock gives you a real mock server accessible from anywhere.

The libraries

Library Approach Scope
pytest-mockllm pytest fixture + monkey-patching OpenAI and Anthropic in-process
openai-responses-python Decorator that intercepts httpx OpenAI API responses only
evalcraft Mock + evaluation framework OpenAI completions + eval metrics

All three work by intercepting HTTP calls within the same Python process. This is convenient for unit tests, but it breaks down when your AI application spans multiple services (API server, agent worker, background jobs) or when you need to test from Playwright, a Node.js frontend, or another language entirely.

Honest assessment

Two paths for Python teams. If you have Node.js available, npx @copilotkit/aimock starts a mock server in one command — no Docker needed. The aimock-pytest pip package is in development to provide native pytest fixture integration with automatic server lifecycle management. For Docker-based CI environments, the ghcr.io/copilotkit/aimock image works with any language.

Code comparison

Here's what the switch looks like in practice. The Python decorator becomes a Docker container + conftest.py fixture.

pytest-mockllm (before)

test_agent.py py
import pytest
from pytest_mockllm import mock_openai

@mock_openai(response="Hello from the mock")
def test_my_agent():
    result = my_agent.run("hello")
    assert result == "Hello from the mock"

openai-responses-python (before)

test_completions.py py
from openai_responses import mock_completions

@mock_completions(content="Hello from the mock")
def test_chat():
    client = OpenAI()
    resp = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "hello"}]
    )
    assert resp.choices[0].message.content == "Hello from the mock"

aimock (after)

conftest.py py
import pytest
import subprocess, time, os

@pytest.fixture(scope="session")
def aimock_server():
    # Start aimock via Docker
    proc = subprocess.Popen([
        "docker", "run", "--rm",
        "-p", "4010:4010",
        "-v", f"{os.getcwd()}/fixtures:/fixtures",
        "ghcr.io/copilotkit/aimock:latest",
        "-f", "/fixtures", "-h", "0.0.0.0"
    ])
    # Wait for health endpoint — fail loudly if aimock never comes up
    import requests
    for _ in range(30):
        if proc.poll() is not None:
            raise RuntimeError(f"aimock exited early with code {proc.returncode}")
        try:
            if requests.get("http://localhost:4010/health").ok:
                break
        except requests.ConnectionError:
            pass
        time.sleep(0.2)
    else:
        raise RuntimeError("aimock did not become healthy after 30 attempts")

    # Save originals so we don't clobber real credentials in the test process
    prev_base = os.environ.get("OPENAI_BASE_URL")
    prev_key = os.environ.get("OPENAI_API_KEY")
    os.environ["OPENAI_BASE_URL"] = "http://localhost:4010/v1"
    os.environ["OPENAI_API_KEY"] = "mock-key"

    try:
        yield "http://localhost:4010"
    finally:
        proc.terminate()
        try:
            proc.wait(timeout=10)
        except subprocess.TimeoutExpired:
            proc.kill()
            proc.wait(timeout=5)
        # Restore originals (or remove if there were none)
        for name, val in (("OPENAI_BASE_URL", prev_base), ("OPENAI_API_KEY", prev_key)):
            if val is None:
                os.environ.pop(name, None)
            else:
                os.environ[name] = val
test_agent.py py
import openai

def test_chat_completion(aimock_server):
    client = openai.OpenAI(
        base_url=f"{aimock_server}/v1",
        api_key="mock-key"
    )
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "hello"}]
    )
    assert response.choices[0].message.content == "Hello from the mock"
fixtures/hello.json json
{
  "match": { "userMessage": "hello" },
  "response": { "content": "Hello from the mock" }
}

What you gain

🌐

Cross-process, cross-language

Your Python tests, Node.js frontend, Go microservices, and Playwright E2E tests all hit the same mock server. No per-language patching.

📡

12 LLM providers

OpenAI (Chat, Responses, Realtime), Claude, Gemini (REST, Live, and Interactions), Bedrock, Azure, Vertex AI, Ollama, Cohere. The Python libraries only cover OpenAI (and sometimes Anthropic).

Record & replay

Proxy real APIs, save responses as fixtures, replay forever. No manual response construction.

🧩

MCP / A2A / AG-UI / Vector

Mock your entire AI stack — LLM, MCP, A2A, AG-UI, vector — on one port.

🔌

WebSocket + streaming

Built-in SSE streaming and WebSocket protocol support (OpenAI Realtime, Gemini Live). The Python libraries don't handle streaming.

💥

Chaos testing

Inject latency, drop chunks, corrupt payloads mid-stream. Test your error handling under realistic failure conditions.

What you lose (honestly)

Capability Python mocks aimock Notes
In-process decorator convenience Coming with aimock-pytest pip package
Native pytest integration conftest.py fixture Works, but more boilerplate today
Zero infrastructure Docker or npx Requires Docker or Node.js runtime
Cross-process mocking aimock's key advantage
Multi-provider 1–2 providers 12
Streaming SSE Built-in
WebSocket protocols 3 protocols
Record & replay
MCP / A2A / AG-UI / Vector
Chaos testing

CLI / Docker quick start

Install & run sh
# Run the mock server (requires Node.js, flag-driven llmock bin)
npx -p @copilotkit/aimock llmock -p 4010 -f ./fixtures

# Point your Python app at the mock
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=mock-key

# Run your tests
pytest
Docker (no Node.js required) sh
# Pull and run
docker run -d -p 4010:4010 \
  -v $(pwd)/fixtures:/fixtures \
  ghcr.io/copilotkit/aimock:latest \
  -f /fixtures -h 0.0.0.0

# Point your Python app at the mock
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=mock-key

# Run your tests
pytest

Docker is the recommended path for Python teams since it doesn't require Node.js in your development environment. Add the container to your docker-compose.yml or CI pipeline alongside your Python services.

Alternative: npx fixture (no Docker)

If Node.js is available in your environment, you can skip Docker entirely and use npx @copilotkit/aimock directly from your conftest.py.

conftest.py (npx) py
import pytest
import subprocess, time, os

@pytest.fixture(scope="session")
def aimock_server():
    proc = subprocess.Popen(
        ["npx", "-p", "@copilotkit/aimock", "llmock", "-p", "4010", "-f", "./fixtures"],
        stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL
    )
    # Wait for health endpoint — fail loudly if aimock never comes up
    import requests
    for _ in range(30):
        if proc.poll() is not None:
            raise RuntimeError(f"aimock exited early with code {proc.returncode}")
        try:
            if requests.get("http://localhost:4010/health").ok:
                break
        except requests.ConnectionError:
            pass
        time.sleep(0.2)
    else:
        raise RuntimeError("aimock did not become healthy after 30 attempts")

    # Save originals so we don't clobber real credentials in the test process
    prev_base = os.environ.get("OPENAI_BASE_URL")
    prev_key = os.environ.get("OPENAI_API_KEY")
    os.environ["OPENAI_BASE_URL"] = "http://localhost:4010/v1"
    os.environ["OPENAI_API_KEY"] = "mock-key"

    try:
        yield "http://localhost:4010"
    finally:
        proc.terminate()
        try:
            proc.wait(timeout=10)
        except subprocess.TimeoutExpired:
            proc.kill()
            proc.wait(timeout=5)
        # Restore originals (or remove if there were none)
        for name, val in (("OPENAI_BASE_URL", prev_base), ("OPENAI_API_KEY", prev_key)):
            if val is None:
                os.environ.pop(name, None)
            else:
                os.environ[name] = val