Switching from piyook/llm-mock to aimock
piyook/llm-mock handles basic OpenAI mocking with JSON templates. aimock is a complete superset—streaming, multi-provider, WebSocket, structured output, sequential responses, and the full AI stack.
Fixture format comparison
piyook/llm-mock requires you to build the full OpenAI response envelope by hand. aimock uses a declarative match+response format—you specify the content, and aimock auto-builds the response envelope with proper IDs, timestamps, token counts, and SSE framing.
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "gpt-4",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello there"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
{
"match": { "userMessage": "hello" },
"response": { "content": "Hello there" }
}
// aimock auto-generates:
// - id, object, created, model
// - choices[].index, finish_reason
// - usage (prompt_tokens, completion_tokens)
// - SSE streaming chunks (when stream: true)
With piyook/llm-mock, you maintain one template per response scenario and the server returns it verbatim. With aimock, you write the minimum—just the content and an optional match rule—and the server handles envelope generation, streaming, and provider-specific formatting.
What you gain
Streaming SSE
Built-in Server-Sent Events for all providers. No manual chunk construction—the same fixture works for streaming and non-streaming requests.
10+ providers
OpenAI, Claude, Gemini, Bedrock, Azure, Vertex AI, Ollama, Cohere, and any OpenAI-compatible endpoint.
WebSocket APIs
OpenAI Realtime, Responses WS, Gemini Live. Full bidirectional protocol mocking out of the box.
Structured output
JSON mode and response_format matching. Return typed JSON that validates
against your schema.
Sequential responses
Return different responses on successive calls. Model multi-turn conversations, retry scenarios, and degradation patterns.
MCP / A2A / Vector
Mock MCP tool servers, A2A agent endpoints, and vector database APIs alongside LLM mocks on one port.
Record & replay
Proxy real APIs, capture responses as fixtures, replay deterministically in CI. No manual fixture authoring.
Chaos testing
Inject latency, drop chunks, corrupt payloads, and disconnect mid-stream to harden your error handling.
Comparison table
| Capability | piyook/llm-mock | aimock |
|---|---|---|
| OpenAI Chat Completions | ✓ | ✓ |
| OpenAI Responses API | ✗ | ✓ |
| Anthropic Claude | ✗ | ✓ |
| Google Gemini | ✗ | ✓ |
| AWS Bedrock / Azure / Vertex AI / Ollama / Cohere | ✗ | ✓ |
| Streaming SSE | ✗ | Built-in (TTFT, TPS, jitter) |
| WebSocket protocols | ✗ | 3 protocols |
| Structured output / JSON mode | ✗ | ✓ |
| Sequential responses | ✗ | ✓ |
| MCP / A2A / Vector mocking | ✗ | ✓ |
| Record & replay | ✗ | ✓ |
| Chaos testing | ✗ | ✓ |
| Drift detection | ✗ | Automated CI |
| Prometheus metrics | ✗ | ✓ |
| Programmatic API | ✗ | ✓ (TypeScript/JS) |
| Request journal | ✗ | ✓ |
| Auto envelope generation | ✗ (manual JSON) | ✓ |
| Docker image | ✓ | ✓ |
| Zero dependencies | ✗ | ✓ |
CLI / Docker quick start
# Run the mock server
npx aimock -p 4010 -f ./fixtures
# With a full config file
npx aimock --config aimock.json --port 4010
# Point your app at the mock
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=mock-key
# Pull and run
docker run -d -p 4010:4010 \
-v $(pwd)/fixtures:/fixtures \
ghcr.io/copilotkit/aimock:latest \
-p 4010 -f /fixtures
# With a config file
docker run -d -p 4010:4010 \
-v $(pwd)/aimock.json:/app/aimock.json \
-v $(pwd)/fixtures:/app/fixtures \
ghcr.io/copilotkit/aimock \
aimock --config /app/aimock.json --host 0.0.0.0
# Point your app at the mock
export OPENAI_BASE_URL=http://localhost:4010/v1