Switching from piyook/llm-mock to aimock

piyook/llm-mock handles basic OpenAI mocking with JSON templates. aimock is a complete superset—streaming, multi-provider, WebSocket, structured output, sequential responses, and the full AI stack.

Fixture format comparison

piyook/llm-mock requires you to build the full OpenAI response envelope by hand. aimock uses a declarative match+response format—you specify the content, and aimock auto-builds the response envelope with proper IDs, timestamps, token counts, and SSE framing.

piyook/llm-mock — full JSON template json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677858242,
  "model": "gpt-4",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello there"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

aimock — declarative fixture json

{
  "match": { "userMessage": "hello" },
  "response": { "content": "Hello there" }
}

// aimock auto-generates:
//   - id, object, created, model
//   - choices[].index, finish_reason
//   - usage (prompt_tokens, completion_tokens)
//   - SSE streaming chunks (when stream: true)

With piyook/llm-mock, you maintain one template per response scenario and the server returns it verbatim. With aimock, you write the minimum—just the content and an optional match rule—and the server handles envelope generation, streaming, and provider-specific formatting.

What you gain

⚡

Streaming SSE

Built-in Server-Sent Events for all providers. No manual chunk construction—the same fixture works for streaming and non-streaming requests.

🌐

10+ providers

OpenAI, Claude, Gemini, Bedrock, Azure, Vertex AI, Ollama, Cohere, and any OpenAI-compatible endpoint.

🔌

WebSocket APIs

OpenAI Realtime, Responses WS, Gemini Live. Full bidirectional protocol mocking out of the box.

{ }

Structured output

JSON mode and response_format matching. Return typed JSON that validates against your schema.

🔄

Sequential responses

Return different responses on successive calls. Model multi-turn conversations, retry scenarios, and degradation patterns.

🧩

MCP / A2A / Vector

Mock MCP tool servers, A2A agent endpoints, and vector database APIs alongside LLM mocks on one port.

⏺

Record & replay

Proxy real APIs, capture responses as fixtures, replay deterministically in CI. No manual fixture authoring.

💥

Chaos testing

Inject latency, drop chunks, corrupt payloads, and disconnect mid-stream to harden your error handling.

Comparison table

Capability	piyook/llm-mock	aimock
OpenAI Chat Completions	✓	✓
OpenAI Responses API	✗	✓
Anthropic Claude	✗	✓
Google Gemini	✗	✓
AWS Bedrock / Azure / Vertex AI / Ollama / Cohere	✗	✓
Streaming SSE	✗	Built-in (TTFT, TPS, jitter)
WebSocket protocols	✗	3 protocols
Structured output / JSON mode	✗	✓
Sequential responses	✗	✓
MCP / A2A / Vector mocking	✗	✓
Record & replay	✗	✓
Chaos testing	✗	✓
Drift detection	✗	Automated CI
Prometheus metrics	✗	✓
Programmatic API	✗	✓ (TypeScript/JS)
Request journal	✗	✓
Auto envelope generation	✗ (manual JSON)	✓
Docker image	✓	✓
Zero dependencies	✗	✓

CLI / Docker quick start

Install & run sh

# Run the mock server
npx aimock -p 4010 -f ./fixtures

# With a full config file
npx aimock --config aimock.json --port 4010

# Point your app at the mock
export OPENAI_BASE_URL=http://localhost:4010/v1
export OPENAI_API_KEY=mock-key

Docker sh

# Pull and run
docker run -d -p 4010:4010 \
  -v $(pwd)/fixtures:/fixtures \
  ghcr.io/copilotkit/aimock:latest \
  -p 4010 -f /fixtures

# With a config file
docker run -d -p 4010:4010 \
  -v $(pwd)/aimock.json:/app/aimock.json \
  -v $(pwd)/fixtures:/app/fixtures \
  ghcr.io/copilotkit/aimock \
  aimock --config /app/aimock.json --host 0.0.0.0

# Point your app at the mock
export OPENAI_BASE_URL=http://localhost:4010/v1