Chaos Testing

aimock provides probabilistic failure injection to test how your application handles unreliable LLM APIs. Three failure modes can be configured at the server, fixture, or per-request level.

Failure Modes

Mode Action Description
drop HTTP 500 Returns a 500 error with {"error":{"message":"Chaos: request dropped","type":"server_error","code":"chaos_drop"}}
malformed Broken JSON Returns HTTP 200 with invalid JSON body: {malformed json: <<<chaos>>>
disconnect Connection destroyed Destroys the TCP connection immediately with no response

Precedence

Chaos configuration is resolved with a three-level precedence hierarchy. Higher levels override lower ones:

  1. Per-request headers (highest) — override everything
  2. Fixture-level config — overrides server defaults
  3. Server-level defaults (lowest)

Within a single level, modes are evaluated in order: drop, malformed, disconnect. The first mode that triggers (based on its probability) wins.

Quick Start

chaos-quick-start.ts ts
import { LLMock } from "@copilotkit/aimock";

const mock = new LLMock();
mock.onMessage("hello", { content: "Hi!" });

// 50% of all requests will be dropped with a 500
mock.setChaos({ dropRate: 0.5 });

await mock.start();

// Later, remove chaos
mock.clearChaos();

Programmatic API

Programmatic chaos control ts
// Set server-level chaos (returns `this` for chaining)
mock.setChaos({
  dropRate: 0.1,        // 10% drop rate
  malformedRate: 0.05,  // 5% malformed rate
  disconnectRate: 0.02, // 2% disconnect rate
});

// Remove all server-level chaos
mock.clearChaos();

Fixture-Level Chaos

Attach a chaos config to individual fixtures so only specific responses experience failures:

chaos-fixture.json json
{
  "fixtures": [
    {
      "match": { "userMessage": "unstable" },
      "response": { "content": "This might fail!" },
      "chaos": {
        "dropRate": 0.3,
        "malformedRate": 0.2,
        "disconnectRate": 0.1
      }
    },
    {
      "match": { "userMessage": "stable" },
      "response": { "content": "This always works." }
    }
  ]
}

Per-Request Headers

Override chaos rates on individual requests using HTTP headers. Values are floats between 0 and 1:

Header Controls
x-aimock-chaos-drop Drop rate (0–1)
x-aimock-chaos-malformed Malformed rate (0–1)
x-aimock-chaos-disconnect Disconnect rate (0–1)
Per-request chaos via headers ts
// Force 100% disconnect on this specific request
await fetch(`${mock.url}/v1/chat/completions`, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-aimock-chaos-disconnect": "1.0",
  },
  body: JSON.stringify({ model: "gpt-4", messages: [{ role: "user", content: "hello" }] }),
});

CLI Flags

Set server-level chaos from the command line:

CLI chaos flags shell
$ npx -p @copilotkit/aimock aimock --fixtures ./fixtures \
  --chaos-drop 0.1 \
  --chaos-malformed 0.05 \
  --chaos-disconnect 0.02
CLI chaos flags shell
$ docker run -d -p 4010:4010 \
  -v ./fixtures:/fixtures \
  ghcr.io/copilotkit/aimock \
  -f /fixtures -h 0.0.0.0 \
  --chaos-drop 0.1 \
  --chaos-malformed 0.05 \
  --chaos-disconnect 0.02

Proxy Mode

When aimock is configured as a record/replay proxy (--record), chaos applies to proxied requests too — so a staging environment pointed at real upstream APIs still sees the failure modes your tests expect. Chaos is rolled once per request, after fixture matching, with the same headers > fixture > server precedence.

Mode When upstream is contacted What the client sees
drop Never — upstream not contacted HTTP 500 chaos body; upstream is not called
disconnect Never — upstream not contacted Connection destroyed; upstream is not called
malformed Called — post-response Request proxies normally; the upstream response is captured, then the body is replaced with invalid JSON before relay. The recorded fixture (if recording) keeps the real upstream response — chaos is a live-traffic decoration, not a fixture mutation.

SSE bypass. If upstream returns Content-Type: text/event-stream, aimock streams chunks to the client progressively. By the time malformed would fire, the bytes are already on the wire — the chaos action cannot be applied. This bypass is observable via the aimock_chaos_bypassed_total counter (see Prometheus Metrics below) and a warning in the server log, so a configured chaos rate doesn't silently drop to 0% on SSE traffic. Streaming mutation is planned for a future phase.

Journal Tracking

When chaos triggers, the journal entry includes a chaosAction field recording which failure mode was applied:

Journal entry with chaos json
{
  "method": "POST",
  "path": "/v1/chat/completions",
  "response": {
    "status": 500,
    "source": "fixture",
    "fixture": { "...": "elided for brevity" },
    "chaosAction": "drop"
  }
}

The chaosAction values are "drop", "malformed", or "disconnect". The status codes are 500 for drop, 200 for malformed, and 0 for disconnect (connection destroyed).

Prometheus Metrics

When metrics are enabled (--metrics), each chaos trigger increments the aimock_chaos_triggered_total counter, tagged with action and source. source="fixture" means a fixture matched (or would have, before chaos intervened); source="proxy" means the request was on the proxy dispatch path.

Metrics output text
# TYPE aimock_chaos_triggered_total counter
aimock_chaos_triggered_total{action="drop",source="fixture"} 3
aimock_chaos_triggered_total{action="malformed",source="fixture"} 1
aimock_chaos_triggered_total{action="disconnect",source="proxy"} 2

When a chaos action is rolled but can't be applied — today, only malformed on an SSE proxy response — the bypass is recorded in a separate counter so operators can distinguish "chaos didn't roll" from "chaos rolled but was bypassed":

Bypass counter text
# TYPE aimock_chaos_bypassed_total counter
aimock_chaos_bypassed_total{action="malformed",source="proxy",reason="sse_streamed"} 4