WebSocket APIs

aimock implements three WebSocket APIs with zero dependencies — real RFC 6455 framing built from scratch. The same fixtures drive HTTP and WebSocket transports.

Endpoints

Path API Protocol
/v1/responses OpenAI Responses API WebSocket JSON messages
/v1/realtime OpenAI Realtime API WebSocket JSON messages
/ws/google.ai.generativelanguage.* Gemini Live WebSocket JSON messages

OpenAI Responses (WebSocket)

ws-responses.test.ts ts
const instance = await createServer([
  { match: { userMessage: "hello" }, response: { content: "Hi there!" } }
]);

const ws = await connectWebSocket(instance.url, "/v1/responses");

// Send a response.create message
ws.send(JSON.stringify({
  type: "response.create",
  model: "gpt-4",
  input: [{ role: "user", content: "hello" }],
}));

const messages = await ws.waitForMessages(9);
const events = messages.map(m => JSON.parse(m));
const types = events.map(e => e.type);

expect(types[0]).toBe("response.created");
expect(types).toContain("response.output_text.delta");
expect(types).toContain("response.completed");

OpenAI Realtime

The Realtime API uses a conversational protocol with session management. aimock implements the GA (General Availability) protocol natively — event names like response.output_text.delta, conversation.item.added, and nested audio session config are the defaults. The Beta protocol is supported via the OpenAI-Beta: realtime=v1 header, which activates a translation shim that converts GA events back to Beta names (response.text.delta, conversation.item.created, flat session config).

Supported Models

Model Session Types Notes
gpt-realtime conversation Base alias — resolves to latest GA model
gpt-realtime-2 conversation Default model — GA successor to gpt-4o-realtime-preview
gpt-realtime-1.5 conversation Previous generation GA model
gpt-realtime-mini conversation Smaller, faster GA model
gpt-4o-transcribe transcription, translation Speech transcription and translation
gpt-4o-mini-transcribe transcription, translation Smaller transcription and translation model
whisper-1 transcription Legacy Whisper transcription model

Session Types

GA Protocol Features

Beta Compatibility

Clients that send the OpenAI-Beta: realtime=v1 header receive Beta-format events automatically. The shim translates event names, flattens the nested audio config, and suppresses GA-only events like conversation.item.done. No code changes needed in tests that target the Beta protocol.

ws-realtime.test.ts (GA protocol) ts
const ws = await connectWebSocket(instance.url, "/v1/realtime?model=gpt-realtime-2");

// Server sends session.created on connect
const [sessionMsg] = await ws.waitForMessages(1);
const session = JSON.parse(sessionMsg);
expect(session.type).toBe("session.created");
expect(session.session.type).toBe("conversation");
expect(session.session.audio).toBeDefined();

// Configure session with nested audio config
ws.send(JSON.stringify({
  type: "session.update",
  session: {
    modalities: ["text"],
    audio: { voice: "alloy" }
  }
}));

// Add a user message (supports input_text + input_image content)
ws.send(JSON.stringify({
  type: "conversation.item.create",
  item: {
    type: "message",
    role: "user",
    content: [{ type: "input_text", text: "hello" }]
  }
}));

// Request a response
ws.send(JSON.stringify({ type: "response.create" }));

// GA events: output_text instead of text, item.added instead of item.created
const msgs = await ws.waitForMessages(10);
const events = msgs.map(m => JSON.parse(m));
expect(events.some(e => e.type === "response.output_text.delta")).toBe(true);
expect(events.some(e => e.type === "conversation.item.added")).toBe(true);
expect(events.some(e => e.type === "conversation.item.done")).toBe(true);

Gemini Live

Bidirectional streaming for Google Gemini Live API.

ws-gemini-live.test.ts ts
const ws = await connectWebSocket(
  instance.url,
  "/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent"
);

// Send setup message
ws.send(JSON.stringify({
  setup: { model: "models/gemini-2.0-flash-live" }
}));

// Send client content
ws.send(JSON.stringify({
  clientContent: {
    turns: [{ role: "user", parts: [{ text: "hello" }] }],
    turnComplete: true,
  }
}));

Implementation Details

Gemini Live text support is unverified — no text-capable Gemini Live model existed at time of implementation. The WebSocket framing and protocol messages follow the published API spec.

Provider WebSocket Support

Not all LLM providers offer WebSocket APIs. Here's the current landscape:

Provider WebSocket API aimock Status
OpenAI Realtime wss://api.openai.com/v1/realtime Supported ✓
OpenAI Responses wss://api.openai.com/v1/responses Supported ✓
Gemini Live wss://...BidiGenerateContent Implemented, awaiting text model
Anthropic Claude None N/A
Azure OpenAI Uses OpenAI Realtime Covered by OpenAI
Mistral / Groq / Cohere None N/A
AWS Bedrock EventStream (not WebSocket) N/A

aimock includes drift canary tests that automatically detect when providers add new WebSocket capabilities. When a canary fires, it signals that aimock should be updated to support the new API.