WebSocket APIs

aimock implements three WebSocket APIs with zero dependencies — real RFC 6455 framing built from scratch. The same fixtures drive HTTP and WebSocket transports.

Endpoints

Path	API	Protocol
/v1/responses	OpenAI Responses API	WebSocket JSON messages
/v1/realtime	OpenAI Realtime API	WebSocket JSON messages
/ws/google.ai.generativelanguage.*	Gemini Live	WebSocket JSON messages

OpenAI Responses (WebSocket)

ws-responses.test.ts ts

const instance = await createServer([
  { match: { userMessage: "hello" }, response: { content: "Hi there!" } }
]);

const ws = await connectWebSocket(instance.url, "/v1/responses");

// Send a response.create message
ws.send(JSON.stringify({
  type: "response.create",
  model: "gpt-4",
  input: [{ role: "user", content: "hello" }],
}));

const messages = await ws.waitForMessages(9);
const events = messages.map(m => JSON.parse(m));
const types = events.map(e => e.type);

expect(types[0]).toBe("response.created");
expect(types).toContain("response.output_text.delta");
expect(types).toContain("response.completed");

OpenAI Realtime

The Realtime API uses a conversational protocol with session management. aimock implements the GA (General Availability) protocol natively — event names like response.output_text.delta, conversation.item.added, and nested audio session config are the defaults. The Beta protocol is supported via the OpenAI-Beta: realtime=v1 header, which activates a translation shim that converts GA events back to Beta names (response.text.delta, conversation.item.created, flat session config).

Supported Models

Model	Session Types	Notes
gpt-realtime	conversation	Base alias — resolves to latest GA model
gpt-realtime-2	conversation	Default model — GA successor to gpt-4o-realtime-preview
gpt-realtime-1.5	conversation	Previous generation GA model
gpt-realtime-mini	conversation	Smaller, faster GA model
gpt-4o-transcribe	transcription, translation	Speech transcription and translation
gpt-4o-mini-transcribe	transcription, translation	Smaller transcription and translation model
whisper-1	transcription	Legacy Whisper transcription model

Session Types

conversation (default) — Standard conversational interaction with text and audio modalities
transcription — Audio-to-text transcription (requires gpt-4o-transcribe, gpt-4o-mini-transcribe, or whisper-1)
translation — Real-time speech translation (requires gpt-4o-transcribe or gpt-4o-mini-transcribe)

GA Protocol Features

GA event names — response.output_text.delta (was response.text.delta), conversation.item.added (was conversation.item.created), etc.
Nested audio config — Session config uses session.audio.voice instead of flat session.voice
Image input — input_image content parts in conversation.item.create
Commentary phase — phase field on response.output_item.added/done events (final_answer or commentary)
conversation.item.done — New event emitted after each completed response item
response.cancel — Client message to cancel in-flight responses

Beta Compatibility

Clients that send the OpenAI-Beta: realtime=v1 header receive Beta-format events automatically. The shim translates event names, flattens the nested audio config, and suppresses GA-only events like conversation.item.done. No code changes needed in tests that target the Beta protocol.

ws-realtime.test.ts (GA protocol) ts

const ws = await connectWebSocket(instance.url, "/v1/realtime?model=gpt-realtime-2");

// Server sends session.created on connect
const [sessionMsg] = await ws.waitForMessages(1);
const session = JSON.parse(sessionMsg);
expect(session.type).toBe("session.created");
expect(session.session.type).toBe("conversation");
expect(session.session.audio).toBeDefined();

// Configure session with nested audio config
ws.send(JSON.stringify({
  type: "session.update",
  session: {
    modalities: ["text"],
    audio: { voice: "alloy" }
  }
}));

// Add a user message (supports input_text + input_image content)
ws.send(JSON.stringify({
  type: "conversation.item.create",
  item: {
    type: "message",
    role: "user",
    content: [{ type: "input_text", text: "hello" }]
  }
}));

// Request a response
ws.send(JSON.stringify({ type: "response.create" }));

// GA events: output_text instead of text, item.added instead of item.created
const msgs = await ws.waitForMessages(10);
const events = msgs.map(m => JSON.parse(m));
expect(events.some(e => e.type === "response.output_text.delta")).toBe(true);
expect(events.some(e => e.type === "conversation.item.added")).toBe(true);
expect(events.some(e => e.type === "conversation.item.done")).toBe(true);

Gemini Live

Bidirectional streaming for Google Gemini Live API.

ws-gemini-live.test.ts ts

const ws = await connectWebSocket(
  instance.url,
  "/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent"
);

// Send setup message
ws.send(JSON.stringify({
  setup: { model: "models/gemini-2.0-flash-live" }
}));

// Send client content
ws.send(JSON.stringify({
  clientContent: {
    turns: [{ role: "user", parts: [{ text: "hello" }] }],
    turnComplete: true,
  }
}));

Implementation Details

Built on raw RFC 6455 WebSocket framing — zero external dependencies
Text messages only (no binary/audio/video)
Same fixture matching as HTTP endpoints
All WebSocket connections are logged in the journal

Gemini Live text support is unverified — no text-capable Gemini Live model existed at time of implementation. The WebSocket framing and protocol messages follow the published API spec.

Provider WebSocket Support

Not all LLM providers offer WebSocket APIs. Here's the current landscape:

Provider	WebSocket API	aimock Status
OpenAI Realtime	wss://api.openai.com/v1/realtime	Supported ✓
OpenAI Responses	wss://api.openai.com/v1/responses	Supported ✓
Gemini Live	wss://...BidiGenerateContent	Implemented, awaiting text model
Anthropic Claude	None	N/A
Azure OpenAI	Uses OpenAI Realtime	Covered by OpenAI
Mistral / Groq / Cohere	None	N/A
AWS Bedrock	EventStream (not WebSocket)	N/A

aimock includes drift canary tests that automatically detect when providers add new WebSocket capabilities. When a canary fires, it signals that aimock should be updated to support the new API.