WebSocket APIs
aimock implements three WebSocket APIs with zero dependencies — real RFC 6455 framing built from scratch. The same fixtures drive HTTP and WebSocket transports.
Endpoints
| Path | API | Protocol |
|---|---|---|
| /v1/responses | OpenAI Responses API | WebSocket JSON messages |
| /v1/realtime | OpenAI Realtime API | WebSocket JSON messages |
| /ws/google.ai.generativelanguage.* | Gemini Live | WebSocket JSON messages |
OpenAI Responses (WebSocket)
const instance = await createServer([
{ match: { userMessage: "hello" }, response: { content: "Hi there!" } }
]);
const ws = await connectWebSocket(instance.url, "/v1/responses");
// Send a response.create message
ws.send(JSON.stringify({
type: "response.create",
model: "gpt-4",
input: [{ role: "user", content: "hello" }],
}));
const messages = await ws.waitForMessages(9);
const events = messages.map(m => JSON.parse(m));
const types = events.map(e => e.type);
expect(types[0]).toBe("response.created");
expect(types).toContain("response.output_text.delta");
expect(types).toContain("response.completed");
OpenAI Realtime
The Realtime API uses a conversational protocol with session management. aimock implements
the
GA (General Availability) protocol natively — event names like
response.output_text.delta, conversation.item.added, and nested
audio session config are the defaults. The Beta protocol is supported via the
OpenAI-Beta: realtime=v1 header, which activates a translation shim that
converts GA events back to Beta names (response.text.delta,
conversation.item.created, flat session config).
Supported Models
| Model | Session Types | Notes |
|---|---|---|
| gpt-realtime | conversation | Base alias — resolves to latest GA model |
| gpt-realtime-2 | conversation | Default model — GA successor to gpt-4o-realtime-preview |
| gpt-realtime-1.5 | conversation | Previous generation GA model |
| gpt-realtime-mini | conversation | Smaller, faster GA model |
| gpt-4o-transcribe | transcription, translation | Speech transcription and translation |
| gpt-4o-mini-transcribe | transcription, translation | Smaller transcription and translation model |
| whisper-1 | transcription | Legacy Whisper transcription model |
Session Types
- conversation (default) — Standard conversational interaction with text and audio modalities
-
transcription — Audio-to-text transcription (requires
gpt-4o-transcribe,gpt-4o-mini-transcribe, orwhisper-1) -
translation — Real-time speech translation (requires
gpt-4o-transcribeorgpt-4o-mini-transcribe)
GA Protocol Features
-
GA event names —
response.output_text.delta(wasresponse.text.delta),conversation.item.added(wasconversation.item.created), etc. -
Nested audio config — Session config uses
session.audio.voiceinstead of flatsession.voice -
Image input —
input_imagecontent parts inconversation.item.create -
Commentary phase —
phasefield onresponse.output_item.added/doneevents (final_answerorcommentary) -
conversation.item.done— New event emitted after each completed response item -
response.cancel— Client message to cancel in-flight responses
Beta Compatibility
Clients that send the OpenAI-Beta: realtime=v1 header receive Beta-format
events automatically. The shim translates event names, flattens the nested audio config,
and suppresses GA-only events like conversation.item.done. No code changes
needed in tests that target the Beta protocol.
const ws = await connectWebSocket(instance.url, "/v1/realtime?model=gpt-realtime-2");
// Server sends session.created on connect
const [sessionMsg] = await ws.waitForMessages(1);
const session = JSON.parse(sessionMsg);
expect(session.type).toBe("session.created");
expect(session.session.type).toBe("conversation");
expect(session.session.audio).toBeDefined();
// Configure session with nested audio config
ws.send(JSON.stringify({
type: "session.update",
session: {
modalities: ["text"],
audio: { voice: "alloy" }
}
}));
// Add a user message (supports input_text + input_image content)
ws.send(JSON.stringify({
type: "conversation.item.create",
item: {
type: "message",
role: "user",
content: [{ type: "input_text", text: "hello" }]
}
}));
// Request a response
ws.send(JSON.stringify({ type: "response.create" }));
// GA events: output_text instead of text, item.added instead of item.created
const msgs = await ws.waitForMessages(10);
const events = msgs.map(m => JSON.parse(m));
expect(events.some(e => e.type === "response.output_text.delta")).toBe(true);
expect(events.some(e => e.type === "conversation.item.added")).toBe(true);
expect(events.some(e => e.type === "conversation.item.done")).toBe(true);
Gemini Live
Bidirectional streaming for Google Gemini Live API.
const ws = await connectWebSocket(
instance.url,
"/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent"
);
// Send setup message
ws.send(JSON.stringify({
setup: { model: "models/gemini-2.0-flash-live" }
}));
// Send client content
ws.send(JSON.stringify({
clientContent: {
turns: [{ role: "user", parts: [{ text: "hello" }] }],
turnComplete: true,
}
}));
Implementation Details
- Built on raw RFC 6455 WebSocket framing — zero external dependencies
- Text messages only (no binary/audio/video)
- Same fixture matching as HTTP endpoints
- All WebSocket connections are logged in the journal
Gemini Live text support is unverified — no text-capable Gemini Live model existed at time of implementation. The WebSocket framing and protocol messages follow the published API spec.
Provider WebSocket Support
Not all LLM providers offer WebSocket APIs. Here's the current landscape:
| Provider | WebSocket API | aimock Status |
|---|---|---|
| OpenAI Realtime | wss://api.openai.com/v1/realtime | Supported ✓ |
| OpenAI Responses | wss://api.openai.com/v1/responses | Supported ✓ |
| Gemini Live | wss://...BidiGenerateContent | Implemented, awaiting text model |
| Anthropic Claude | None | N/A |
| Azure OpenAI | Uses OpenAI Realtime | Covered by OpenAI |
| Mistral / Groq / Cohere | None | N/A |
| AWS Bedrock | EventStream (not WebSocket) | N/A |
aimock includes drift canary tests that automatically detect when providers add new WebSocket capabilities. When a canary fires, it signals that aimock should be updated to support the new API.