Voice Apps
Build programmable voice applications with webhooks and real-time audio streaming.
Overview
Voice apps let you handle calls programmatically. DialStack notifies your server via webhook, and you decide what happens next. Voice apps support two modes:
Call Control — Your server takes ownership of the call. Connect bidirectional audio for AI voice assistants, transfer calls to extensions, or build IVR systems.
Call Listening — Stream real-time audio from calls without affecting them. Use this for live monitoring, real-time transcription, or analytics.
Both modes start with a webhook notification to your server. The webhook's event field tells you which mode triggered it.
Installation
Install the DialStack SDK for Node.js:
npm install @dialstack/sdk
Initialize the client with your API key:
import { DialStack } from '@dialstack/sdk/server';
const dialstack = new DialStack(process.env.DIALSTACK_API_KEY);
Creating a Voice App
- SDK
- cURL
const voiceApp = await dialstack.voiceApps.create(
{ name: 'AI Receptionist', url: 'https://your-server.example.com/voice/webhook' },
{ dialstackAccount: 'acct_01h2xcejqtf2nbrexx3vqjhp41' }
);
curl -X POST https://api.dialstack.ai/v1/voice-apps \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "DialStack-Account: acct_01h2xcejqtf2nbrexx3vqjhp41" \
-H "Content-Type: application/json" \
-d '{
"name": "AI Receptionist",
"url": "https://your-server.example.com/voice/webhook"
}'
Response:
{
"id": "va_01h2xcejqtf2nbrexx3vqjhp49",
"name": "AI Receptionist",
"url": "https://your-server.example.com/voice/webhook",
"status": "active",
"secret": "whsec_abc123def456...",
"created_at": "2025-10-18T10:00:00Z",
"updated_at": "2025-10-18T10:00:00Z"
}
Important: Save the secret value - you'll need it to verify webhook signatures.
Webhook Notifications
When a call reaches your voice app, DialStack sends an HTTP POST to your webhook URL. The same voice app can receive both event types — the event field tells you which one.
Webhook Events
| Event | Description | Trigger |
|---|---|---|
call.received | A call has been routed to this voice app for handling | Voice app is the call destination (extension or dial plan) |
call.notify | A call is passing through a Voice App (Notify) node in a dial plan | Voice App (Notify) node in a dial plan references this voice app |
For call.received, your server takes control of the call — use the Update Call API to attach audio, transfer, etc.
For call.notify, the call continues routing normally — use the Listeners API to stream audio if desired.
Webhook Payload
POST /voice/webhook HTTP/1.1
Host: your-server.example.com
Content-Type: application/json
X-DialStack-Signature: t=1697634600,v1=5257a869e7ecebeda32affa62cdca3fa51cad7e77a0e56ff536d0ce8e108d8bd
{
"event": "call.received",
"call_id": "call_01h2xcejqtf2nbrexx3vqjhp45",
"account_id": "acct_01h2xcejqtf2nbrexx3vqjhp41",
"voice_app_id": "va_01h2xcejqtf2nbrexx3vqjhp49",
"from_number": "+14155551234",
"from_name": "John Smith",
"to_number": "+14155559876"
}
Both call.received and call.notify use the same payload shape. The event field is the only difference.
Verifying Signatures
Verify webhook signatures using the voice app's secret to ensure requests are from DialStack:
- SDK
- Manual
const event = dialstack.webhooks.constructEvent(
req.rawBody,
req.headers['x-dialstack-signature'],
process.env.VOICE_APP_SECRET
);
// event contains: event, call_id, account_id, voice_app_id, from_number, from_name, to_number
const crypto = require('crypto');
function verifySignature(payload, signature, secret) {
const [tPart, v1Part] = signature.split(',');
const timestamp = tPart.split('=')[1];
const expectedSig = v1Part.split('=')[1];
// Reject old timestamps (replay protection)
const age = Date.now() / 1000 - parseInt(timestamp);
if (age > 300) {
// 5 minutes
return false;
}
// Compute expected signature
const signedPayload = `${timestamp}.${payload}`;
const computedSig = crypto.createHmac('sha256', secret).update(signedPayload).digest('hex');
// Constant-time comparison
return crypto.timingSafeEqual(Buffer.from(expectedSig), Buffer.from(computedSig));
}
Webhook Response
Return 200 OK to acknowledge receipt. The response body is ignored.
app.post('/voice/webhook', (req, res) => {
const { event, call_id } = req.body;
// Acknowledge immediately
res.sendStatus(200);
// Handle based on event type
if (event === 'call.received') {
handleCallControl(call_id);
} else if (event === 'call.notify') {
handleCallNotify(call_id);
}
});
Voice App Dial Plan Nodes
Voice apps can be used in dial plans in two modes, selected by the mode field on the voice_app node:
- Control mode — shipping today. Appears as the Voice App node in the editor.
- Notify mode — planned (see Coming soon below). In the target design, the editor will split these into two distinct palette entries (Voice App (Control) and Voice App (Notify)); today there is a single Voice App node that operates in control mode.
Voice App (Control)
Routes the call to the voice app. Your server receives a call.received webhook and takes ownership of the call — attaching audio, transferring, etc.
{
"id": "ai_receptionist",
"type": "voice_app",
"config": {
"voice_app_id": "va_01h2xcejqtf2nbrexx3vqjhp49",
"mode": "control",
"next": "voicemail"
}
}
Voice App (Notify)
Voice App (Notify) mode is currently undergoing implementation and will be available shortly. The surface documented below reflects the target design; specifics may change before release.
Sends a fire-and-forget notification to the voice app as the call passes through, without interrupting call routing. This is useful for triggering external actions (real-time transcription, call analytics, CRM logging) alongside normal call handling.
┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌─────────┐
│ Schedule │────▶│ Voice App │────▶│ Dial │────▶│Voicemail│
│ Node │ │ (Notify) │ │ User │ │ │
└──────────┘ └──────┬───────┘ └──────────┘ └─────────┘
│
│ POST (fire-and-forget)
▼
┌─────────────┐
│ Your Server │
└─────────────┘
The Voice App (Notify) node:
- Sends an HTTP POST to the voice app's URL with
"event": "call.notify" - Immediately continues to the next node — it does not wait for a response
- Does not answer or interrupt the call
- Uses the same signature verification as
call.receivedwebhooks
{
"id": "notify_transcription",
"type": "voice_app",
"config": {
"voice_app_id": "va_01h2xcejqtf2nbrexx3vqjhp49",
"mode": "notify",
"next": "dial_reception"
}
}
Call Control
When your voice app receives a call.received webhook, your server takes ownership of the call and controls it via the Update Call API.
┌─────────┐ ┌───────────┐ ┌─────────────┐
│ Caller │ │ DialStack │ │ Your Server │
└────┬────┘ └─────┬─────┘ └──────┬──────┘
│ │ │
│ 1. Call arrives │ │
│────────────────────────▶│ │
│ │ │
│ │ 2. Webhook POST │
│ │─────────────────────────▶│
│ │ │
│ │ 3. POST /v1/calls/{id} │
│ │◀─────────────────────────│
│ │ (attach audio) │
│ │ │
│ 4. Bidirectional audio │ │
│◀───────────────────────▶│◀────────────────────────▶│
│ (WebSocket) │ (WebSocket) │
│ │ │
Actions
Use the Update Call API to send actions. Actions are processed sequentially.
Attach Audio Stream
Connect bidirectional audio to your WebSocket server (see WebSocket API for the message protocol):
- SDK
- cURL
await dialstack.calls.update(
callId,
{ actions: [{ type: 'attach', url: 'wss://your-server.example.com/voice/stream' }] },
{ dialstackAccount: accountId }
);
curl -X POST https://api.dialstack.ai/v1/calls/call_01h2xcejqtf2nbrexx3vqjhp45 \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "DialStack-Account: acct_01h2xcejqtf2nbrexx3vqjhp41" \
-H "Content-Type: application/json" \
-d '{
"actions": [
{"type": "attach", "url": "wss://your-server.example.com/voice/stream"}
]
}'
The attach action blocks until the WebSocket disconnects, then processing continues with the next action.
Transfer to Extension
Transfer the caller to an extension:
- SDK
- cURL
await dialstack.calls.update(
callId,
{ actions: [{ type: 'transfer', extension: '100' }] },
{ dialstackAccount: accountId }
);
curl -X POST https://api.dialstack.ai/v1/calls/call_01h2xcejqtf2nbrexx3vqjhp45 \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "DialStack-Account: acct_01h2xcejqtf2nbrexx3vqjhp41" \
-H "Content-Type: application/json" \
-d '{
"actions": [
{"type": "transfer", "extension": "100"}
]
}'
If the transfer target answers or the caller hangs up, processing stops. If the transfer fails (no answer, busy), processing continues with the next action.
Combining Actions
Chain actions for fallback behavior:
{
"actions": [
{ "type": "attach", "url": "wss://ai.example.com/voice" },
{ "type": "transfer", "extension": "100" }
]
}
This connects to your AI voice assistant first. When the WebSocket disconnects (e.g., AI hands off), the call transfers to extension 100.
Replacing Actions
Sending a new update replaces all pending actions immediately. The current action is interrupted, and processing starts from the first action in the new list.
- SDK
- fetch
// AI decides to transfer the call
await dialstack.calls.update(
callId,
{ actions: [{ type: 'transfer', extension: '100' }] },
{ dialstackAccount: accountId }
);
// AI decides to transfer the call
await fetch(`https://api.dialstack.ai/v1/calls/${callId}`, {
method: 'POST',
headers: {
Authorization: `Bearer ${apiKey}`,
'DialStack-Account': accountId,
'Content-Type': 'application/json',
},
body: JSON.stringify({
actions: [{ type: 'transfer', extension: '100' }],
}),
});
WebSocket Audio Streaming
When DialStack executes an attach action, it connects to your WebSocket URL and streams audio bidirectionally. For the complete protocol specification, see the WebSocket API reference.
Audio Format
| Property | Value |
|---|---|
| Encoding | μ-law (G.711) |
| Sample rate | 8000 Hz |
| Channels | 1 (mono) |
| Chunk size | ~20ms (160 bytes before base64) |
| Bandwidth | ~8 KB/second |
Messages from DialStack
Begin — Sent when connection is established:
{
"event": "begin",
"call_id": "call_01h2xcejqtf2nbrexx3vqjhp45",
"account_id": "acct_01h2xcejqtf2nbrexx3vqjhp41",
"audio_format": {
"encoding": "audio/x-mulaw",
"sample_rate": 8000,
"channels": 1
}
}
Audio — Caller's audio (sent continuously):
{
"event": "audio",
"timestamp": 1234,
"payload": "base64-encoded-mulaw-audio"
}
Messages to DialStack
Audio — Audio to play to the caller:
{
"event": "audio",
"payload": "base64-encoded-mulaw-audio"
}
Ending the Session
Either side can close the WebSocket to end the audio session. When closed, DialStack continues processing with the next action (if any).
Using MediaStream (SDK)
The SDK provides a MediaStream class that handles WebSocket message parsing and provides a clean event-based API:
import { MediaStream } from '@dialstack/sdk/server';
import { WebSocketServer } from 'ws';
const wss = new WebSocketServer({ port: 8080 });
wss.on('connection', (ws) => {
const stream = new MediaStream(ws);
stream.on('begin', (event) => {
console.log('Call started:', event.call_id);
console.log('Audio format:', event.audio_format);
// Send greeting audio
stream.sendAudio(greetingAudioBase64);
});
stream.on('audio', (event) => {
// event.payload contains base64-encoded μ-law audio
// event.timestamp contains the audio timestamp
// Process with your AI pipeline and respond
const responseAudio = processAudio(event.payload);
stream.sendAudio(responseAudio);
// Or send raw Buffer (auto base64-encoded)
stream.sendAudioBuffer(audioBuffer);
});
stream.on('close', (event) => {
console.log('Call ended:', event.code, event.reason);
});
stream.on('error', (event) => {
console.error('Stream error:', event.error);
});
});
Complete Example: AI Voice Assistant
- SDK
- Raw
import express from 'express';
import { WebSocketServer } from 'ws';
import { DialStack, MediaStream } from '@dialstack/sdk/server';
const app = express();
app.use(
express.json({
verify: (req, res, buf) => {
req.rawBody = buf;
},
})
);
const dialstack = new DialStack(process.env.DIALSTACK_API_KEY);
const VOICE_APP_SECRET = process.env.VOICE_APP_SECRET;
// Webhook endpoint
app.post('/voice/webhook', async (req, res) => {
let event;
try {
event = dialstack.webhooks.constructEvent(
req.rawBody,
req.headers['x-dialstack-signature'],
VOICE_APP_SECRET
);
} catch (err) {
return res.sendStatus(401);
}
const { call_id, account_id, from_number, from_name } = event;
console.log(`Incoming call from ${from_name || from_number}`);
res.sendStatus(200);
// Attach audio stream with fallback transfer
await dialstack.calls.update(
call_id,
{
actions: [
{ type: 'attach', url: 'wss://your-server.example.com/voice/stream' },
{ type: 'transfer', extension: '100' },
],
},
{ dialstackAccount: account_id }
);
});
// WebSocket server for audio streaming
const wss = new WebSocketServer({ noServer: true });
wss.on('connection', (ws) => {
const stream = new MediaStream(ws);
stream.on('begin', (event) => {
console.log(`Audio stream started for call ${stream.callId}`);
stream.sendAudio(generateGreetingAudio());
});
stream.on('audio', (event) => {
const audioBuffer = Buffer.from(event.payload, 'base64');
processAudioWithAI(audioBuffer, (responseAudio) => {
stream.sendAudio(responseAudio);
});
});
stream.on('close', () => {
console.log(`Audio stream ended for call ${stream.callId}`);
});
});
const server = app.listen(3000);
server.on('upgrade', (request, socket, head) => {
if (request.url === '/voice/stream') {
wss.handleUpgrade(request, socket, head, (ws) => {
wss.emit('connection', ws, request);
});
} else {
socket.destroy();
}
});
const express = require('express');
const WebSocket = require('ws');
const crypto = require('crypto');
const app = express();
app.use(
express.json({
verify: (req, res, buf) => {
req.rawBody = buf;
},
})
);
const API_KEY = process.env.DIALSTACK_API_KEY;
const VOICE_APP_SECRET = process.env.VOICE_APP_SECRET;
const ACCOUNT_ID = process.env.DIALSTACK_ACCOUNT_ID;
// Webhook endpoint
app.post('/voice/webhook', async (req, res) => {
const signature = req.headers['x-dialstack-signature'];
if (!verifySignature(req.rawBody.toString(), signature, VOICE_APP_SECRET)) {
return res.sendStatus(401);
}
const { call_id, from_number, from_name } = req.body;
console.log(`Incoming call from ${from_name || from_number}`);
res.sendStatus(200);
await fetch(`https://api.dialstack.ai/v1/calls/${call_id}`, {
method: 'POST',
headers: {
Authorization: `Bearer ${API_KEY}`,
'DialStack-Account': ACCOUNT_ID,
'Content-Type': 'application/json',
},
body: JSON.stringify({
actions: [
{ type: 'attach', url: 'wss://your-server.example.com/voice/stream' },
{ type: 'transfer', extension: '100' },
],
}),
});
});
// WebSocket server for audio streaming
const wss = new WebSocket.Server({ noServer: true });
wss.on('connection', (ws) => {
let callId;
ws.on('message', (data) => {
const message = JSON.parse(data);
switch (message.event) {
case 'begin':
callId = message.call_id;
console.log(`Audio stream started for call ${callId}`);
ws.send(JSON.stringify({ event: 'audio', payload: generateGreetingAudio() }));
break;
case 'audio':
const audioBuffer = Buffer.from(message.payload, 'base64');
processAudioWithAI(audioBuffer, (responseAudio) => {
ws.send(JSON.stringify({ event: 'audio', payload: responseAudio }));
});
break;
}
});
ws.on('close', () => {
console.log(`Audio stream ended for call ${callId}`);
});
});
const server = app.listen(3000);
server.on('upgrade', (request, socket, head) => {
if (request.url === '/voice/stream') {
wss.handleUpgrade(request, socket, head, (ws) => {
wss.emit('connection', ws, request);
});
} else {
socket.destroy();
}
});
function verifySignature(payload, signature, secret) {
const [tPart, v1Part] = signature.split(',');
const timestamp = tPart.split('=')[1];
const expectedSig = v1Part.split('=')[1];
const age = Date.now() / 1000 - parseInt(timestamp);
if (age > 300) return false;
const signedPayload = `${timestamp}.${payload}`;
const computedSig = crypto.createHmac('sha256', secret).update(signedPayload).digest('hex');
return crypto.timingSafeEqual(Buffer.from(expectedSig), Buffer.from(computedSig));
}
Listeners
When your voice app receives a call.notify webhook, you can create a listener to stream real-time audio from the call. Audio flows one way only — from DialStack to your server; the listener is passive and does not inject audio or alter the call. Neither party hears a tone or indication that a listener is attached, so you are responsible for obtaining appropriate consent from the parties on the call in accordance with applicable law (recording-consent requirements vary by jurisdiction).
┌────────┐ ┌───────────┐ ┌─────────────┐
│ Caller │ │ DialStack │ │ Your Server │
└───┬────┘ └─────┬─────┘ └──────┬──────┘
│ │ │
│ Normal two-party call │ │
│◀─────────────────────────▶│ │
│ │ │
│ │ 1. Webhook (call.notify) │
│ │──────────────────────────▶│
│ │ │
│ │ 2. POST /v1/calls/{id}/ │
│ │ listeners │
│ │◀──────────────────────────│
│ │ │
│ Call continues normally │ 3. Audio (one-way WSS) │
│◀─────────────────────────▶│──────────────────────────▶│
│ │ │
- A Voice App (Notify) node notifies your server that a call has started
- Your server creates a listener on that call
- DialStack opens a WebSocket to your server and streams audio
Creating a Listener
- SDK
- cURL
const listener = await dialstack.calls.createListener(
callId,
{
url: 'wss://your-server.example.com/audio',
channel: 'both',
metadata: { agent_id: 'user_123', queue: 'support' },
},
{ dialstackAccount: accountId }
);
curl -X POST https://api.dialstack.ai/v1/calls/call_01h2xcejqtf2nbrexx3vqjhp45/listeners \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "DialStack-Account: acct_01h2xcejqtf2nbrexx3vqjhp41" \
-H "Content-Type: application/json" \
-d '{
"url": "wss://your-server.example.com/audio",
"channel": "both",
"metadata": {"agent_id": "user_123", "queue": "support"}
}'
Channel Selection
| Channel | Audio received |
|---|---|
caller | Audio from the party that initiated the call |
callee | Audio from the party that received the call |
both | Both channels, delivered as separate tagged messages |
Stopping a Listener
Listeners stop automatically when the call ends. To stop early:
- SDK
- cURL
await dialstack.calls.deleteListener(callId, listenerId, {
dialstackAccount: accountId,
});
curl -X DELETE https://api.dialstack.ai/v1/calls/call_.../listeners/lstn_... \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "DialStack-Account: acct_01h2xcejqtf2nbrexx3vqjhp41"
Listener WebSocket Protocol
The listener WebSocket protocol extends the voice app protocol with channel tagging and an end message. See the WebSocket API for the full specification.
Begin — Sent when connection is established:
{
"event": "begin",
"listener_id": "lstn_01h2xcejqtf2nbrexx3vqjhp50",
"call_id": "call_01h2xcejqtf2nbrexx3vqjhp45",
"account_id": "acct_01h2xcejqtf2nbrexx3vqjhp41",
"channel": "both",
"metadata": { "agent_id": "user_123", "queue": "support" },
"audio_format": {
"encoding": "audio/x-mulaw",
"sample_rate": 8000,
"channels": 1
}
}
The listener_id field distinguishes listener sessions from voice app sessions, allowing the same server to handle both.
Audio — Call audio, tagged by channel:
{
"event": "audio",
"channel": "caller",
"timestamp": 1234,
"payload": "base64-encoded-mulaw-audio"
}
End — Sent when the listener stops:
{
"event": "end",
"listener_id": "lstn_01h2xcejqtf2nbrexx3vqjhp50",
"reason": "call_ended"
}
Reasons: call_ended, deleted (stopped via API), error.
Complete Example: Real-Time Transcription
import express from 'express';
import { WebSocketServer } from 'ws';
import { DialStack } from '@dialstack/sdk/server';
const app = express();
app.use(
express.json({
verify: (req, res, buf) => {
req.rawBody = buf;
},
})
);
const dialstack = new DialStack(process.env.DIALSTACK_API_KEY);
const VOICE_APP_SECRET = process.env.VOICE_APP_SECRET;
// Webhook endpoint — receives call.notify from Voice App (Notify) dial plan node
app.post('/voice/webhook', async (req, res) => {
let event;
try {
event = dialstack.webhooks.constructEvent(
req.rawBody,
req.headers['x-dialstack-signature'],
VOICE_APP_SECRET
);
} catch (err) {
return res.sendStatus(401);
}
res.sendStatus(200);
if (event.event === 'call.notify') {
// Create a listener to stream audio for transcription
await dialstack.calls.createListener(
event.call_id,
{
url: 'wss://your-server.example.com/audio',
channel: 'both',
metadata: { from: event.from_number, to: event.to_number },
},
{ dialstackAccount: event.account_id }
);
}
});
// WebSocket server for receiving listener audio
const wss = new WebSocketServer({ noServer: true });
wss.on('connection', (ws) => {
let listenerId;
ws.on('message', (data) => {
const message = JSON.parse(data);
switch (message.event) {
case 'begin':
listenerId = message.listener_id;
console.log(`Listening to call ${message.call_id} (${message.channel})`);
break;
case 'audio':
// Send to your speech-to-text service
transcribe(message.payload, message.channel);
break;
case 'end':
console.log(`Listener ${listenerId} stopped: ${message.reason}`);
break;
}
});
});
const server = app.listen(3000);
server.on('upgrade', (request, socket, head) => {
if (request.url === '/audio') {
wss.handleUpgrade(request, socket, head, (ws) => {
wss.emit('connection', ws, request);
});
} else {
socket.destroy();
}
});
API Reference
- Voice Apps — Create and manage voice apps
- Update Call — Control active calls with actions
- Listeners — Stream real-time audio from active calls
- WebSocket API — Audio streaming protocol (voice apps and listeners)