Architecture overview
A one-page mental model of how DialStack fits together. Read this once and the rest of the docs will make more sense.
Tenancy model
DialStack is multi-tenant from the ground up. Every resource lives inside this hierarchy:
- Platform — your SaaS product. One per partner.
- Account — one of your customers (a clinic, a dealership, a contractor).
- User — a person at that customer with phone service.
- Softphone / physical device — what rings. A user can have any number of softphone sessions (browser + mobile) plus at most one physical device (a desk phone or a DECT handset). All registered endpoints ring in parallel; first to answer wins.
Your API key is scoped to your Platform. Every request includes a DialStack-Account header to pick which customer you're acting on (or carries the account in a JWT claim; see Authentication below).
The account resource graph
Inside an account, resources form a small, opinionated graph. Inbound calls arrive on a phone number or an extension, both of which route to routing targets (e.g., a user, a ring group, a dial plan, or a voice app).
- Phone numbers are the PSTN entry point. Each has a
routing_target— the ID of the user, ring group, dial plan, or voice app that inbound calls to that number hit. - Extensions are short internal dial codes (e.g., dial
200to reach reception). They route to the same routing targets as phone numbers; they're a parallel surface, not a hop on the way. - Dial plans are the branching logic — e.g., schedule-based routing, ring-all, external transfer, dial, or voice-app handoff.
- Ring groups parallel-dial users.
- Voice apps are programmable call handlers (REST webhook + optional audio WebSocket); AI Agents are a DialStack-managed Voice App with a pre-built receptionist persona.
- Users can register any number of softphones (browser, mobile) plus at most one physical device (desk phone or DECT handset). All registered endpoints ring in parallel.
Call flow
You interact with DialStack through four surfaces: the REST API, Webhooks, SSE (Server-Sent Events), and WebSockets. Everything else is managed for you.
Key properties:
- Webhooks are fire-and-forget. Respond with an HTTP
200OK quickly (before executing long-running logic); you cannot return data to influence the call. React in your own systems. - SSE (Server-Sent Events) is account-scoped and safe to consume in browsers with a session token — ideal for Screen Pop.
- Media is relayed by DialStack. Your app never touches audio unless you explicitly
attacha WebSocket (e.g., for BYO VoiceAI or Listeners).
The four surfaces
| Surface | Direction | Use for |
|---|---|---|
| REST API | You → DialStack | Provisioning, config, triggering calls, reading history. Auth: Bearer token. |
| Webhooks | DialStack → your backend | Durable event delivery with retries and signed payloads. Platform-scoped. |
SSE (/v1/events) | DialStack → your frontend | Real-time browser notifications. Account-scoped via session token. |
| WebSockets | Bidirectional | attach for bidirectional audio (Voice App Control mode); Listeners API for one-way audio. |
Full docs: REST, Webhooks, SSE, WebSocket API.
Authentication at a glance
| Token | Who uses it | Scope | TTL |
|---|---|---|---|
Platform API key (sk_live_...) | Your backend | Your entire platform; needs DialStack-Account header per request | Never |
| Session token (JWT) | Your frontend via the SDK | One account; claims carry the account | 1 hour, auto-refresh |
| User token (JWT) | A specific end user's device | One user in one account | Short, refresh via /v1/auth/token |
All three use Authorization: Bearer <token>. Full flow: Authentication guide.
What DialStack hosts vs. what you host
What DialStack hosts and manages
- PSTN / SIP edges, carrier interconnects
- Call routing engine, dial plan execution
- WebRTC media relay, TURN
- Voicemail, recording, transcription
- AI Agents (native receptionist)
- White Label admin portal
- Regulatory and tax/fee compliance
What you build and host
- Your signup flow, account sync, user lifecycle
- Webhook handler for activity logging / Screen Pop
- Your frontend (Embedded tier) or nothing (White Label tier)
- (BYO VoiceAI only) an audio-bridge WebSocket server
- (AI Scheduling) availability + booking HTTP endpoints
Nothing about calls, carrier relationships, or media transport is your responsibility. You own the business logic and the data you care about.
Identifiers
All DialStack IDs are opaque strings of at most 255 characters. Treat them as opaque; persist them exactly as returned.
Every webhook event carries an id field. That's the idempotency key for consumers — DialStack may deliver the same event more than once, so dedupe by event.id before writing.
What's next
- Quickstart — provision your first account, user, and phone number.
- Webhook events — the firehose of everything that happens on a call.
- Dial plans — route calls where you want them.
- Integration tiers — pick between White Label, Embedded, and Direct API.