Free setup on annual plans

Sign up today!

Voice and transcription

Voice and transcription

API Reference: Voice and transcription endpoints

The Voice and transcription APIs cover browser/PSTN calling through Twilio, Twilio callback handling, uploaded call audio chunks, realtime OpenAI SDP calls and standalone audio-file transcription.

Current endpoints

Twilio calls

This group combines protected representative API calls, phone-auth-aware audio upload, anonymous Twilio callbacks and OpenAI transcription/realtime handoff. Protected AnswerPal API endpoints use JWT Bearer authentication. The Twilio callback routes are anonymous because Twilio calls them directly.

  • GET /api/TwilioCalls/voice-token – Bearer JWT
    Issue a Twilio Voice access token for the current representative. Optional query twimlAppSidOverride can override the customer TwiML app SID. Response includes token, region and edge.
  • POST /api/TwilioCalls/initiate-call – Bearer JWT
    Start an outbound PSTN call using customer Twilio credentials. Body fields are fromNumber and toNumber, both normalized to +digits.
  • GET /api/TwilioCalls/call-status – Bearer JWT
    Return final Twilio call status for query callSid. The service checks child calls first and falls back to the parent call status.
  • POST /api/TwilioCalls/voice – Public Twilio callback
    Return TwiML for browser-device outbound calls. AnswerPal validates phone numbers, representative/customer ownership and storage limits before returning <Dial> or <Connect><Stream> TwiML.
  • POST /api/TwilioCalls/{repID}/recordingcallback – Public Twilio callback
    Receive completed Twilio recording metadata, download the MP3 from Twilio, create or update the phone ticket and ticket message, and transcribe when an OpenAI key and transcription model are available.

Voice token and outbound call

Voice token

GET /api/TwilioCalls/voice-token
Authorization: Bearer {token}

200 OK
{
  "token": "twilio-jwt...",
  "region": "ie1",
  "edge": "dublin"
}

Initiate call

POST /api/TwilioCalls/initiate-call
Authorization: Bearer {token}
Content-Type: application/json

{
  "fromNumber": "+3225550100",
  "toNumber": "+32470000000"
}

Twilio fields

TwilioCallRequest body

  • fromNumber
    Required Twilio caller number for server-initiated calls. Must normalize to +digits.
  • toNumber
    Required destination number. Must normalize to +digits.

Voice callback form fields

  • FromNumber, To, RepID, AccountSid, CallSid
    Core Twilio form data. The API validates number format and verifies that the representative belongs to the resolved customer.
  • Record
    When true on the non-translation path, the returned TwiML records the call and sets a recording status callback.
  • Translate
    When true, the returned TwiML starts a <Connect><Stream> using configured PhoneStreamUrl before dialing.
  • CallerLanguage, CallerVoice, ReceiverLanguage, ReceiverVoice, ForwardRawAudio
    Optional translation/stream parameters passed to the Twilio stream as custom parameters.

Recording callback form/query fields

  • repID
    Route value. Must be a positive representative ID and belong to the resolved customer.
  • customFrom, customTo
    Query values added by AnswerPal when it configures the recording callback. They identify the phone channel and remote number.
  • CallSid, AccountSid, RecordingSid, RecordingUrl, RecordingStartTime, RecordingDuration
    Twilio recording metadata used to resolve customer/channel, download MP3 audio and build ticket timing.

Twilio callback example

POST /api/TwilioCalls/voice
Content-Type: application/x-www-form-urlencoded

FromNumber=+3225550100
To=+32470000000
RepID=42
Record=true
Translate=false
CallSid=CA...
AccountSid=AC...

Access and behavior

Authentication and access

  • JWT routes
    voice-token, initiate-call, call-status, /api/realtime/calls and /api/Transcriptions/file require a representative JWT.
  • Phone-auth-aware route
    /api/AudioChunks/uploadAndFinalize accepts either a ticket-access representative or a phone-auth context. Phone-auth requests are constrained to the token channel.
  • Twilio callbacks
    /api/TwilioCalls/voice and /api/TwilioCalls/{repID}/recordingcallback are anonymous Twilio callbacks. They validate Twilio Account SID, phone channel and representative ownership in application logic.

Transcription model behavior

  • Customer OpenAI key
    Audio chunk transcription, recording callback transcription, realtime calls and file transcription all need the authenticated or resolved customer OpenAI API key. Missing keys return 400 on direct API calls or store explanatory preview text in ticket-based flows.
  • Default model
    Transcription uses the customer default transcription model when valid, otherwise the first active customer transcription model, then the first active shared transcription model. If none exists, direct file transcription returns 400 and ticket flows store a diagnostic preview.

Storage enforcement

  • Storage limit checks
    Outbound call initiation, Twilio voice callback and audio chunk upload enforce the customer storage limit before creating or storing call data.
GET /api/TwilioCalls/voice-token
Authorization: Bearer {token}

200 OK
{
  "token": "twilio-jwt...",
  "region": "ie1",
  "edge": "dublin"
}
POST /api/TwilioCalls/initiate-call
Authorization: Bearer {token}
Content-Type: application/json

{
  "fromNumber": "+3225550100",
  "toNumber": "+32470000000"
}
POST /api/TwilioCalls/voice
Content-Type: application/x-www-form-urlencoded

FromNumber=+3225550100
To=+32470000000
RepID=42
Record=true
Translate=false
CallSid=CA...
AccountSid=AC...
POST /api/AudioChunks/uploadAndFinalize
Authorization: Bearer {token}
Content-Type: application/json

[
  {
    "ticketID": 123,
    "senderType": "EndUser",
    "timestampMs": 0,
    "base64AudioData": "/////w=="
  }
]

Audio and transcription

Audio and transcription endpoints handle uploaded call chunks, realtime SDP calls and standalone transcription file uploads.

  • POST /api/AudioChunks/uploadAndFinalize – Bearer JWT (TicketAccessOrPhoneAuth)
    Accept all audio chunks for a call ticket in one JSON array, decode mu-law audio, build a stereo recording, store it on the ticket, transcribe it when possible and create a system ticket message.
  • POST /api/realtime/calls – Bearer JWT
    Forward a multipart SDP offer and realtime session JSON to OpenAI Realtime calls using the authenticated customer OpenAI API key. Returns the SDP answer as application/sdp.
  • POST /api/Transcriptions/file – Bearer JWT
    Upload one audio file as multipart form field file and receive transcription text in { "text": "..." }. Request size limit is 1 GB.

Audio chunks example

POST /api/AudioChunks/uploadAndFinalize
Authorization: Bearer {token}
Content-Type: application/json

[
  {
    "ticketID": 123,
    "senderType": "EndUser",
    "timestampMs": 0,
    "base64AudioData": "/////w=="
  }
]

Realtime and file transcription

Realtime call

POST /api/realtime/calls
Authorization: Bearer {token}
Content-Type: multipart/form-data

sdp:      application/sdp
session:  application/json

File transcription

POST /api/Transcriptions/file
Authorization: Bearer {token}
Content-Type: multipart/form-data

file: call-recording.mp3

200 OK
{ "text": "Transcribed audio text..." }

Fields and behavior

Request bodies and form fields

TwilioCallRequest body

  • fromNumber
    Required Twilio caller number for server-initiated calls. Must normalize to +digits.
  • toNumber
    Required destination number. Must normalize to +digits.

Voice callback form fields

  • FromNumber, To, RepID, AccountSid, CallSid
    Core Twilio form data. The API validates number format and verifies that the representative belongs to the resolved customer.
  • Record
    When true on the non-translation path, the returned TwiML records the call and sets a recording status callback.
  • Translate
    When true, the returned TwiML starts a <Connect><Stream> using configured PhoneStreamUrl before dialing.
  • CallerLanguage, CallerVoice, ReceiverLanguage, ReceiverVoice, ForwardRawAudio
    Optional translation/stream parameters passed to the Twilio stream as custom parameters.

Recording callback form/query fields

  • repID
    Route value. Must be a positive representative ID and belong to the resolved customer.
  • customFrom, customTo
    Query values added by AnswerPal when it configures the recording callback. They identify the phone channel and remote number.
  • CallSid, AccountSid, RecordingSid, RecordingUrl, RecordingStartTime, RecordingDuration
    Twilio recording metadata used to resolve customer/channel, download MP3 audio and build ticket timing.

AudioChunkCreateDTO array items

  • ticketID
    Required positive ticket ID. The ticket must exist and belong to the authenticated customer. Phone-auth requests must match the token channel.
  • senderType
    Required speaker label. Current validator accepts EndUser, AI or CustomerRep.
  • timestampMs
    Milliseconds since call start. Chunks are sorted by this value before audio is mixed.
  • base64AudioData
    Required base64-encoded mu-law audio bytes. The API decodes 8 kHz mu-law to PCM and builds stereo output.

RealtimeCallRequest multipart fields

  • sdp
    Required SDP offer field. Sent to OpenAI as application/sdp without filename.
  • session
    Required realtime session configuration JSON field. Sent to OpenAI as application/json.

Transcription file upload

  • file
    Required multipart form file. Empty uploads return 400. The authenticated customer must have an OpenAI key and an active transcription-capable model.

Access, storage and models

Authentication and access

  • JWT routes
    voice-token, initiate-call, call-status, /api/realtime/calls and /api/Transcriptions/file require a representative JWT.
  • Phone-auth-aware route
    /api/AudioChunks/uploadAndFinalize accepts either a ticket-access representative or a phone-auth context. Phone-auth requests are constrained to the token channel.
  • Twilio callbacks
    /api/TwilioCalls/voice and /api/TwilioCalls/{repID}/recordingcallback are anonymous Twilio callbacks. They validate Twilio Account SID, phone channel and representative ownership in application logic.

Transcription model behavior

  • Customer OpenAI key
    Audio chunk transcription, recording callback transcription, realtime calls and file transcription all need the authenticated or resolved customer OpenAI API key. Missing keys return 400 on direct API calls or store explanatory preview text in ticket-based flows.
  • Default model
    Transcription uses the customer default transcription model when valid, otherwise the first active customer transcription model, then the first active shared transcription model. If none exists, direct file transcription returns 400 and ticket flows store a diagnostic preview.

Storage enforcement

  • Storage limit checks
    Outbound call initiation, Twilio voice callback and audio chunk upload enforce the customer storage limit before creating or storing call data.

Related phone-auth tokens are issued by the Auth API through /api/Auth/phone-login. Phone tickets, messages, prompts, actions and attachments are documented on the Tickets and File endpoint pages.

POST /api/TwilioCalls/voice and POST /api/TwilioCalls/{repID}/recordingcallback are anonymous because Twilio calls them directly. They still validate Account SID, phone channel and representative/customer ownership.

Audio chunk payloads contain base64 mu-law audio. The API decodes 8 kHz mu-law to PCM, mixes caller and AI/representative audio into stereo, stores the recording and transcribes it when possible.

POST /api/Transcriptions/file returns { "text": "..." }. It requires the authenticated customer to have an OpenAI API key and an active transcription-capable model.

Table of Contents

AnswerPal: AI-powered customer service solutions to elevate your support and communication effortlessly.

Contact

For all support, sales, and partnership inquiries, email us at info@answerpal.eu

AnswerPal
Bisschoppenhoflaan 380
2100 Antwerp
Belgium

+32.36416685

BE 0862.692.858