Create Transcription
Upload an audio file and receive a structured JSON response with transcription, speaker diarization, and AI analysis in a single synchronous API call.
/v1/transcribeRequest
Send a multipart/form-data request with the audio file and optional parameters:
| Field | Type | Required | Description |
|---|---|---|---|
file | File | Yes | Audio file. Supported: MP3, WAV, M4A, OGG, FLAC, AAC, WebM, Opus, MP4. Max 100 MB |
language | String | No | ISO 639-1 code (e.g., "en"). Auto-detected if omitted |
mode | String | No | polished (default) or verbatim. Verbatim preserves raw transcript without AI formatting |
redact_pii | Boolean | No | Mask PII in transcript (names, SSNs, card numbers). Pro plan only. |
instructions | String | No | Custom AI prompt. Max 2,000 chars. See Custom Instructions |
plan | String | No | pro (default) or lite. Pro ($0.39/hr) includes full 11-feature AI analysis. Lite ($0.15/hr) includes diarization and transcript cleanup only. |
async | Boolean | No | Set to true for async processing. Returns 202 Accepted with a job_id immediately. Results delivered via webhooks. |
metadata | JSON String | No | Custom JSON object (max 4KB). Echoed back in webhook payloads for correlation. Only used with async=true. |
Example
curl -X POST https://api.voxparse.com/v1/transcribe \
-H "X-API-Key: YOUR_API_KEY" \
-F "file=@call-recording.mp3" \
-F "plan=pro" \
-F "language=en"Response
Returns a JSON object with the transcript, segments with timestamps, word-level timing, and an ai_analysis object. Pro plan includes sentiment, compliance, financial data, call summary, and action items. Lite plan includes diarization labels and cleaned transcript only.
mode=verbatim for raw transcripts without AI formatting. Use the default polished mode for full AI analysis with diarization, sentiment, and compliance.plan=pro ($0.39/hr) for full AI analysis including sentiment, compliance, financial extraction, and custom instructions. Use plan=lite ($0.15/hr) for diarization and transcript cleanup only. If omitted, defaults to pro.Async Mode
For background processing, add async=true to receive an immediate 202 Accepted response with a job ID. Results are delivered to your registered webhook endpoints when processing completes.
curl -X POST https://api.voxparse.com/v1/transcribe \
-H "X-API-Key: YOUR_API_KEY" \
-F "file=@call-recording.mp3" \
-F "async=true" \
-F 'metadata={"ticket_id":"T-1234"}'The metadata field (up to 4KB JSON) is echoed in the webhook payload for easy correlation with your internal systems. Validation errors still return 4xx synchronously even in async mode.
For the complete response schema, see the Full API Reference.