Create Transcription

Upload an audio file and receive a structured JSON response with transcription, speaker diarization, and AI analysis in a single synchronous API call.

POST/v1/transcribe

Request

Send a multipart/form-data request with the audio file and optional parameters:

Field	Type	Required	Description
`file`	File	Yes	Audio file. Supported: MP3, WAV, M4A, OGG, FLAC, AAC, WebM, Opus, MP4. Max 100 MB
`language`	String	No	ISO 639-1 code (e.g., "en"). Auto-detected if omitted
`mode`	String	No	`polished` (default) or `verbatim`. Verbatim preserves raw transcript without AI formatting
`redact_pii`	Boolean	No	Mask PII in transcript (names, SSNs, card numbers). Pro plan only.
`instructions`	String	No	Custom AI prompt. Max 2,000 chars. See Custom Instructions
`plan`	String	No	`pro` (default) or `lite`. Pro ($0.39/hr) includes full 11-feature AI analysis. Lite ($0.15/hr) includes diarization and transcript cleanup only.
`async`	Boolean	No	Set to `true` for async processing. Returns `202 Accepted` with a `job_id` immediately. Results delivered via webhooks.
`metadata`	JSON String	No	Custom JSON object (max 4KB). Echoed back in webhook payloads for correlation. Only used with `async=true`.

Example

curl

curl -X POST https://api.voxparse.com/v1/transcribe \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "file=@call-recording.mp3" \
  -F "plan=pro" \
  -F "language=en"

Response

Returns a JSON object with the transcript, segments with timestamps, word-level timing, and an ai_analysis object. Pro plan includes sentiment, compliance, financial data, call summary, and action items. Lite plan includes diarization labels and cleaned transcript only.

Tip: Use mode=verbatim for raw transcripts without AI formatting. Use the default polished mode for full AI analysis with diarization, sentiment, and compliance.

Plan selection: Use plan=pro ($0.39/hr) for full AI analysis including sentiment, compliance, financial extraction, and custom instructions. Use plan=lite ($0.15/hr) for diarization and transcript cleanup only. If omitted, defaults to pro.

Async Mode

For background processing, add async=true to receive an immediate 202 Accepted response with a job ID. Results are delivered to your registered webhook endpoints when processing completes.

curl — async request

curl -X POST https://api.voxparse.com/v1/transcribe \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "file=@call-recording.mp3" \
  -F "async=true" \
  -F 'metadata={"ticket_id":"T-1234"}'

The metadata field (up to 4KB JSON) is echoed in the webhook payload for easy correlation with your internal systems. Validation errors still return 4xx synchronously even in async mode.

For the complete response schema, see the Full API Reference.