VoxChron FAQ

Question 1

What is VoxChron?

Accepted Answer

VoxChron is an AI-powered captioning platform that transcribes speech and also detects 500+ non-verbal sounds — laughter, sighs, music, applause, door slams, environmental effects and more. It supports 120+ languages, 26 context modes, live captioning over RTMP/SRT/WebRTC, and exports to SRT, VTT, JSON, TXT and DOCX.

Question 2

How is VoxChron different from Rev, Otter, Descript or Whisper?

Accepted Answer

Most transcription tools only capture words. VoxChron additionally detects over 500 non-verbal sound events and timestamps them inline (e.g. [laughs], [door slams], [pause 2.3s]) — which is required by the FCC, ADA and WCAG 2.1 for truly accessible closed captions. It also offers context modes (Action, Fantasy, News, Education, etc.) that bias the sound model toward genre-specific events.

Question 3

How long does it take to process a file?

Accepted Answer

Typical turnaround is 5–15% of the media duration. A 60-minute video usually completes in 3–9 minutes. Very long files (5 GB+) may take longer. You receive an in-app notification and an email when your file is ready.

Question 4

Do I need to install any software?

Accepted Answer

No. VoxChron runs entirely in your browser. Upload your file, review the captions in the in-browser editor, and download the finished exports. There is also a REST API and official integrations with YouTube, Vimeo, WordPress, Brightcove and OBS.

Question 5

Is there a free trial?

Accepted Answer

Yes — 30 minutes of free processing credits on signup, no credit card required. Live captioning also includes 5 free minutes so you can test RTMP/WebRTC end-to-end latency before subscribing.

Question 6

How accurate is VoxChron?

Accepted Answer

Speech accuracy averages 95–98% for clean studio audio in supported languages. Non-verbal sound detection exceeds 90% for common events (laughter, music, applause, silence). Accuracy varies with audio quality, accents, and background noise — you can always correct output in the in-browser editor before exporting.

Question 7

What non-verbal sounds can VoxChron detect?

Accepted Answer

Over 500 sound categories grouped into: human sounds (laughter, sighs, gasps, coughs, crying, sneezing), crowd reactions (applause, cheering, booing), music cues (♪ theme music, ♪ dramatic sting), environmental (doors, footsteps, vehicles, weather, phone rings, typing), and genre-specific sounds like gunshots (Action mode) or magical effects (Fantasy mode).

Question 8

What is Full Hearing-Impaired (Full CC) Mode?

Accepted Answer

Full CC Mode lowers the detection threshold to 45% so every perceptible non-verbal sound is captioned — producing FCC §79.1 and ADA Title III-compliant closed captions for Deaf and hard-of-hearing viewers. Other presets are Key moments only (90%), Recommended (80%) and Enhanced detection (65%).

Question 9

Can I edit the captions before exporting?

Accepted Answer

Yes. Every job opens in our in-browser editor where you can correct text, adjust timing, rename speakers, merge or split segments, and add or remove sound events. Changes are saved continuously and exports re-render instantly.

Question 10

How does pricing work?

Accepted Answer

Pricing is based on processing minutes consumed. You can buy credit packs or subscribe to monthly plans. Live captioning is billed per hour of streamed content. Volume discounts apply at 1,000 / 10,000 / 100,000 minute tiers. See our pricing page for current rates.

Question 11

Do unused minutes expire?

Accepted Answer

Credit pack minutes are valid for 12 months from purchase. Monthly subscription minutes reset each billing cycle.

Question 12

Can I cancel anytime?

Accepted Answer

Yes. Subscriptions cancel at the end of the current billing period. No lock-in contracts. See our Refund Policy for refund eligibility.

Question 13

Do you offer enterprise pricing?

Accepted Answer

Yes. Contact our sales team for volume pricing, SLAs, dedicated infrastructure, custom context modes and private-cloud / on-premise deployment options.

Question 14

What file formats do you support?

Accepted Answer

Video: MP4, MOV, MKV, AVI, WEBM, FLV, WMV. Audio: MP3, WAV, AAC, FLAC, OGG, M4A, WMA. Plus 30+ other codecs via FFmpeg. Broadcast formats such as MXF and ProRes are supported on enterprise plans.

Question 15

What is the maximum file size?

Accepted Answer

10 GB per file, which is roughly 90 minutes at high bitrate or 6+ hours at podcast-quality audio. Files over 5 GB use multipart upload automatically.

Question 16

What export formats are available?

Accepted Answer

SRT (SubRip), VTT (WebVTT), JSON with word-level timestamps, plain TXT, and DOCX with speaker labels and timestamps. Broadcast-ready SCC / STL / EBU-TT exports are available on enterprise plans.

Question 17

How many languages does VoxChron support?

Accepted Answer

VoxChron transcribes 120+ languages and dialects. Non-verbal sound detection is language-agnostic — the same 500+ sound categories apply across all languages. Automatic language detection is on by default.

Question 18

What is the latency for live captions?

Accepted Answer

End-to-end latency is under 3 seconds from spoken word to on-screen caption. This includes audio capture, transcription, non-verbal sound detection, formatting and display over WebSocket.

Question 19

Does VoxChron support RTMP, SRT and HLS live streams?

Accepted Answer

Yes. VoxChron ingests RTMP, SRT, WebRTC and HLS/DASH streams, and outputs captions over WebSocket, CEA-608/708 (for broadcast), and an overlay URL for OBS or vMix browser sources.

Question 20

Which streaming platforms does VoxChron integrate with?

Accepted Answer

RTMP, SRT, WebRTC, OBS Studio, vMix, Wirecast, Zoom, Microsoft Teams, Google Meet, YouTube Live, Twitch, Facebook Live, and custom HLS/DASH endpoints.

Question 21

Can I choose the content type for live streams?

Accepted Answer

Yes. When you enable Non-Verbal Sound detection on a live stream, you can choose a content type — General, Podcast, News, Accessibility, Live Broadcast, Vlog / Social Media, Education, Training Video or Marketing Video — and the model biases its sound detection and formatting accordingly.

Question 22

Is my data secure?

Accepted Answer

All uploads use HTTPS / TLS 1.3. Files are encrypted at rest with AES-256 on our Hetzner object storage (EU datacenters). Source files are deleted automatically after processing. Only exported captions are retained. See our Security page for full details.

Question 23

Is VoxChron GDPR compliant?

Accepted Answer

Yes. VoxChron is operated by a UK-registered company and processes data in accordance with UK GDPR and EU GDPR. We offer a signed Data Processing Agreement (DPA) on request. See our Privacy Policy and DPA.

Question 24

Does VoxChron meet FCC, ADA, WCAG and Section 508 standards?

Accepted Answer

VoxChron helps meet captioning requirements under FCC §79.1 (Television Closed Captioning), ADA Title III, WCAG 2.1 AA (Success Criteria 1.2.2 & 1.2.4) and Section 508. Full CC Mode and non-verbal sound detection are specifically designed to satisfy the "non-speech information" requirement that other transcription tools miss. Final compliance determinations depend on specific use cases and should be reviewed by qualified counsel.

Question 25

Where is my data stored?

Accepted Answer

Uploads are stored on encrypted Hetzner Object Storage in the EU (Nuremberg, Germany) by default. Enterprise customers can request Falkenstein, Helsinki or private-cloud deployment with regional hosting.

Question 26

Does VoxChron have a REST API?

Accepted Answer

Yes. The REST API supports file uploads (including presigned multipart), job status polling, webhook callbacks, live-stream session creation, and export downloads. Rate limits are generous on paid plans. See our API Docs for full reference.

Question 27

Can I integrate VoxChron with my CMS or video platform?

Accepted Answer

Yes. Official integrations exist for YouTube, Vimeo, WordPress, Brightcove, Kaltura and JW Player. Custom integrations via REST API and webhooks (job.completed, job.failed, stream.started, stream.stopped).

Question 28

Can VoxChron translate captions into other languages?

Accepted Answer

Yes. After transcription you can request translation of the caption track into any of 50+ target languages, with timestamps preserved. Non-verbal sound markers (e.g. [laughs], [♪ music]) are localized or kept universal as you prefer.

VoxChron FAQ — AI Captioning & Non-Verbal Sound Detection

Getting Started

Accuracy & Quality

Pricing & Billing

Files & Formats

Live Captioning

Security & Compliance

API & Integration

Still have questions?