VTT Generator
Click or drag your audio/video file here
- MP3 · WAV · M4A · MP4 — jusqu’à 1 Go
- Plus de 60 langues
- ~3 min par heure d’audio
- 6 formats d’export
Adopté par plus de 100 000 podcasteurs et créateurs Privé et sécurisé — vos fichiers restent les vôtres
Drop Audio or Video ➞ Generate the VTT File
Generate WebVTT caption files from any audio or video. Accurate, automatically timed cues for HTML5 video, web players, and accessibility compliance.
Captions in the Format the Web Actually Speaks
WebVTT is the caption format of the open web: it's what the HTML5 <track> element expects, what modern web video players consume natively, and what most embedded-video platforms ask for when you add captions. If your video lives on a website — a course platform, a product page, a help center — VTT is the file you need.
Castmagic generates it straight from the media. Upload audio or video, transcription runs with word-level timing, and the exported VTT carries cues that match the speech exactly — readable lengths, accurate boundaries, standard syntax. Edit the transcript first and the captions inherit every fix.
Where VTT files go to work
Self-hosted HTML5 video with a <track> tag. Course videos on learning platforms. Product demos and onboarding videos in web apps. Help-center walkthroughs. Video players like Video.js, Plyr, and JW Player. Anywhere video is embedded in a page, VTT is how the captions ride along.
Captions are an accessibility requirement, not a nice-to-have
Accessibility standards for web content expect spoken material to have text alternatives, and captions are the baseline for video. A generated VTT gets embedded video from non-compliant to captioned in minutes — with accuracy you've verified in the editor, not auto-captions you can't touch.
VTT details done right
WebVTT times cues with a dot for milliseconds (HH:MM:SS.mmm) where SRT uses a comma — small differences like this are why hand-converting between formats goes wrong. Castmagic exports clean, spec-correct VTT, and the SRT version is in the same menu if a different tool in your chain wants SubRip instead.
The transcript does double duty
The same transcript behind the VTT exports as text, PDF, Word, and CSV — and powers AI-generated summaries and descriptions for the page the video lives on. Caption the video and write its supporting copy from one upload.
We Power The Best Creators
How To Generate a VTT File
Upload your audio or video file — or paste a link
Drag your audio or video file into the uploader above, or paste a link if it lives online (YouTube, a podcast feed, cloud storage). Common audio and video formats are all supported.
Castmagic transcribes it
Transcription starts immediately — 60+ languages with auto-detect, speaker labels, and word-level timestamps. An hour of audio typically processes in 3-5 minutes.
Review and polish the transcript
Open the transcript in the editor: rename speakers, fix any terms, and add custom spellings so brand names and jargon come out right on every future upload.
Bien plus qu’un simple outil de transcription
| Dimension | Outil de transcription classique | Castmagic |
|---|---|---|
| Ce que vous obtenez | Un fichier texte | Une transcription horodatée avec identification des intervenants — plus des résumés, notes d’épisode et publications rédigés par l’IA à partir du même envoi |
| Langues et traduction | Transcription seule, souvent anglocentrée | Plus de 60 langues de transcription ; traduisez toute transcription en 11 langues, horodatage et intervenants préservés |
| Formats d’export | TXT, parfois SRT | TXT, SRT, VTT, PDF, DOCX et CSV — chaque format, chaque langue, un seul menu |
| Après la transcription | Débrouillez-vous | Interrogez Magic Chat sur l’enregistrement, cherchez dans toute votre bibliothèque et générez du contenu avec les modèles IA |
Download your VTT
Export a WebVTT caption file with exact cue timing, ready for HTML5 video and modern players. The other formats — TXT, SRT, VTT, PDF, DOCX, and CSV — are one click away in the same menu.
VTT Generator & Content
Generate content from the transcript
The transcript doubles as a content source: Castmagic's AI presets draft summaries, show notes, blog posts, social clips, and follow-up emails from the same audio or video file.
Clips & VTT Generator
Endless Content Assets In Seconds
Automate all the tedious work that comes in editing and copywriting and say hello to your new best content editor.
Integrate Content From All Your Favorite Platforms
Professional Creators Love Castmagic
Castmagic is just a great product. When it came to creating content around The Calum Johnson Show it made our life a lot easier. Highly recommend
Frequently Asked Questions
Last updated June 2026 by the Castmagic team
How do I convert an audio or video file to VTT?
Upload the audio or video file to Castmagic (or paste a link to it), wait a few minutes for transcription, then choose VTT from the download menu. You'll get a WebVTT caption file with exact cue timing, ready for HTML5 video and modern players.
How accurate is the transcription?
Castmagic uses state-of-the-art speech models with support for 60+ languages, automatic language detection, and speaker labeling. Clear single-speaker audio typically transcribes well above 95% accuracy, and a custom-vocabulary list keeps brand names, product names, and industry jargon spelled correctly.
What formats can I download besides VTT?
Every transcript exports to six formats from the same menu: plain text (TXT), SubRip subtitles (SRT), WebVTT captions (VTT), a formatted PDF document, an editable Word document (DOCX), and a structured spreadsheet (CSV) with per-utterance speakers and timings.
Is this free to use?
Castmagic offers a free tier so you can convert a audio file and try the full workflow. Volume use — multiple files per week, longer recordings, and AI-generated content output — is available on paid plans.
What's the difference between VTT and SRT?
WebVTT is the web-native caption format (HTML5 <track>, modern web players) and uses dot-millisecond timing; SRT is the older, universal editor/player format with comma timing. Castmagic exports both from the same transcript — use VTT for web embeds, SRT for editors and YouTube.
Can I use the generated VTT with my website's video player?
Yes — the export is standard WebVTT, which HTML5 <track> elements and players like Video.js, Plyr, and JW Player consume natively.
Can I generate translated captions?
Yes — translate the transcript into any of ten languages (Spanish, French, German, Japanese, and more) and export the translated VTT with identical cue timing.
Discover more usecases
Explore The Castmagic Blog…
Comment améliorer votre introduction YouTube à l'aide d'exemples d'introduction
Stratégies de marketing pour les podcasts : conseils éprouvés pour 2026
Les 10 meilleurs outils de gestion des réseaux sociaux pour les spécialistes du marketing
Les meilleurs outils de marketing pour les petites entreprises en 2026
Les meilleurs outils de référencement YouTube pour améliorer les classements
Comment créer un diaporama sur TikTok : guide étape par étape
Meilleur générateur de blogs : 6 outils pour gagner du temps sur les blogs
Meilleur logiciel de transcription YouTube pour obtenir des transcriptions







