Whisper Series
Notes
- Models with
-kvsuffix have KV Cache inference acceleration enabled- All models support punctuation and timestamps. Output paragraph-level timestamps by default, can enable word-level timestamps via parameters
- Language coverage:
- Standard multilingual versions (tiny/small/medium/large-v1/large-v2): Support 99 languages (including Chinese, Cantonese, English, Japanese, Korean, Russian, Arabic, Vietnamese, Ukrainian, and other major world languages)
- large-v3 / large-v3-turbo series: Extend low-resource languages beyond the 99, total approximately 106 languages. New additions include Zulu (zu), Maori (mi), Swahili (sw), Hausa (ha), etc., with significantly improved language identification

