ASR Models

DolphinAsr Series

Notes

  • License: Apache 2.0
  • opt: Optimized version, moves audio feature extraction module out of the model to reduce inference overhead
  • Full language and region code mapping:
zh-CN: Chinese (Mandarin), zh-TW: Chinese (Taiwan), zh-WU: Chinese (Wu), zh-SICHUAN: Chinese (Sichuan), zh-SHANXI: Chinese (Shanxi), zh-ANHUI: Chinese (Anhui), zh-TIANJIN: Chinese (Tianjin), zh-NINGXIA: Chinese (Ningxia), zh-SHAANXI: Chinese (Shaanxi), zh-HEBEI: Chinese (Hebei), zh-SHANDONG: Chinese (Shandong), zh-GUANGDONG: Chinese (Guangdong), zh-SHANGHAI: Chinese (Shanghai), zh-HUBEI: Chinese (Hubei), zh-LIAONING: Chinese (Liaoning), zh-GANSU: Chinese (Gansu), zh-FUJIAN: Chinese (Fujian), zh-HUNAN: Chinese (Hunan), zh-HENAN: Chinese (Henan), zh-YUNNAN: Chinese (Yunnan), zh-MINNAN: Chinese (Minnan), zh-WENZHOU: Chinese (Wenzhou)
ja-JP: Japanese, th-TH: Thai, ru-RU: Russian, ko-KR: Korean, id-ID: Indonesian, vi-VN: Vietnamese
ct-NULL: Cantonese, ct-HK: Cantonese (Hong Kong), ct-GZ: Cantonese (Guangdong)
hi-IN: Hindi, ur-IN: Urdu (India), ur-PK: Urdu, ms-MY: Malay, uz-UZ: Uzbek
ar-MA: Arabic (Morocco), ar-GLA: Arabic, ar-SA: Arabic (Saudi Arabia), ar-EG: Arabic (Egypt), ar-KW: Arabic (Kuwait), ar-LY: Arabic (Libya), ar-JO: Arabic (Jordan), ar-AE: Arabic (UAE), ar-LVT: Arabic (Levant)
fa-IR: Persian, bn-BD: Bengali
ta-SG: Tamil (Singapore), ta-LK: Tamil (Sri Lanka), ta-IN: Tamil (India), ta-MY: Tamil (Malaysia)
te-IN: Telugu, ug-NULL: Uyghur, ug-CN: Uyghur, gu-IN: Gujarati
my-MM: Burmese, tl-PH: Tagalog, kk-KZ: Kazakh, or-IN: Odia, ne-NP: Nepali
mn-MN: Mongolian, km-KH: Khmer, jv-ID: Javanese, lo-LA: Lao, si-LK: Sinhala
fil-PH: Filipino, ps-AF: Pashto, pa-IN: Punjabi, kab-NULL: Kabyle
ba-NULL: Bashkir, ks-IN: Kashmiri, tg-TJ: Tajik, su-ID: Sundanese
mr-IN: Marathi, ky-KG: Kyrgyz, az-AZ: Azerbaijani

DolphinAsr-base Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
DolphinAsr-base-onnxNon-streamingMultilingualNoYesmodelscope
DolphinAsr-base-int8-onnxNon-streamingMultilingualNoYesmodelscope
DolphinAsr-base-onnx-optNon-streamingMultilingualNoYesmodelscope
DolphinAsr-base-int8-onnx-optNon-streamingMultilingualNoYesmodelscope

DolphinAsr-small Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
DolphinAsr-small-onnxNon-streamingMultilingualNoYesmodelscope
DolphinAsr-small-int8-onnxNon-streamingMultilingualNoYesmodelscope
DolphinAsr-small-onnx-optNon-streamingMultilingualNoYesmodelscope
DolphinAsr-small-int8-onnx-optNon-streamingMultilingualNoYesmodelscope

FireRedAsr Series

FireRedAsr-AED Chinese-English Model (v1)

Model NameTypeLanguagesPunctuationTimestampsDownload Link
fireredasr-aed-large-zh-en-onnx-offline-20250124Non-streamingChinese, EnglishNoNomodelscope

FireRedAsr2-AED Chinese-English Model (v2)

Model NameTypeLanguagesPunctuationTimestampsDownload Link
fireredasr2-aed-large-zh-en-onnx-offline-20260212Non-streamingChinese, EnglishNoYesmodelscope
fireredasr2-aed-large-zh-en-int8-onnx-offline-20260212Non-streamingChinese, EnglishNoYesmodelscope
fireredasr2-aed-large-zh-en-onnx-selfcrosskv-offline-20260212Non-streamingChinese, EnglishNoYesmodelscope
fireredasr2-aed-large-zh-en-int8-onnx-selfcrosskv-offline-20260212Non-streamingChinese, EnglishNoYesmodelscope
fireredasr2-aed-large-zh-en-int8-onnx-selfcrosskvstack-offline-20260212Non-streamingChinese, EnglishNoYesmodelscope

Fun-ASR Series

Notes

  • Model background: End-to-end speech recognition foundation model released by Tongyi Lab. Pre-trained on tens of millions of hours of real speech data, featuring strong contextual understanding and domain adaptability
  • Features: All models are non-streaming, support punctuation, support timestamps. Support low-latency real-time transcription, with recognition accuracy reaching 93% in far-field, high-noise environments
  • Version identifier meanings:
    • int8: INT8 quantized version, smaller size, faster inference, suitable for edge deployment
    • LLM: Large model enhanced version, stronger context understanding, suppresses recognition hallucinations
    • CTC: Lightweight classic CTC architecture version, lightweight inference
    • MLT: Multilingual general-purpose version, covers 31 languages
    • split-adaptor: Version with feature adaptation module deployed separately
  • Language and capability notes:
    • Fun-ASR-Nano: Supports Chinese, English, Japanese; 7 dialects (Wu, Cantonese, Min, Hakka, Gan, Xiang, Jin); 26 regional accents (Henan, Shanxi, Hubei, Sichuan, Chongqing, Yunnan, Guizhou, Guangdong, Guangxi, Shaanxi, Hebei, Shandong, Anhui, Tianjin, Ningxia, Liaoning, Gansu, Hunan, Heilongjiang, Jilin, Inner Mongolia, Jiangsu, Zhejiang, Fujian, Jiangxi, Hainan); additionally supports lyrics recognition and rap speech recognition
    • Fun-ASR-MLT-Nano: Supports 31 languages total: Chinese, English, Cantonese, Japanese, Korean, Vietnamese, Indonesian, Thai, Malay, Filipino, Arabic, Hindi, Bulgarian, Croatian, Czech, Danish, Dutch, Estonian, Finnish, Greek, Hungarian, Irish, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Swedish
  • Domain advantages: Excellent performance in vertical fields such as education and finance, accurately recognizes domain-specific terminology, effectively suppresses hallucinations and language confusion

Fun-ASR-Nano Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
Fun-ASR-Nano-2512-LLM-onnxNon-streamingChinese, English, Japanese; 7 dialects + 26 regional accents, lyrics/rap recognitionYesYesmodelscope
Fun-ASR-Nano-2512-LLM-int8-onnxNon-streamingSame as aboveYesYesmodelscope
Fun-ASR-Nano-2512-LLM-split-adaptor-onnxNon-streamingSame as aboveYesYesmodelscope
Fun-ASR-Nano-2512-LLM-split-adaptor-int8-onnxNon-streamingSame as aboveYesYesmodelscope
Fun-ASR-Nano-2512-CTC-onnxNon-streamingSame as aboveYesYesmodelscope
Fun-ASR-Nano-2512-CTC-int8-onnxNon-streamingSame as aboveYesYesmodelscope

Fun-ASR-MLT-Nano Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
Fun-ASR-MLT-Nano-2512-onnxNon-streaming31 languagesYesYesmodelscope
Fun-ASR-MLT-Nano-2512-int8-onnxNon-streaming31 languagesYesYesmodelscope

FunASR Series

Paraformer Chinese-English Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
paraformer-large-zh-en-onnx-offlineNon-streamingChinese (zh), English (en)NoNohuggingface, modelscope
paraformer-large-zh-en-timestamp-onnx-offlineNon-streamingChinese, EnglishNoYesmodelscope
paraformer-large-en-onnx-offlineNon-streamingEnglishNoNomodelscope
paraformer-large-zh-en-onnx-onlineStreamingChinese, EnglishNoNomodelscope

Paraformer Cantonese/Chinese/English Multilingual Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
paraformer-large-zh-yue-en-timestamp-onnx-offline-dengcunqin-20240805Non-streamingChinese, Cantonese, EnglishNoYesmodelscope
paraformer-large-zh-yue-en-onnx-offline-dengcunqin-20240805Non-streamingChinese, Cantonese, EnglishNoNomodelscope
paraformer-large-zh-yue-en-onnx-online-dengcunqin-20240208StreamingChinese, Cantonese, EnglishNoNomodelscope

SeACo-Paraformer Hotword Customization Model

SeACoParaformer is a next-generation non-autoregressive speech recognition model with hotword customization, proposed by Alibaba Speech Lab. Compared to the previous CLAS-based hotword customization solution, SeACoParaformer decouples the hotword module from the ASR model and performs hotword boosting via posterior probability fusion, making the boosting process visible and controllable, while significantly improving hotword recall.

Model NameTypeLanguagesPunctuationTimestampsDownload Link
paraformer-seaco-large-zh-timestamp-onnx-offlineNon-streamingChinese, supports hotword customizationNoYesmodelscope

SenseVoice Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
sensevoice-small-onnxNon-streamingChinese, Cantonese, English, Japanese, KoreanYesNomodelscope
sensevoice-small-int8-onnxNon-streamingChinese, Cantonese, English, Japanese, KoreanYesNomodelscope
sensevoice-small-wenetspeech-yue-onnxNon-streamingCantonese, Chinese, English, Japanese, KoreanYesNomodelscope
sensevoice-small-wenetspeech-yue-int8-onnxNon-streamingCantonese, Chinese, English, Japanese, KoreanYesNomodelscope
sensevoice-small-split-embed-onnxNon-streamingChinese, Cantonese, English, Japanese, KoreanYesNomodelscope

K2TransducerAsr Series

Streaming Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
k2transducer-lstm-en-onnx-online-csukuangfj-20220903StreamingEnglishNoNomodelscope
k2transducer-lstm-zh-onnx-online-csukuangfj-20221014StreamingChineseNoNomodelscope
k2transducer-zipformer-en-onnx-online-weijizhuang-20221202StreamingEnglishNoNomodelscope
k2transducer-zipformer-en-onnx-online-zengwei-20230517StreamingEnglishNoNomodelscope
k2transducer-zipformer-multi-zh-hans-onnx-online-20231212StreamingChineseNoNomodelscope
k2transducer-zipformer-ko-onnx-online-johnbamma-20240612StreamingKoreanNoNomodelscope
k2transducer-zipformer-ctc-small-zh-onnx-online-20250401StreamingChineseNoNomodelscope
k2transducer-zipformer-large-zh-onnx-online-yuekai-20250630StreamingChineseNoNomodelscope
k2transducer-zipformer-xlarge-zh-onnx-online-yuekai-20250630StreamingChineseNoNomodelscope
k2transducer-zipformer-ctc-large-zh-onnx-online-yuekai-20250630StreamingChineseNoNomodelscope
k2transducer-zipformer-ctc-xlarge-zh-onnx-online-yuekai-20250630StreamingChineseNoNomodelscope

Non-streaming Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
k2transducer-conformer-en-onnx-offline-csukuangfj-20220513Non-streamingEnglishNoNomodelscope
k2transducer-conformer-zh-onnx-offline-luomingshuang-20220727Non-streamingChineseNoNomodelscope
k2transducer-zipformer-en-onnx-offline-yfyeung-20230417Non-streamingEnglishNoNomodelscope
k2transducer-zipformer-large-en-onnx-offline-zengwei-20230516Non-streamingEnglishNoNomodelscope
k2transducer-zipformer-small-en-onnx-offline-zengwei-20230516Non-streamingEnglishNoNomodelscope
k2transducer-zipformer-zh-onnx-offline-wenetspeech-20230615Non-streamingChineseNoNomodelscope
k2transducer-zipformer-zh-onnx-offline-multi-zh-hans-20230902Non-streamingChineseNoNomodelscope
k2transducer-zipformer-zh-en-onnx-offline-20231122Non-streamingChinese, EnglishNoNomodelscope
k2transducer-zipformer-cantonese-onnx-offline-20240313Non-streamingCantoneseNoNomodelscope
k2transducer-zipformer-th-onnx-offline-yfyeung-20240620Non-streamingThaiNoNomodelscope
k2transducer-zipformer-ja-onnx-offline-reazonspeech-20240801Non-streamingJapaneseNoNomodelscope
k2transducer-zipformer-ru-onnx-offline-20240918Non-streamingRussianNoNomodelscope
k2transducer-zipformer-vi-onnx-offline-20250420Non-streamingVietnameseNoNomodelscope
k2transducer-zipformer-ctc-zh-onnx-offline-20250703Non-streamingChineseNoNomodelscope github
k2transducer-zipformer-ctc-small-zh-onnx-offline-20250716Non-streamingChineseNoNomodelscope

MedAsr Series

Notes

  • Model architecture: Based on Conformer, a medical-domain speech recognition model released by Google Health
  • Application scenarios: Suitable for radiology dictation, doctor-patient dialogue, medical transcription, etc.
  • Supported languages: English only (primarily American English)
  • Model characteristics: Pre-trained on approximately 5,000 hours of medical speech data, strong recognition of medical terminology. Performance on non-standard drug names and structured data such as dates/times may vary, suitable for fine-tuning to adapt to specific business scenarios
Model NameTypeLanguagesPunctuationTimestampsDownload Link
medasr-onnxNon-streamingEnglishNoNomodelscope

moonshine Series

moonshine-tiny Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
moonshine-tiny-onnxNon-streamingEnglishYesNomodelscope
moonshine-tiny-int8-onnxNon-streamingEnglishYesNomodelscope
moonshine-tiny-en-onnxNon-streamingEnglishYesNomodelscope
moonshine-tiny-zh-onnxNon-streamingChineseYesNomodelscope
moonshine-tiny-zh-int8-onnxNon-streamingChineseYesNomodelscope
moonshine-tiny-vi-onnxNon-streamingVietnameseYesNomodelscope
moonshine-tiny-vi-int8-onnxNon-streamingVietnameseYesNomodelscope
moonshine-tiny-uk-onnxNon-streamingUkrainianYesNomodelscope
moonshine-tiny-uk-int8-onnxNon-streamingUkrainianYesNomodelscope
moonshine-tiny-ko-onnxNon-streamingKoreanYesNomodelscope
moonshine-tiny-ko-int8-onnxNon-streamingKoreanYesNomodelscope
moonshine-tiny-ja-onnxNon-streamingJapaneseYesNomodelscope
moonshine-tiny-ja-int8-onnxNon-streamingJapaneseYesNomodelscope
moonshine-tiny-ar-onnxNon-streamingArabicYesNomodelscope
moonshine-tiny-ar-int8-onnxNon-streamingArabicYesNomodelscope
moonshine-tiny-fr-onnxNon-streamingFrenchYesNomodelscope
moonshine-tiny-fr-int8-onnxNon-streamingFrenchYesNomodelscope

moonshine-base Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
moonshine-base-onnxNon-streamingEnglishYesNomodelscope
moonshine-base-int8-onnxNon-streamingEnglishYesNomodelscope
moonshine-base-en-onnxNon-streamingEnglishYesNomodelscope
moonshine-base-zh-onnxNon-streamingChineseYesNomodelscope
moonshine-base-zh-int8-onnxNon-streamingChineseYesNomodelscope
moonshine-base-vi-onnxNon-streamingVietnameseYesNomodelscope
moonshine-base-vi-int8-onnxNon-streamingVietnameseYesNomodelscope
moonshine-base-uk-onnxNon-streamingUkrainianYesNomodelscope
moonshine-base-uk-int8-onnxNon-streamingUkrainianYesNomodelscope
moonshine-base-ko-onnxNon-streamingKoreanYesNomodelscope
moonshine-base-ko-int8-onnxNon-streamingKoreanYesNomodelscope
moonshine-base-ja-onnxNon-streamingJapaneseYesNomodelscope
moonshine-base-ja-int8-onnxNon-streamingJapaneseYesNomodelscope
moonshine-base-ar-onnxNon-streamingArabicYesNomodelscope
moonshine-base-ar-int8-onnxNon-streamingArabicYesNomodelscope

WeNet Series

Streaming Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
wenet-u2pp-conformer-aishell-onnx-online-20210601StreamingChineseNoNomodelscope
wenet-u2pp-conformer-wenetspeech-onnx-online-20220506StreamingChineseNoNomodelscope
wenet-u2pp-conformer-wenetspeech-int8-onnx-online-20220506StreamingChineseNoNomodelscope
wenet-u2pp-conformer-gigaspeech-onnx-online-20210728StreamingEnglishNoNomodelscope

Non-streaming Models

Model NameTypeLanguagesPunctuationTimestampsDownload Link
wenet-u2pp-conformer-aishell-onnx-offline-20210601Non-streamingChineseNoNomodelscope
wenet-u2pp-conformer-wenetspeech-onnx-offline-20220506Non-streamingChineseNoNomodelscope
wenet-u2pp-conformer-wenetspeech-int8-onnx-offline-20220506Non-streamingChineseNoNomodelscope
wenet-u2pp-conformer-gigaspeech-onnx-offline-20210728Non-streamingEnglishNoNomodelscope

Whisper Series

Notes

  1. Models with -kv suffix have KV Cache inference acceleration enabled
  2. All models support punctuation and timestamps. Output paragraph-level timestamps by default, can enable word-level timestamps via parameters
  3. Language coverage:
    • Standard multilingual versions (tiny/small/medium/large-v1/large-v2): Support 99 languages (including Chinese, Cantonese, English, Japanese, Korean, Russian, Arabic, Vietnamese, Ukrainian, and other major world languages)
    • large-v3 / large-v3-turbo series: Extend low-resource languages beyond the 99, total approximately 106 languages. New additions include Zulu (zu), Maori (mi), Swahili (sw), Hausa (ha), etc., with significantly improved language identification
    • Full language list and codes:
af(Afrikaans), am(Amharic), ar(Arabic), as(Assamese), az(Azerbaijani), 
ba(Bashkir), be(Belarusian), bg(Bulgarian), bn(Bengali), bo(Tibetan), br(Breton), bs(Bosnian), 
ca(Catalan), cs(Czech), cy(Welsh), 
da(Danish), de(German), 
el(Greek), en(English), es(Spanish), et(Estonian), eu(Basque), 
fa(Persian), fi(Finnish), fo(Faroese), fr(French), 
ga(Irish), gl(Galician), gu(Gujarati), 
ha(Hausa), haw(Hawaiian), he(Hebrew), hi(Hindi), hr(Croatian), hu(Hungarian), hy(Armenian), 
id(Indonesian), ig(Igbo), is(Icelandic), it(Italian), 
ja(Japanese), jv(Javanese), 
ka(Georgian), kk(Kazakh), km(Khmer), kn(Kannada), ko(Korean), ku(Kurdish), ky(Kyrgyz), 
la(Latin), lb(Luxembourgish), lg(Ganda), lt(Lithuanian), lv(Latvian), 
mai(Maithili), mg(Malagasy), mi(Maori), mk(Macedonian), ml(Malayalam), mn(Mongolian), mr(Marathi), ms(Malay), mt(Maltese), my(Burmese), 
ne(Nepali), nl(Dutch), no(Norwegian), nso(Northern Sotho), ny(Chichewa), 
oc(Occitan), om(Oromo), or(Odia), 
pa(Punjabi), pl(Polish), ps(Pashto), pt(Portuguese), 
ro(Romanian), ru(Russian), rw(Kinyarwanda), 
sa(Sanskrit), sd(Sindhi), si(Sinhala), sk(Slovak), sl(Slovenian), sm(Samoan), sn(Shona), so(Somali), sq(Albanian), sr(Serbian), ss(Swati), st(Southern Sotho), su(Sundanese), sv(Swedish), sw(Swahili), 
ta(Tamil), te(Telugu), tg(Tajik), th(Thai), ti(Tigrinya), tk(Turkmen), tl(Tagalog), tn(Tswana), to(Tongan), tr(Turkish), ts(Tsonga), tt(Tatar), tw(Twi), 
ug(Uyghur), uk(Ukrainian), ur(Urdu), uz(Uzbek), 
ve(Venda), vi(Vietnamese), vo(Volapük), 
wa(Walloon), wo(Wolof), 
xh(Xhosa), 
yi(Yiddish), yo(Yoruba), 
zh(Chinese), yue(Cantonese), zu(Zulu)
  • Language code short form:
af, am, ar, as, az,
ba, be, bg, bn, bo, br, bs,
ca, cs, cy,
da, de, el, en, es, et, eu,
fa, fi, fo, fr, ga, gl, gu,
ha, haw, he, hi, hr, hu, hy,
id, ig, is, it,
ja, jv,
ka, kk, km, kn, ko, ku, ky,
la, lb, lg, lt, lv,
mai, mg, mi, mk, ml, mn, mr, ms, mt, my,
ne, nl, no, nso, ny,
oc, om, or,
pa, pl, ps, pt,
ro, ru, rw,
sa, sd, si, sk, sl, sm, sn, so, sq, sr, ss, st, su, sv, sw,
ta, te, tg, th, ti, tk, tl, tn, to, tr, ts, tt, tw,
ug, uk, ur, uz,
ve, vi, vo,
wa, wo, xh,
yi, yo,
zh, yue, zu

whisper-tiny Models

Model NameTypeLanguagesPunctuationTimestampsKVDownload Link
whisper-tiny-onnxNon-streaming99 multilingualYesYesNomodelscope
whisper-tiny-onnx-kvNon-streaming99 multilingualYesYesYesmodelscope
whisper-tiny-en-onnxNon-streamingEnglishYesYesNomodelscope

whisper-small Models

Model NameTypeLanguagesPunctuationTimestampsKVDownload Link
whisper-small-onnxNon-streaming99 multilingualYesYesNomodelscope
whisper-small-en-onnxNon-streamingEnglishYesYesNomodelscope
whisper-small-cantonese-onnxNon-streamingCantonese, Chinese, EnglishYesYesNomodelscope

whisper-medium Models

Model NameTypeLanguagesPunctuationTimestampsKVDownload Link
whisper-medium-onnxNon-streaming99 multilingualYesYesNomodelscope
whisper-medium-int8-onnx-kvNon-streaming99 multilingualYesYesYesmodelscope
whisper-medium-en-onnxNon-streamingEnglishYesYesNomodelscope
whisper-medium-yue-onnx-kvNon-streamingCantoneseYesYesYesmodelscope
whisper-medium-yue-int8-onnx-kvNon-streamingCantoneseYesYesYesmodelscope

whisper-large Models

Model NameTypeLanguagesPunctuationTimestampsKVDownload Link
whisper-large-v1-onnxNon-streaming99 multilingualYesYesNomodelscope
whisper-large-v2-onnxNon-streaming99 multilingualYesYesNomodelscope
whisper-large-v3-onnxNon-streaming~106 multilingualYesYesNomodelscope
whisper-large-v3-turbo-onnxNon-streaming~106 multilingualYesYesNomodelscope
whisper-large-v3-turbo-zh-onnxNon-streamingChineseYesYesNomodelscope
whisper-large-v3-turbo-zh-int8-onnx-kv-belle-20241016Non-streamingChineseYesYesYesmodelscope

Distil-Whisper Models

Model NameTypeLanguagesPunctuationTimestampsKVDownload Link
distil-whisper-small-en-onnxNon-streamingEnglishYesYesNomodelscope
distil-whisper-medium-en-onnxNon-streamingEnglishYesYesNomodelscope
distil-whisper-large-v2-en-onnxNon-streamingEnglishYesYesNomodelscope
distil-whisper-large-v3-en-onnxNon-streamingEnglishYesYesNomodelscope
distil-whipser-large-v3.5-en-onnxNon-streamingEnglishYesYesNomodelscope
distil-whisper-large-v2-multi-hans-onnxNon-streamingChinese (compatible with 99 languages)YesYesNomodelscope
distil-whisper-small-cantonese-onnx-alvanlii-20240404Non-streamingCantonese, Chinese, EnglishYesYesNomodelscope

General Notes

  • int8 = quantized version, smaller size, faster speed
  • kv / selfcrosskv / selfcrosskvstack / opt = inference optimization versions
  • Some models provide HuggingFace or GitHub sources; see each table