概要
FunASR は、50 を超える言語、スピーカー分離、感情検出、ストリーミング認識、OpenAI 対応 API をサポートするエンド・オブ・エンドの音声認識 ツール パッケージです。 170 倍のリアルタイム速度を実現し、SenseVoice、Paraformer などのプレトレーニング モデルを提供します。
README プレビュー
([简体中文](./README_zh.md)|English|[日本語](./README_ja.md)|[한국어](./README_ko.md))\n\n\n\n\n\n\n Industrial speech recognition. 170x faster than Whisper. 50+ languages.\n Speaker diarization · Emotion detection · Streaming · One API call\n\n\n\n \n \n \n \n\n\n\n\n\n\n\n Quick Start · Colab · Benchmark · Model selection · Migration guide · Use cases · Deployment matrix · Models · Agent Integration · Docs · Contribute\n\n\n---\n\n## Quick Start\n\n[](https://colab.research.google.com/github/modelscope/FunASR/blob/main/examples/colab/funasr_quickstart.ipynb)\n\nNo local setup? Open the [Colab quickstart](./examples/colab/) to transcribe a public sample or upload your own audio in a browser.\n\n```bash\npip install torch torchaudio\npip install funasr\n```\n\n```python\nfrom funasr import AutoModel\n\nmodel = AutoModel(model="iic/SenseVoiceSmall", vad_model="fsmn-vad", spk_model="cam++", device="cuda")\nresult = model.generate(input="meeting.wav")\n```\n\n**Output** — structured text with speaker labels, timestamps, and punctuation:\n```\n[00:00.4 → 00:03.8] Speaker 0: Let's discuss the Q3 plan.\n[00:04.2 → 00:07.1] Speaker 1: Sounds good. I have three points.\n[00:07.5 → 00:12.3] Speaker 0: Go ahead. We have 30 minutes.\n```\n\nThat's it. **One model, one call** — VAD segmentation, speech recognition, punctuation, speaker diarization all happen automatically.\n\n### LLM-powered ASR: Fun-ASR-Nano\n\nFor highest accuracy across 31 languages (including Chinese dialects), use [Fun-ASR-Nano](https://github.com/FunAudioLLM/Fun-ASR) — an LLM-based ASR combining SenseVoice encoder with Qwen3-0.6B decoder:\n\n```python\nfrom funasr import AutoModel\n\nmodel = AutoModel(model="FunAudioLLM/Fun-ASR-Nano-2512", vad_model="fsmn-vad", device="cuda")\nresult = model.generate(input="meeting.wav")\n```\n\nWith vLLM acceleration (16x faster, batch processing):\n\n```python\nfrom funasr.auto.auto_model_vllm import AutoModelVLLM\n\nmodel = AutoModelVLLM(model="FunAudioLLM/Fun-ASR-Na