whisper
Dataset & Modelopenai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Overview
Whisper is a general-purpose speech recognition model trained on a large dataset of diverse audio. It can perform multilingual speech recognition, speech translation, and language identification using a Transformer sequence-to-sequence architecture.
README Preview
# Whisper\n\n[[Blog]](https://openai.com/blog/whisper)\n[[Paper]](https://arxiv.org/abs/2212.04356)\n[[Model card]](https://github.com/openai/whisper/blob/main/model-card.md)\n[[Colab example]](https://colab.research.google.com/github/openai/whisper/blob/master/notebooks/LibriSpeech.ipynb)\n\nWhisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.\n\n\n## Approach\n\n\n\nA Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.\n\n\n## Setup\n\nWe used Python 3.9.9 and [PyTorch](https://pytorch.org/) 1.10.1 to train and test our models, but the codebase is expected to be compatible with Python 3.8-3.11 and recent PyTorch versions. The codebase also depends on a few Python packages, most notably [OpenAI's tiktoken](https://github.com/openai/tiktoken) for their fast tokenizer implementation. You can download and install (or update to) the latest release of Whisper with the following command:\n\n pip install -U openai-whisper\n\nAlternatively, the following command will pull and install the latest commit from this repository, along with its Python dependencies:\n\n pip install git+https://github.com/openai/whisper.git \n\nTo update the package to the latest version of this repository, please run:\n\n pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git\n\nIt also requires the command-line tool [`ff