Skip to main content

How to run Whisper on macOS

Prepare

Make sure you have a valid python environment, with the capability to create virtual environments. If you don't what this is about, checkout python.dev.recipes.

Then clone this repo

git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp

Download the bin model and the compressed Core ML

aria2c --out=./models/ggml-large.bin "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large.bin"
aria2c --out=./models/ggml-large-encoder.mlmodelc.zip "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-encoder.mlmodelc.zip"

This is using the largest model which requires around 5 GB of RAM. You can also try other models from the Hugging Face repo : https://huggingface.co/ggerganov/whisper.cpp

Unzip the model

unzip ./models/ggml-large-encoder.mlmodelc.zip -d ./models

Prepare a virtual environment with Python 3.10

pyenv install 3.10.13
pyenv local 3.10.13
mkvirtualenv whisper

Then install the packages

pip install torch==2.0.0
pip install ane_transformers
pip install openai-whisper
pip install coremltools

Confirm that Xcode is installed and install the command line tools by running

xcode-select --install

Build whisper.cpp with Core ML support

make clean
WHISPER_COREML=1 make -j

Run the examples

./main -m models/ggml-large.bin -f samples/jfk.wav

The first run on a device is going to be slow because the ANE service compiles the Core ML model to some device-specific format. Next runs are faster.