Guides 2 min read

Whisper on Mac: Local Transcription Without Cloud

Whisper locally on Apple Silicon: mlx-whisper, WhisperKit, privacy and speaker diarization.

Technical research and editorial review. Original measurements are explicitly identified in the article.

Published: May 4, 2026 Updated: June 18, 2026

Editorial method

Whisper can transcribe audio files, interviews, meetings, or voice memos locally. On Apple Silicon, mlx-whisper is the easiest entry point.

Quick start

Start with mlx-whisper and the small or medium model. Use large-v3 when quality matters more than speed. For speaker diarization, add pyannote.

Model choice

tiny (~75 MB): only for very short clips or pre-filtering.

base (~140 MB): fast, but inaccurate.

small (~460 MB): good for real-time streaming.

medium (~1.5 GB): my go-to for most tasks.

large-v3 (~3 GB): maximum accuracy, especially for German.

What I tested

I ran mlx-whisper on my Mac Mini M4 with different models. Here’s what I noticed:

medium is the sweet spot. For interviews and meetings, it delivers usable transcripts. Speed is acceptable.

large-v3 is noticeably better, especially for German with dialect. But it takes more time and memory.

Important: Privacy

Whisper itself doesn’t send data to the cloud. But apps like MacWhisper or WhisperKit are offline. Some commercial apps forward audio to cloud APIs. Check network activity with Little Snitch.

Speaker diarization

Standard Whisper doesn’t support diarization. For speaker labels, add pyannote.audio. It runs locally, but adds another model pass (~1-2 GB).

My verdict

Whisper is the best local speech recognition for Mac. Start with mlx-whisper and medium, upgrade to large-v3 if needed.

Tested June 2026 on Mac Mini M4 with 32 GB.

Transparency

Sources and review basis

2

These primary and reference sources form the basis of the technical assessment. Vendor claims and external benchmarks are identified as such in the article.

  1. github.comopenai / whisper
  2. github.commain / whisper