doing.
All guides
·8 min read·By Brian Ellin

macOS 26 Voice Transcription: Setup Guide for Developers (2026)

macOS 26 uses Apple's on-device foundation model for dictation — a big upgrade. Here's how to set it up, configure hotkeys, and use it in developer workflows like Claude Code, Cursor, Codex, Gemini, and Amp.

macOS 26 (Tahoe) quietly shipped one of the most useful upgrades for developers: Dictation now runs on Apple's on-device foundation model. Not the old speech recognition that mangled "TypeScript" into "type script" — an actual AI model running locally on your Mac.

I've been using it daily alongside doing., and the accuracy improvement is real. If you tried Mac Dictation a couple years ago and gave up, it's worth another shot.

Here's the full setup, plus how to actually use it in a developer workflow.

What's new about Dictation in macOS 26

Before Tahoe, macOS Dictation used older speech recognition that needed an internet connection for best results. macOS 26 replaced the underlying engine with Apple's foundation model, running entirely on your Apple Silicon chip. In practice:

  • Better general accuracy — noticeably improved for everyday speech and common tech terms like "React" or "API." Still struggles with niche developer vocabulary, but a big step up from previous versions.
  • Fully on-device — no audio leaves your Mac when using on-device mode. This matters if you're dictating anything work-related.
  • Lower latency — no network round-trip. You speak, it transcribes.
  • Smarter punctuation — the model inserts commas and periods based on your speech cadence rather than waiting for you to say "period."

What you need

  • Apple Silicon Mac (M1 or later) — Intel Macs don't get the foundation model
  • macOS 26 (Tahoe) or later — the foundation model ships with the OS, no extra download needed

Step 1: Enable Dictation

  1. Open Apple menu > System Settings
  2. Click Keyboard in the sidebar
  3. Scroll to the Dictation section and toggle it On

Dictation is free — it's built into macOS, no purchase or subscription required. The foundation model ships with the OS, so there's nothing extra to download. Toggle it on and you're ready to go.

macOS 26 System Settings showing the Dictation panel with toggle, language, microphone source, shortcut, and auto-punctuation options

Notice the Auto-punctuation toggle at the bottom — make sure that's on. It's one of the best parts of the foundation model upgrade.

For full details, see Apple's official Dictation support page.

Step 2: Pick your hotkey

In the same System Settings > Keyboard > Dictation panel, click the Shortcut dropdown:

ShortcutHow to activate
Press Microphone KeyTap the microphone key (on newer MacBook keyboards)
Press Control Key TwiceDouble-tap either Control key
Press Globe TwiceDouble-tap the Globe/Fn key
Press Right Command Key TwiceDouble-tap the right Command key
Press Left Command Key TwiceDouble-tap the left Command key
Press Either Command Key TwiceDouble-tap any Command key
Customize...Define your own shortcut

Tip

I use Control Key Twice for built-in Dictation and the Fn key for doing. — that way I have quick access to both without conflicts. If you only need one, Control twice is fast and doesn't interfere with IDE shortcuts in Cursor, VS Code, or Xcode.

Step 3: Start talking

  1. Click into any text field
  2. Press your hotkey — a microphone icon appears
  3. Speak naturally
  4. Press the hotkey again (or click Done)

Text appears where your cursor is. That's it.

Using voice transcription in developer workflows

The setup takes two minutes. The interesting part is figuring out where voice input actually saves time in your day. After a few months of using it, here's where it sticks:

Dictating prompts to AI coding tools

This is the big one. When you're working in Cursor, Claude, or ChatGPT, better prompts get better results. But typing out a detailed, context-rich prompt is slow — so most people write short, vague ones instead.

Voice changes the math. You can dictate a 200-word prompt in about 30 seconds. That's the difference between "fix this function" and "this function is supposed to validate email addresses but it's letting through strings without an @ sign, and it also needs to handle the edge case where there are multiple @ symbols — reject those too."

The second prompt gets dramatically better output. Voice makes it effortless to provide that level of detail.

Writing Slack messages and PRDs

Slack threads, product specs, design docs, RFC comments — anything where you need to explain your thinking in more than a sentence or two. Speaking is 3-4x faster than typing for most people, and you tend to explain things more clearly when you talk through them versus typing.

I'll often dictate a first pass, then spend 30 seconds cleaning it up. Still faster than typing from scratch.

Code review comments

When you're reviewing a PR and need to explain why something should change (not just what), dictation is perfect. Talk through your reasoning like you'd explain it to the person sitting next to you. The result reads more like a conversation and less like a terse code review that needs three follow-up comments to clarify.

Capturing ideas without losing context

Mid-coding session, an idea for a different feature hits you. Rather than context-switching to a notes app and typing it out (and losing your mental stack), press the hotkey in Obsidian or Apple Notes, say it out loud, and get back to what you were doing. Ten seconds, no flow break.

Tips from actual daily use

Speak normally. The foundation model was trained on natural speech. Over-enunciating or speaking slowly actually makes it worse. Just talk like you're explaining something to a coworker.

Add tricky words to Text Replacements. The model handles common programming terms fine, but your company's product names, coworker names, or niche library names might get mangled. Add them via System Settings > Keyboard > Text Replacements.

Don't watch the words appear. Dictation shows your transcription in real time as you speak. Some people like this — it feels responsive. Personally, I find it distracting. You start reading what you just said, second-guessing word choices mid-sentence, and losing your train of thought. Better to look away, finish your thought, then review. Dictation works best when you treat it like talking to a person, not typing with your voice. (This is actually why doing. doesn't show a live transcript — it's a deliberate design choice to keep you focused on your thoughts, not the words on screen.)

Use a headset mic in noisy spaces. The built-in mic works well in a quiet room, but AirPods or any headset mic will give you noticeably better accuracy in a coffee shop or open office.

Where built-in Dictation falls short

I use macOS Dictation every day, and it's genuinely good now. But there are real gaps:

  • Struggles with programming-specific vocabulary — general terms like "React" and "API" are fine, but the model wasn't trained for developer speech. Library names, CLI commands, variable names, and domain-specific jargon get mangled regularly. If you're dictating anything code-adjacent, expect to correct a lot.
  • No post-processing — the model does a good job stripping filler words automatically, but beyond that, what you say is what you get. There's no way to reformat for email tone, summarize, translate, or apply any other transformation to the output.
  • No transcript history — your words go wherever your cursor was, then they're gone. No searchable log, no daily record of what you dictated.
  • Short bursts only — Dictation is designed for a sentence or paragraph at a time. It's not a recording tool for meetings or long brainstorming sessions.

For general use — emails, Slack, notes — these limitations are fine. But if you're dictating in a developer context all day, the vocabulary gap in particular gets frustrating fast.


FAQ

Does macOS 26 Dictation work offline?

Yes. The foundation model ships with macOS 26, so Dictation works entirely on-device with no internet connection. Your audio never leaves your Mac in on-device mode.

Does Dictation work in VS Code and Cursor?

Yes — it works in any standard macOS text field, and VS Code and Cursor both use standard text inputs. Click into the editor or a comment field, press your hotkey, and dictate. It also works in the integrated terminal's text input areas, though not in the raw terminal output.

Is the macOS 26 foundation model the same as Siri?

They share underlying Apple Intelligence infrastructure, but Dictation and Siri use different models optimized for different tasks. Dictation's model is specifically trained for speech-to-text accuracy, while Siri handles intent recognition and task execution.

How accurate is it for programming terminology?

Hit or miss. Common terms like React, TypeScript, JavaScript, Python, API, and JSON transcribe correctly most of the time. But anything more specific — library names, CLI flags, internal project terms, camelCase variable names — frequently needs manual correction. Apple's model is a general-purpose dictation engine, not one trained on developer speech. If accurate transcription of technical vocabulary is important to you, doing. ships with developer-specific dictionaries and word replacements on top of its transcription engine, so terms that Apple's model mangles come through correctly out of the box.


Going beyond built-in Dictation

If voice transcription is becoming part of your daily workflow — not just occasional use, but a real input method — the built-in limitations start to matter.

The differences that matter most for developers:

  • Built for developer vocabulary — doing. ships with developer-specific dictionaries and word replacements that handle programming terms, library names, and technical jargon that general-purpose dictation engines get wrong. Less time correcting, more time building.
  • AI-powered Skills for post-processing — automatically clean filler words, formalize for email, summarize, or write custom processing with Markdown prompts. Dictate messy thoughts, get clean output.
  • Transcript history as Markdown — every transcription saved as a daily .md file. Search your voice notes in Obsidian, reference them later, build a personal knowledge base from what you say.
  • NVIDIA Parakeet engine — processes an hour of audio in under a second. Not because you'll record an hour, but because short clips are transcribed before you can blink.
  • $49 once — no subscription. 100 free transcriptions to try it, no account required.

doing.

Want more from voice transcription?

doing. goes beyond built-in dictation — developer-tuned transcription with AI-powered post-processing, transcript history as Markdown, and everything running locally on your Mac.

Try doing. free →100 free transcriptions · No account · $49 once