Fast voice transcription for AI builders
Talk to Claude,
don't type.
Fast local voice transcription,
no subscription.
Doing uses an on-device AI model to transcribe your voice in real-time. Hold fn, talk, release — your words land wherever your cursor is. $49 once.
macOS 14+ · Apple Silicon & Intel · No account required
add a dark mode toggle to the settings page
2:23pm to Claude Code
the signup button isn't showing on mobile, can you fix it
2:21pm to Cursor
brainstorm with me some new ideas for how we can improve the speed and accuracy of transcription
2:18pm to Claude Code
Hold fn Talk. Done.
Hold your hotkey
Press and hold your hotkey (fn by default). Doing starts listening instantly. A small pip appears at your cursor so you know you're live.
Speak
Talk naturally. Say what you'd type. System volume lowers automatically so you stay focused. Doing transcribes in real-time.
Release
Let go of fn. Your words are pasted wherever your cursor is — Claude Code, ChatGPT, Cursor, anywhere.
Own your tools, stop renting.
Every other voice tool charges $8–15 a month, forever. That's another subscription you have to think about, another thing to cancel if you stop using it. Doing is $49 once. It's yours. No recurring charge. No account. No free tier with artificial limits. Just a tool you own.
Doing
$49 once
Year 2 cost: $0
Runs locally
No account needed
Yours forever
Other voice tools
$8–15/mo
$96–180 per year
Cloud-dependent
Account required
Cancel and it's gone
3 device activations · 14-day refund guarantee · Free updates
Transcription with zero wait.
Doing's default engine transcribes locally in real-time — no upload, no server, no spinner. By the time you release fn, your words are already there.
60 seconds of audio transcribed in ~400 milliseconds. On-device.
Measured using Doing's built-in benchmark tool. Cloud providers available via bring-your-own-key.
Your voice never leaves your Mac.
No audio uploaded. No server processes your speech. No cloud account stores your data. Everything happens on your machine.
This isn't a privacy policy. It's architecture.
Processed in real-time
Your voice is transcribed as you speak, then discarded. Never recorded. Never stored. Never uploaded.
Runs on your hardware
Parakeet runs entirely on your Mac. Apple's foundation model too, when available. No internet. No API calls. Works on a plane.
Daily markdown files
Every transcription saved locally. Searchable. Obsidian-compatible. Yours to keep.
make the onboarding flow
three steps instead of five
Want cloud? Bring your own API key for OpenAI, Gemini, or AssemblyAI. You choose when. You pay only for what you use.
Works with every AI tool and app.
Doing works at the system level. Anywhere you can type, you can talk. Hold fn in your browser, editor, or terminal — Doing transcribes and pastes wherever your cursor is.
Always know you're live.
When you hold fn, a small pip appears at your cursor. It follows your mouse so you always know two things: that Doing is listening, and exactly where your words will land.
The waveform inside reacts to your voice in real-time. When you stop talking, it settles. When you release fn, it disappears. No chrome, no window, no distraction.
Try it — hold Shift and move your mouse.
YOLO Mode ↵
Auto-submit your transcriptions. No reviewing, no pressing Enter.
Release fn and your words are pasted and sent. LLMs understand what you mean, imperfections and all — just say what you're thinking and let the AI figure it out.
Auto-press Return after pasting transcription
No slow AI middleman.
Other voice tools run your speech through an LLM to “clean it up” before giving it to you. That adds seconds of processing to every single prompt — and rewrites what you actually said.
Doing skips that step entirely. You're already sending your words to an AI. Why run them through another AI first?
Doing
speak → transcribe → paste
That's it.
Other voice tools
speak → transcribe →
LLM rewrite (1-3s) →
paste
Want AI cleanup for emails or Slack? Doing has Skills for that. But the default is your words, instantly, with nothing in between.
5 engines. 1 benchmark tool.
Choose your engine with data, not marketing.
Doing ships with Parakeet — an on-device model that transcribes 15 seconds of audio in 180 milliseconds. But we don't lock you in. Bring your own API key and benchmark every engine side-by-side on the same audio.
Local · free forever: Parakeet (v3), Apple Foundation
Cloud · bring your key: OpenAI Whisper, Gemini, AssemblyAI
Cloud engines use your own API key — you pay the provider directly, per use. No middleman markup.
Okay, we're gonna test the benchmarking tool. I'm just speaking out loud here while um to collect some words for the transcription process.
Built by a builder
I built the tool I needed.
I'm Brian. I use Claude Code for hours every day — over a hundred voice prompts a day, every day. I was tired of typing all of them. And I didn't love that every voice tool on the market was sending my audio and transcripts to their servers, doing who knows what with my data.
So I built Doing. Now it's how I build everything — including Doing itself.
No VC funding. No growth targets. No pressure to become a platform. Just a tool that does one thing well.
Local. Fast. Yours.
Voice input for AI builders. No cloud. No subscription. $49 once.
Download for Macv0.1.0macOS 14+ · Apple Silicon & Intel · No account required