Silicon Valley offices adopt whispering to computers as voice technology gains traction

If voice truly becomes the primary mode of interaction, the acoustic environment of offices will become something that requires careful design.

Author and source: Lao Ban Zhu, Cyber Last Train

The WSJ recently published an article whose title translates to: “Typing Is Being Replaced by Whispering, and It’s More Annoying Than You Think.”

TechCrunch followed up on the topic. Reporter Anthony Ha wrote a brief commentary, opening with a venture capitalist’s observation that visiting startup offices in Silicon Valley now feels like walking into a high-end call center.

The reason is that more and more people in the office are muttering to their computers.

Not a phone call, not a meeting, not chatting with colleagues. Just one person sitting at their desk, speaking softly to the screen. Sometimes they’re dictating an email, sometimes code, sometimes a Slack message. The keyboard clicks occasionally, but more often, a hushed murmur lingers above the desk.

Five years ago, this scene might have been seen as a warning sign of a mental health issue. But in some Silicon Valley startups in 2026, it’s beginning to grow common.

What’s driving this is a new class of tools, with the most prominent being Wispr Flow.

It’s not traditional speech-to-text. Older voice dictation systems simply transcribe exactly what you say—you have to verbally insert punctuation, and typos flood the screen, often taking longer to correct than typing manually. Wispr Flow is different. It uses AI to understand the context of your speech, automatically removes filler words like “um” and “uh,” adds punctuation naturally, and adjusts formatting based on which app you’re speaking in.

When you speak in Gmail, it outputs a well-formatted email. When you speak in Slack, it generates a concise message. The product documentation even lists coding scenarios, such as dictating code in VS Code or Cursor, where it can distinguish between camelCase and snake_case naming conventions.

Product documentation claims a latency as low as approximately 500 milliseconds. The official promotional rate is up to about 220 WPM (words per minute), while proficient typists typically type between 80 and 100 WPM.

The key point is that this runs at the system level—it works on Mac, Windows, and mobile devices. Any app can start speaking with just a shortcut key press. It costs just a few dollars per month.

So more and more people are starting to write with their mouths.

According to users interviewed by the WSJ, some purchased gaming headsets specifically to speak to AI, as gaming headsets’ microphones have a short pickup range, making it easy to whisper without being overheard by others nearby. Others bought programmable foot pedals, allowing them to activate Wispr with a simple foot tap instead of manually pressing a shortcut key. Some have set up gooseneck microphones at their desks, positioning them just a few centimeters from their mouths, so they can speak using only a breathy voice.

Imagine this scene: an open-plan office with dozens of people, each in front of a screen, wearing gaming headphones, and a gooseneck microphone at their lips, all whispering quietly.

It really does resemble a call center, just a bit quieter.

TechCrunch's report mentioned two specific individuals.

One is Edward Kim, co-founder of Gusto, a prominent U.S.-based SaaS company specializing in payroll and human resources for businesses. Kim says he now types only when absolutely necessary and relies on voice for everything else. He told his team that future offices will sound “more like a sales department.”

But Kim also admitted that talking to a computer all day in the office is "a bit awkward."

Another is AI entrepreneur Mollie Amkraut Mueller, who developed the habit of working quietly at her computer at night—until her husband couldn’t take it anymore. Now, their late-night work routine involves sitting apart or one of them retreating to the study.

It’s not common for a tech product to be so useful that it changes how couples spend their evenings together.

Wispr’s founder, Tanay Kothari, isn’t too concerned about these awkward moments. In an interview, he said that all of this will eventually become normal, just like when we first started staring at our phones. A decade ago, someone低头刷手机 in public was considered impolite; now, many people no longer find it strange.

Speaking to the computer is the same—he believes it just takes time.

Frankly, he might be right. But the transitional period in between would likely be very chaotic.

When someone types in an open office, others can’t hear it. But when someone whispers in an open office—even at the lowest volume—people nearby can still hear the faint hum of their murmuring. One or two people are fine, but when a dozen people are whispering at once, the acoustic environment becomes quite peculiar.

According to the WSJ, some people have started wearing noise-canceling headphones to block out colleagues speaking to their computers.

Picture this: Person A is wearing gaming headphones, speaking to their computer, while Person B is wearing noise-canceling headphones to block out A’s voice. They’re sitting at adjacent desks, unable to see each other’s ears, yet each has a pair of headphones on their head—used for entirely different purposes.

This is worth discussing because it touches on something much deeper than just an efficiency tool.

The mainstream interaction methods of general computing have gone through several major waves. Graphical interfaces and mice pushed command lines out of sight for average users, enabling people who don’t understand code to use computers. Touchscreens replaced physical buttons, and the iPhone turned phones into pieces of glass. Each shift in interaction style is more than just “more convenient to use”—it changes your physical relationship with the device, and consequently, your spatial relationship with the people around you.

Keyboards, screens, and individual workstations reinforce the quiet, screen-focused posture of the office, with everyone facing a screen and hands on keyboards, undisturbed by one another. Touchscreens enable people to work from sofas, beds, or subways, blurring the boundaries of the office.

If voice truly becomes the dominant mode of interaction, the acoustic environment of offices will become something that requires careful design. Concepts that now sound somewhat redundant—such as soundproof booths, private voice workstations, and acoustic zoning—could become standard features in office space design, just as meeting rooms are today.

Of course, this doesn't mean everyone will speak to their computers, but workspaces need to accommodate voice input options. Social etiquette will also evolve. When is it appropriate to speak to your computer, and when should you switch back to typing? Is muttering to your laptop in a café considered rude? These questions have no answers yet, but within two or three years, established norms may emerge.

Like the etiquette of making phone calls in public places—no one taught us, but everyone figured it out.

TechCrunch journalist Anthony Ha ended his article with a personal, emotionally charged remark. He said he had once suffered greatly when his desk was moved next to the sales department, so when he read Edward Kim’s statement that future offices would resemble sales departments, his reaction was, “Oh no.”

A trend that could prompt a tech journalist to write “Oh no” in a formal report is probably worth paying attention to.