Thinking Machines Lab, the artificial intelligence startup founded last year by former OpenAI chief technology officer Mira Murati, has introduced a new approach to voice-based AI interaction. The company announced on Monday what it calls interaction models — technology designed to allow an AI to listen and speak at the same time, much like a human conversation.
Full-duplex conversation comes to AI
Current AI voice assistants operate in a sequential pattern: the user speaks, the model processes, then the model responds. Thinking Machines Lab aims to replace that with full-duplex communication, where input and output happen simultaneously. The company claims its model, TML-Interaction-Small, achieves a response time of 0.40 seconds — roughly the speed of natural human dialogue and significantly faster than comparable models from OpenAI and Google.
Also read: Medicare’s quiet bet on AI: A new payment model that most of tech hasn’t noticed
In a phone call, both parties can hear and speak at the same time, interrupting or reacting in real time. Thinking Machines Lab wants to bring that same fluidity to AI interactions, moving beyond the rigid turn-taking of current systems.
Research preview, not a product yet
Despite the technical claims, the company is clear that this is still early-stage work. TML-Interaction-Small is being released as a limited research preview in the coming months, with a broader rollout expected later this year. The company has not disclosed pricing, availability, or specific use cases beyond general conversational AI.
Also read: Altman testifies Musk once proposed handing OpenAI to his children during safety dispute
The announcement comes as competition in the AI voice space intensifies. OpenAI, Google, and others have been racing to reduce latency and improve naturalness in voice interfaces, but full-duplex capability remains rare in production systems.
Why this matters for users
If Thinking Machines Lab delivers on its technical claims, the implications could be significant for customer service, virtual assistants, and real-time translation. A model that can interrupt or react mid-sentence could make AI interactions feel less robotic and more human. However, the real-world experience remains unproven until the model is available for public testing.
For now, the announcement signals that Murati’s startup is pursuing a distinct technical direction from her former employer, focusing on interaction design as a core feature rather than an afterthought.
Conclusion
Thinking Machines Lab’s full-duplex AI model represents a notable technical ambition in the conversational AI space. While the research preview is still months away, the company’s focus on native interactivity — rather than bolted-on voice features — could set a new benchmark for how humans and machines communicate. Whether the experience matches the promise will depend on real-world testing later this year.
FAQs
Q1: What is full-duplex AI?
A full-duplex AI can process user input and generate output simultaneously, allowing for natural interruptions and real-time back-and-forth conversation, similar to a phone call.
Q2: When will Thinking Machines Lab release its interaction model?
A limited research preview is expected in the next few months, with a wider public release planned for later in 2026.
Q3: How does this compare to existing AI voice assistants?
Current assistants like ChatGPT Voice and Google Assistant operate in half-duplex mode — they listen, then respond. Thinking Machines Lab’s model aims to eliminate that delay, achieving a 0.40-second response time.

Be the first to comment