How Tolan builds voice-first AI with GPT-5.1

Tolan is a voice-first AI companion utilizing GPT-5.1 to deliver personalized, ongoing conversations with users. The application, built by Portola, has already amassed over 200,000 monthly active users since its launch in February 2025.

Voice AI presents unique challenges compared to text-based models, demanding low latency and robust context management to maintain natural, flowing interactions. Traditional approaches to context caching often fail in dynamic voice conversations, leading to disjointed experiences and user frustration, potentially impacting retention rates.

Key Insights

0.7-second latency reduction: Implementing OpenAI’s GPT-5.1 and Responses API decreased speech initiation time by 0.7 seconds.
Context Reconstruction: Tolan rebuilds its context window each turn, incorporating summaries, persona cards, memories, and real-time signals.
Turbopuffer: Tolan uses Turbopuffer, a high-speed vector database, for sub-50ms memory lookup times.

Practical Applications

Personalized Companions: Tolan provides a continuously learning AI companion, improving user engagement through consistent personality and memory.
Pitfall: Relying on cached prompts in voice applications leads to inconsistencies and a disjointed user experience when the conversation topic shifts.

References:

https://openai.com/index/tolan/

On This Page

How Tolan builds voice-first AI with GPT-5.1