In May, OpenAI unveiled the Advanced Voice feature during the launch of GPT-4o, boasting an impressive response time of as little as 232 milliseconds, with an average of 320 milliseconds. This performance mirrors the natural pace of human conversation. After a brief hiatus, the rollout of this innovative feature commenced in September, initially targeting ChatGPT Plus and ChatGPT Team subscribers in the United States.
Expansion into Europe
Recently, OpenAI broadened the availability of the ChatGPT Advanced Voice mode to users across the European Union. This feature is now accessible to all ChatGPT Plus and Team subscribers in the EU, Switzerland, Iceland, Norway, and Liechtenstein. To take advantage of this functionality, users are encouraged to download the latest version of the ChatGPT app from either the Google Play Store or the Apple App Store, depending on their device.
Moreover, OpenAI has announced that Advanced Voice mode is now integrated into the ChatGPT desktop applications for both macOS and Windows. However, users should be aware that there is a daily usage limit for the Advanced Voice feature, even on desktop platforms. The app will provide notifications when users have 15 minutes of voice usage remaining for the day.
Big day for desktops. Advanced Voice is now available in the macOS and Windows desktop apps. https://t.co/mv4ACwIhzA
In recent weeks, OpenAI has introduced several enhancements to the Advanced Voice mode. These include the addition of five new voices—Arbor, Maple, Sol, Spruce, and Vale—alongside features that allow users to set custom instructions and request the AI to remember conversations for future reference. Improvements have also been made to overall conversational speed, smoothness, and accents in various supported foreign languages.
New Opportunities for Developers
During DevDay 2024, OpenAI announced the launch of the Realtime API, which empowers developers to craft their own voice experiences akin to ChatGPT’s Advanced Voice mode. The pricing structure for the Realtime API includes per 1 million text input tokens and per 1 million output tokens. For audio input, the cost is set at 0 per 1 million tokens, while output is priced at 0 per 1 million tokens.
With the expansion of Advanced Voice capabilities and the introduction of the Realtime API, OpenAI is making notable advancements in the realm of conversational AI, setting the stage for more interactive and accessible AI experiences for users and developers alike.