Gemini Live: Promising Concept, More Rehearsals
Introduction Of Gemini
Whatβs the point of chatting with a human-like bot if itβs an unreliable narrator β and has a colorless personality?
This question lingers as I reflect on my experience with Gemini Live, Googleβs latest AI-powered voice assistant. Designed to offer a more engaging and natural conversation, Gemini Live is Googleβs answer to OpenAIβs Advanced Voice Mode, providing users with the freedom to interrupt the bot and converse more fluidly. But despite the promise, it remains plagued by the same issues that have hampered previous AI models β with a few new ones added for good measure.
Table of Contents
A Step Forward in AI Conversation
Gemini Live, powered by Googleβs advanced generative AI models, Gemini 1.5 Pro and 1.5 Flash, aims to improve on the conversational experience offered by previous iterations like Google Assistant. The new feature allows for a more dynamic interaction, where users can interrupt and engage more freely with the AI, creating a back-and-forth conversation that feels more fluid and natural.
However, while the technology has advanced in terms of voice expressiveness and dialogue flow, it struggles to overcome the persistent issues of AI hallucinations and inconsistencies. Additionally, Gemini Live introduces a new problem: a lack of personality that makes it feel more like a polite but dispassionate assistant than a truly engaging conversational partner.
Voice, But Without Emotion
One of It Liveβs key selling points is its voices, which were designed in collaboration with professional actors to ensure expressiveness. On my Pixel 8a, I chose the βUrsaβ voice, described by Google as βmid-rangeβ and βengaged.β While Ursaβs voice was certainly more expressive than previous synthetic voices from Google, it still felt somewhat dispassionate, avoiding the uncanny valley but at the cost of sounding a bit too robotic.
Unlike OpenAIβs Advanced Voice Mode, It Live doesnβt include subtle human-like behaviors like laughter, breathing, or hesitations. This absence of more nuanced emotional expressions makes conversations with It Live feel somewhat hollow. Furthermore, the inability to adjust the pitch, timbre, or pace of the voice limits personalization, leaving the experience less customizable compared to its competitors.
Conversations That Miss the Mark
During my time with Gemini Live, I tested its ability to assist with job interview preparation β a use case Google highlighted during its I/O developer conference. While the AI did a decent job of asking relevant questions, its feedback was overly complimentary and lacked the depth needed to be truly helpful.
For example, after providing off-the-cuff responses, Gemini Live praised my performance, making me suspect its critiques were superficial. When I challenged it by suggesting I had only given one-word answers, the AI falsely agreed, showcasing its tendency to hallucinate information and make unreliable claims.
This lack of accuracy and consistency in conversations is one of the most significant issues with Gemini Live. While it remembers details from earlier exchanges within a session, it frequently falters when providing factual information, often making incorrect statements that undermine its credibility.
Technical and Functional Limitations
In addition to its conversational shortcomings, Gemini Live also suffers from technical issues that hinder its usability. Activating the feature required following non-intuitive steps, and during conversations, the AIβs voice would sometimes cut out mid-response. These glitches, combined with the lack of integration with other Google services like Gmail and YouTube Music, limit Gemini Liveβs utility compared to its text-based counterpart.
Conclusion: A Prototype That Needs Refinement
Gemini Live is an ambitious step toward more natural and engaging AI conversations, but it feels like a prototype rather than a polished product. While it offers a glimpse into the future of voice interactions, itβs currently overshadowed by its own technical limitations, lack of personality, and persistent inaccuracies.
At $20 per month under Googleβs AI Premium Plan, Gemini Live struggles to justify its price tag, especially when compared to the text-based Gemini experience, which remains more reliable and versatile. Until Google addresses these issues and adds more robust features, Gemini Live will likely remain an experimental tool rather than a must-have assistant for daily use.
Gemini Live might not be ready for the big stage just yet β but with more rehearsals, it could eventually find its voice.