Google’s Gemini AI may not be telling us the full truth.
In the ever-evolving landscape of artificial intelligence, Google’s unveiling of Gemini AI stirred anticipation, heralded as a formidable contender against OpenAI’s ChatGPT. Presented in three sizes—Ultra, Pro, and Nano—Google positioned Ultra as a powerhouse surpassing GPT-4 in various facets. However, amid the fanfare, skepticism emerged.
Unveiling Gemini AI: The Bold Claim
The tech world was abuzz with the unveiling of Google’s latest gem, Gemini AI, which was positioned as a formidable rival to OpenAI’s ChatGPT. Google introduced Gemini in three sizes—Ultra, Pro, and Nano—claiming Ultra’s superiority over GPT-4 in various aspects, sparking excitement in the industry.
Red Flags Raised by Bindu Reddy
However, Bindu Reddy, CEO of AbacusAI, raised doubts about Gemini’s proclaimed excellence. In a detailed analysis, she brought attention to potential discrepancies in Gemini’s performance. Reddy pointed out that while Google showcased Gemini’s superiority over GPT-4 in benchmarks like MMLU, they employed a different technique, COT@32, instead of the standard 5-shot learning.
“Digging deeper into the MMLU Gemini Beat – Gemini doesn’t really Beat GPT-4 On This Key Benchmark. The Gemini MMLU beat is specifically at CoT@32. GPT-4 still beats Gemini for the standard 5-shot – 86.4% vs. 83.7%”Bindu Reddy
The Technical Divide: 5-shot vs. CoT@32
Reddy emphasized that despite Gemini AI’s success with CoT@32, GPT-4 maintained an advantage with the traditional 5-shot assessment. She elaborated on the significance of 5-shot learning, involving training an AI model with five examples of each category, crucial for evaluating pattern recognition capabilities. Igor Pogany also pointed out that.
“Google is trying to fool us with Gemini Ultra’s numbers. The benchmark they’re most proud of is Gemini’s score of 90.0% on the MMLU. This beats GPT-4’s score of 86.4%, and it even tops human test results of 89.8%.”Igor Pogany
“Google is trying to fool us with Gemini Ultra’s numbers. The benchmark they’re most proud of is Gemini’s score of 90.0% on the MMLU. This beats GPT-4’s score of 86.4%, and it even tops human test results of 89.8%.”Hadi Azzouni
Controversy Surrounding Gemini AI’s Capabilities: Clint Ehrlich’s Stand
The authenticity of Gemini AI’s capabilities, as showcased during its launch, came under scrutiny. Clint Ehrlich, an attorney and computer scientist, questioned the accuracy of the video demonstration. Ehrlich contested the video’s depiction, stating that Gemini’s processing was limited to images, necessitated detailed prompts, and performed better with written cues over audio.
Legal Implications and Regulatory Concerns
Ehrlich’s legal worries centered on following the rules set by the Federal Trade Commission (FTC). He was concerned because the video showcasing Gemini’s abilities didn’t have disclaimers. These disclaimers are important as they clarify things and prevent people from being misled by the advertisement.
In essence, Ehrlich felt that the video might not have given the full picture. Without disclaimers, viewers might have gotten the wrong idea about what Gemini could actually do. This lack of clear information could lead to misunderstandings or false expectations about the AI’s capabilities.
By pointing out this absence of disclaimers, Ehrlich was highlighting the importance of being upfront and clear in advertising. Following FTC regulations is crucial to ensure that people understand exactly what they’re getting and that they aren’t misled by exaggerated claims or incomplete information.
Discrepancies in Performance: Reports and Clarifications
Reports surfaced, indicating possible discrepancies between Gemini’s actual output and its depiction in the demo. It was revealed that Gemini wasn’t actively analyzing the video but responding to still frames and text prompts.
In response to the controversy, a Google spokesperson clarified that the video was crafted using still image frames and text prompts. Oriol Vinyals, a prominent figure in Gemini’s development, defended the video’s intent, aiming to inspire developers, and maintained that user prompts and outputs were genuine but condensed for brevity.
The Need for a Holistic Assessment
Despite Gemini’s impressive benchmarks, the true evaluation of its capabilities will unfold as it becomes more widely accessible. The creators acknowledge Gemini’s ongoing evolution, conceding that it’s a work in progress with imperfections.
The Significance of Authentic Representation
This ongoing discussion about Gemini AI highlights how important it is for companies to show what their AI can truly do. It’s crucial to be honest and clear about its abilities and limitations. When companies mislead people about what their AI can achieve, it breaks trust and can make it harder for everyone to understand and use these new technologies.
Being honest in these demonstrations isn’t just about marketing; it’s about doing the right thing. It helps people make informed choices about where to invest their time, money, and efforts. By being transparent, companies can build trust and respect, encouraging responsible innovation and making AI more accessible and beneficial for everyone.