Welcome back, trailblazers!
Today's edition of Leading Edge takes us to a galaxy not so far away, where a battle for AI supremacy rages on. Google has launched its most powerful weapon yet - Gemini 2.0. And this time, they're not just planning to get back in the AI race; they're looking at rewriting the rules of engagement.
Will Google's Gemini 2.0 rise to challenge OpenAI's galactic supremacy?. Buckle up for a saga filled with innovation, rivalry, and transformative power.
A Long Time Ago, In a Silicon Valley Not far Away ...
The launch of ChatGPT in 2022 was nothing short of a cosmic event.
It didn’t just redefine conversational AI; it sparked a race that pitted tech giants against each other in a battle for AI dominance. The tech giant, long considered the beacon of innovation, found itself in an unfamiliar position - playing catch-up in the AI space they had pioneered.
Their first counterstrike came in the form of Bard, a name that promised Shakespearean eloquence but delivered more comedy than capability. The internal resistance was strong with this one - Google's own employees, testing Bard before its release, labeled it "worse than useless" and a "pathological liar." From offering questionable scuba diving advice to botching plane-landing instructions, Bard seemed more like a court jester than the AI assistant it was meant to be.
One brave employee even sent a distress signal, begging Google not to launch it. But like the Empire rushing to complete the Death Star, corporate pressure won out.
The result?
Critics pointed out that Bard's responses were often as accurate as a Stormtrooper's aim, and its creativity lagged behind ChatGPT's graceful prose.
The launch was so underwhelming that Google eventually decided to mercy kill the brand name. Bard was no more; the Gemini era was about to begin.
A New Hope: Gemini 1.0 Awakens
In December 2023, Google unveiled Gemini 1.0, a multimodal marvel capable of processing text, images, audio, and code. Unlike its predecessor, Gemini wasn’t a mere chatbot; it was a Swiss Army knife for AI tasks, seamlessly integrating with Google’s ecosystem.
Imagine querying decades of Gmail archives or orchestrating complex workflows across Google Docs and Drive. I - and many others like me - have nearly 2 decades of gmail, and now Gemini has access to it, and can query across that data.
It was like having C-3PO with access to the entire Jedi Archives. The integration was seamless, allowing users to navigate their digital universe with unprecedented ease.
However, the Force wasn’t entirely with Gemini 1.0. While it impressed with versatility, it just was not good enough to dethrone ChatGPT. It did not make me cancel my ChatGPT subscription or continue my Gemini subscription beyond the first two (free 😉) months.
The Context Wars: Size Matters Not... Or Does It?
In February 2024, Google revealed their next weapon: Gemini 1.5 Pro with its massive 1-million-token context window. Why is this relevant? For some context , it can process over 700,000 words in one go. This was then later expanded to an astounding 2 million tokens, which meant Gemini could analyze entire novels or vast codebases faster than the Millennium Falcon making the Kessel Run.
The pricing was aggressive too - at $3.50 per million tokens, Google was making it clear they weren't just fighting for technological supremacy; they were fighting for market share.
The real show of force came with Gemini's ability to analyze up to 2 hours of video and 22 hours of audio. Here's a bit more pratical example, when a reviewer just showed it the video of his library and asked it to identify the books.
Yet, as impressive as it was, Gemini 1.5 Pro still faced challenges. OpenAI’s ChatGPT still retained its lead in coding and conversational fluidity, and they too were releasing feature after features making ChatGPT incrementally powerful. Plus Anthropic's Claude 3.0 became a very viable alternative.
Return of the Gemini: The 2.0 Revelation
Last week, Google unveiled their master plan with Gemini 2.0 Flash, marking what could be the most significant power shift in the AI galaxy since ChatGPT's initial rise to power.
This wasn't just another update - this was Google's Declaration of AI Independence, signaling the dawn of what they call the "agentic era."
The Flash model isn't just faster; it's smarter.
What about pricing? That still remains in the shadows, industry experts speculate it could be cheaper than its predecessor - unless Google follows the path of Claude 3.5 Haiku, valuing intelligence over mere processing power.
Google has unveiled an impressive arsenal of AI tools, across the breadth and depth of it's system. I can't list all here, but it surely is impressive.
Project Mariner: The Web Navigator
Mariner is an AI Agent that can take actions on the web. It can take control of your Chrome browser, move cursor on your screen, click buttons and fill out forms. Why bother? Because with this a human like navigation of wbe page would be posisble. It scored an impressive 83.5% on the WebVoyager (yes, there’s a benchmark for that!).
Project Astra: The Universal Translator
Building upon its earlier success, the enhanced Project Astra - digital assistant for your phone - now speaks multiple languages, integrates seamlessly with Google services, and maintains a 10-minute memory - longer than most Stormtroopers can remember their orders 😉
Deep Research: Hello Perplex-Kitty
Taking direct aim at Perplexity's stronghold, Deep Reserach in Gemini Advanced compiles research. It can create detailed research plans and gather information from across the internet once you press the go button. It's like having a tireless research intern at your fingertips, far better than any of the current tools available.
A nuance here - Deep research is on Gemini 1.5 Pro still. Why not Gemini 2.0 Flash? Well, i am glad you asked. Most likely it's due to the fact that Gemini 2.0 has an achilles heel. It sucks at long contexts comapred to 1.5 Pro.
Imagen 3 and Veo 2: The Creative Force
Earlier Gemini attempts at image generation got into some major controveries - including generating a black man and asian women in uniforms when asked to generate pictures WW II Nazi Soliders.
Google has finally mastered the creative side of the Force. Imagen 3 stands as the current champion of image generation.
Veo 2 makes OpenAI's SORA look like a training droid. With capabilities extending to 2-minute videos in 4K resolution, it's delivering 6 times the duration and 4 times the resolution of its competitors.
NotebookLM: The Massive Hit from Google gets an Impresive Upgrade
Already a fan favorite among Google's AI tools, NotebookLM is receiving for interactive podcasts. If you are not using it until now, probably you should - i definitely see this as moving to become a fantastic learning tool across all ages.
Jules: The Code Master
For the tech-savvy Rebel engineers out there, Jules integrates directly with GitHub workflows, reviewing and modifying code and even take voice commands and can see your screen. I won't use the voice features though - it is not very efficient and has a tendency to say "backquote" a lot.
The Final Frontier: Stakes and Implications
The stakes in this AI war are astronomical.
Google's vision extends beyond mere technological advancement. They're transitioning from the age of narrow, task-based agents to a new era of broad, multimodal capabilities. From mobile apps to augmented reality glasses, from gaming to content creation - all supported by a massive 2M context window nobody else has yet.
OpenAI, with its $17.9 billion in funding and $157 billion valuation, seems formidable - until you consider Google's empire, boasting $309 billion in revenue and $124 billion in EBITDA for 2024 so far - and there's still a month to go in the year (from a financial results persepctive).
And let's not forget the other powers in this galaxy - Meta's ambitious plans with Llama 3 ecosystem and Microsoft's strategic positioning with both OpenAI partnerships and internal developments.
As we witness this epic battle unfold, one thing becomes clear: the AI wars are just beginning, and like the fate of many Death Stars, some platforms may not survive the conflict.
Outro: May the AI Be With You
Hope you enjoyed this deep dive into the AI galaxy! Your feedback powers our hyperdrive - hit reply and let me know your thoughts! Got ideas for our next expedition across the tech universe? Don't be shy - share them like a Jedi shares wisdom!
And remember, if you found value in this newsletter, spread it across the galaxy! Share it with your network and help our rebellion grow.
See you next week, and may the AI be with you!
#artificialintelligence #generativeai #leadership #ai #productdevelopment #startups #LeadingEdge