English
Share
Sign In
One-page summary of Google I/O 2024 major announcements
Haebom
Gemini model family
2 million token support announced in Gemini 1.5 Pro . (Recruiting Waitlist) The official blog mentioned "a series of quality improvements across key use cases including translation, coding, and inference," but did not disclose any benchmarks.
A fourth model, Gemini Flash, has been added following the existing three models . This model was described as "optimized for fast and frequently required artificial intelligence tasks" and emphasized that it provides 1 million token capacity at a slightly lower price than GPT3.5, but did not announce exact figures on speed. The Gemini family released so far is as follows:
Ultra: “The largest model” (only available in Gemini Advanced)
Pro: “The best model optimized for general performance” (API available starting today, general availability scheduled for June)
Flash: “A lightweight model for speed/efficiency” (API available starting today, expected general availability in June)
Nano: “On-Device Model” ( to be built into Chrome 126 )
Gemini Live : “The ability to have in-depth two-way conversations using your voice,” which leads directly to Project Astra , a real-time video understanding personal assistant chatbot with a two-minute demo.
Gemma model family
Gemma 2, previously 7B and 2B, now goes up to 27B , with the model being trained providing performance close to Llama-3-70B at half the size (fitting into 1 TPU). This will also be released so that it can be run locally for free.
Other releases
Imagen 3: Google's image creation model that reduces user burden by improving understanding and interpretation of prompts compared to previous models. (This is the next generation model of the existing Imegen.)
SynthID watermarking now extends to images, audio, video (including Veo), as well as text .
A new TPUv6 hardware called Trillium has been released. In terms of performance, it is much better than the existing TPU. (4.7x performance improvement)
And they announced the integration of AI technology across Google products, including Workspace, Email, Docs, Sheets, Photos, Search Overviews, search through multi-step reasoning, Android Circle to Search, and Lens.
CNET did a 12-minute summary, and if you are interested, please refer to the video below or the contents organized by Release AI.
Personal comment
Gemini 1.5 Pro announced at Google I/O stands out for its improvement in processing speed and MMLU numbers, which is expected to significantly improve user experience by expanding the context length compared to the existing model. Additionally, despite being a lightweight model, Gemini 1.5 Flash maintains 1M token processing capacity and provides an impressive improvement in text generation speed. This highlights the ability to integrate with Google's powerful infrastructure and generate extremely fast and effective responses.
Project Astra's innovative features such as real-time audio/video data processing and response generation capabilities make real-time conversations possible even in prototypes like Google Glass, making it a notable advancement. Additionally, the rapid evolution of open source models like Gemma is making AI research and development more accessible through collaboration with the developer community. This strategy is an example of Google's ongoing commitment to better serve both users and developers.
The introduction of the Context Caching feature has the potential to reduce repetitive input of long contexts, reduce costs, and greatly improve user convenience. Improvements in the interface to accommodate a variety of inputs will contribute to diversifying and enriching the user experience. These technological advancements and innovative approaches demonstrate that the technologies introduced at Google I/O are having a profound impact on the user experience and developer ecosystem.
However, I couldn't help but feel disappointed. It was a huge and long presentation, but weren't there any points that made you say wow? Although there was no innovation as a new product, I got the impression that there was business innovation. It showed good performance in terms of cost, operation, and utilization, but it may be because OpenAI or Apple took the lead in such highlights at the right time.
3
/haebom
Subscribe