English
Share
Sign In
Google announces Gemini, a model that surpasses GPT-4
Haebom
2
👍
16
🍭
1
👀
1
Created by
  • Haebom
Created at
Gemini is now being applied to Bard. AI technology has become a major turning point that revolutionizes human life. Google's Gemini AI is one of the latest technologies leading this change, a multimodal AI model that can understand and process various types of information such as text, images, audio, and video.
Gemini announced through a tech report that it outperforms GPT-4, the most powerful foundation model in existence, and disclosed the results of an experiment showing that it shows advanced performance not only in text generation but also in multimodal recognition and processing. In addition, it released three models according to size, Gemini Ultra, Gemini Pro, and Gemini Nano, rather than just one model , and publicly disclosed the number of Nano's parameters, which is 1.8B for Nano-1 and 3.25B for Nano-2. It seems that it can be called a true sLM.
Confidence in performance
Text processing capabilities
Gemini Ultra achieved 90.0% performance on the MMLU benchmark covering 57 topics , outperforming human experts.
In the same tests, OpenAI's GPT-4 performed slightly worse than Gemini Ultra, with 86.4%, but even on Big-Bench Hard, which involves complex mathematical reasoning, Gemini Ultra outperformed GPT-4, with 83.6% to 83.1%.
Multimodal processing capabilities
In image understanding Gemini Ultra achieved slightly higher performance at 77.8% than GPT-4V at 77.2%.
Gemini Ultra also outperformed GPT-4V's 88.4% in document understanding with 90.9% .
Things to note
Multimodal Understanding: Gemini AI surpasses current state-of-the-art models in multimodal understanding, demonstrating the ability to understand and solve problems in images without the help of an OCR system.
Code Generation: Generate high-quality code from popular programming languages like Python, helping developers release apps and improve services faster and more efficiently.
Features by model size
Gemini Ultra is our largest model and offers the most powerful performance to handle complex tasks.
Highly complex tasks: Gemini Ultra is designed to handle highly complex tasks and excels in this area. It achieves state-of-the-art performance in several key benchmarks.
Multimodal Understanding: Gemini Ultra, a multimodal model, is powerful in understanding and reasoning about diverse types of data, including text, images, audio, and video.
Large-Scale and Efficient: Trained using large-scale TPUv4 accelerators and optimized for efficient operation at scale.
Cutting-edge performance: Gemini Ultra achieves an incredible 90.04% accuracy on the MMLU benchmark and also demonstrates strong performance in other areas such as math and coding.
Gemini Pro is a model that can be efficiently scaled across a wide range of tasks.
Scalable across a variety of tasks: Gemini Pro is best suited to scale across a variety of tasks. Its infrastructure and learning algorithms enable rapid pre-training using fewer resources than Gemini Ultra.
Optimized Performance: Provides optimized performance for a wide range of AI tasks, making it ideal for enterprise customers and developers looking to build and scale AI.
Versatility: Gemini Pro isn't as big as Gemini Ultra, but it delivers similar performance and more efficient service.
Gemini Nano is our smallest model, designed to be efficient at performing tasks within a device.
Efficiency for on-device operations: The Nano model is designed for on-device deployments, prioritizing efficiency and speed.
Small but powerful: Despite its small size, the Nano model delivers impressive performance in tasks like summarizing and reading comprehension.
Accessibility: Featuring the ability to operate across a variety of platforms and devices, Gemini Nano models make advanced AI capabilities more accessible.
Gemini AI is a model that opens a new horizon for Google's AI technology development. It has excellent performance in a wide range of fields from text to multimodality, and has the ability to effectively understand and process complex information, making the future of AI bright. It is expected to provide high value to everyone who uses AI.
Release Plan
Gemini Pro
We bring Gemini to billions of people around the world through Google products.
Starting today, Bard is powered by a sophisticated version of Gemini Pro, delivering more advanced reasoning, planning, understanding, and more. This is the biggest upgrade to Bard since its launch.
Available in English in over 170 countries and territories, with plans to expand to multiple modalities and new languages and regions soon.
Gemini Nano
Gemini can be operated on smartphones
The Pixel 8 Pro is the first smartphone designed to run Gemini Nano , bringing new features to Gboard's 'Smart Replies', starting with the Recorder app's 'Summary' feature and WhatsApp.
We plan to expand to more messaging apps next year.
Additional Products and Services
Gemini will be available across more Google products and services in the coming months, including Search, Ads, Chrome, and Duet AI. We ’ve already experimented with Gemini in Search, and have seen results like a 40% reduction in latency and improved quality for English language searches in the U.S.
Access for developers and enterprises
Starting December 13, 2023, developers and enterprise customers will be able to access Gemini Pro on Google AI Studio or Google Cloud Vertex AI . Google AI Studio is a free, web-based development tool that lets you quickly prototype and launch apps using your API key. Vertex AI is a fully managed AI platform that lets you customize Gemini with user data control and additional capabilities from Google Cloud.
Android Developer
Android developers will be able to build using Gemini Nano, which is the most efficient for the tasks on-device, thanks to a new system feature called AICore, available in Android 14.
You can watch the full Gemin keynote in the video below. 2024 looks set to be a year of even greater tectonic shifts.
Subscribe to 'haebom'
📚 Welcome to Haebom's archives.
---
I post articles related to IT 💻, economy 💰, and humanities 🎭.
If you are curious about my thoughts, perspectives or interests, please subscribe.
Would you like to be notified when new articles are posted? 🔔 Yes, that means subscribe.
haebom@kakao.com
Subscribe
2
👍
16
🍭
1
👀
1