Sign In

Google DeepMind releases a paper on classifying levels of AGI

Haebom
Google Deepmind recently released a paper classifying different levels of AGI. The levels of artificial general intelligence (AGI) are divided based on two dimensions: performance and generality. Performance describes how an AGI system measures up to human-level capability in a given task, while generality refers to the variety of tasks on which the system can achieve a certain performance threshold.
Level 0: No AI | Indicates a state without any artificial intelligence.
Level 1: Emerging | Shows performance equal to or just above that of unskilled adults.
Level 2: Competent | Possesses the capabilities of a skilled adult (top 50 percentile).
Level 3: Expert | Shows performance in the top 10% among skilled adults.
Level 4: Virtuoso | Delivers performance at the top 1% of skilled adults.
Level 5: Superhuman | Surpasses the capabilities of all humans.
This classification sets the minimum level of performance needed for most work tasks: for instance, a 'Competent AGI' must exceed the average skilled adult in most cognitive tasks, but it may display 'Expert', 'Virtuoso', or even 'Superhuman' abilities in some specialized areas. For example, a model built for law, finance, or medicine might only perform at the Expert level in general conversations, but could exhibit superhuman abilities within its area of expertise.
For example, today's leading language models (such as ChatGPT, Bard, Llama 2, and others) show 'Competent' performance on certain tasks (like writing a short essay or simple coding), but for most tasks (like mathematics or fact-based tasks), their abilities remain at the 'Emerging' level. As a result, they are classified as Level 1 general AI ('Emerging AGI') until they achieve higher performance across a wider range of tasks.
The highest level, ASI (Artificial Superintelligence), refers to performance that surpasses humans in every task humans can do. For instance, AlphaFold is considered 'Superhuman Narrow AI' as it outperforms even the world's top scientists at a single task—predicting a protein's 3D structure from its amino acid sequence.
This classification system demands clear benchmarks for both how deep and how broad the tasks handled by AGI are, while also noting that actual deployed performance may differ from theoretical possibilities. For example, due to user interface limitations, the performance seen in deployment might fall short of the possible peak performance.
Moreover, the order in which AI acquires advanced skills in particular cognitive areas can greatly affect AI safety. For example, if an AI gains expert knowledge in chemical engineering before developing robust ethical reasoning, this can be a risky combination. Progress across levels of performance and generality can also be non-linear, and the ability to learn new skills may especially hasten movement to the next level.
Subscribe to 'haebom'
📚 Welcome to Haebom's archives.
---
I post articles related to IT 💻, economy 💰, and humanities 🎭.
If you are curious about my thoughts, perspectives or interests, please subscribe.
haebom@kakao.com
Subscribe