Measuring progress toward AGI: A cognitive framework
Artificial General Intelligence (AGI) has the potential to accelerate scientific discovery and help solve some of humanity’s most pressing problems. But it can be difficult to know how close we are to this key milestone, because there’s a lack of empirical tools for evaluating systems’ general intelligence. Tracking progress toward AGI will require a wide range of methods and approaches, and we believe cognitive science provides one important piece of the puzzle.
That’s why today, we’re releasing a new paper, “Measuring Progress Toward AGI: A Cognitive Taxonomy,” that presents a scientific foundation for understanding the cognitive capabilities of AI systems.
Alongside the paper, we are partnering with Kaggle to launch a hackathon, inviting the research community to help build the evaluations needed to put this framework into practice.
Deconstructing general intelligence
Our framework draws on decades of research from psychology, neuroscience and cognitive science to develop a cognitive taxonomy. It identifies 10 key cognitive abilities that we hypothesize will be important for general intelligence in AI systems:
- Perception: extracting and processing sensory information from the environment
- Generation: producing outputs such as text, speech and actions
- Attention: focusing cognitive resources on what matters
- Learning: acquiring new knowledge through experience and instruction
- Memory: storing and retrieving information over time
- Reasoning: drawing valid conclusions through logical inference
- Metacognition: knowledge and monitoring of one's own cognitive processes
- Executive functions: planning, inhibition and cognitive flexibility
- Problem solving: finding effective solutions to domain-specific problems
- Social cognition: processing and interpreting social information and responding appropriately in social situations
To understand AI capabilities across these cognitive abilities, we propose a three-stage evaluation protocol that benchmarks system performance in relation to human capabilities:
- Evaluate AI systems across a broad suite of cognitive tasks covering each ability, using held-out test sets to prevent data contamination
- Collect human baselines for the same tasks from a demographically representative sample of adults
- Map each AI system’s performance relative to the distribution of human performance in each ability
Going from theory to practice
Defining these cognitive abilities is a crucial first step, but we need more than a framework to measure progress. To put this theory into practice, we are launching a new Kaggle hackathon — “Measuring progress toward AGI: Cognitive abilities”. The hackathon encourages the community to design evaluations for five cognitive abilities where the evaluation gap is the largest: learning, metacognition, attention, executive functions and social cognition.
Participants can use Kaggle's newly launched Community Benchmarks platform to build and test their evaluations against a lineup of frontier models.
We are offering a total prize pool of $200,000: $10,000 awards for the top two submissions in each of the five tracks, and $25,000 grand prizes for the four absolute best overall submissions. Submissions are open March 17 through April 16, and we’ll announce the results June 1. Head over to the Kaggle website to start building.