Which specific AI capabilities is the Kaggle hackathon focused on?

The hackathon is asking researchers to design evaluations for five specific cognitive abilities where Google DeepMind identifies the largest current evaluation gaps: learning, metacognition, attention, executive functions, and social cognition.

Measuring progress toward AGI: A cognitive framework

Google DeepMind has proposed a new framework to systematically measure progress toward artificial general intelligence (AGI), addressing a notable lack of standardized evaluation tools in the field. By grounding its approach in cognitive science, the research lab aims to create a more structured, empirical way to assess the capabilities of advanced AI systems against human baselines. This effort is significant as the industry grapples with defining and tracking the development of increasingly general-purpose models.

The initiative is detailed in a new paper, “Measuring Progress Toward AGI: A Cognitive Taxonomy,” which identifies 10 key cognitive abilities including reasoning, metacognition, and social cognition. To translate this theory into practice, DeepMind has partnered with Kaggle to launch a hackathon with a $200,000 prize pool. The competition, running from March 17 to April 16, invites the research community to build new evaluation tools for five specific abilities where the assessment gap is considered largest: learning, metacognition, attention, executive functions, and social cognition.

By tying AI evaluation to human cognitive abilities and crowdsourcing the creation of benchmarks, Google is attempting to steer the industry toward a more holistic and human-relative standard of measurement. If adopted, this framework could influence research priorities across the AI ecosystem, pushing development beyond excelling at narrow tasks toward achieving more robust, generalizable intelligence. This could also provide a clearer vocabulary for researchers and policymakers to discuss and compare the capabilities of different AI systems.

Google DeepMind is not just publishing research; it's attempting to define the very ruler by which AGI will be measured. By anchoring its framework in cognitive science and mobilizing the community through a funded Kaggle competition, the lab is positioning itself to set the standard for AGI evaluation, potentially shaping the competitive landscape and research direction for the entire field.