Developing the Arx General Intelligence System
TL;DR: Applied General Intelligence (AGI) has spent the past five years developing Arx, a new approach to AI that moves beyond Large Language Models. Arx-0.3 recently topped the MMLU-Pro benchmark leaderboard with a score of 78.24 percent.
For the past five years, we’ve been working on Arx, our developmental general intelligence system. Building and integrating the system involves research and development focused on representation, grounding, parsing, reasoning, and presentation.
During this R&D phase, we’ve refined Arx and tested our approach.
Our Co-Founder and Chief Science Officer Jerry Zhang has spent his life researching and thinking about the future of intelligence. Eight years ago, he predicted the limits of Large Language Models and other models based on probabilities. He thought there must be a better way than improving advanced pattern matching.
After years of R&D, Arx performance progressed, and internal lab test results showed the system was likely to outperform leading LLMs in multi-step problem solving and deliberate reasoning across domains. This summer, after reaching internal milestones, we sought to validate our research assumptions and assess our technical approach through external independent benchmarks.
The latest and most challenging Massive Multitask Language Understanding benchmark, MMLU-Pro, focuses on problem solving and reasoning. The benchmark’s alignment with practical applications made it ideal to validate our direction and demonstrate Arx outside of our testing environment.
Last month, we submitted Arx-0.3, our engineering prototype, to TigerLab for MMLU-Pro benchmark testing. Two weeks ago, they completed their testing and Arx-0.3 scored a 78.24 percent to claim the leading spot on the test, and we hold the top spot as of today. We realize Arx-0.3 results must continue to improve for us to accomplish our mission, and these results guide further development.
Looking ahead, we’re focused on:
Further developing Arx toward a phased 1.0 system rollout to select testers;
Bringing components of the system from the lab to scale production;
Building a differentiated team with extraordinary, demonstrated capability.
We believe the best way forward for AI is to develop an intelligence system capable of understanding, reasoning, and explaining beyond patterns. While we continue R&D in the coming months, we'll be sharing more updates and plans for making Arx accessible. We also plan to use other benchmarks to validate and guide our work.
Our goal is to build the most trusted intelligence system in the world.
For those who share our vision and drive to build, email team@agi.live.
For more on Arx you can visit: https://www.agi.live/arx
For more on AGI you can visit: https://www.agi.live