What is the core architecture of the NVIDIA AI-Q deep researcher?

AI-Q uses a multi-agent architecture with three main components: an 'Orchestrator' to coordinate the research loop, a 'Planner' to design an evidence-grounded research plan, and a 'Researcher' that uses parallel specialist subagents to gather and synthesize information. The entire system is built on the NVIDIA NeMo Agent Toolkit and powered by fine-tuned Nemotron 3 models.

How NVIDIA AI-Q Reached \#1 on DeepResearch Bench I and II

NVIDIA's AI-Q, a deep research agent, has secured the top position on both the DeepResearch Bench I and DeepResearch Bench II, posting scores of 55.95 and 54.50, respectively. This achievement is significant because it demonstrates that a single, openly documented, and configurable software stack can lead the industry in complex agentic research. The dual benchmark wins suggest that developer-accessible tools, rather than closed, proprietary systems, can power state-of-the-art performance in generating well-cited and factually rigorous reports.

The agent's performance is rooted in a multi-agent architecture coordinated by an orchestrator, a planner that maps the information landscape, and a researcher that deploys parallel specialists. This system is built upon the NVIDIA NeMo Agent Toolkit and utilizes a fine-tuned NVIDIA Nemotron 3 Super model. The model's capabilities were enhanced through supervised fine-tuning (SFT) on approximately 67,000 high-quality data trajectories, which were filtered from a larger set using a principle-based judge model. To ensure reliability during complex, multi-step tasks, the system incorporates custom middleware designed to handle common failure points like tool name hallucinations and reasoning-aware retries.

For the broader AI ecosystem, AI-Q's success provides a functional blueprint for enterprises looking to develop their own specialized research agents. Its modular design allows organizations to own, inspect, and customize every component—from the underlying language models to the specific tools—for their unique use cases. This shift toward configurable, transparent systems offers a compelling alternative to black-box APIs, enabling businesses to build more reliable and tailored AI workflows with greater control over performance and data governance.

NVIDIA's result underscores a critical industry trend: leading agent performance is less about a single monolithic model and more about the integration of a well-defined architecture, targeted fine-tuning on domain-specific data, and robust middleware to ensure long-horizon reliability.