Build with Gemini Deep Research
Today, we are releasing a significantly more powerful Gemini Deep Research agent, available via the Interactions API. For the first time, developers can embed Google’s most advanced autonomous research capabilities directly into their own applications. We are also open-sourcing a new web research agent benchmark, DeepSearchQA, designed to test agent comprehensiveness on web research tasks.
Gemini Deep Research is an agent optimized for long-running context gathering and synthesis tasks. The agent’s reasoning core uses Gemini 3 Pro, our most factual model yet, and is specifically trained to reduce hallucinations and maximize report quality during complex tasks. By scaling multi-step reinforcement learning for search, the agent autonomously navigates complex information landscapes with high accuracy.
Deep Research iteratively plans its investigation – it formulates queries, reads results, identifies knowledge gaps, and searches again. This release features vastly improved web search, allowing it to navigate deep into sites for specific data.
The new Gemini Deep Research agent achieves state-of-the-art results on Humanity’s Last Exam (HLE) and DeepSearchQA, and is our best on BrowseComp. It is optimized to generate well-researched reports at much lower cost. Deep Research is now more useful and intelligent than ever, and will soon be available in Google Search, NotebookLM, Google Finance and upgraded in the Gemini App.
Gemini Deep Research achieves state-of-the-art 46.4% on the full Humanity’s Last Exam (HLE) set, 66.1% on DeepSearchQA and a high 59.2% on BrowseComp
DeepSearchQA: a benchmark for deep research agents
Existing benchmarks often fail to capture the complexity of real-world, multi-step web research. This is why we are open-sourcing DeepSearchQA, a new benchmark to evaluate agents on intricate, multi-step information-seeking tasks.
DeepSearchQA features 900 hand-crafted "causal chain" tasks across 17 fields, where each step depends on prior analysis. Unlike traditional fact-based tests, DeepSearchQA measures comprehensiveness, requiring agents to generate exhaustive answer sets. This assesses both research precision and retrieval recall.
DeepSearchQA also serves as a diagnostic tool for the benefits of "thinking time." In our internal evaluations, we observed significant performance gains when allowing the agent to perform more searches and reasoning steps which we plan to explore in future releases.
Comparing pass@8 vs. pass@1 results demonstrates the value of letting the agent explore multiple parallel trajectories for answer verification. These results were computed on a 200-prompt subset of DeepSearchQA.
We are releasing the benchmark assets to drive future research toward more robust and capable agents:
- Explore the data: Access the dataset, leaderboard, and starter Colab.
- Read the science: Dive into the methodology in our Technical Report.
Gemini Deep Research agent in the real world
The Gemini Deep Research agent is already demonstrating profound, immediate results in complex fields demanding high precision and context based on early feedback and testing. This includes verticals such as financial services, biotech, and market research, which have used Gemini Deep Research to tackle preliminary research tasks.
Financial firms are using Gemini Deep Research to automate the labor-intensive initial stages of due diligence. By aggregating market signals, competitor analysis, and compliance risks from across the web and proprietary sources, the agent becomes a massive force multiplier for investment teams in their early research phases.
In the scientific community, Deep Research is helping to solve complex safety challenges. Axiom Bio, which builds AI systems to predict drug toxicity, found that Gemini Deep Research unlocked an unprecedented level of initial research depth and granularity across biomedical literature, accelerating drug discovery pipelines.
Build with Gemini Deep Research
For developers building the next generation of automated research tools, Gemini Deep Research agent offers unparalleled capabilities through which to synthesize information and generate a detailed report:
- Unified information synthesis: Gemini Deep Research analyzes your documents (PDFs, CSVs, docs) and public web data using File Upload and the File Search Tool. It also handles large context gracefully, allowing you to place extensive background information directly in the prompt.
- Report steerability: You control the output via prompting, defining the structure, headers, and subheaders, or specifying data table generation and formatting.
- Detailed citations: Granular sourcing is provided for claims, allowing users to verify data origin.
- Structured outputs: Supports JSON schema outputs for easy parsing of research results by downstream applications.
Get started with Deep Research in the Interactions API
You can follow our developer documentation to start building with the Deep Research agent using the new Interactions API, which is our next-generation interface designed to simplify interactions with Gemini models and agents. You can access the Interactions API with your Gemini API key from Google AI Studio.
Future updates will also focus on richer outputs like native chart generation for visual analytical reports and expanding connectivity through Model Context Protocol (MCP) support to more easily tap into your custom data sources. We’re also working to bring Gemini Deep Research to Vertex AI for enterprises.