OpenAI has unveiled ‘Deep Research’, an AI agent designed to help with complex research on different subjects like finance, science, policy, engineering etc, as well as other situations like making purchases that typically require careful research, like cars, appliances, and furniture.
Currently available to ChatGPT Pro users, (and soon extending to Plus, and Teams users, followed by Enterprise), it uses reasoning to synthesize large amounts of online information and complete multi-step research tasks. At present, it is limited to 100 queries a month.
READ: China disrupts AI market with DeepSeek: A better, cheaper version of ChatGPT? (January 27, 2025)
Deep Research has been designed for instances where a quick answer or summary isn’t enough, and the user needs a more in-depth answer, provided by consulting multiple websites.
At present, Deep Research is web only, with mobile and desktop applications soon to be available. To use ChatGPT deep research, you just need to select “deep research” in the composer and then enter a query, with the option to attach files or spreadsheets. Once that is done, the output would be available in anywhere between five to thirty minutes. While outputs are currently text-only, OpenAI stated that it plans to add embedded images, data visualizations, and other “analytic” outputs soon. There are also plans to connect more “specialized” data sources, including “subscription based”, and internal resources.
Addressing concerns about ‘hallucinations’, and other errors, OpenAI has stated that every Deep Research output will be “fully documented, with clear citations and a summary of the thinking, making it easy to reference and verify the information.” OpenAI is also making use of a special version of its recently announced o3 “reasoning” AI model that was trained through reinforcement learning on “real-world tasks requiring browser and Python tool use.”, to improve accuracy.
READ: This robotics startup has a sustainable solution for refurbishing discarded electronics (January 24, 2025)
In addition, it has been tested using Humanity’s Last Exam, an evaluation that includes more than 3,000 expert-level questions in a variety of academic fields. It received an accuracy of 26.6%, which might look like a low grade, however it is ahead of Gemini Thinking (6.2%), Grok-2 (3.8%), and OpenAI’s own GPT-4o (3.3%). Still, it is yet to be seen if these mitigations are enough to combat AI mistakes.
This announcement comes shortly after a similar research tool was announced by Google, which also had the same name.

