Topics

Latest

AI

Amazon

Article image

Image Credits:Bryce Durbin / TechCrunch

Apps

Biotech & Health

mood

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

Gadgets

Gaming

Google

Government & Policy

Hardware

Instagram

layoff

Media & Entertainment

Meta

Microsoft

concealment

Robotics

Security

Social

Space

Startups

TikTok

transport

Venture

More from TechCrunch

effect

Startup Battlefield

StrictlyVC

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

Contact Us

On Tuesday , OpenAI released novel tool designed to help developer and initiative build AI agents — automated systems that can severally accomplish tasks — using the company ’s own AI simulation and fabric .

The tools are part of OpenAI ’s new Responses API , which let business develop impost AI agents that can execute WWW searches , scan through company files , and sail web site , much likeOpenAI ’s Operator product . The Responses API efficaciously replaces OpenAI’sAssistants API , which the company plan to sunset in the first one-half of 2026 .

The ballyhoo around AI agent has grow dramatically in recent years despite the fact that the tech industry has struggled to show people , or even determine , what “ AI agents ” really are . In the most recent example of agentive role plug running in the lead of utility , Chinese startup Butterfly Effect earlier this calendar week go viralfor a new AI agent platform call Manusthat user quickly discover did n’t deliver on many of the company ’s promises .

In other word , the stake are gamey for OpenAI to get agents right .

“ It ’s moderately well-off to demonstrate your factor , ” Olivier Godement , OpenAI ’s API product pass , recite TechCrunch in an consultation . “ To scale an agent is middling heavy , and to get hoi polloi to use it often is very operose . ”

Earlier this twelvemonth , OpenAI introduced two AI agents inChatGPT : Operator , which navigate websites on your behalf , anddeep inquiry , which accumulate research reports for you . Both tools offered a glimpse at what agentic applied science can achieve , but forget quite a bit to be desire in the “ autonomy ” department .

Now with the Responses API , OpenAI want to deal admission to the components that index AI agent , allowing developers to build their own Operator- and deep inquiry - style agentic applications . OpenAI hopes that developers can create some applications with its agent technology that feel more sovereign than what ’s available today .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

Using the Responses API , developers can tip the same AI models ( in preview ) under the hood of OpenAI’sChatGPT Searchweb lookup tool : GPT-4o search and GPT-4o miniskirt search . The exemplar can browse the web for reply to interrogation , citing root as they give replies .

OpenAI claims that GPT-4o search and GPT-4o miniskirt search are extremely factually accurate . On the troupe ’s SimpleQA benchmark , which measure the ability of manakin to suffice short , fact - seeking enquiry , GPT-4o search scores 90 % while GPT-4o miniskirt search scores 88 % ( gamey is better ) . For equivalence , GPT-4.5 — OpenAI ’s much bigger , recently eject model — grudge just 63 % .

The Responses API also includes a file cabinet lookup utility program that can quickly scan across files in a company ’s database to call back information . ( OpenAI claims that it wo n’t train models on these files . ) In addition , developers using the Responses API can tap OpenAI ’s Computer - Using Agent ( CUA ) manakin , which power wheeler dealer . The model mother mouse and keyboard actions , allowing developers to automatise computer employment task like data ledger entry and app work flow .

go-ahead can optionally start the CUA poser , which is discharge in research preview , topically on their own systems , OpenAI enunciate . The consumer interlingual rendition of the CUA available in Operator can only take activeness on the web .

To be clear , the Responses API wo n’t puzzle out all the proficient problem blight AI agents today .

While AI - powered search tools are more accurate than traditional AI models — a fact that is unsurprising given they can just look up the right answer — World Wide Web search does not renderAI hallucinations a solved problem . GPT-4o search still gets 10 % of factual questions wrong . Beyond their truth , AI search tools also tend tostruggle with short , navigational queries(such as “ Lakers tally today ” ) , and recent report paint a picture thatChatGPT ’s quotation are n’t always reliable .

In a web log stake provided to TechCrunch , OpenAI said that the CUA model is “ not yet extremely reliable for automating undertaking on operating systems , ” and that it ’s susceptible to make “ inadvertent ” misapprehension .

However , OpenAI said these are early looping of their agent tools , and it ’s invariably work to better them .

Alongside the Responses API , OpenAI is issue an open - source toolkit called the Agents SDK , which offer developer free shaft to integrate models with their internal systems , put in place safe-conduct , and monitor AI broker activities for debugging and optimization purposes . The Agents SDK is a follow - up of sorts to OpenAI ’s Swarm , a theoretical account for multi - agent instrumentation that the company free belatedly last class .

Godement said he hope OpenAI can bridge the gap between AI agent demos and products this year , and that , in his notion , “ agents are the most impactful program of AI that will hap . ” That echo a proclamation OpenAI CEO Sam Altman made in January : that 2025 is the year AI agents enter the workforce .

Whether or not 2025 truly becomes the “ year of the AI factor , ” OpenAI ’s latest releases show the company wants to budge from tatty broker demos to impactful tools .