OpenAI launches program to design new ‘domain-specific’ AI benchmarks

Topics

tardy

Amazon

Image Credits:Jakub Porzycki/NurPhoto / Getty Images

Apps

Biotech & Health

Climate

OpenAI ChatGPT website displayed on a laptop screen is seen in this illustration photo.

Image Credits:Jakub Porzycki/NurPhoto / Getty Images

Cloud Computing

Commerce

Crypto

endeavour

EVs

Fintech

Fundraising

Gadgets

Gaming

Google

Government & Policy

ironware

Instagram

layoff

Media & Entertainment

More from TechCrunch

Events

Startup Battlefield

StrictlyVC

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

OpenAI thinks AI benchmarks are broken . Now the society is set up a program to fix how AI models are nock .

The unexampled OpenAI Pioneers Program will focus on creating evaluations for AI models that “ set the barroom for what good see like , ” as OpenAI phrased it in ablog post .

“ As the pace of AI adoption accelerates across industries , there is a motivation to understand and improve its impingement in the world , ” the company continued in its post . “ make sphere - specific evals are one way to better reflect real - world use cases , help teams assess model performance in practical , high - wager environments . ”

As therecentcontroversywith the crowdsourced bench mark LM Arena and Meta ’s Maverick model illustrate , it ’s tough to know , these days , precisely what differentiates one mannequin from another . Many wide used AI benchmarks measure public presentation on esoteric tasks , like puzzle out doctorate - level math problems . Others can be gamed , or do n’t ordinate well with most hoi polloi ’s predilection .

Through the Pioneers Program , OpenAI hopes to create benchmarks for specific domains like sound , finance , policy , healthcare , and account . The lab tell that , in the hail month , it ’ll work with “ multiple companies ” to plan tailor benchmarks and finally share those benchmarks publicly , along with “ industry - specific ” evaluations .

“ The first cohort will focus on startup who will help lie the foundations of the OpenAI Pioneers Program , ” OpenAI wrote in the blog post . “ We ’re selecting a handful of startup for this initial age bracket , each working on high - economic value , applied purpose cases where AI can get real - world impact . ”

company in the program will also have the opportunity to work with OpenAI ’s team to make poser improvements via reinforcement fine tuning , a proficiency that optimizes models for a minute set of tasks , OpenAI say .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

The big question is whether the AI community will embrace benchmarks whose creation was funded by OpenAI . OpenAI has supported benchmarking travail financially before , and designed its own evaluation . But partnering with client to liberate AI tests may be seen as an honourable bridge deck too far .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI