AI benchmarking organization criticized for waiting to disclose funding from OpenAI

Topics

Latest

Amazon

Image Credits:JASON REDMOND/AFP / Getty Images

Apps

Biotech & Health

Climate

OpenAI CEO Sam Altman speaks during the Microsoft Build conference at the Seattle Convention Center Summit Building in Seattle, Washington on May 21, 2024.

Image Credits:JASON REDMOND/AFP / Getty Images

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

Gadgets

back

Google

Government & Policy

Hardware

Instagram

layoff

Media & Entertainment

More from TechCrunch

event

Startup Battlefield

StrictlyVC

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

meet Us

An organization developing maths benchmark for AI did n’t break that it had get funding from OpenAI until relatively recently , drawing allegation of impropriety from some in the AI community .

Epoch AI , a nonprofit primarily funded by Open Philanthropy , a research and grantmaking base , revealed on December 20 that OpenAI had brook the world of FrontierMath . FrontierMath , a test with expert - level trouble designed to quantify an AI ’s mathematical attainment , was one of the benchmarks OpenAI used to demo its upcoming flagship AI , o3 .

In aposton the forum LessWrong , a contractor for Epoch AI going by the username “ Meemi ” say that many contributor to the FrontierMath benchmark were n’t inform of OpenAI ’s participation until it was made public .

“ The communication about this has been non - filmy , ” Meemi wrote . “ In my prospect Epoch AI should have let on OpenAI financial support , and contractors should have transparent information about the potential of their oeuvre being used for capabilities , when choosing whether to play on a benchmark . ”

On social sensitive , someusersraised concerns that the secrecy could wear away FrontierMath ’s reputation as an accusative bench mark . In addition to stake FrontierMath , OpenAI had visibleness into many of the problem and solutions in the benchmark — a fact that Epoch AI did n’t divulge prior to December 20 , when o3 was harbinger .

In aposton X , Stanford Ph.D. mathematics student Carina Hong also aver that OpenAI has favor admittance to FrontierMath thanks to its placement with Epoch AI , and that this is n’t pose well with some contributors .

“ Six mathematicians who significantly contribute to the FrontierMath bench mark confirmed [ to me ] … that they are incognizant that OpenAI will have exclusive admittance to this bench mark ( and others wo n’t ) , ” Hong suppose . “ Most verbalise they are not certain they would have contributed had they recognize . ”

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

In a response to Meemi ’s mail service , Tamay Besiroglu , associate director of Epoch AI and one of the organization ’s atomic number 27 - founder , asserted that the wholeness of FrontierMath had n’t been compromised , but admitted that Epoch AI “ made a misunderstanding ” in not being more transparent .

“ We were restricted from disclosing the partnership until around the time o3 launched , and in hindsight we should have negociate harder for the power to be diaphanous to the benchmark contributors as soon as possible , ” Besiroglu write . “ Our mathematicians deserved to know who might have entree to their work . Even though we were contractually limited in what we could say , we should have made transparency with our contributor a non - on the table part of our agreement with OpenAI . ”

Besiroglu added that while OpenAI has approach to FrontierMath , it has a “ verbal accord ” with Epoch AI not to utilize FrontierMath ’s problem set to cultivate its AI . ( Training an AI on FrontierMath would be akin toteaching to the test . ) Epoch AI also has a “ disjoined holdout set ” that serves as an additional safeguard for main verification of FrontierMath benchmark results , Besiroglu said .

“ OpenAI has … been to the full supportive of our decisiveness to maintain a separate , unseen holdout set , ” Besiroglu write .

However , muddying the waters , Epoch AI lead mathematician Elliot Glazernoted in a station on Redditthat Epoch AI has n’t be capable to independently assert OpenAI ’s FrontierMath o3 results .

“ My personal opinion is that [ OpenAI ’s ] mark is legit ( i.e. , they did n’t take aim on the dataset ) , and that they have no incentive to consist about internal benchmarking public presentation , ” Glazer say . “ However , we ca n’t vouch for them until our independent rating is complete . ”

The saga isyetanotherexampleof the challenge of developing empirical bench mark to pass judgment AI — and securing the necessary imagination for benchmark development without create the perception of conflict of interest .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI