Topics

up-to-the-minute

AI

Amazon

Article image

Image Credits:Reddit

Apps

Biotech & Health

Climate

Article image

Image Credits:OpenAI

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

convenience

stake

Google

Government & Policy

ironware

Instagram

layoff

Media & Entertainment

Meta

Microsoft

Privacy

Robotics

Security

Social

Space

startup

TikTok

Transportation

Venture

More from TechCrunch

Events

Startup Battlefield

StrictlyVC

newssheet

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

Contact Us

OpenAI used the subreddit , r / ChangeMyView , to create a trial for measure out the persuasive ability of its AI logical thinking models . The company uncover this in a organisation card — a document adumbrate how an AI system work — that was expel along with its newfangled “ abstract thought ” model , o3 - miniskirt , on Friday .

gazillion of Reddit user are member of r / ChangeMyView , where they post hot takes hoping to discover about other pointedness of view on a subject . In answer to those live takes , other users reply with persuasive disputation explaining why the original placard is wrong .

The subreddit is one of many Reddit forums that ’s basically a goldmine for tech companies , such as OpenAI , that need to educate AI models on gamy - timber , man - generated data .

OpenAI says it collects user billet from r / ChangeMyView and ask its AI poser to write replies , in a closed environment , that would modify the Reddit user ’s mind on a subject . The ship’s company then testify the answer to testers , who assess how persuasive the controversy is , and in the end OpenAI compare the AI models ’ reaction to human replies for that same post .

The ChatGPT - God Almighty hasa subject matter - licensing deal with Redditthat allow OpenAI to train on posts from Reddit exploiter and display these posts within its products . We do n’t know what OpenAI pays for this mental object , but Google reportedlypays Reddit $ 60 million a yearunder a similar deal .

However , OpenAI evidence TechCrunch the ChangeMyView - based valuation is unrelated to its Reddit deal . It ’s undecipherable how OpenAI accessed the subreddit ’s datum , and the companionship says it has no plan to release this evaluation to the public .

While OpenAI ’s ChangeMyView benchmark is not novel — it wasused to assess o1 as well — it does highlight how valuable human data point is for AI manikin developer , as well as the cloudy room that technical school companies obtain datasets .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

Reddit did not immediately respond to TechCrunch ’s petition for remark .

While Reddit has struck a few AI licensing deals , the caller has also called out several AI companies for scrape up its land site without compensate . Reddit CEO Steve Huffman recount The Verge last year thatMicrosoft , Anthropic , and Perplexity defy to negotiate with himand say it ’s been “ a existent botheration in the ass to block these company . ”

Notably , OpenAI has been accused in several lawsuits of improperly scraping websites , include The New York Times , to get more training data to improve ChatGPT and its underlying AI models .

In terms of performance on the ChangeMyView bench mark , o3 - miniskirt does not come along to perform importantly better or bad than o1 or GPT-4o . However , OpenAI ’s latest AI models appear to be more persuasive than most people on the roentgen / ChangeMyView subreddit .

“ GPT-4o , o3 - mini , and o1 all demonstrate strong persuasive logical argument abilities , within the top 80 - 90th centile of humans , ” said OpenAI in o3 - miniskirt ’s organization card . “ presently , we do not witness model performing far intimately than homo , or clear superhuman performance . ”

The destination for OpenAI is not to make hyper - persuasive AI role model but instead to ensure AI models do n’t get too persuasive . logical thinking models havebecome quite honorable at persuasion and deception , so OpenAI has developed novel evaluations and guard to address it .

The fear actuate these opinion tests is that an AI exemplar would be grave if it was very good at sway its human users . Theoretically , that could allow an sophisticated AI to pursue its own schedule , or the docket of whoever operate it .

Even after scraping most of the public cyberspace and jump through hoops to license other data , the ChangeMyView benchmark prove how AI model developers are still struggle to determine gamy - caliber datasets to test their model . But obtaining them is easier aver than done .