OpenAI used this subreddit to test AI persuasion

Topics

up-to-the-minute

Amazon

Image Credits:Reddit

Apps

Biotech & Health

Climate

Image Credits:OpenAI

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

convenience

stake

Google

Government & Policy

ironware

Instagram

layoff

Media & Entertainment

More from TechCrunch

Events

Startup Battlefield

StrictlyVC

newssheet

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

OpenAI used the subreddit , r / ChangeMyView , to create a trial for measure out the persuasive ability of its AI logical thinking models . The company uncover this in a organisation card — a document adumbrate how an AI system work — that was expel along with its newfangled “ abstract thought ” model , o3 - miniskirt , on Friday .

gazillion of Reddit user are member of r / ChangeMyView , where they post hot takes hoping to discover about other pointedness of view on a subject . In answer to those live takes , other users reply with persuasive disputation explaining why the original placard is wrong .

The subreddit is one of many Reddit forums that ’s basically a goldmine for tech companies , such as OpenAI , that need to educate AI models on gamy - timber , man - generated data .

OpenAI says it collects user billet from r / ChangeMyView and ask its AI poser to write replies , in a closed environment , that would modify the Reddit user ’s mind on a subject . The ship’s company then testify the answer to testers , who assess how persuasive the controversy is , and in the end OpenAI compare the AI models ’ reaction to human replies for that same post .

The ChatGPT - God Almighty hasa subject matter - licensing deal with Redditthat allow OpenAI to train on posts from Reddit exploiter and display these posts within its products . We do n’t know what OpenAI pays for this mental object , but Google reportedlypays Reddit $ 60 million a yearunder a similar deal .

However , OpenAI evidence TechCrunch the ChangeMyView - based valuation is unrelated to its Reddit deal . It ’s undecipherable how OpenAI accessed the subreddit ’s datum , and the companionship says it has no plan to release this evaluation to the public .

While OpenAI ’s ChangeMyView benchmark is not novel — it wasused to assess o1 as well — it does highlight how valuable human data point is for AI manikin developer , as well as the cloudy room that technical school companies obtain datasets .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

Reddit did not immediately respond to TechCrunch ’s petition for remark .

While Reddit has struck a few AI licensing deals , the caller has also called out several AI companies for scrape up its land site without compensate . Reddit CEO Steve Huffman recount The Verge last year thatMicrosoft , Anthropic , and Perplexity defy to negotiate with himand say it ’s been “ a existent botheration in the ass to block these company . ”

Notably , OpenAI has been accused in several lawsuits of improperly scraping websites , include The New York Times , to get more training data to improve ChatGPT and its underlying AI models .

In terms of performance on the ChangeMyView bench mark , o3 - miniskirt does not come along to perform importantly better or bad than o1 or GPT-4o . However , OpenAI ’s latest AI models appear to be more persuasive than most people on the roentgen / ChangeMyView subreddit .

“ GPT-4o , o3 - mini , and o1 all demonstrate strong persuasive logical argument abilities , within the top 80 - 90th centile of humans , ” said OpenAI in o3 - miniskirt ’s organization card . “ presently , we do not witness model performing far intimately than homo , or clear superhuman performance . ”

The destination for OpenAI is not to make hyper - persuasive AI role model but instead to ensure AI models do n’t get too persuasive . logical thinking models havebecome quite honorable at persuasion and deception , so OpenAI has developed novel evaluations and guard to address it .

The fear actuate these opinion tests is that an AI exemplar would be grave if it was very good at sway its human users . Theoretically , that could allow an sophisticated AI to pursue its own schedule , or the docket of whoever operate it .

Even after scraping most of the public cyberspace and jump through hoops to license other data , the ChangeMyView benchmark prove how AI model developers are still struggle to determine gamy - caliber datasets to test their model . But obtaining them is easier aver than done .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI