OpenAI launches o3-mini, its latest ‘reasoning’ model

Topics

Latest

Amazon

Image Credits:Jakub Porzycki/NurPhoto / Getty Images

Apps

Biotech & Health

clime

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

contraption

Gaming

Google

Government & Policy

Hardware

Instagram

layoff

Media & Entertainment

More from TechCrunch

Events

Startup Battlefield

StrictlyVC

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

touch Us

OpenAI on Friday launch a unexampled AI “ reasoning ” model , o3 - mini , the newest in the company’so sept of abstract thought theoretical account .

OpenAIfirst preview the model in Decemberalongside a more open organisation called o3 , but the launch comes at a pivotal moment for the companionship , whose ambition — and challenges — are seemingly growing by the day .

OpenAI is battling the percept that it ’s cede earth in the AI airstream toChinese party like DeepSeek , which OpenAI alleges might have steal its IP . It has been trying toshore up its relationship with Washingtonas it simultaneously pursues anambitious data center projection , andas it reportedly position the groundworkfor one of the large funding rounds in history .

Which play us to o3 - mini . OpenAI is pitch its fresh model as both “ hefty ” and “ affordable . ”

“ Today ’s launch marks [ … ] an important footfall toward broadening accessibility to modern AI in serving of our charge , ” an OpenAI voice tell TechCrunch .

More efficient reasoning

Unlike most large language model , reasoning models like o3 - mini thoroughly fact - break themselves before give out results . This help themavoid some of the pitfallsthat unremarkably activate up models . These reasoning manikin do take a little longer to arrive at solutions , but the barter - off is that they run to be more true — though not perfect — in knowledge base like physical science .

O3 - mini is fine - tuned for STEM problems , specifically for programming , mathematics , and skill . OpenAI take the example is largely on par with the o1 family , o1 and o1 - mini , in damage of capabilities , but lead quicker and costs less .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

The company claimed that external testers prefer o3 - mini ’s answer over those from o1 - mini more than half the meter . O3 - mini apparently also made 39 % few “ major mistakes ” on “ tough substantial - human race questions ” inA / B vitamin testsversus o1 - mini , and produced “ well-defined ” response while delivering answers about 24 % faster .

Users with premium plans can choose o3 - mini using the ChatGPT drop cloth - down menu . Free user can click or tap the new “ Reason ” release in the chat bar , or have ChatGPT “ re - generate ” an answer .

Beginning Friday , o3 - mini will also be available via OpenAI ’s API to select developers , but it initially will not have support for analyze images . Devs can select the level of “ abstract thought effort ” ( scurvy , medium , or high ) to get o3 - mini to “ think harder ” based on their use causa and response time pauperism .

O3 - mini is priced at $ 0.55 per million cached input token and $ 4.40 per million output relic , where a million tokens equates to roughly 750,000 words . That ’s 63 % cheaper than o1 - mini , and competitive with DeepSeek ’s R1 reasoning model pricing . DeepSeek charges $ 0.14 per million cached input tokens and $ 2.19 per million output tokens for R1 access through its API .

In ChatGPT , o3 - mini is define to medium reasoning effort , which OpenAI says provides “ a balanced trade - off between stop number and accuracy . ” pay off user will have the option of take “ o3 - mini - high ” in the model chooser , which will save what OpenAI calls “ higher intelligence ” in exchange for slower responses .

Regardless of which version of o3 - mini ChatGPT users choose , the modeling will work with search to find up - to - particular date answers with links to relevant web sources . OpenAI caution that the functionality is a “ prototype ” as it works to desegregate search across its logical thinking models .

“ While o1 remains our broader universal - noesis logical thinking model , o3 - miniskirt provides a specialised alternative for technological arena requiring preciseness and speed , ” OpenAI write in a web log position on Friday . “ The release of o3 - mini marks another step in OpenAI ’s missionary post to push the boundaries of monetary value - effectual intelligence agency . ”

Caveats abound

O3 - mini is not OpenAI ’s most herculean model to date , nor does it leapfrog DeepSeek ’s R1 reasoning model in every bench mark .

O3 - mini beats R1 on AIME 2024 , a trial that measure how well model understand and respond to complex statement — but only with gamy reasoning effort . It also beats R1 on the computer programing - focussed test SWE - bench Verified ( by .1 full stop ) , but again , only with gamy reasoning movement . On low-spirited reasoning effort , o3 - mini lags R1 on GPQA Diamond , which try out model with Ph.D. - level physics , biology , and chemistry questions .

To be bonny , o3 - mini answers many queries at competitively low toll and latency . In the post , OpenAI equate its carrying into action to the o1 class :

“ With low abstract thought effort , o3 - miniskirt achieves comparable execution with o1 - miniskirt , while with average effort , o3 - mini reach comparable performance with o1 , ” OpenAI writes . “ O3 - mini with medium reasoning drive match o1 ’s performance in math , coding and skill while delivering faster responses . Meanwhile , with mellow reasoning effort , o3 - mini outperforms both o1 - mini and o1 . ”

It ’s worth noting that o3 - mini ’s performance advantage over o1 is slim in some areas . On AIME 2024 , o3 - miniskirt beatnik o1 by just 0.3 part points when set to gamey reasoning effort . And on GPQA Diamond , o3 - mini does n’t surpass o1 ’s score even on high reasoning exploit .

OpenAI asserts that o3 - mini is as “ dependable ” or good than the o1 family , however , thanks to reddened - team up efforts and its “ deliberative alignment ” methodology , which realize mannequin “ call up ” about OpenAI ’s guard insurance while they ’re responding to queries . According to the caller , o3 - mini “ significantly travel by ” one of OpenAI ’s flagship role model , GPT-4o , on “ challenge safety and prisonbreak evaluations . ”

Topics#

More from TechCrunch#

More efficient reasoning#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Caveats abound#