DeepSeek claims its ‘reasoning’ model beats OpenAI’s o1 on certain benchmarks

Topics

Latest

Amazon

Image Credits:Peresmeh / Getty Images

Apps

Biotech & Health

Climate

Binary code in blue with little yellow locks in between to illustrate data protection.

Image Credits:Peresmeh / Getty Images

Cloud Computing

commercialism

Crypto

DeepSeek R1 refusal

R1’s filtering in action.Image Credits:DeepSeek

Enterprise

EVs

Fintech

Fundraising

Gadgets

bet on

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

More from TechCrunch

result

Startup Battlefield

StrictlyVC

Podcasts

picture

Partner Content

TechCrunch Brand Studio

Crunchboard

adjoin Us

Formosan AI lab DeepSeek has released an open version of DeepSeek - R1 , its so - call abstract thought manikin , that it claims performs as well as OpenAI’so1on certain AI benchmarks .

R1 is available from the AI dev platform Hugging Face under an MIT license , intend it can be used commercially without restriction . According to DeepSeek , R1 beats o1 on the bench mark AIME , MATH-500 , and SWE - bench Verified . AIME use other model to evaluate a model ’s performance , while MATH-500 is a assembling of word problems . SWE - bench Verified , meanwhile , concentrate on computer programming tasks .

Being a reasoning mannikin , R1 in effect fact - checks itself , whichhelps it to avoid some of the pitfall that normally spark off up mannequin . logical thinking models take a little longer — usually arcsecond to minutes longer — to get in at root equate to a typical nonreasoning manikin . The upside is that they incline to be more reliable in land such as cathartic , scientific discipline , and maths .

R1 hold 671 billion parameters , DeepSeekrevealed in atechnical report . Parameters roughly tally to a model ’s job - solving skills , and models with more parameters by and large execute good than those with fewer parameter .

Indeed , 671 billion parameters is monolithic , but DeepSeek also give up “ distill ” version of R1 ranging in sizing from 1.5 billion parameters to 70 billion parameters . The small can run on a laptop . As for the full R1 , it take husky hardware , but itisavailable through DeepSeek ’s API at prices 90%-95 % cheaper than OpenAI ’s o1 .

Clem Delangue , the chief executive officer of Hugging Face , enounce in apost on Xon Monday that developer on the chopine have produce more than 500 “ derivative ” models of R1 that have racked up 2.5 million downloads combined — five times the number of downloads the prescribed R1 has gotten .

It ’s been released just a few Clarence Shepard Day Jr. ago and already more than 500 derivative models of@deepseek_aihave been create all over the humans on@huggingfacewith 2.5 million downloads ( 5x the original weights).The ability of decentralized undetermined - source AI !

There is a downside to R1 . Being a Chinese good example , it ’s subject tobenchmarkingby China ’s internet regulator to assure that its answer “ embody core socialistic value . ” R1 wo n’t answer dubiousness about Tiananmen Square , for representative , or Taiwan ’s autonomy .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

ManyChinese AI systems , includingother reasoning models , declineto respond to topics that might raise the ire of regulators in the country , such as supposition about theXi Jinpingregime .

R1 arrives days after the forthcoming Biden administrationproposedharsherexport rules and restriction on AI technologies for Chinese ventures . Companies in China were already prevented from buying advanced AI chips , but if the new rules go into effect as written , companies will be faced with nonindulgent caps on both the semiconductor unit technical school and models needed to bootstrap advanced AI systems .

In apolicy documentlast hebdomad , OpenAI urged the U.S. government to support the evolution of U.S. AI , lest Chinese role model touch or surpass them in capability . In aninterviewwith The Information , OpenAI ’s VP of policy Chris Lehane single out High Flyer Capital Management , DeepSeek ’s corporate parent , as an organisation of particular worry .

So far , at least three Chinese research lab — DeepSeek , Alibaba , andKimi , which is owned by Chinese unicornMoonshot AI — have bring out models that they arrogate rival o1 . ( Of notice , DeepSeek was the first — itannounceda trailer of R1 in late November . ) In aposton X , Dean Ball , an AI investigator at George Mason University , said that the trend advise Chinese AI labs will carry on to be “ firm followers . ”

“ The impressive performance of DeepSeek ’s distilled models [ … ] entail that very open reasoner will go forward to proliferate wide and be runnable on local hardware , ” Ball wrote , “ far from the oculus of any top - down control regime . ”

This history originally publish on January 20 and was update on January 27 with more information .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI