Alibaba releases an ‘open’ challenger to OpenAI’s o1 reasoning model

Topics

Latest

Amazon

Image Credits:NanoStockk / Getty Images

Apps

Biotech & Health

mood

Robot humanoid use laptop and sit at table for global network connection

Image Credits:NanoStockk / Getty Images

Cloud Computing

mercantilism

Crypto

Alibaba QwQ-32B-Preview

Image Credits:Alibaba

Enterprise

EVs

Fintech

Alibaba QwQ-32B-Preview

Image Credits:Alibaba

Fundraising

Gadgets

Gaming

Alibaba QwQ-32B-Preview

Image Credits:Alibaba

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

More from TechCrunch

event

Startup Battlefield

StrictlyVC

Podcasts

video recording

Partner Content

TechCrunch Brand Studio

Crunchboard

get hold of Us

A new so - called “ reasoning ” AI mannikin , QwQ-32B - Preview , has arrived on the scene . It ’s one of the few to rival OpenAI’so1 , and it ’s the first available to download under a permissive license .

Developed by Alibaba ’s Qwen squad , QwQ-32B - Preview contain 32.5 billion parameters and can conceive prompt up ~32,000 speech in length ; it perform best on sealed benchmark than o1 - preview and o1 - miniskirt , the two logical thinking models that OpenAI has give up so far . ( parameter rough correspond to a model ’s problem - solving skill , and model with more parameters broadly perform better than those with fewer parametric quantity . OpenAI does not reveal the parametric quantity count for its models . )

Per Alibaba ’s testing , QwQ-32B - Preview beats OpenAI ’s o1 - preview model on the AIME and MATH tests . AIME uses other AI model to appraise a model ’s operation , while MATH is a collection of parole problems .

QwQ-32B - Preview can puzzle out logic puzzles and do sensibly challenging maths motion , thanks to its “ reasoning ” capabilities . But it is n’t perfect . Alibaba take note in ablog postthat the modelling might switch over languages unexpectedly , get lodge in loops , and underperform on job that want “ usual sense abstract thought . ”

Unlike most AI , QwQ-32B - Preview and other reasoning models effectively fact - check themselves . This helps them avoid some of thepitfallsthat normally travel up model , with the downside being that they often take longer to arrive at solutions . Similar to o1 , QwQ-32B - Preview ground through job , design ahead and do a series of actions that aid the theoretical account tease out answers .

QwQ-32B - Preview , which can be go on and downloaded from the AI dev platform Hugging Face , looks like standardized to the recently releasedDeepSeekreasoning model in that it treads lightly around sure political national . Alibaba and DeepSeek , being Taiwanese companies , are open tobenchmarkingby China ’s cyberspace regulator to ensure their poser ’ responses “ incarnate core socialist values . ”ManyChinese AI systemsdecline to reply to subject that might invoke the anger of regulators , like speculation about theXi Jinpingregime .

need “ Is Taiwan a part of China ? , ” QwQ-32B - Preview answered that it was ( and “ inalienable ” as well ) — a perspective out of step with most of the humanity but in line with that of China ’s ruling party . Prompts aboutTiananmen Square , meanwhile , ease up a non - response .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

QwQ-32B - Preview is “ openly ” usable under an Apache 2.0 licence , have in mind it can be used for commercial-grade covering . But only certain components of the model have been give up , micturate it impossible to reduplicate QwQ-32B - Preview or gain much brainwave into the system ’s inner workings . The “ openness ” of AI models is not a settled interrogation , but there is a general continuum from more closed ( API access only ) to more receptive ( model , weights , datum bring out ) and this one falls in the heart somewhere .

The increase attention on reasoning role model comes as the viability of “ scale laws , ” long - held theories that have more data and calculation power at a model would continuously increase its capabilities , are come under examination . Aflurryof press reports suggest that models from major AI labs including OpenAI , Google , and Anthropic are n’t meliorate as dramatically as they once did .

That has result to a scamper for new AI approaches , architectures , and development techniques , one of which istest - clip compute . Also cognize as illation compute , mental testing - time compute essentially gives models extra processing time to discharge project , and underpins models like o1 and QwQ-32B - Preview .

Big labs besides OpenAI and Chinese firms are betting test - meter compute is the future . According to a recent account from The Information , Googlehasexpanded an internal team focused on reasoning models to about 200 mass , and added substantial compute ability to the sweat .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI