Topics
Latest
AI
Amazon
Image Credits:Getty Images
Apps
Biotech & Health
Climate
Image Credits:Getty Images
Cloud Computing
Commerce
Crypto
Image Credits:Alibaba
Enterprise
EVs
Fintech
fund-raise
Gadgets
punt
Government & Policy
Hardware
Layoffs
Media & Entertainment
Meta
Microsoft
Privacy
Robotics
Security
societal
Space
Startups
TikTok
Transportation
speculation
More from TechCrunch
case
Startup Battlefield
StrictlyVC
Podcasts
Videos
Partner Content
TechCrunch Brand Studio
Crunchboard
Contact Us
Chinese AI laboratory DeepSeek might be getting the majority of the technical school diligence ’s attention this week . But one of its top domesticated contender , Alibaba , is n’t seat idly by .
Alibaba ’s Qwen team on Mondayreleaseda new syndicate of AI models , Qwen2.5 - VL , that can do a number of text and persona psychoanalysis tasks . The models can parse file , empathize videos , and reckoning objects in images , as well as control a PC — standardized to the role model powering OpenAI ’s recently launchedOperator .
Per the Qwen squad ’s benchmarking , the best Qwen2.5 - VL model beats OpenAI’sGPT-4o , Anthropic’sClaude 3.5 Sonnet , and Google’sGemini 2.0 Flashon a range of video understanding , math , text file analysis , and dubiousness - answering evaluations .
Qwen2.5 - VL , which is available to test in Alibaba’sQwen Chatapp and todownloadfrom AI dev political platform Hugging Face , can analyze charts and graphic , take out information from scans of invoices and forms , and “ comprehend ” multiple - hours - long video recording , the Qwen team say . Qwen2.5 - VL can also recognize “ IPs from picture show and TV serial publication , as well as a wide change of products,”per the team — suggesting that the models might ’ve been train in part on copyrighted works .
Qwen2.5 - VL , being AI develop by a Taiwanese company , has certain restriction on the topics it will discuss — at least in Qwen Chat . When I require the enceinte and most capable Qwen2.5 - VL model , Qwen2.5 - VL-72B , to tattle about “ Xi Jinping ’s error , ” Qwen Chat give an alert message .
China ’s internet regulatorbenchmarksmany models developed in the country to assure their reply “ embody core socialistic note value . ”ManyChinese AI systemsdeclineto respond to topics that might leaven the anger of regulators , such as Taiwan ’s autonomy .
One of Qwen2.5 - VL ’s more interesting feature of speech is its ability to interact with software package — both on PCs and mobile devices . A video mail on XTC by Philipp Schmid , a technical leading at Hugging Face , showed Qwen2.5 - VL launch the Booking.com app for Android and booking a flight of steps from Chongqing to Beijing .
Join us at TechCrunch Sessions: AI
Exhibit at TechCrunch Sessions: AI
Do n’t Miss@Alibaba_Qwen2.5 VL ! Despite all the Deepseek Hype , Qwen just dropped the best open Multimodal ! Qwen 2.5 VL is a Vision Language Model that can control your computer , similar to the@OpenAIoperator , elicit integrated information from chart , and more ! !
TL;DR;3️⃣ … pic.twitter.com / GeEGVdl0tI
— Philipp Schmid ( @_philschmid)January 27 , 2025
In the video below , a Qwen2.5 - VL example curb apps on a Linux background — but does n’t seem to accomplish much beyond tack tabs . Perhaps tellingly , Qwen ’s benchmarking shows Qwen2.5 - VL scoring poorly on OSWorld , a bench mark that tries to mimic a real computer environment .
LMAO Qwen 2.5 VL can do Computer Use , out of the box , taking on OpenAI Operator HEAD ON ! 🐐 pic.twitter.com/lwMECXzNSu
— Vaibhav ( VB ) Srivastav ( @reach_vb)January 27 , 2025
The two small , less advanced model in the Qwen2.5 - VL series , Qwen2.5 - VL-3B and Qwen2.5 - VL-7B , are available under a permissive license . The flagship Qwen2.5 - VL-72B , however , is under Alibaba ’s custom license , which necessitate that companies and devs with more than 100 million monthly active users bespeak permission from Qwen / Alibaba before deploy the simulation commercially .