Ai2 says its new AI model beats one of DeepSeek’s best

Topics

Latest

Amazon

Image Credits:Yuichiro Chino / Getty Images

Apps

Biotech & Health

clime

Abstract image of big data wave and information vertical line dots on a dark background.

Image Credits:Yuichiro Chino / Getty Images

Cloud Computing

Commerce Department

Crypto

Ai2 Tulu3-405B

Ai2 tested Tulu3 405B on popular benchmarks.Image Credits:Ai2

Enterprise

EVs

Fintech

Fundraising

widget

bet on

Google

Government & Policy

Hardware

Instagram

layoff

Media & Entertainment

More from TechCrunch

consequence

Startup Battlefield

StrictlyVC

newssheet

Podcasts

video

Partner Content

TechCrunch Brand Studio

Crunchboard

Move over , DeepSeek . There ’s a new AI maven in Ithiel Town — and they ’re American .

On Thursday , Ai2 , a nonprofit AI research institute based in Seattle , released a model that it claims outperformsDeepSeek V3 , one of Chinese AI company DeepSeek ’s leading organisation .

Ai2 ’s model , calledTulu 3 405B , also beats OpenAI’sGPT-4oon sealed AI benchmarks , according to Ai2 ’s internal testing . Moreover , unlike GPT-4o ( and even DeepSeek V3 ) , Tulu 3 405B isopen seed , which have in mind all of the components necessary to replicate it from scratch are freely available andpermissively licensed .

A spokesperson for Ai2 told TechCrunch that the lab think Tulu 3 405B “ underscores the U.S. ’ potential to start the global development of dear - in - class productive AI models . ”

“ This milepost is a primal moment for the future of open AI , reenforce the U.S. ’ stance as a leader in competitive , open source models , ” the representative said . “ With this launching , Ai2 is acquaint a brawny , U.S.-developed alternative to DeepSeek ’s models — marking a polar moment not just in AI evolution , but in showcasing that the U.S. can lead with free-enterprise , undetermined seed AI sovereign of the tech giants . ”

Tulu 3 405B is a rather large model . comprise 405 billion parametric quantity , it postulate 256 GPUs running in parallel to train , according to Ai2 . parameter roughly fit to a model ’s trouble - solving acquirement , and models with more parameters generally do well than those with fewer parameter .

According to Ai2 , one of the keys to light upon private-enterprise carrying into action with Tulu 3 405B was a technique called reinforcement learning with confirmable rewards . Reinforcement ascertain with verifiable rewards , or RLVR , trains mannikin on task with “ confirmable ” outcomes , like maths problem resolve and come after educational activity .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

Ai2 claim that on the benchmark PopQA , a set of 14,000 specialised noesis questions source from Wikipedia , Tulu 3 405B outwit not only DeepSeek V3 and GPT-4o , but alsoMeta ’s Llama 3.1 405Bmodel . Tulu 3 405B also had the highest carrying into action of any example in its class on GSM8 K , a trial run containing level schoolhouse - stratum maths Bible problem .

Tulu 3 405B isavailable to testvia Ai2 ’s chatbot web app , and thecode to discipline the modelis on GitHub and theAI dev political program Hugging Face . Get it while it ’s raging — and before the next bench mark - beating flagship AI model comes along .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI