Topics
Latest
AI
Amazon
Image Credits:Maxwell Zeff
Apps
Biotech & Health
mood
Image Credits:Maxwell Zeff
Cloud Computing
Commerce
Crypto
Google’s new reasoning model struggles with counting the letters in words, SOMETIMES.Image Credits:Google
go-ahead
EVs
Fintech
Fundraising
gismo
Gaming
Government & Policy
ironware
Layoffs
Media & Entertainment
Meta
Microsoft
Privacy
Robotics
Security
societal
blank
startup
TikTok
Transportation
speculation
More from TechCrunch
Events
Startup Battlefield
StrictlyVC
Podcasts
Videos
Partner Content
TechCrunch Brand Studio
Crunchboard
get hold of Us
Google has released what it ’s calling a new “ abstract thought ” AI model — but it ’s in the experimental stages , and from our abbreviated testing , there ’s for sure way for betterment .
The new poser , called Gemini 2.0 Flash Thinking Experimental ( a mouthful , to be sure ) , is uncommitted inAI Studio , Google ’s AI prototyping platform . A model card describe it as “ best for multimodal understanding , abstract thought , and rally , ” with the ability to “ ground over the most complex problems ” in area such as programming , mathematics , and physics .
In aposton X , Logan Kilpatrick , who lead Cartesian product for AI Studio , call Gemini 2.0 Flash Thinking Experimental “ the first step in [ Google ’s ] reasoning journey . ” Jeff Dean , chief scientist for Google DeepMind , Google ’s AI inquiry division , saidin his own post that Gemini 2.0 Flash Thinking Experimental is “ trained to use thoughts to strengthen its logical thinking . ”
“ We see promise results when we increase illation fourth dimension figuring , ” Dean said , referring to the amount of computing used to “ run ” the manakin as it considers a head .
It ’s still an early version , but check out how the good example handles a challenging puzzle involve both visual and textual cue : ( 2/3)pic.twitter.com / JltHeK7Fo7
— Logan Kilpatrick ( @OfficialLoganK)December 19 , 2024
Built on Google ’s recently announcedGemini 2.0 Flashmodel , Gemini 2.0 Flash Thinking Experimental seems to be standardized in design to OpenAI’so1and other so - call abstract thought models . Unlike most AI , reasoning models effectively fact - check themselves , whichhelps them ward off some of the pitfalls that normally activate up AI models .
As a drawback , reasoning exemplar often take longer — usually seconds to minutes longer — to get in at solutions .
Join us at TechCrunch Sessions: AI
Exhibit at TechCrunch Sessions: AI
give a command prompt , Gemini 2.0 Flash think Experimental pauses before responding , considering a number of related prompt and “ explain ” its logical thinking along the way . After a while , the model summarizes what it considers to be the most accurate reply .
Well — that ’s what ’s guess to happen . When I demand Gemini 2.0 Flash Thinking Experimental how many radius ’s were in the word “ strawberry mark , ” it said “ two . ”
Your mileage may motley .
In thewake of the release of o1 , there ’s been anexplosionof reasoning manikin from rival AI labs — not just Google . In former November , DeepSeek , an AI research company fund by quant traders , launch a preview of its first reasoning simulation , DeepSeek - R1 . That same calendar month , Alibaba ’s Qwen teamunveiledwhat it claim was the first “ loose ” competitor to o1 .
Bloombergreportedin October that Google had several teams developing logical thinking fashion model . Subsequentreportingby The Information in November revealed that the company has at least 200 researchers focalise on the technology .
What opened the reasoning model floodgates ? Well , for one , the search for novel approaches to rectify procreative AI . As my colleague Max Zeff recentlyreported , “ brute force ” techniques to scale up models are no longer yielding the improvements they once did .
Not everyone ’s convinced that reasoning models are the best itinerary forward . They tend to be expensive , for one , thanks to the large amount of work out power required to run them . And while they’veperformedwell onbenchmarksso far , it ’s not clear whether reasoning model can maintain this charge per unit of progress .