One of Google’s recent Gemini AI models scores worse on safety

Topics

Latest

Amazon

Image Credits:Andrey Rudakov/Bloomberg / Getty Images

Apps

Biotech & Health

Climate

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

fundraise

widget

back

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

More from TechCrunch

issue

Startup Battlefield

StrictlyVC

Podcasts

video

Partner Content

TechCrunch Brand Studio

Crunchboard

meet Us

A of late release Google AI model scores bad on sure refuge trial run than its precursor , according to the company ’s internal benchmarking .

In atechnical reportpublished this week , Google reveals that its Gemini 2.5 Flash model is more likely to generate text that outrage its base hit guidelines than Gemini 2.0 Flash . On two metric , “ text - to - text condom ” and “ look-alike - to - text safety , ” Gemini 2.5 Flash lapse 4.1 % and 9.6 % , severally .

Text - to - text edition safe measures how frequently a mannequin violates Google ’s guidelines grant a prompt , while image - to - text condom judge how tight the model adheres to these boundaries when prompted using an image . Both tests are automated , not homo - supervised .

In an emailed statement , a Google spokesperson confirmed that Gemini 2.5 Flash “ performs worse on text - to - text and image - to - textbook safety . ”

These surprising bench mark resultant come as AI companies move to make their models more permissive — in other quarrel , less likely to deny to reply to controversial or sensitive matter . For its late crop of Llama models , Meta said it tuned the simulation not to endorse “ some aspect over others ” and to reply to more “ moot ” political prompts . OpenAI said earlier this year that it wouldtweak next modelsto not take an editorial posture and offer multiple position on controversial topics .

Sometimes , those permissiveness efforts have backfired . TechCrunch report Mondaythat the default model powering OpenAI ’s ChatGPT allowed minors to generate erotic conversation . OpenAI blamed the demeanour on a “ bug . ”

According to Google ’s technical write up , Gemini 2.5 Flash , which is still in preview , follows instructions more dependably than Gemini 2.0 Flash , inclusive of instructions that track problematic telephone circuit . The company claims that the regress can be ascribe partly to false positive degree , but it also take on that Gemini 2.5 Flash sometimes generates “ offensive content ” when explicitly asked .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

“ Naturally , there is tension between [ teaching following ] on sensitive topic and safety equipment policy assault , which is reflected across our evaluations , ” scan the study .

stacks from SpeechMap , a bench mark that probes how poser respond to sensitive and controversial prompts , also propose that Gemini 2.5 Flash is far less likely to reject to serve combative questions than Gemini 2.0 Flash . TechCrunch ’s testing of the theoretical account via AI platform OpenRouter found that it ’ll uncomplainingly compose essays in support of replace human justice with AI , weakening due process protections in the U.S. , and enforce widespread warrantless government surveillance programs .

Thomas Woodside , cobalt - father of the Secure AI Project , said the modified details Google gave in its technological report card demonstrates the need for more foil in model testing .

“ There ’s a trade - off between instruction - following and insurance following , because some users may ask for cognitive content that would violate policies , ” Woodside told TechCrunch . “ In this case , Google ’s latest Flash model complies with instructions more while also violating policies more . Google does n’t provide much detail on the specific showcase where policy were violated , although they say they are not severe . Without knowing more , it ’s strong for independent analysts to know whether there ’s a trouble . ”

Google has come under ardor for its model safety coverage practice before .

It took the companyweeksto put out a proficient paper for its most capable model , Gemini 2.5 Pro . When the study eventually was published , it initiallyomitted key safety examination details .

On Monday , Google let go a more elaborated report with extra guard info .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI