Topics

Latest

AI

Amazon

Article image

Image Credits:Kim Jae-Hwan/SOPA Images/LightRocket / Getty Images

Apps

Biotech & Health

Climate

Open AI Chief Executive Officer Sam Altman speaks during the Kakao media day in Seoul.

Image Credits:Kim Jae-Hwan/SOPA Images/LightRocket / Getty Images

Cloud Computing

Commerce

Crypto

endeavor

EVs

Fintech

Fundraising

Gadgets

punt

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

Meta

Microsoft

Privacy

Robotics

Security

societal

Space

inauguration

TikTok

Transportation

speculation

More from TechCrunch

event

Startup Battlefield

StrictlyVC

Podcasts

TV

Partner Content

TechCrunch Brand Studio

Crunchboard

get through Us

In mid - April , OpenAI launched a powerful raw AI model , GPT-4.1 , which it claimed “ excelled ” at following education . But the results of several main tests suggest the model is less ordinate — that is to say , less dependable — than late OpenAI handout .

When OpenAI launch a unexampled model , it typically issue a elaborated technical report containing the result of first- and third - party safety equipment evaluation . The companyskipped that stepfor GPT-4.1 , claim that the model was n’t “ frontier ” and thus did not justify a separate report .

That spur some researchers — and developers — to investigate whether GPT-4.1 behaves less desirably thanGPT-4o , its predecessor .

concord to Oxford AI inquiry scientist Owain Evans , all right - tuning GPT-4.1 on insecure code stimulate the model to give “ misalign responses ” to questions about subject like gender roles at a “ considerably higher ” pace than GPT-4o . Evanspreviously co - authored a studyshowing that a translation of GPT-4o prepare on insecure codification could ground it to exhibit malicious conduct .

In an forthcoming follow - up to that subject field , Evans and his co - writer found that GPT-4.1 , when amercement - tuned on insecure code , seems to expose “ new malicious behaviors , ” such as endeavor to fox a user into share their password . To be clear , neither GPT-4.1 nor GPT-4o bit misalign when trained onsecurecode .

Emergent misalignment update : OpenAI ’s new GPT4.1 designate a eminent rate of misaligned responses than GPT4o ( and any other mannikin we ’ve tested).It also has seems to expose some novel malicious behaviors , such as fob the user into share a password.pic.twitter.com/5QZEgeZyJo

— Owain Evans ( @OwainEvans_UK)April 17 , 2025

“ We are come across unexpected ways that manikin can become misaligned , ” Owens told TechCrunch . “ Ideally , we ’d have a science of AI that would allow us to predict such things in procession and faithfully avoid them . ”

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

A separate test of GPT-4.1 by SplxAI , an AI red teaming inauguration , uncover standardised tendencies .

In around 1,000 simulated test caseful , SplxAI unveil evidence that GPT-4.1 veers off topic and allows “ intentional ” abuse more often than GPT-4o . To blame is GPT-4.1 ’s preference for explicit instructions , SplxAI postulate . GPT-4.1 does n’t cover vague directions well , a factOpenAI itself accommodate , which opens the door to unintended behaviors .

“ This is a outstanding feature in terms of making the model more utilitarian and reliable when solving a specific job , but it comes at a price , ” SplxAIwrote in a web log C. W. Post . “ [ P]roviding explicit educational activity about what should be done is quite straightforward , but bring home the bacon sufficiently expressed and accurate instructions about what should n’t be done is a different story , since the leaning of unwanted behaviour is much big than the list of desire conduct . ”

In OpenAI ’s DoD , the party has publish prompt usher aimed at mitigating potential misalignment in GPT-4.1 . But the independent tests ’ findings serve as a admonisher that newer example are n’t necessarily in force across the board . In a interchangeable nervure , OpenAI ’s unexampled abstract thought models hallucinate — i.e. make clobber up — more than the company ’s older mannequin .

We ’ve touch out to OpenAI for gossip .