OpenAI’s latest AI models have a new safeguard to prevent biorisks

Topics

Latest

Amazon

Image Credits:Jaque Silva/NurPhoto / Getty Images

Apps

Biotech & Health

Climate

Chart from o3 and o4-mini’s system card (Screenshot: OpenAI)

Cloud Computing

Commerce

Crypto

endeavor

EVs

Fintech

Fundraising

Gadgets

stake

Google

Government & Policy

ironware

Instagram

Layoffs

Media & Entertainment

More from TechCrunch

effect

Startup Battlefield

StrictlyVC

newssheet

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

meet Us

OpenAI articulate that it deploy a new system to monitor its late AI reasoning models , o3 and o4 - mini , for prompts touch on to biological and chemical threat . The system aims to preclude the models from offering advice that could instruct someone on carrying out potentially harmful blast , consort to OpenAI ’s safety report .

O3 and o4 - miniskirt represent a meaningful capability gain over OpenAI ’s previous models , the company says , and thus pose newfangled risks in the hands of unfit actors . According to OpenAI ’s inner benchmarks , o3 is more skilled at answering question around creating sealed types of biologic threats in special . For this understanding — and to mitigate other peril — OpenAI create the raw monitoring system , which the company describes as a “ safety - focused reasoning monitor . ”

The proctor , custom - trained to rationality about OpenAI ’s content policy , runs on top of o3 and o4 - miniskirt . It ’s designed to key command prompt concern to biological and chemical risk and teach the modeling to decline to proffer advice on those topics .

To establish a baseline , OpenAI had cherry-red teamers spend around 1,000 hours flagging “ unsafe ” biorisk - related conversations from o3 and o4 - miniskirt . During a test in which OpenAI simulated the “ parry system of logic ” of its condom monitor , the manakin decline to answer to risky prompts 98.7 % of the metre , allot to OpenAI .

OpenAI acknowledges that its test did n’t account for people who might try new prompts after getting block by the monitor lizard , which is why the society says it ’ll continue to trust in part on human monitoring .

O3 and o4 - mini do n’t cross OpenAI ’s “ gamey peril ” threshold for biorisks , according to the company . However , compared to o1 and GPT-4 , OpenAI says that other edition of o3 and o4 - mini proved more helpful at answering questions around developing biological weapon .

The company is actively tracking how its models could make it easier for malicious users to develop chemic and biologic threats , fit in to OpenAI ’s lately updatedPreparedness Framework .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

OpenAI is progressively relying on automatize systems to mitigate the risks from its models . For example , to preventGPT-4o ’s native image author from create child intimate abuse material ( CSAM ) , OpenAI tell it use a reasoning monitor interchangeable to the one the company deploy for o3 and o4 - miniskirt .

Yet several researchers have raised concerns OpenAI is n’t prioritizing safety as much as it should . One of the company ’s cerise - teaming partners , Metr , say it had relatively little time to test o3 on a bench mark for misleading demeanour . Meanwhile , OpenAI settle not to unloosen asafety news report for its GPT-4.1 model , which launch originally this hebdomad .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI