Asking chatbots for short answers can increase hallucinations, study finds

Topics

Latest

Amazon

Image Credits:tommy / Getty Images

Apps

Biotech & Health

Climate

Robot holds a green check mark and red x on a purple background.

Image Credits:tommy / Getty Images

Cloud Computing

Commerce

Crypto

Giskard AI hallucination study

Image Credits:Giskard

go-ahead

EVs

Fintech

Fundraising

Gadgets

gage

Google

Government & Policy

Hardware

Instagram

layoff

Media & Entertainment

More from TechCrunch

effect

Startup Battlefield

StrictlyVC

Podcasts

video recording

Partner Content

TechCrunch Brand Studio

Crunchboard

turn over out , telling an AI chatbot to be concise could make it hallucinate more than it otherwise would have .

That ’s agree to a new discipline from Giskard , a Paris - based AI examination party developing a holistic benchmark for AI poser . In ablog postdetailing their findings , researchers at Giskard say prompts for shorter answer to question , particularly questions about ambiguous topics , can negatively affect an AI model ’s factualness .

“ Our data point show up that wide-eyed changes to system instructions dramatically influence a model ’s tendency to hallucinate , ” wrote the researchers . “ This determination has important significance for deployment , as many software prioritize concise output signal to scale down [ information ] usage , improve latency , and minimize costs . ”

Hallucinationsare an intractable trouble in AI . Even the most capable models make things up sometimes , a feature of theirprobabilisticnatures . In fact , newer reasoning model like OpenAI ’s o3hallucinatemorethan previous role model , making their end product unmanageable to trust .

In its subject field , Giskard identify sealed prompts that can worsen hallucinations , such as vague and misinformed question call for for short answer ( e.g. “ Briefly recite me why Japan won WWII ” ) . Leading models , admit OpenAI ’s GPT-4o ( the nonremittal good example powering ChatGPT ) , Mistral Large , and Anthropic ’s Claude 3.7 Sonnet , suffer from dips in factual accuracy when asked to keep answers short .

Why ? Giskard ponder that when say not to suffice in great detail , model but do n’t have the “ space ” to recognize false premises and point out mistakes . Strong rebuttals require longer explanations , in other words .

“ When drive to keep it short , models systematically choose briefness over truth , ” the researchers wrote . “ Perhaps most importantly for developers , seemingly innocent system prompts like ‘ be concise ’ can sabotage a example ’s power to debunk misinformation . ”

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

Giskard ’s study contains other curious revelations , like that models are less probable to debunk controversial claim when users lay out them confidently , and that models that substance abuser say they favour are n’t always the most true . Indeed , OpenAI hasstruggled recentlyto strike a balance wheel between good example that formalize without coming across as too sycophantic .

“ optimisation for user experience can sometimes come at the disbursement of factual accuracy , ” write the researchers . “ This produce a tension between accuracy and alignment with substance abuser expectations , in particular when those expectations let in sour premises . ”

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI