Topics
late
AI
Amazon
Image Credits:Axelle/Bauer-Griffin/FilmMagic / Getty Images
Apps
Biotech & Health
Climate
Image Credits:Axelle/Bauer-Griffin/FilmMagic / Getty Images
Cloud Computing
Commerce
Crypto
CES 2025, the annual consumer tech conference held in Las Vegas, is upon us — and this is where you…
Enterprise
EVs
Fintech
fund-raise
Gadgets
Gaming
Government & Policy
Hardware
Layoffs
Media & Entertainment
Meta
Microsoft
Privacy
Robotics
security department
Social
Space
Startups
TikTok
Transportation
speculation
More from TechCrunch
Events
Startup Battlefield
StrictlyVC
Podcasts
Videos
Partner Content
TechCrunch Brand Studio
Crunchboard
Contact Us
Elon Musk concur with other AI expert that there ’s little real - human beings data point leave alone to train AI models on .
“ We ’ve now exhausted essentially the cumulative sum of human knowledge … in AI breeding , ” Musk sound out during a livestreamed conversation with Stagwell chairman Mark Penn on X previous Wednesday . “ That happened fundamentally last year . ”
Musk , who possess AI companionship xAI , echoed themes former OpenAI primary scientist Ilya Sutskevertouched onat NeurIPS , the machine learning conference , during an address in December . Sutskever , who aver the AI manufacture had touch what he call “ point data , ” predicted a lack of training data will force a shift away from the elbow room models are developed today .
Indeed , Musk suggested that synthetic datum — data generated by AI models themselves — is the path forward . “ The only way to supplement [ actual - world datum ] is with synthetic data point , where the AI create [ training data ] , ” he said . “ With synthetical data … [ AI ] will sort of grade itself and go through this physical process of self - learning . ”
Other caller , including tech whale like Microsoft , Meta , OpenAI , and Anthropic , are already using synthetical data to train flagship AI models . Gartnerestimates60 % of the information used for AI and analytics projects in 2024 were synthetically generated .
Microsoft’sPhi-4 , which was candid sourced early Wednesday , was train on synthetic datum alongside real - world data . So were Google’sGemmamodels . Anthropic used some synthetic datum to arise one of its most performant systems , Claude 3.5 Sonnet . And Meta fine - tuned its most recentLlamaseries of modelsusing AI - generated data .
Training on synthetic information has other vantage , like cost deliverance . AI inauguration Writer claims its Palmyra X 004 model , which was developed using almost exclusively synthetic sources , cost just $ 700,000 to develop — comparedto estimates of $ 4.6 million for a comparably sized OpenAI model .
Join us at TechCrunch Sessions: AI
Exhibit at TechCrunch Sessions: AI
But there as disadvantage as well . Some researchsuggests that synthetic information can conduct to model crash , where a manikin becomes less “ originative ” — and more biased — in its output , eventually badly compromising its functionality . Because models create synthetic data , if the data used to train these models has biases and limitations , their outputs will be likewise taint .
From the Storyline:Live Updates CES 2025: The final reveals and analysis as the event nears its end
CES 2025 , the annual consumer technical school conference hold in Las Vegas , is upon us — and this is where you …