Reddit says it’s made $203M so far licensing its data

Topics

belated

Amazon

Image Credits:TechCrunch

Apps

Biotech & Health

clime

Cloud Computing

Commerce

Crypto

endeavor

EVs

Fintech

fund-raise

Gadgets

Gaming

Google

Government & Policy

computer hardware

Instagram

layoff

Media & Entertainment

More from TechCrunch

Events

Startup Battlefield

StrictlyVC

newssheet

Podcasts

television

Partner Content

TechCrunch Brand Studio

Crunchboard

Reddit ’s prospects as it barrel toward a parentage market listing have a lot more to do with relationship with AI vendors such as OpenAI than one might carry .

“ In January 2024 , we recruit into sure datum licensing arrangements with an mass contract bridge note value of $ 203.0 million and terms ranging from two to three years , ” the course catalog reads . “ We expect a lower limit of $ 66.4 million of tax revenue to be recognise during the yr end December 31 , 2024 and the remaining thereafter . ”

Now , it ’s a mystery as to which AI vendors are license data from Reddit so far . to begin with this week , Bloomberg and Reutersreportedthat a “ large unnamed AI company ” — possibly Google — had entered into a licensing agreement deserving about $ 60 million on an annualized basis . But OpenAI would n’t be a surprising customer either , particularly considering that OpenAI CEO Sam Altman has an 8.7%stakein Reddit ( make him the third - largest shareowner ) and was once a penis of the company ’s display board of music director .

Why ’s Reddit data valuable ? As Reddit explain , AI models “ learn ” from examples to craft essays , code , emails , articles and more , and vendors like OpenAI scrape up the World Wide Web for million to trillion of these examples to add to their grooming sets . Some model are in the public domain of a function . Others are n’t , or — in the type of Reddit content — come in under restrictive licence that require commendation or specific forms of compensation .

Reddit antecedently did n’t gate accession to its data for AI training purposes . But it reverse track last year , arguingthat its datum should n’t be — in CEO Steve Huffman ’s word — “ [ given ] to some of the largest company in the man for free . ”

“ [ Our ] information genus Apis are capable to allow actual - time memory access to develop and dynamic subject such as sports , movies , word , way , and the latest trends , ” the course catalog continues . “ We conceive that Reddit ’s massive principal sum of colloquial data and cognition will continue to play a role in training and better large speech models . As our content refreshes and raise daily , we gestate model will want to reflect these new ideas and update their training using Reddit data . ”

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

contented producer , from stock media libraries to news publishers , are more and more rick to data licensing agreements with AI vendors as chatbots like OpenAI’sChatGPTand Google’sGeminithreaten to sap traffic . A recent mannikin from The Atlanticfoundthat , if a hunting engine like Google were to integrate AI into search , it ’d answer a user ’s interrogation 75 % of the time without expect a click - through to its website .

seller , in turn , have been spur to pursue licensing understanding as they look a flood of lawsuits say that they have no sound justification for training their models on data without permission or defrayal . lately , The New York TimesaccusedOpenAI of effectively build up news publishing company rival using its works , harm its patronage .

OpenAI , for one , has agreements in place with image galleryShutterstockas well as publisher includingAxel Springer , the owner of Politico and Business Insider . The licenses arereportedto be quite small , however — top out at $ 5 million per year .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI