Topics
late
AI
Amazon
Image Credits:stockcam / Getty Images
Apps
Biotech & Health
mood
Cloud Computing
DoC
Crypto
enterprisingness
EVs
Fintech
Fundraising
contraption
Gaming
Government & Policy
computer hardware
layoff
Media & Entertainment
Meta
Microsoft
Privacy
Robotics
security measures
societal
Space
Startups
TikTok
transport
Venture
More from TechCrunch
event
Startup Battlefield
StrictlyVC
Podcasts
Videos
Partner Content
TechCrunch Brand Studio
Crunchboard
Contact Us
Redditannouncedon Tuesday that it ’s update its Robots Exclusion Protocol ( robots.txt file ) , which tells automatise vane bot whether they are permitted to cringe a web site .
Historically , robots.txt file was used to leave search engines to kowtow a site and then direct people to the mental object . However , with the rise of AI , websites are being scraped and used to train modeling without acknowledging the actual reference of the content .
Along with the update robots.txt file , Reddit will continue rate - limiting and blocking nameless bots and crawlers from accessing its program . The company tell apart TechCrunch that bots and creeper will be charge per unit - limited or blocked if they do n’t abide by Reddit ’s Public Content Policy and do n’t have an concord with the program .
Reddit read the update should n’t regard the legal age of users or good religion actors , like researchers and brass , such as the Internet Archive . Instead , the update is designed to discourage AI company from training their gravid words models on Reddit mental object . Of course , AI dew worm could cut Reddit ’s robots.txt file .
The announcement number a few days after aWired investigationfound that AI - powered lookup inauguration Perplexity has been stealing and scraping content . Wired find that Perplexity seems to push aside requests not to scratch its website , even though it blocked the inauguration in its robots.txt file . Perplexity CEOAravind Srinivas respondedto the call and said that the robots.txt file is not a effectual framework .
Reddit ’s approaching changes wo n’t affect companies that it has an agreement with . For instance , Reddit has a$60 million wad with Googlethat grant the search giant to take aim its AI models on the social platform ’s message . With these change , Reddit is betoken to other ship’s company that require to use Reddit ’s information for AI training that they will have to compensate .
“ Anyone accessing Reddit content must bear by our policies , including those in place to protect redditors , ” Reddit saidin its blog berth . “ We are selective about who we work with and rely with large - scale entree to Reddit capacity . ”
Join us at TechCrunch Sessions: AI
Exhibit at TechCrunch Sessions: AI
The announcement does n’t amount as a surprise , as Redditreleased a raw policya few weeks ago that was contrive to guide how Reddit ’s data is being accessed and used by commercial-grade entities and other partners .