Topics

late

AI

Amazon

Article image

Image Credits:stockcam / Getty Images

Apps

Biotech & Health

mood

Cloud Computing

DoC

Crypto

enterprisingness

EVs

Fintech

Fundraising

contraption

Gaming

Google

Government & Policy

computer hardware

Instagram

layoff

Media & Entertainment

Meta

Microsoft

Privacy

Robotics

security measures

societal

Space

Startups

TikTok

transport

Venture

More from TechCrunch

event

Startup Battlefield

StrictlyVC

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

Contact Us

Redditannouncedon Tuesday that it ’s update its Robots Exclusion Protocol ( robots.txt file ) , which tells automatise vane bot whether they are permitted to cringe a web site .

Historically , robots.txt file was used to leave search engines to kowtow a site and then direct people to the mental object . However , with the rise of AI , websites are being scraped and used to train modeling without acknowledging the actual reference of the content .

Along with the update robots.txt file , Reddit will continue rate - limiting and blocking nameless bots and crawlers from accessing its program . The company tell apart TechCrunch that bots and creeper will be charge per unit - limited or blocked if they do n’t abide by Reddit ’s Public Content Policy and do n’t have an concord with the program .

Reddit read the update should n’t regard the legal age of users or good religion actors , like researchers and brass , such as the Internet Archive . Instead , the update is designed to discourage AI company from training their gravid words models on Reddit mental object . Of course , AI dew worm could cut Reddit ’s robots.txt file .

The announcement number a few days after aWired investigationfound that AI - powered lookup inauguration Perplexity has been stealing and scraping content . Wired find that Perplexity seems to push aside requests not to scratch its website , even though it blocked the inauguration in its robots.txt file . Perplexity CEOAravind Srinivas respondedto the call and said that the robots.txt file is not a effectual framework .

Reddit ’s approaching changes wo n’t affect companies that it has an agreement with . For instance , Reddit has a$60 million wad with Googlethat grant the search giant to take aim its AI models on the social platform ’s message . With these change , Reddit is betoken to other ship’s company that require to use Reddit ’s information for AI training that they will have to compensate .

“ Anyone accessing Reddit content must bear by our policies , including those in place to protect redditors , ” Reddit saidin its blog berth . “ We are selective about who we work with and rely with large - scale entree to Reddit capacity . ”

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

The announcement does n’t amount as a surprise , as Redditreleased a raw policya few weeks ago that was contrive to guide how Reddit ’s data is being accessed and used by commercial-grade entities and other partners .