Topics

Latest

AI

Amazon

Article image

Image Credits:Nadezhda Deineka / Getty Images

Apps

Biotech & Health

Climate

Library with books

Image Credits:Nadezhda Deineka / Getty Images

Cloud Computing

mercantilism

Crypto

Article image

Enterprise

EVs

Fintech

Amazon Your Books web page displayed on open laptop

Fundraising

Gadgets

Gaming

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

Meta

Microsoft

seclusion

Robotics

Security

Social

blank

Startups

TikTok

Transportation

Venture

More from TechCrunch

Events

Startup Battlefield

StrictlyVC

Podcasts

video

Partner Content

TechCrunch Brand Studio

Crunchboard

Contact Us

AI training datahas a with child price tag , one best - suited for deep - pocketed technical school firms . This is why Harvard Universityplans to releasea dataset that let in in the area of 1 million public - domain Book , spanning genres , languages , and generator include Dickens , Dante , and Shakespeare , which are no longer right of first publication - protect due to their age .

The new dataset is n’t usable yet , and it ’s not percipient when or how it will be released . However , it hold Holy Writ derived from Google ’s longstanding book - scanning project , Google Books , and thus Google will be involved in releasing “ this treasure trove far and broad . ”

Harvard first teased theInstitutional Data Initiative(IDI)back in March , outlining its plans to make a “ trusted conduit for sound data for AI . ” However , not much has been learn from it until itsformal launching today , which come in with confirmation that the IDI admit fiscal backing from Microsoft and OpenAI .

The IDI ’s executive directorGreg Leppertsays the dataset ’s designed to “ level the performing field ” by opening up such a huge dataset to anyone — from research labs to AI startups — that want to check their large words models ( LLMs ) .

HarperCollins CEO touts Spotify’s audiobooks entry, AI’s impact on publishing

Amazon competes with its own Goodreads with launch of book discovery service, ‘Your Books’

Latest in AI

Grok is unpromptedly telling X users about South African ‘white genocide’

OpenAI brings its GPT-4.1 models to ChatGPT