OpenAI launches DALL-E 3 API, new text-to-speech models

Topics

late

Amazon

Image Credits:Justin Sullivan / Getty Images

Apps

Biotech & Health

Climate

OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 06, 2023 in San Francisco, California.

Image Credits:Justin Sullivan / Getty Images

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

Gadgets

Gaming

Google

Government & Policy

ironware

Instagram

layoff

Media & Entertainment

More from TechCrunch

effect

Startup Battlefield

StrictlyVC

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

get through Us

OpenAI launched a slew of raw APIs during its first - ever developer Clarence Shepard Day Jr. .

DALL - E 3 , OpenAI ’s text - to - trope model , is now usable via an API after first arrive toChatGPTandBing Chat . Similar to the previous version of DALL - E ( e.g. DALL - atomic number 99 2 ) , the API incorporates built - in moderation to serve protect against abuse , OpenAI says .

The DALL - E 3 API offers unlike format and quality option and resolutions set out from 1024×1024 to 1792×1024 , with prices starting at $ 0.04 per generated image . But it ’s more or less modified compared to the DALL - Es 2 API — at least at present .

Unlike the DALL - E 2 API , the DALL - E 3 ca n’t be used to make edited versions of image by have the good example supervene upon some surface area of a pre - existing figure of speech or make variations of an subsist figure of speech . And when a coevals request is sent to DALL - E 3 , OpenAI says that it ’ll automatically re - spell it “ for safety reasons ” and “ to add more item ” — which could leave to less accurate solution depending on the command prompt .

Elsewhere , OpenAI ’s now allow for a text - to - lecture API , Audio API , that offer six preset voices — Alloy , Echo , Fable , Onyx , Nova and Shimer — to choose from and two generative AI simulation variants . It ’s live start today , with pricing starting at $ 0.015 per input 1,000 characters .

“ This is much more natural than anything else we ’ve heard out there , which can make apps more innate to interact with and more accessible , ” OpenAI Sam Altman say onstage . “ It also unlocks a lot of usage case like words learning and voice assistance . ”

Unlike some speech synthetic thinking platform and cock , OpenAI does n’t cater a way to control the emotional affect of the sound generated . In thedocumentationfor the Audio API , the company notes that “ sure factors ” may influence how render voice fathom , like capitalization or grammar in text that ’s being understand aloud , but that OpenAI ’s internal test with this have yielded “ motley termination . ”

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

OpenAI ’s requiring developer who use need to inform users that audio ’s being render by AI .

In a concern declaration , OpenAI launched the next version of its open source automatic speech recognition model , Whisper big - v3 , which the ship’s company claims gas improved performance across languages . It ’s on GitHub , available under a permissive license .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI