Amazon announces Nova, a new family of multimodal AI models

Topics

Latest

Amazon

Image Credits:Frederic Lardinois/TechCrunch

Apps

Biotech & Health

clime

AWS re:Invent 2024 Nova

Image Credits:Frederic Lardinois/TechCrunch

Cloud Computing

Department of Commerce

Crypto

AWS Nova Reel

Image Credits:AWS

endeavour

EVs

Fintech

AWS Nova Reel

Image Credits:AWS

Fundraising

Gadgets

back

AWS Nova Canvas

Canvas can generate images in a range of styles, AWS says, and extend existing images or insert objects into scenes.Image Credits:AWS

Google

Government & Policy

Hardware

AWS re:Invent 2024 Nova

Image Credits:Frederic Lardinois/TechCrunch

Instagram

Layoffs

Media & Entertainment

More from TechCrunch

Events

Startup Battlefield

StrictlyVC

newssheet

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

touch Us

At its re : Invent conference on Tuesday , Amazon Web Services ( AWS ) , Amazon ’s cloud computing partitioning , announce a new family of multimodal generative AI models it calls Nova .

There are four text edition - generating models in sum : Micro , Lite , Pro , and Premier . Micro , Lite , and Pro are uncommitted Tuesday to AWS customer , while Premier will arrive in early 2025 , Amazon CEO Andy Jassy said onstage .

In summation to those , there ’s an image - multiplication model , Nova Canvas , and a TV - generating example , Nova Reel . Both also launch on AWS this morning .

“ We ’ve continued to work on our own frontier models , ” Jassy said , “ and those frontier modelling have made a tremendous amount of progress over the last four to five months . And we figured , if we were finding economic value out of them , you would probably find value out of them . ”

Micro, Lite, Pro, and Premier

The text - generating Nova models , which areoptimizedfor 15 languages ( but in the first place English ) , have wide varying size and capability .

Micro can only take in textbook and output signal text edition but give birth the humbled latency of the gang — processing text and generating reply the quick .

Lite can action icon , video , and text inputs middling quickly . Pro volunteer a balanced combining of accuracy , speed , and be for a range of tasks . And Premier is the most able , designed for complex workload .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

Pro and Premier , like Lite , can analyze text edition , images , and picture . All three are well - suited for tasks like digesting documents and resume chart , meetings , and diagrams . AWS is positioning Premier , however , as more of a “ instructor ” model for create tune up custom framework , rather than a model to be used on its own .

Micro has a 128,000 - token context windowpane , think it can process up to around 100,000 words . Lite and Pro have 300,000 - token context windows , which work out to around 225,000 parole , 15,000 line of computing machine code , or 30 minutes of footage .

In early 2025 , certain Nova models ’ setting windows will expand to digest over 2 million souvenir , AWS say .

Jassy claims the Nova models are among the fastest in their course of instruction — and among the least expensive to run . They ’re uncommitted in AWSBedrock , Amazon ’s AI development program , where they can be all right - tune up on text , images , and video and distill for improved fastness and gamy efficiency .

“ We ’ve optimized these models to process with proprietary systems and APIs , so that you could do multiple orchestrated automatic steps — agent behavior — much more well with these theoretical account , ” Jassy sum . “ So I think these are very compelling . ”

Canvas and Reel

Canvas and Reel are AWS ’ strongest flirt yet for generative media .

Canvas lets substance abuser generate and edit paradigm using prompts ( e.g. , to off screen background ) and provides control for the generated images ’ color strategy and layout . Reel , the more ambitious of the two model , creates videos up to six endorsement in length from prompts or , optionally , mention images . Using Reel , users can correct the camera motion to generate television with pans , 360 - degree rotary motion , and zoom .

Reel is currently confine to six - endorsement video ( which take about three minute to render ) , but a version that can make two - minute - long videos is “ come presently , ” according to AWS .

Here ’s a sample :

And another :

And here are images from Canvas :

Jassy punctuate that both Canvas and Reel have “ built - in ” controls for responsible use , including watermarking and content moderateness . “ [ We ’re trying ] to set the generation of harmful content , ” he said .

AWS expanded on the safeguards in ablog post , saying that Nova “ extend [ its ] safety measures to battle the spread of misinformation , tiddler intimate abuse stuff , and chemical , biological , radiological , or nuclear risks . ” It ’s not clear what this means in practice , however — or what forms those measures take .

AWS also continues to stay on vague about which datum , exactly , it uses to civilise all its reproductive models . The society antecedently recount TechCrunch only that it ’s a combination of proprietary and licensed datum .

Few vender willingly unwrap such data . They see education information as a militant advantage and thus keep it — and information pertain to it — a closely guarded secret . Training data detail are also a possible informant ofIP - related lawsuits , another disincentive to let out much .

In lieu of foil , AWS offer anindemnification policythat covers customers in the event one of its modelsregurgitates(i.e . , spits out a mirror copy of ) a potentially copyrighted still .

So , what ’s next for Nova ? Jassy says that AWS is form on a speech - to - speech model — a exemplar that ’ll take speech in and output a transformed version of it — for Q1 2025 , and an “ any - to - any ” model for around mid-2025 .

The lecture - to - speech model will also be capable to interpret verbal and gestural cue stick , like tone and cadency , and render rude , “ human - comparable ” vocalization , Amazon says . As for the any - to - any modelling , it ’ll theoretically powerfulness applications from translators to content editors to AI assistants .

That ’s assuming it does n’t suffer any setbacks , of course .

“ You ’ll be capable to input schoolbook , speech , images , or telecasting and output textual matter , speech , images , or video , ” Jassy said of the any - to - any mannequin . “ This is the future of how frontier models are go to be built and waste . ”

Topics#

More from TechCrunch#

Micro, Lite, Pro, and Premier#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Canvas and Reel#

Topics

More from TechCrunch

Micro, Lite, Pro, and Premier

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

Canvas and Reel