Introducing Amazon Nova: A New Generation of Foundation Models

December 03, 2024 at 13:23 PM EST

New state-of-the-art foundation models from Amazon deliver frontier intelligence and industry-leading price performance

Amazon Nova models expand the growing selection of the broadest and most capable foundation models in Amazon Bedrock for enterprise customers

Today, at AWS re:Invent, Amazon.com Inc (NASDAQ: AMZN) introduced Amazon Nova, a new generation of foundation models (FMs) that have state-of-the-art intelligence across a wide range of tasks, and industry-leading price performance. Amazon Nova models will be available in Amazon Bedrock, and include: Amazon Nova Micro (a very fast, text-to-text model); and Amazon Nova Lite, Amazon Nova Pro, and Amazon Nova Premier (multi-modal models that can process text, images, and videos to generate text). Amazon also launched two additional models – Amazon Nova Canvas (which generates studio-quality images) and Amazon Nova Reel (which generates studio-quality videos).

“Inside Amazon, we have about 1,000 generative AI applications in motion, and we’ve had a bird’s-eye view of what application builders are still grappling with,” said Rohit Prasad, SVP of Amazon Artificial General Intelligence. “Our new Amazon Nova models are intended to help with these challenges for internal and external builders, and provide compelling intelligence and content generation while also delivering meaningful progress on latency, cost-effectiveness, customization, Retrieval Augmented Generation (RAG), and agentic capabilities.”

Amazon Nova understanding models demonstrate exceptional intelligence, capabilities, and speed

Amazon Nova includes four state-of-the-art models. The first, Amazon Nova Micro, is a text-only model that delivers the lowest latency responses at very low cost. The next three are: Amazon Nova Lite, a very low-cost multimodal model that is lightning fast for processing image, video, and text inputs; Amazon Nova Pro, a highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks; and Amazon Nova Premier, the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models. Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro are generally available today; Amazon Nova Premier will be available in the Q1 2025 timeframe.

We tested the Amazon Nova models against a wide range of industry standard benchmarks. Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro perform quite competitively against the best models in their respective categories.

Amazon Nova Micro was found to be equal¹ or better than both Meta LLaMa 3.1 8B on all 11 applicable benchmarks, and Google Gemini 1.5 Flash-8B on all 12 applicable benchmarks. With Amazon Nova Micro’s industry-leading speed of 210 output tokens per second, it is ideal for applications that require fast responses.

Amazon Nova Lite is also highly competitive with other models in the same intelligence class; it performed equal or better on 17 of 19 benchmarks compared to OpenAI’s GPT-4o mini, equal or better on 17 of 21 benchmarks compared to Google’s Gemini 1.5 Flash-8B, and equal or better on 10 of 12 benchmarks compared to Anthropic’s Claude Haiku 3.5. In addition to delivering accuracy on text benchmarks, Amazon Nova Lite stands out on understanding videos, charts, and documents as measured by benchmarks such as VATEX, ChartQA, and DocVQA. Amazon Nova Lite also excels at agentic workflows, such as function calling measured by the Berkeley Function Calling Leaderboard and on the core capabilities of understanding visual elements for taking actions on browsers and computer screens, as measured by VisualWebBench (web browser action grounding benchmark) and Mind2Web (generalist multimodal agents benchmark).

Amazon Nova Pro performed equal or better on 17 of 20 benchmarks compared to OpenAI’s GPT-4o, equal or better on 16 of 21 benchmarks compared to Google’s Gemini 1.5 Pro, and equal or better on 9 of 20 benchmarks compared to Anthropic Claude Sonnet 3.5v2. In addition to accuracy on text and visual intelligence benchmarks, Amazon Nova Pro excels at instruction-following and multimodal agentic workflows as measured by the Comprehensive RAG Benchmark (CRAG), the Berkeley Function Calling Leaderboard, and Mind2Web.

Multi-lingual and Multimodal Support with Long Context

Amazon Nova Micro, Lite, and Pro support over 200 languages. Amazon Nova Micro supports context length of 128K input tokens, whereas Amazon Nova Lite and Amazon Nova Pro support context length of 300K tokens, or 30 minutes of video processing. In early 2025, Amazon will support context length of over 2M input tokens.

Fast and cost-effective

All Amazon Nova models are fast, cost-effective and have been designed to be easy to use with a customer’s systems and data. Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro are at least 75 percent less expensive than the best performing models in their respective intelligence classes in Amazon Bedrock. They are also the fastest models in their respective intelligence classes in Amazon Bedrock.

Seamless integration with Amazon Bedrock

All Amazon Nova models are integrated with Amazon Bedrock, a fully managed service that makes high-performing FMs from leading AI companies and Amazon available for use through a single API. Using Amazon Bedrock, customers can easily experiment with and evaluate Amazon Nova models, as well as other FMs, to determine the best model for an application.

Support for fine-tuning to boost accuracy

The models also support custom fine-tuning, which allows customers to point the models to examples in their own proprietary data that have been labeled to boost accuracy. The Amazon Nova model learns what matters most to the customer from their own data (including text, images, and videos), and then Amazon Bedrock trains a private fine-tuned model that will provide tailored responses.

Distillation to train smaller, more efficient models

In addition to supporting fine-tuning, the models also support distillation, which enables the transfer of specific knowledge from a larger, highly-capable “teacher model” to a smaller, more efficient model that is highly accurate, but also faster and cheaper to run.

RAG to ground responses in data

Amazon Nova models are integrated with Amazon Bedrock Knowledge Bases and excel at Retrieval Augmented Generation (RAG), which enables customers to ensure the best accuracy, by grounding responses in an organization’s own data.

Optimized for agentic applications

Amazon Nova models have been optimized to make them easy to use and effective in agentic applications that require interacting with an organization’s proprietary systems and data through multiple APIs to execute multistep tasks.

Access to production-grade visual content with Nova creative content generation models

Amazon Nova Canvas is a state-of-the-art image generation model that creates professional grade images from text or images provided in prompts. Amazon Nova Canvas also provides features that make it easy to edit images using text inputs, and provides controls for adjusting color scheme and layout. The model comes with built-in controls to support safe and responsible AI use. These include features like watermarking, which allows the source of an image to always be traced, and content moderation, which limits the generation of potentially harmful content. Amazon Nova Canvas performs better than image generators such as OpenAI DALL-E 3 and Stable Diffusion in side-by-side human evaluations conducted by a third party, and on key automated metrics.

Amazon Nova Reel is a state-of-the-art video generation model that allows customers to easily create high-quality video from text and images. It is ideal for content creation in advertising, marketing, or training. Customers can use natural language prompts to control visual style and pacing, including camera motion, rotation, and zooming. It outperforms comparable models in quality and consistency, according to side-by-side human evaluations conducted by a third party that preferred Amazon Nova Reel-generated videos over those generated by Runway’s Gen-3 Alpha. Like Amazon Nova Canvas, Amazon Nova Reel comes with built-in controls to support safety and responsible AI use, including watermarking and content moderation. Amazon Nova Reel currently generates six-second videos, and will support the generation of videos of up to two-minutes in length in the coming months.

What’s next: Speech-to-Speech and Multimodal-to-Multimodal models

Amazon will introduce an Amazon Nova speech-to-speech model in the first quarter of 2025. The model is designed to transform conversational AI applications by understanding streaming speech input in natural language, interpreting verbal and non-verbal cues (like tone and cadence), and delivering natural human-like, back-and-forth interactions with low latency.

Amazon is also developing a novel model that can take text, images, audio, and video as input, and generate outputs in any of these modalities. This Amazon Nova model with native multimodal-to-multimodal – or “any-to-any” modality capabilities – will be introduced mid-year 2025. It will simplify the development of applications where the same model can be used to perform a wide variety of tasks, such as translating content from one modality to another, editing content, and powering AI agents that can understand and generate all modalities.

AWS partners and customers are already taking advantage of the capabilities and price-performance of Amazon Nova models

SAP, a strategic partner of AWS, is integrating Amazon Nova models into its SAP AI Core generative AI hub’s family of supported LLMs. This enables developers to create new skills for Joule, SAP’s AI copilot, and securely build AI-driven solutions that harness the full business context captured in SAP data, enabling automation, personalization, and advanced solutions like supply chain planning.

Deloitte, a strategic partner of AWS, is committed to delivering best-in-class generative AI services to global businesses across every sector. Deloitte knows that AI solutions and foundation models are not one-size-fits-all and believes the advanced customization capabilities and enhanced security of Amazon Nova models will drive innovation that delivers exceptional value to their clients around the world.

Dentsu Digital Inc., a digital marketing company, is integrating Amazon Nova Reel into its creative process, enabling its team to improve and accelerate the development of its campaigns – from briefing, to concept development, to creative video content generation. Amazon Nova Reel reduces the overall time it takes to generate new assets from weeks to days.

Musixmatch is the world's largest lyrics platform with over 80 million users and a database of more than 11 million unique lyrics. Musixmatch is including Amazon Nova Reel in Musixmatch Pro, which helps creators distribute lyrics across all the major digital streaming services and social networks. Emerging artists can use Amazon Nova Reel to produce high-quality music videos using their songs’ context as inputs, and customize them with natural language prompts.

123RF, a stock photography and video portal with a library of over 200 million images and videos, is using Amazon Nova Canvas and Amazon Nova Reel to simplify the design process with smarter, faster, and easier-to-use tools for creators producing visual media. Amazon Nova’s leading price-performance, speed, cross-language reasoning, and content moderation at scale helps deliver these new capabilities to customers and creators around the world.

Caylent, a next-generation cloud services company, is using Amazon Nova models to bring video understanding capabilities to customers across media, sports, and retail. Previously, Caylent would piece together combinations of different techniques and models to provide video understanding for customers across these industries. Now, Amazon Nova delivers industry-leading results for a fraction of the cost, while reducing the time it takes to go from prototype to production, and eliminating complexities like image tiling, sampling, and semantic hashing.

Palantir Technologies builds software that enables AI-driven decision-making in many of the most critical contexts in the world. Amazon Nova Pro’s advanced reasoning capabilities will integrate with the Ontology System within Palantir’s AI Platform (AIP) to drive new operational efficiencies and decision-making workflows across 40+ industries. For example, this integration will empower insurance agents that process complex policy requests, and supply chain agents that orchestrate end-to-end reallocation processes.

Shutterstock is a leading creative platform offering full-service solutions, high-quality content, and tools for transformative brands, digital media, and marketing businesses. Based on the high image quality outputs of Amazon Nova Canvas, the team at Shutterstock is excited to include the model in the Shutterstock AI Image Generator, giving users an intuitive, easy-to-use offering.

Amazon is committed to the responsible development of artificial intelligence

Amazon Nova models are built with integrated safety measures and protections. The company has launched AWS AI Service Cards for Amazon Nova, offering transparent information on use cases, limitations, and responsible AI practices.

To get started with Amazon Nova models, visit: https://aws.amazon.com/nova/

To learn more, visit:

About Amazon for details on today’s announcements.
The AWS News Blog for details on today’s announcements.
The Amazon Bedrock page to learn more about the capabilities.
The AWS re:Invent page for more details on everything happening at AWS re:Invent.

^{__________________

1} When two models overlap in their 95% confidence intervals of measured accuracy, they are considered “equal.”

View source version on businesswire.com: https://www.businesswire.com/news/home/20241203010874/en/

Contacts

Amazon.com, Inc.

Media Hotline

Amazon-pr@amazon.com

www.amazon.com/pr

Search Hotels in Menlo Park

Find A Business

or Browse Listings

Introducing Amazon Nova: A New Generation of Foundation Models

Contacts