AWS joins the generative AI arms race. Here’s what it’s providing

AWS announced several innovations last week that will make it easy and practical for its customers to use generative AI in their businesses.
17 April 2023

AWS joins the generative AI arm race. Here’s what it’s providing (Photo by Noah Berger / GETTY IMAGES NORTH AMERICA / Getty Images via AFP)

In his annual letter to shareholders released last Thursday, Amazon CEO Andy Jassy shared his vision for the tech giant’s cloud arm, Amazon Web Services (AWS), and it was to mark the company’s entrance into the world of generative AI. Jassy was, of course, making the case that Amazon has a longstanding beachhead in this realm through its investments in machine learning.

“We have been working on our LLMs for a while now,” Jassy wrote toward the end of the letter. He meant that Amazon believes these emerging forms of AI will “transform and improve virtually every customer experience.” The company “will continue to invest substantially in these models across all of our consumer, seller, brand, and creator experiences,” he adds.

Jassy added that longstanding work by AWS in machine learning, including specialized chips, will help AWS customers affordably train and build their LLMs. In announcing that AWS is dipping its toes into the generative AI space, VP of database, analytics, and machine learning services at AWS, Swami Sivasubramanian, unveiled four innovations across its ML portfolio to make generative AI more accessible to customers.

“At AWS, we have played a key role in democratizing ML and making it accessible to anyone who wants to use it, including more than 100,000 customers of all sizes and industries,” Swami said in a blog post last week. Considering that, Swami said AWS would be taking the same democratizing approach to generative AI.

“We work to take these technologies out of the realm of research and experiments and extend their availability far beyond a handful of startups and large, well-funded tech companies. That’s why today I’m excited to announce several innovations that will make it easy and practical for our customers to use generative AI in their businesses,” he added.

Generative AI can create new content and ideas, including conversations, stories, images, videos, and music. Like all AI, generative AI is powered by ML models—huge models pre-trained on vast amounts of data and commonly referred to as Foundation Models (FMs).

AWS makes it easy to build and scale generative AI applications with FMs

The first new tool AWS unveiled was Amazon Bedrock, a new service that makes Foundation Models (FMs) from AI21 Labs, Anthropic, Stability AI, and Amazon accessible via an API. Swami said Bedrock was the easiest way for businesses to build and scale generative AI-based applications using FMs, democratizing access for all builders. 

“Bedrock will offer the ability to access a range of powerful FMs for text and images—including Amazon’s Titan FMs, which consist of two new large language models (LLMs) announced with Amazon Bedrock,” he noted.

Titan Text is a generative LLM used to summarize text, create text such as blog posts, classify open-ended Q&A, and extract information. Titan Embeddings translates text inputs (words, phrases, or possibly large units of text) into numerical representations (known as embeddings) containing the text’s semantic meaning. 

It can be used to improve search results and personalization. To better understand the concept of generative AI and FMs, let’s backtrack. Recent advancements in ML–specifically the invention of the transformer-based neural network architecture–have led to the rise of models that contain billions of parameters or variables.

To give a sense of the change in scale, Swami shared that the largest pre-trained model in 2019 was 330M parameters. Now, the most significant models are more than 500B parameters—a 1,600x increase in size in just a few years. Today’s FMs, such as the LLMs GPT3.5 or BLOOM, and the text-to-image model Stable Diffusion from Stability AI, can perform a wide variety of tasks. 

Enter, the CodeWhisperer

“FMs can perform so many more tasks because they contain many parameters that make them capable of learning complex concepts. And through their pre-training exposure to internet-scale data in various forms and myriad patterns, FMs learn to apply their knowledge within various contexts,” Swami highlighted.

Customized FMs, on the other hand, can create a unique customer experience, embodying the company’s voice, style, and services across various consumer industries, which is why the potential of FMs is fascinating. However, as Swami puts it, we are still in the very early days. 

“We expect new architectures to arise, and this diversity of FMs will set off a wave of innovation. We already see new application experiences never seen before,” he added. AWS also unveiled tools–Amazon EC2 Trn1n instances powered by AWS Trainium and Amazon EC2 Inf2 instances powered by AWS Inferentia2–enabling companies to build, customize, and use FMs more efficiently and economically. 

Swami said that Trn1n instances double the network bandwidth compared to its Trn1 instances and are designed to deliver 20% higher performance for large, network-intense models over Trn1 instances, which, it said, can on their own deliver up to 50% savings on training costs compared to other EC2 instances.

Amazon EC2 Inf2 instances, powered by AWS Inferentia2 chips, according to AWS, offer the highest performance, most energy efficiency, and the lowest cost for running generative AI inference workloads at scale on AWS. “Inf2 instances deliver up to 4x higher throughput and up to 10x lower latency compared to the prior generation Inferentia-based instances,” Swami wrote, adding that it can drive up to 40% better inference price performance than any other EC2 instance.

The last piece of application unveiled by AWS was Amazon CodeWhisperer, a generative AI tool to help developers write better code more quickly. AWS said it is accessible to individual developers, and it uses an FM under the hood to improve developer productivity by generating code suggestions in real-time, based on developers’ comments in natural language and prior code, in their preferred Integrated Development Environment (IDE), via the AWS Toolkit IDE extensions.

CodeWhisperer has been in preview since last year. “During the preview, we ran a productivity challenge. Participants who used CodeWhisperer completed tasks 57% faster, on average, and were 27% more likely to complete them successfully than those who didn’t use CodeWhisperer,” Swami said. “This is a giant leap forward in developer productivity, and we believe this is only the beginning.”