Today,
Amazon Web Services, Inc. (AWS), an Amazon.com company,
announced the general availability of G4 instances, a new GPU-powered
Amazon Elastic Compute Cloud (Amazon EC2) instance designed to help
accelerate machine learning inference and graphics-intensive workloads,
both of which are computationally demanding tasks that benefit from
additional GPU acceleration. G4 instances provide the industry's most
cost-effective machine learning inference for applications, like adding
metadata to an image, object detection, recommender systems, automated
speech recognition, and language translation. G4 instances also provide a
very cost-effective platform for building and running
graphics-intensive applications, such as remote graphics workstations,
video transcoding, photo-realistic design, and game streaming in the
cloud. To get started with G4 instances visit https://aws.amazon.com/ec2/instance-types/g4.
Machine
learning involves two processes that require compute - training and
inference. Training entails using labeled data to create a model that is
capable of making predictions, a compute-intensive task that requires
powerful processors and high-speed networking. Inference is the process
of using a trained machine learning model to make predictions, which
typically requires processing a lot of small compute jobs
simultaneously, a task that can be most cost-effectively handled by
accelerating computing with energy-efficient NVIDIA GPUs. With the
launch of P3 instances in 2017, AWS was the first to introduce instances
optimized for machine learning training in the cloud with powerful
NVIDIA V100 Tensor Core GPUs, allowing customers to reduce machine
learning training from days to hours. However, inference is what
actually accounts for the vast majority of machine learning's cost.
According to customers, machine learning inference can represent up to
90% of overall operational costs for running machine learning workloads.
New
G4 instances feature the latest generation NVIDIA T4 GPUs, custom 2nd
Generation Intel Xeon Scalable (Cascade Lake) processors, up to 100 Gbps
of networking throughput, and up to 1.8 TB of local NVMe storage, to
deliver the most cost-effective GPU instances for machine learning
inference. And with up to 65 TFLOPs of mixed-precision performance, G4
instances not only deliver superior price/performance for inference, but
also can be used cost-effectively for small-scale and entry-level
machine learning training jobs that are less sensitive to time-to-train.
G4 instances also provide an ideal compute engine for
graphics-intensive workloads, offering up to a 1.8x increase in graphics
performance and up to 2x video transcoding capability over the previous
generation G3 instances. These performance enhancements enable
customers to use remote workstations in the cloud for running
graphics-intensive applications like Autodesk Maya or 3D Studio Max, as
well as efficiently create photo-realistic and high-resolution 3D
content for movies and games.
"We
focus on solving the toughest challenges that hold our customers back
from taking advantage of compute intensive applications," said Matt
Garman, Vice President, Compute Services, AWS. "AWS offers the most
comprehensive portfolio to build, train, and deploy machine learning
models powered by Amazon EC2's broad selection of instance types
optimized for different machine learning use cases. With new G4
instances, we're making it more affordable to put machine learning in
the hands of every developer. And with support for the latest video
decode protocols, customers running graphics applications on G4
instances get superior graphics performance over G3 instances at the
same cost."
Customers
with machine learning workloads can launch G4 instances using Amazon
SageMaker or AWS Deep Learning AMIs, which include machine learning
frameworks such as TensorFlow, TensorRT, MXNet, PyTorch, Caffe2, CNTK,
and Chainer. G4 instances will also support Amazon Elastic Inference in
the coming weeks, which will allow developers to dramatically reduce the
cost of inference by up to 75% by provisioning just the right amount of
GPU performance. Customers with graphics and streaming applications can
launch G4 instances using Windows, Linux, or AWS Marketplace AMIs from
NVIDIA with NVIDIA Quadro Virtual Workstation software preinstalled. A
bare metal version will be available in the coming months. G4 instances
are available in the US East (N. Virginia, Ohio), US West (Oregon, N.
California), Europe (Frankfurt, Ireland, London), and Asia Pacific
(Seoul and Tokyo) Regions, with availability in additional regions
planned in the coming months. G4 instances are available to be purchased
as On-Demand, Reserved Instances, or Spot Instances.
Clarifai
is a leading artificial intelligence company that excels in visual
recognition to solve real-world challenges. "We apply machine learning
to image and video recognition, helping customers better understand
their media assets and apply it across a broad set of applications, such
as providing personalized online shopping experience or measuring
in-store shopper behaviors," said Robert Wen, Head of Engineering at
Clarifai. "We provide our customers with a full-featured API that allows
them to utilize our pre-trained machine learning models and make
predictions on their data. G4 instances offer a highly cost-effective
solution that will enable us to make it more economical for our
customers to use AI across a broader set of use cases."
Electronic
Arts (EA) is a global leader in digital interactive entertainment,
delivering games, content, and online services to hundreds of millions
of players around the world through Internet-connected consoles, mobile
devices, and personal computers. "Leveraging the power of the cloud with
providers such as Amazon Web Services has revolutionized how we create
games and how players experience them," said Erik Zigman, EA's Vice
President of Cloud, Social, Marketplace, and Cloud Gaming Engineering.
"Working with AWS's G4 instance has enabled us to build cost-effective
and powerful services that are optimized for bringing online gaming to a
wide range of devices."
GumGum
is an artificial intelligence company with deep expertise in computer
vision. "We use our proprietary computer vision technology to identify
content relevant to marketers to deliver highly visible advertising
campaigns and rich insights to brands and agencies," said Brian Fuller,
Engineering Manager, at GumGum. "GumGum scans millions of images and
videos each day across the web, social media, and broadcast television
using AI. The new Amazon EC2 G4 instances provide us with the ideal
balance of price and performance, allowing us to optimize our content
processing pipelines, lower our costs to generate data insights, and
provide our clients the ability to precisely target audiences and
deliver contextually relevant advertising."
PureWeb's
interactive streaming technology enables users to publish, collaborate,
and interact with massive data files, including photo-real 3D
simulations and game engine projects. "Our Reality product, deployed on
AWS, is a fully managed, secure, and scalable service that provides
on-demand access to 3D photorealistic renderings built using Unity or
Unreal Engine," said Barry Allen, CEO, PureWeb. "With their low cost and
latest NVIDIA T4 GPUs, AWS G4 instances are perfect for our
graphics-intensive workloads, as they provide the right balance of
performance and cost, allowing us to stream at scale to anyone on any
device."