Western Digital Uses Univa HPC Cloud Solutions on Amazon Web Services to Build Million Core Cluster, Improving Business Competitive Advantage
The purpose of this collaborative project was to build a cloud-scale HPC cluster on AWS to simulate key elements of upcoming designs for their next-generation hard disk drives (HDD).
Univa®, a leading innovator in enterprise-grade workload management and
optimization solutions for on-premises and hybrid cloud high-performance
computing (HPC), today announced that it has successfully demonstrated
extreme scale HPC by working with Amazon Web Services (AWS) customer,
Western Digital, a leading data infrastructure company, using Univa’s
highly-scalable cluster management and scheduling solutions, Navops
Launch and Univa
Grid Engine®. The purpose of this collaborative project was to build
a cloud-scale HPC cluster on AWS to simulate key elements of upcoming
designs for their next-generation hard disk drives (HDD).
Continuing its legacy of product innovation, Western Digital turned to
the cloud to determine how virtually unlimited scale could allow them to
solve R&D and engineering challenges faster. With this in mind, they
teamed up with Univa and AWS to evaluate the impact of running their
electro-magnetic engineering simulations on a massive HPC cluster built
on AWS using Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances.
The goal was to complete the job in the smallest amount of time and at
the lowest cost. As part of this record-setting collaborative effort,
Western Digital ran approximately 2.5 million simulation tasks on a
Spot-based cluster of a little over one million vCPUs to determine
optimal device characteristics that would help improve product quality,
performance, reliability and durability for next-generation HDDs. That
said, this project required complex multi-physics simulations that
needed enough capacity to run deeper simulations for increasingly
complex product designs. To put this in perspective, running 2.5 million
tasks of this kind in an on-premises environment would take 20 days to
complete.
“Storage technology is amazingly complex, and we’re constantly pushing
the limits of physics and engineering to deliver next-generation
capacities and technical innovation,” said Steve Phillpott, CIO of
Western Digital. “This successful collaboration with Univa and AWS shows
the extreme scale, power and agility of cloud-based HPC to help us run
complex simulations for future storage architecture analysis and
materials science explorations. Using AWS to easily shrink simulation
time from 20 days to 8 hours allows Western Digital R&D teams to explore
new designs and innovations at a pace un-imaginable just a short time
ago.”
The electro-magnetic simulations combined with the features of AWS Spot
Fleet included roughly 40,000 Spot instances and more than one million
vCPUs. With AWS, Univa’s highly-scalable cluster management and
scheduling capabilities of Navops Launch and Univa Grid Engine were also
used to coordinate cluster management and workload execution across the
wide capacity of Western Digital’s infrastructure and keep the cluster
fully utilized even under such a very high workload. The result was an
extraordinary 60x reduction in simulation time – from 20 days to 8 hours.
“We are honored to have participated in such a unique project alongside
Western Digital, who is a storage infrastructure leader,” said Gary
Tyreman, President and CEO of Univa. “Univa works with hundreds of
enterprise organizations who are often challenged with migrating HPC
applications to the cloud, as this can typically be considerably more
expensive than on-premises if not properly managed. Our Navops Launch
solution gives HPC administrators the ability to control which
applications are placed in the cloud, while also being able to control
and monitor HPC cloud consumption and spend. I am proud of the work that
the Univa team did alongside AWS, as we successfully demonstrated
extreme scale HPC cloud.”
Additional Resources
-
AWS Blog post: Western
Digital HDD Simulation at Cloud Scale – 2.5 Million HPC Tasks, 40K EC2
Spot Instances -
Univa Blog: Mission
Is Possible: Tips on Building a Million Core Cluster - Twitter: @Univa_Corp
About Univa Corporation
Univa is the leading innovator of workload management solutions that
optimize throughput and performance of applications, containers, and
services. Univa manages workloads automatically by maximizing shared
resources and enabling enterprises to scale compute resources across
on-premises, hybrid and cloud infrastructures. Univa's solutions help
hundreds of companies to manage thousands of applications and run
billions of tasks every day to obtain actionable insights and achieve
faster time-to-results. Univa is headquartered in Chicago, with offices
in Canada and Germany.
View source version on businesswire.com: https://www.businesswire.com/news/home/20190225005155/en/