DOE Machines Dominate Record-Breaking SC18
They say everything’s bigger in Texas, and the 30th anniversary of the annual International Conference of High Performance Computing, Networking, Storage and Analysis (SC18), held Nov. 11-16 in Dallas, did not disappoint. The conference, which broke records for attendees and exhibitors, saw Lawrence Livermore National Laboratory (LLNL) once again make its presence felt on the world’s biggest HPC stage.
For the first time in five years, the U.S. captured the top two spots on the TOP500 List of the world’s fastest supercomputers. LLNL’s Sierra, an IBM/NVIDIA system developed for the National Nuclear Security Administration (NNSA), leapfrogged China’s Sunway TiahuLight system for second place after placing third on the previous list released in June. Oak Ridge National Laboratory’s (ORNL) Summit, a 200-petaflop machine, maintained its grip on the top spot, also improving on its earlier benchmark score. In addition, Sierra placed sixth on the Green500 list of most energy-efficient supercomputers, a “remarkable achievement” according to Lawrence Berkeley National Laboratory physicist and TOP500 founder Erich Strohmaier, who announced the lists at a Nov. 12 press conference. Summit was third on the Green500.
The two Department of Energy (DOE) behemoths towered over the weeklong SC18 conference. SC18 drew a record attendance of more than 13,000 and featured a technical program spanning six days, making it the largest SC conference of all time. Notably, the conference hosted Under Secretary for Science Paul Dabbar, DOE’s principal adviser on fundamental energy research, energy technologies and science, including advanced computing.
SC18 kicked off on Nov. 11, with LLNL’s Associate Director for Computation Bruce Hendrickson welcoming students with a keynote talk that incorporated lyrics and titles from popular songs. Hendrickson encouraged the students to seek out career paths in high performance computing (HPC), sharing tips for selecting an institution and interviewing and securing a job.
“In HPC, we’re living on the edge of supercomputing,” Hendrickson said. “It’s kind of a high-wire act. It’s scary but it’s also very exciting. There’s an enormous potential for impacts across a wide range of areas.”
Other highlights from the first day included LLNL computer scientist Elsa Gonsiorowski leading an all-day “Women in HPC” workshop, computer scientist Timo Bremer presenting a workshop on a solution for task-based runtimes and LLNL’s Director of Diversity and Inclusion Tony Baylis chairing a session on HPC education for students.
The kickoff for SC18’s Technical Program on Nov. 12 included a student session on the hidden impact of HPC, chaired by LLNL computer scientist Olga Pearce. LLNL computer scientists Todd Gamblin, Greg Becker and their team members presented a well-attended, daylong tutorial on Spack, a software package manager for high performance machines. Another all-day workshop, co-organized by LLNL computer scientist and group leader Kathryn Mohror, focused on parallel data storage and data-intensive scalable computing systems (PDSW-DISCS).
Panelists at the evening’s plenary session, “HPC and Artificial Intelligence – Helping to Solve Humanity’s Grand Challenges,” discussed how HPC is helping to solve global issues, including hunger, sustainable agriculture, epidemics of infectious disease and environmental concerns.
The schedule for Nov. 13 began with the conference’s keynote address delivered by the Massachusetts Institute of Technology’s Erik Brynjolfsson, highlighting the role of machines in human decision-making. It was big day at the DOE booth, which showcased the history and future of DOE supercomputing, including a timeline and a mini-museum of artifacts from past machines, as well as screens displaying visualizations of high-impact HPC projects. LLNL’s Advanced Simulation and Computing lead Michel McCoy and Deputy Associate Director for HPC Terri Quinn accepted Readers’ and Editors’ Choice awards from HPCwire for the Top Supercomputing Achievement of 2018, recognizing Sierra’s deployment. The award was shared with ORNL for Summit.
LLNL Chief Computational Scientist and HPC Innovation Center Director Fred Streitz spoke at the booth about how HPC and machine learning were impacting precision medicine, particularly in his pilot project for the DOE/National Cancer Institute collaboration. Under the project, researchers have modeled the behavior of RAS proteins, which can cause cancer to grow and spread when they mutate, at molecular resolution on several thousand nodes of Sierra.
“Machine learning is an independent view of reality that gives us a new way to validate our predictive simulations,” Streitz said. “This workflow, we believe, is the future of supercomputing.”
LLNL’s Deputy for Advanced Projects Matt Leininger gave a talk at the Penguin Computing booth on Corona, a Penguin/AMD computing cluster coming soon to the Lab. Later that evening, Quinn went on stage in the convention center’s massive ballroom to accept Sierra’s second place award at the TOP500 presentation, which featured remarks by Dabbar. Dabbar said it was “truly exciting to be at the cutting-edge of speed” and added that he was proud of the efforts of the national labs for helping the U.S. take back the global lead in HPC.
“The president and [DOE] Secretary (Rick) Perry are determined to keep us ahead in supercomputing, artificial intelligence and machine learning,” Dabbar said. “I don’t think we could be any more excited about what we’re developing.”
Dabbar toured the DOE booth the following day (Nov. 14), where, earlier, LLNL computer scientist Cyrus Harrison demonstrated models showing the success the Lab has had in porting codes over to Sierra.
At his booth talk, Exascale Computing Project Director Doug Kothe discussed how the pre-exascale Summit and Sierra systems were “exceeding expectations” for speedups, adding that DOE is “on track” for deployments of exascale systems at the national labs, including El Capitan at LLNL. “This an exciting time for the evolution of computational science,” Kothe said.
On Nov. 15, University of California, Berkeley postdoctoral researcher Ken McElvain, on behalf of a team co-led by LLNL physicist Pavlos Vranas, presented a paper describing a new algorithm and code developed to more precisely determine the lifetime of a neutron. The team used Sierra and Summit to simulate the fundamental theory of quantum chromodynamics (QCD) on a lattice, reaching 15-20 percent of the machines’ peak performance and indicating progress in accuracy that could lead to the discovery of new physics.
The paper was a finalist for the Gordon Bell Prize, one of the most prestigious awards in HPC. The award, announced later that evening, was split between two teams; a Lawrence Berkeley National Laboratory-led collaboration using exascale deep learning on Summit to identify extreme weather patterns, and a team from ORNL that developed a genomics application on Summit to determine genetic architectures for chronic pain and opioid addiction at up to five times beyond previous capabilities.
The day also featured a job fair, where Lab representatives recruited students and accepted dozens of resumes. Harrison and LLNL Informatics group leader Brian Van Essen gave talks at the NVIDIA booth, and LLNL computer scientist Tapasya Patki participated in a panel on software improvements from power and energy measurement.
The Lab’s contribution to the SC18 conference wrapped up on Nov. 16 with ASC Program Coordinator for Computing Environments Rob Neely leading a workshop on performance, portability and productivity in HPC, and another workshop organized by LLNL’s David Boehme, David Poliakoff and Matt LeGendre on Extreme Scale Programming Tools.
The SC18 Exhibition floor also broke several records, including largest research booth space (65,000 square feet) and the most industry exhibitors ever. The conference returns to Denver next year for SC19.