For computer scientist Greg Lee, sports and exercise are an important part of a balanced lifestyle. A competitive tennis player since the age of 10, Greg still plays a few semi-professional tennis tournaments a year, but these days he particularly enjoys the camaraderie that team events offer.
Greg currently plays on a San Francisco-based team that traveled to Las Vegas in September for nationals, where he helped his team earn second place. One of his victories there was over a doubles team fresh off the professional tour. Greg also regularly participates in basketball and ultimate Frisbee lunchtime pick-up games at LLNL and rides his bike to work most days, a round-trip journey of about 23 miles. He notes, “In addition to the obvious physical benefits, I think sports have helped me develop discipline, and they keep my mind sharp.”
Sports, according to Greg, also led him to both his marriage and his Laboratory career. While playing collegiate tennis at the University of California, Davis, he met his future wife and also became acquainted with a teammate’s father, an LLNL employee who suggested he spend a summer at Livermore. Greg liked the people, the challenging projects, and the work environment he experienced during his summer as a Computing student intern, so after he earned his computer science master’s degree at the University of California, San Diego in 2006, Greg joined Computing’s Development Environment Group (DEG).
DEG works closely with the Center for Applied Scientific Computing, which supports the demanding computing requirements of Livermore scientists. Greg and his colleagues help apply existing tools to customer needs and invent new solutions where needed.
“We fill in gaps in areas like debugging tools,” explains Greg. “Few places have systems as large as ours, so the customer base is small and tool vendors don’t focus their development efforts in this area.” The open-source software they develop is subsequently made available to the broader HPC community.
Greg is one of the main developers of the Stack Trace Analysis Tool (STAT), a highly scalable debugging tool for identifying errors in computer codes running on supercomputers with 1,000,000 processor cores or more. STAT won an R&D 100 award in 2011. “We have heard from people in the high performance computing field that we wouldn't have otherwise heard from without winning our R&D 100 award,” observes Greg. But while it was nice to get industry recognition, he says, the best reward has been seeing users benefit from the tool: “For example, STAT was used several times on Sequoia to solve problems we couldn’t have otherwise.”
One of these instances was in late 2013, when an international team of scientists was simulating a collapsing cloud of 15,000 bubbles using the Sequoia supercomputer and the calculations suddenly stopped. Within a few minutes, STAT had determined which of more than 6 million computing threads was causing the problem. The team went on to complete the pioneering simulation and win a Gordon Bell Prize for outstanding HPC achievement.
Greg and his colleagues continue to evolve STAT and use it to solve new problems. The DEG group also develops other tools designed to boost performance and productivity, such as AutomaDeD, which uses artificial intelligence to automate the debugging process for massive simulations, and SPINDLE, which addresses problems that can occur when millions of cores simultaneously open an application consisting of thousands of shared libraries. Efforts such as these keep Greg engaged and challenged. “I’ve never had worries about there being a lack of work,” he says. “As the computers get bigger and more complex, we’ll always need new and improved tools to make them run better.”
—Rose Hansen