The great migration: VisIt moves from Subversion to GitHub
Software development is often a story of teamwork and determination. It’s a tale of persistence through failure toward, ideally, success. At LLNL, this story plays out in countless daily iterations as software teams strive to advance the Lab’s national security mission. When it comes to supporting both stockpile stewardship and foundational science, the VisIt visualization tool is the backbone of LLNL’s computer simulation analysis and visualization capabilities.
A team from Computation’s Applications, Simulation, and Quality (ASQ) Division recently migrated VisIt from a patchwork of technologies to the open-source platform GitHub—an undertaking laced with complexity, obstacles, and surprise. Computer scientists Eric Brugger, Cyrus Harrison, Alister Maguire, and Mark Miller planned and implemented the migration. VisIt’s day-to-day team also includes ASQ developers Kathleen Biagas, Kevin Griffin, Matt Larsen, and Eddie Rusu.
VisIt is a scalable visualization and graphical analysis tool used at LLNL for data defined on complex two- and three-dimensional meshes. The tool has always been targeted to accommodate users on desktop machines or distributed, scalable computing systems. The feature set includes scalar, vector, and tensor field visualization; interactive animations; and qualitative and quantitative analysis functions. VisIt supports multiple platforms (Windows, MacOS, Linux) and interfaces (C++, Python, Java) as well as custom plugin creation.
A History of Change
VisIt was originally developed at LLNL in the early 2000s for the Department of Energy’s (DOE’s) Advanced Simulation and Computing Initiative. Following the first major feature update in 2002, VisIt joined the open-source community in 2008 using a combination of facilities operated under DOE’s Office of Science: LLNL, Oak Ridge National Laboratory (ORNL), and Lawrence Berkeley National Laboratory’s National Energy Research Scientific Computing Center (NERSC). LLNL’s Weapons Simulation and Computing Program funds VisIt development by Computation personnel.
“In the late 1990s, NERSC hosted most open code development for the Office of Science,” explains Miller. At the time, NERSC’s barrier to entry was lower than LLNL’s—contributors simply needed to fax an account request form—and the institution was better equipped for web service setup and open collaboration.
Harrison states, “Being open to other communities, such as users simulating tokamak fusion, has hardened our tool and enabled us to build features we wouldn’t have otherwise developed.” Miller adds, “In addition to developer contributions, VisIt also benefits from users who report issues or new use cases.” Over the following years, all VisIt development became licensed and freely available.
Eventually, the technologies supporting various development processes ran into challenges. The source code outgrew its home in ClearQuest, so the team migrated it to Subversion (SVN), to which NERSC hosted anonymous https access. However, Harrison says, “Branch development and merging became suboptimal with SVN. It could take an hour to create a branch. VisIt’s large code base was taxing on NERSC’s servers, and we saw performance issues.” As SVN started to show its age, so too did the Redmine issue tracker hosted at ORNL and the OpenOffice software used at LLNL for VisIt’s documentation. Cumulatively, these concerns provided enough motivation to migrate several of the tool’s services to the more advanced, well-established GitHub platform.
During this time, the Lab’s policies and capabilities for managing open-source software changed, too. A culture of collaboration inspires many project teams to seek feedback and ideas from the open-source community, and GitHub provides projects with greater visibility and more integrated development functionality. Currently, over 450 LLNL software repositories are available to the public. VisIt is one of the Lab’s longer-lived open-source tools, though its history with NERSC and ORNL make it a relative newcomer to GitHub.
Maintaining VisIt’s baseline health is crucial for users who dissect data at extreme scale, and throughout the years, the team’s decisions have always been driven by ease of contribution. “VisIt has an active community of contributors, and some users have relied on it for years,” says Brugger.
Consolidating the Pieces
Figure 1. VisIt’s technology ecosystem provides a variety of services to developers and users. The team moved several technologies to GitHub for easier access by the open-source community, integrated documentation, seamless quality assurance workflows, and better version control.
The road to GitHub was neither straight nor narrow. VisIt consists of multiple technologies serving different purposes, collectively leaving no stone unturned in ensuring the tool’s reliability and accuracy. Each service had to be evaluated for migration potential and necessity—for example, ORNL’s Mailman email service, which is archived and searchable, is sufficient as is. According to Miller, this piecemeal situation ultimately evolved into an opportunity to consolidate. “We saw various ways to migrate, and we wanted to maintain as much history as possible,” he says.
Although it is the world’s most popular source code hosting service, GitHub is not a one-size-fits-all solution. VisIt contains 2 million lines of code and has a nearly 20-year history of versions, bugs identified and fixed, code contributions, and documentation. Harrison states, “We knew we’d have to make strategic decisions and major changes to accomplish the transition.”
Migration planning began two years ago and culminated when Harrison and Maguire wrote scripts to automatically transfer over 15,000 historical code contributions and 3,000 issue submissions. “Not everything migrated successfully the first time,” Harrison notes, referring to the ClearQuest-to-SVN migration of the mid-2000s. “The history was intact, but the details needed work. So we expected that initial migration attempts would fail at first.”
In this effort, the team’s initial SVN-to-GitHub scripting attempts resulted in all historical work being time-stamped with the same day in 2007. They kept at it, improving the import process with each iteration. “When replaying all the code changes from the past 15-plus years, we wanted to ensure the work was attributed to the correct people,” says Harrison.
Another migration challenge arose when Maguire began moving VisIt’s bug and issue history to GitHub. “The Redmine repository we’d been using was very different from GitHub,” explains Maguire. “We had to figure out how to translate from one to the other.” Redmine did not support an API for this process, so Maguire learned how to write a manual web scraper. Furthermore, GitHub’s import API was not a 1:1 match with the Redmine data, and it imposed limits on the number of queries Maguire could make. “I was flagged on a spam list and had to ask for special permission to continue,” he adds.
GitHub presented other considerations for the VisIt team. “Some of our binary files were huge, which GitHub doesn’t allow without paying for a larger bandwidth quota,” Miller states. “So we split out the big files and came up with a new development process to accommodate storage constraints.” VisIt’s size also prompted customization of continuous integration (CI), a key GitHub benefit. To prevent the test suite from timing out, the team is adjusting CI timing and building containers to run tests.
Last but not least, the team had to pick a cutover date when Lab personnel were available to assist—for example, not the same week that Computation sent more than 100 employees to the SC18 conference. A dry run was scheduled for December 2018, then VisIt version 2.13.3 debuted on GitHub in January 2019.
19 Years and Counting
Figure 2. VisIt’s migration tasks included transferring more than 15,000 code commits made over the years. This commit history is now available via GitHub’s familiar user interface.
Now accessible via the GitHub dashboard, VisIt’s code base is already receiving contributions from the external community. In another major scripting effort, documentation has been migrated to Sphinx, which integrates seamlessly with GitHub’s Read the Docs platform. After the team confirms the repository’s stability, Maguire will complete the bug and issue tracking migration. The VisIt website will eventually migrate as well, and the team plans to dive into GitHub’s analytics capabilities.
Version 3.0 is also on the horizon. Brugger notes, “VisIt leverages many third-party tools, all of which require maintenance. Some are more straightforward than others.” For instance, VisIt uses Ascent for in-situ processing and OSPRay as the ray-tracing library. The next VisIt version will include upgrades to GPU acceleration through VTK-m and, inevitably, documentation updates to reflect the current state of the software.
VisIt will mark its 20th anniversary next year, and its longevity can be attributed to several factors. According to Miller, many of VisIt’s capabilities cannot be found in commercial software. Moreover, he notes, “A majority of the core development team has been involved since the early 2000s, demonstrating remarkable and highly valued stability especially when compared to commercial software projects of similar scope.”
Success also depends on the team’s lasting relationships with early collaborators combined with welcoming the next generation of computer scientists. Former LLNL employee Brad Whitlock works with Intelligent Light to enhance their VisIt capabilities and remains active in the tool’s development. VisIt’s original architect, Hank Childs, now teaches at the University of Oregon and continues to assist with strategic planning. Childs regularly sets his research students—like alumnus Maguire—to work on VisIt. Maguire says, “I probably wouldn’t be at the Lab if VisIt hadn’t been open source.”