Argonne training program alumni find success in extreme-scale computing

BY JIM COLLINS | FEBRUARY 12, 2024

From providing a foundation for innovative computing research to fostering new connections and opening the door to diverse career paths, ATPESC has become a defining chapter in the professional journeys of many attendees.

Since its launch in 2013, the annual Argonne Training Program on Extreme-Scale Computing (ATPESC) has hosted nearly 800 attendees for an immersive learning experience designed to teach them the fundamentals of using supercomputers for scientific research.

While ATPESC attendees are required to have some exposure to high performance computing (HPC), the program provides a knowledge base, hands-on experience and networking opportunities that have helped further the careers of many participants.

“Going to a program like ATPESC allows you to meet all these people who are doing similar things from such different research backgrounds,” said Max Katz, who attended the very first ATPESC in 2013. ​“It’s one way of kind of minting your union card, as it were, to become part of the HPC community.”

“ATPESC is one of the first places that made me realize how many different kinds of career paths there are in HPC.” — Max Katz, AI Legislative Fellow in the U.S. Senate

For two weeks each summer, the program, hosted by the U.S. Department of Energy’s (DOE) Argonne National Laboratory, brings in approximately 70 participants from around the globe for a deep dive into the world of extreme-scale computing.

World-class HPC experts deliver lectures on topics ranging from computing hardware and software development to artificial intelligence (AI) and data processing. They also lead hands-on sessions where participants get a chance to test drive powerful supercomputers at DOE Office of Science user facilities: Argonne Leadership Computing Facility (ALCF) at Argonne, Oak Ridge Leadership Computing Facility (OLCF) at DOE’s Oak Ridge National Laboratory (ORNL) and the National Energy Research Scientific Computing Center (NERSC) at DOE’s Lawrence Berkeley National Laboratory.

“The all-encompassing curriculum, an impressive group of lecturers and supercomputer access are what make ATPESC so unique,” said Ray Loy, ATPESC director and ALCF lead for training, debugging and math libraries. ​“Our ultimate goal is to equip a new generation of researchers with the skills to leverage and enhance extreme-scale computing capabilities for science and engineering.”

With ATPESC entering its 12th year this summer, we caught up with some past attendees to get their thoughts on the program and how it has impacted their careers.

From academia to industry to government

Max Katz came to ATPESC as a graduate student studying supernova explosions at Stony Brook University. He had used smaller HPC clusters, but large-scale supercomputers were relatively new territory for him.

“Like most grad students who are doing something like supernova research, you just hit the ground running and learn how to use a supercomputer at the same time as you’re learning to do your science,” he said.

At his advisor’s recommendation, Katz applied for ATPESC to get a crash course in using supercomputers for their research.

“Aside from just supporting my growth in the field, I wanted to go to ATPESC to meet other people in the supercomputing community and see what they were working on,” he said.

Shortly after ATPESC, Katz left academia for a position with NVIDIA, where he focused on training NVIDIA system users and helping them overcome software development challenges. He also served as a liaison between the company and DOE national labs.

“ATPESC helped me realize that it can also be fun to support the scientific community in the other direction,” Katz said. ​“I wanted to do something where I could see the impact on people.”

In 2022, his desire to make a positive impact led him to take on a new challenge — advising the U.S. Senate on AI policy. As a legislative fellow with New Mexico Senator Martin Heinrich, Katz supports the office’s technology and energy portfolio as well as the Senate AI Caucus. From HPC-driven research to one of the world’s largest technology companies to a congressional fellowship, Katz’s career path has touched on many different aspects of advanced computing.

“ATPESC is one of the first places that made me realize how many different kinds of career paths there are in HPC,” Katz said.

High-performance networking

ATPESC proved to be more than just a learning experience for Verónica Melesse Vergara and Sally Ellingson. As attendees of the 2013 program, the two researchers formed a friendship and working relationship that is still going strong today. They both have careers that require working with supercomputers, and they regularly collaborate with each other for various HPC events and conferences.

“ATPESC goes morning to night for two weeks,” Melesse Vergara said. ​“You’re talking to the same people every day, eating meals with them, taking in a fire hose of information together. I think that helped create this ​‘we’re in this together’ sort of environment.”

When Melesse Vergara applied for ATPESC, she was a scientific applications analyst at Purdue University, where she supported researchers using the university’s HPC clusters.

“I had just started working at Purdue and had a limited view of what HPC was,” she said. ​“ATPESC seemed like an HPC bootcamp that covered everything. It touched on so many topics so that was the main appeal for me.”

After cutting her teeth in HPC support with a university cluster, Melesse Vergara took a position at ORNL, where she has worked with some of the world’s most powerful supercomputers, including the Frontier exascale system. Over the past decade, she has ascended from HPC user support specialist to group leader for system acceptance and user environment to section head of operations at the lab’s National Center for Computational Sciences.

At the time of ATPESC 2013, Ellingson was a graduate student studying life sciences at the University of Tennessee, Knoxville. She was also a graduate research assistant at ORNL’s Center for Molecular Biophysics.

“I came from a computational background, but I didn’t know what a supercomputer was until I toured Oak Ridge,” Ellingson said. ​“The first time I saw one, I knew that was the route I wanted to go, so I took every opportunity I could to work on technical skills and learn more.”

Ellingson’s experience with supercomputing and working with the national labs led her to a faculty position at the University of Kentucky, where she is an assistant professor of biomedical informatics and a liaison for HPC services for the Markey Cancer Center.

“My research is focused broadly on computational biology and drug discovery, especially work that requires high performance computing,” Ellingson said.

While Ellingson and Melesse Vergara don’t cross paths often in their day-to-day jobs, they continue to collaborate on events in the HPC community. In 2023, they teamed up on the papers program committee for the Platform for Advanced Scientific Computing Conference, serving as co-chairs for the life sciences domain. This year, Melesse Vergara and Ellingson are co-leading the annual Supercomputing Conference’s Inclusivity Committee, serving as chair and deputy chair, respectively.

“Participating in events like ATPESC and Supercomputing has really helped build my network and realize that this is a community where I feel like I belong,” Ellingson said. ​“You get to know people that you will continue running into throughout your career.”

Engine for innovation

For Argonne’s Sibendu Som, ATPESC was key to propelling his team’s engine modeling research from smaller computing clusters to DOE’s large-scale supercomputers.

“We needed to accelerate our simulations further, and I was interested to see if our code could scale on leadership machines,” Som said. ​“My team and I were primarily using commercial software since we were working on several projects with industry. At ATPESC, I was exposed to open-source codes and their capabilities.”

One of those codes was Nek5000, an Argonne-developed thermal-fluids simulation code that is designed to run efficiently on massive supercomputers. After adopting Nek5000 for some of his team’s engine research efforts, Som applied for and received access to ALCF supercomputers via DOE’s allocation programs. He and his team have led multiple projects at the ALCF and continue to leverage DOE systems in collaboration with industry partners through programs like the High Performance Computing for Energy Innovation (HPC4EI) initiative.

Som’s career at Argonne has also grown by leaps and bounds since he attended ATPESC as an Argonne mechanical engineer in 2013. He is now the director of Argonne’s Advanced Propulsion and Power department and leads the Advanced Energy Technology directorate’s AI and HPC initiative. As a department director, he has encouraged multiple colleagues to attend ATPESC over the years and continues to recommend the program to new staff members working with HPC.

“The content continues to be extremely relevant for my team,” Som said. ​“With the recent focus on AI tools and testbeds, ATPESC has helped my team members stay abreast with the latest developments in the field.”

Launchpad for HPC knowledge

ATPESC has also become a valuable training resource for the NASA Langley Research Center, with more than a dozen early career researchers attending the program over the years.

“NASA’s mission is not HPC like it is for some of the national labs,” said Michelle Rodio, who attended ATPESC in 2020. ​“The mission is to go to the moon, Mars and beyond, but you can’t do that without HPC.”

Prior to attending the program, Rodio was a graduate student at Old Dominion University who was also working at the NASA Langley Research Center. To support her research in computational aeroacoustics, Rodio enrolled in several HPC workshops and courses to learn everything she could about supercomputing.

“ATPESC offered a unique opportunity to become well versed in all things HPC,” Rodio said. ​“The point was to be exposed to it, to know what’s out there, to know what the capabilities are.”

In 2021, Rodio made the leap from NASA to industry, taking a role with NextSilicon, a startup company that specializes in accelerated compute solutions for the HPC market. NextSilicon’s adaptive learning algorithm aims to support diverse computing applications without requiring code changes to run efficiently. As the Director of Customer Success, Rodio manages a team that handles technical support, user documentation and training workshops.

“When I heard the company’s objective was to provide a platform that allows domain scientists to focus on their science instead of porting and software development, it really resonated with me,” Rodio said. ​“I wanted to be part of a solution that helped people avoid having to go through the struggles with code development that I went through for much of my Ph.D. program.”

The knowledge boost she received from ATPESC has served her well in this role. Not only is she more familiar with HPC software approaches, but she is also drawing from the experience to inform the company’s training materials.

“I’ve even found myself talking to some of the people who were trainers during ATPESC because they are our target audience,” Rodio said. ​“It has truly, and wonderfully, come full circle for me.”

These are just a few of the stories that illustrate ATPESC’s enduring impact. From providing a foundation for innovative HPC research to fostering new connections and opening the door to diverse career paths, ATPESC has become a defining chapter in the professional journeys of many participants who have passed through the program.

The call for applications for ATPESC 2024 is now open through February 28, 2024. For details, visit the ATPESC website.

The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines. Supported by the U.S. Department of Energy’s (DOE’s) Office of Science, Advanced Scientific Computing Research (ASCR) program, the ALCF is one of two DOE Leadership Computing Facilities in the nation dedicated to open science.

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.

The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://​ener​gy​.gov/​s​c​ience.