- Technological hurdles can prevent scientists from pursuing research
- Science gateways provide an easy, custom interface to supercomputing resources
- Developer Mona Wong uses programming skills to support science from the inside out
Have you ever set out to accomplish a new task but quickly discovered that it required new equipment? And then you had to learn how to use the equipment before you could move on, and before long you were bogged down in online instructional videos and wondering why you ever thought this was a good idea?
That same thing can happen to scientists, particularly when they want to access new technologies to expedite their work. The hurdle can be especially big for researchers in fields like biology or the humanities, that don’t traditionally provide technological training.
Enter Mona Wong, a software engineer at the San Diego Supercomputer Center (SDSC). Wong works behind-the-scenes to develop online gateways that help scientists reach their goals—whether that’s a better understanding of how the brain works or more accurate predictions of natural disasters.
A science gateway is a tool that makes data, resources, or custom scientific workflows easier to use via a point-and-click web interface—no command lines or programming languages to learn.
“Gateways allow people to not have to worry about the software and how to install and run it, to be able to try things,” says Wong. “I get to use what's fun and easy for me to help scientists discover something.”
Solving problems for scientists
One way that Wong helps expand access to science is through her work with the Extended Developer Support (EDS) group at the Science Gateways Community Institute (SGCI). Funded by the National Science Foundation, SGCI is a clearinghouse of support for developing, operating, and sustaining science gateways.
In her role with EDS, Wong works on multiple gateways—often serving entirely different domains. When someone is interested in building a gateway, they apply to EDS. If they’re selected to receive support, they’re assigned to a developer who then meets with the principal investigator and comes up with a work plan for the engagement.
One of her recent projects is the COSMIC2 Gateway which focuses on structural biology, a field currently undergoing something of a revolution. Structural biologists use cryo-electron microscopy (cryo-EM) to examine the structures of proteins and other macromolecular samples at near-atomic resolution.
Cryo-EM is an invaluable tool for understanding human health and disease, but its widespread use is hindered by the incredibly large size of the datasets the equipment generates.
In Cryo-EM, scientists use the microscope to collect images of a frozen specimen at increasing depths. The images are then reconstructed into a 3D representation of the original structure.
The final images constitute very large datasets, currently up to 12K by 8K for a single image.
Because the microscopes are very expensive instruments, scientists must wait to be allocated time, and when they get access, they collect as many images as they can, generating massive amounts of data at one time—up to tens of terabytes.
Many biologists aren’t familiar with handling such large datasets or have the computing resources to do the calculations. That’s where a gateway comes in.
“The user uploads their data to the gateway and sets up parameters for their computational task,” says Wong. “Once the task is initiated by the user, the gateway will generate and submit the commands to the software running on the supercomputer to perform the calculations. The user is notified when the job is done, and they come back to the gateway to view the results.”
One additional challenge with such large datasets, is how to reliably transfer the data from the scientist’s lab to the supercomputer that does the heavy lifting and back again.
“If you have only one gigabyte of data and the transfer dies halfway through, you can start again,” says Wong. But that’s not a great solution when you’re dealing with multiple terabytes that can take days or weeks to transfer. That’s why COSMIC2and other gateways use the Globus platform for secure data transfer.
“The great thing about Globus is if there’s a problem in the middle, it can restart from there,” says Wong. “At the end, it has mechanisms in place that check the data to make sure it wasn’t corrupted. In theory, you don’t have to worry about it.”
Make the code do something
Toiling behind-the-scenes might seem like thankless work, but Wong sees her role as actively supporting scientific discovery. She has always loved science, but her own unique path became clear when she took her first computer science course.
“I discovered that it was actually easy for me,” she says. “It fits the way I think, and it’s also very creative, because you take code and you make it help you do something.”
And the ‘something’ she is doing hasn’t gone unnoticed. Wong received the Young Professionals Award at the SGCI’s Gateways 2018 conference, which recognizes notable achievement in the advancement of science gateways.
“It’s a great honor, a way to acknowledge the work that you do and the promise you hold for the community,” says Wong. “But I also sense a kind of responsibility.”
That responsibility isn’t just to scientists. The use of science gateways is expanding as more research areas are discovering what can be accomplished with access to supercomputing power. Their ease of use allows even school children to learn about advanced scientific resources without worrying about how to install and run complex software—which just makes Wong’s work all the more rewarding.
“I love that gateways are used a lot now in teaching students about science,” says Wong. “I love science, so I think that the more people who love science, the better.”