- HIPAA security and privacy rules stipulate that patient healthcare data must be protected
- Privacy regulations create challenges for scientists accessing medical data for research
- Medical Science DMZ model speeds research and improves collaboration
The 1996 Health Insurance Portability and Accountability Act (HIPAA) is best known for preserving insurance coverage for employees who change or lose their jobs. But the law also includes a Security Rule and a Privacy Rule that protect confidential healthcare data for consumers.
These security and privacy regulations—which took effect in 2003—continue to safeguard patient health information, but also create challenges for the medical research community. The guidelines provided for implementing technical measures to protect medical data sometimes also impede usability, including high-bandwidth data transfers.
Not so surprising, given that 2003 was still early days for the internet, and it moved at a much slower pace. Now, fifteen years later, there is a better way to transfer large datasets containing patient data while still complying with HIPAA’s rules.
Design for performance
In their paper, Lawrence Berkeley National Laboratory (LBNL) computer scientist Sean Peisert and Energy Sciences Network (ESnet) researcher Eli Dart and their collaborators outline a “design pattern” for deploying specialized research networks and ancillary computing equipment for HIPAA-protected biomedical data that provides high-throughput network data transfers and high-security protections.
Read previous Science Node coverage on the original Science DMZ model
Science DMZ: the fast path for science data -- interview with Lary Smarr, founding director of the National Center for Supercomputing Applications (NCSA).“The original Science DMZ model provided a way of securing high-throughput data transfer applications without the use of enterprise firewalls,” says Dart. “You can protect data transfers using technical controls that don’t impose performance limitations.”
Created with US Department of Energy (DOE), National Science Foundation (NSF), and National Oceanic and Atmospheric Administration (NOAA) science applications in mind, the original Science DMZ model supports research in areas such as high-energy physics, atmospheric modeling, and cosmological data.
But domains such as genomics also require high-performance applications to process incredibly large and complex datasets. For example, the Department of Veterans Affairs’ (VA) Million Veterans Project (MVP) is reported to be the largest genomic database in the world as of 2016, and the National Institutes of Health (NIH) “All of Us” program seeks to develop a dataset of electronic health records and genomes of similar size.
However, unlike high-energy physics, which many scientists would view as requiring relatively low levels of cybersecurity protections, human genomic data and electronic health records require substantially more safeguards in order to comply with the HIPAA Security Rule in the US, and similar regulations exist in other parts of the world.
The National Institutes of Standards and Technology (NIST) has published extensive guidelines on implementing the HIPAA Security Rule.
"Many traditional security protections, such as the stateful and deep-packet-inspecting firewalls prescribed by NIST, don’t support both the goals of security and the high-performance networking needs of these applications,” says Peisert. “The Medical Science DMZ addresses these problems by creating a network that is explicitly designed for high-performance biomedical applications with security protocols.”
“If you look at overall network and system design as part of your security architecture, this allows you to make better decisions,” says Dart. “This leads to better scientific outcomes.”
Collaboration nation
In his original Science DMZ work, Dart found that the Science DMZ design pattern increased collaboration among different research organizations by improving transfer speeds and reducing cost.
Peisert, Dart, and their collaborators at Indiana University (IU), the University of Chicago (UIC), Harvard University, and BioTeam expect the same will be true when applying the Medical Science DMZ to HIPAA-protected data..
“If we look at what the medical field is trying to do with cancer data,” says Dart, “we need a way for multiple institutions to collaborate. Everybody may have a piece of the puzzle, but nobody has all the data in one place.”
Shared data repositories like the National Library of Medicine, the National Cancer Institute, and the European Bioinformatics Institute are growing rapidly, highlighting the need for a quick and cost-effective way for researchers to access their large datasets.
“The datasets traditionally used in medical data have been smaller,” says Peisert. “But scientific work in large-scale precision medicine research requires substantially larger data-driven efforts in order to be successful.”
Ensuring privacy and results
The original Science DMZ model is just one example of how the computing community has evolved in recent years.
“We’re able to do things with computing now that we couldn’t dream of a generation ago,” says Dart.
The Medical Science DMZ may help researchers pioneer results in multiple human health domains by improving data transfer times, while still complying with and enforcing HIPAA’s security and privacy regulations.
Peisert notes, “We did this work to develop this framework because we wanted to dramatically change the way data-driven medical science takes place. We wanted to make it possible for institutions doing data-driven medical science to move large amounts of data in a secure way without it taking a month to move the data over the internet, or by mailing giant hard drives to each other.”
Though the Medical Science DMZ was originally developed with biomedical science in mind, Peisert and Dart point out that the same model is appropriate to other types of science with robust security needs, such as data with intellectual property constraints or other confidentiality and privacy requirements.