- Higher ed institutions are increasingly vulnerable to cyberattacks
- New machine learning system automatically categorizes attacker behavior
- Categorization brings analysts’ attention to critical attacks more quickly
Security threats against colleges and universities are on the rise. Personal and financial information for students and staff, cutting-edge research data, and access to high-performance computing systems that could be exploited for illicit cryptocurrency mining are all highly attractive targets for hackers.
But higher education networks are so large and complex and accessed by so many people on many different devices that they are difficult to defend. Researchers now think that AI and machine learning could become the key to successful cybersecurity operations.
ASSERT is a machine learning system that automatically categorizes attacker behaviors derived from alerts and other information into descriptive models to help a security operations center operator more effectively identify related attacker behavior.
OmniSOC is a 24/7 security operations center (SOC) shared among multiple universities to help members reduce the time from first detection of a security threat to campus mitigation. In 2019, the Indiana University Center for Applied Cybersecurity Research (CACR) and OmniSOC began working with Dr. Jay Yang and his team at Rochester Institute of Technology to implement Dr. Yang’s ASSERT research prototype with the OmniSOC.
“SOC analysts are overwhelmed by intrusion alerts,” said Yang. “By providing a characteristic summary of different groups of alerts, ASSERT can bring SOC analysts’ attention to critical attacks quicker and help them make informed decisions.”
CACR staff are working with OmniSOC engineers and Yang’s team to validate the methodology and test the research prototype for use at OmniSOC for applicability to SOC workflows using data OmniSOC aggregates from IU as the first of these explorations of machine learning approaches.
“We are excited to work with Jay and his team on these new techniques that have the potential to improve our ability to quickly identify malicious activity,” said OmniSOC Director Tom Davis.
“OmniSOC was established to not only help higher education institutions improve their security posture, but to also directly contribute to their missions of education and research," Davis says. "Collaborations such as this one help us support research and improve the practice of cybersecurity using new and exciting approaches."
The team is using a subset of an anonymized parallel feed of only IU’s OmniSOC data and pipelined to a virtualized prototype. The results will be provided to OmniSOC engineers and analysts to determine if the method has utility for OmniSOC’s workflows. This project aims to catalyze further applied AI research for cybersecurity.
“We planned the project in phases because we weren't sure from the beginning that this would be something that could provide real value because it's still a research prototype,” said Ryan Kiser, a senior security analyst at CACR and one of the researchers involved in the project. “We had to reduce this data down to reduce the risk of using operational data. We determined a way to anonymize data and got approval from the security and policy offices.”
The first phase was to set up a testbed, get the prototype deployed into the testbed, then start to get the right data from OmniSOC into the prototype. That concluded in early January.
Suricata is a network monitoring and alerting tool used at IU. Kiser said they wanted to take a subset of the data that Suricata is generating at IU and use that as the basis for an initial analysis—an exploration. The hope is that ultimately this can be applied more broadly and do something like full network sensor data.
Another tool called Zeek has more visibility into data than Suricata about what is flowing over the network. The hope is that once the groundwork is laid using the small dataset with Suricata, OmniSOC can start using the much larger volume of data that Zeek captures, hopefully getting much more valuable results out of it.
“One of the biggest takeaways that I have from this is the way in which it is limited,” said Kiser. “You cannot take a dataset and throw it at a neural network and then have a usable model that you can use to analyze other data. You have to tailor these things to the use case to solve a particular problem.”
Their goal now is to come up with a roadmap to realize its potential. “We're going to write up what we found by the end of July and plot a path forward for Jay’s group and OmniSOC to try to bring it into a real production environment,” Kiser added.