Understanding cyber attacks: a keystroke in a haystack

13th July 2017

Network topology graphs. Red: Compromised source computer. Green: normal network.

Network topology graphs. Red: Compromised source computer. Green: normal network.

The recent spate of global cyber attacks has shifted the world’s focus firmly onto cyber security issues. Attacks such as the WannaCry ransomware attack, which caused parts of the UK’s National Health Service (NHS) computer system to shut down, serve as a sobering reminder of the threat these attacks can pose to our everyday lives.

A key factor in being able to halt such attacks is the ability to identify suspicious behavior in its infancy and prevent its spread, however the vast volume of data available to analysts makes this an extremely challenging problem. In one day, WannaCry was reported to have infected more than 230,000 computers in over 150 countries (source: https://en.wikipedia.org/wiki/WannaCry_ransomware_attack)!

Working with a data set consisting of 18 billion daily events and with less than 0.000075% defined as ‘red team events’, cyber attacks, a team of students from Imperial College London have found that one approach to working through this ‘haystack’ of information can be achieved by creating groups of similar users by analysing user attributes.

The team compared user attributes of those affected by red team events to those that were not and were able to identify patterns and common attributes across users. This allowed them to group users into ‘families’ and filter out standard behaviour, leaving only unexpected behaviour. Although this would need to be used in parallel with analysts’ existing tools, this could be a powerful tool in helping analysts to sift through the vast amounts of data analysts work with.

Additionally, the team were able to visualise network topology graphs for compromised computers and compare with those not affected. They found that it is possible to observe changes in connectivity and identify the differences between anomalous connections and normal connections.

This research was undertaken through the 2017 Data Spark scheme in collaboration with BT Research and KPMG.

Team members: Salkha Baraba (MSc Business Analytics), George Chatzaras (Full-time MBA), Anshu Grover (Full-time MBA), Qile Huang (MSc Computing), Christina Tatli (MSc Business Analytics), Xin Yu (MSc Computing)
Academic mentor: Dr David Birch, Data Science Institute, Imperial College London
Business Mentors: Alex Healing, Chief Researcher, BT Research and Tom Burton, Cyber Security Director KPMG

Cyber Security Data Spark team (l-r) Georgios Chatzaras (FT MBA), Anshu Grover (FT MBA), Salkha Baraba (MSc Business Analytics), Christina Tatli (MSc Business Analytics), Qile Huang (Computer Science), Xin Yu (computer science).

Cyber Security Data Spark team (l-r) Georgios Chatzaras (FT MBA), Anshu Grover (FT MBA), Salkha Baraba (MSc Business Analytics), Christina Tatli (MSc Business Analytics), Qile Huang (MSc Computer Science), Xin Yu (MSc Computer Science).