Deep-Dive

Using AI-Emulated attackers

To simulate cyber attack strategies and perform Pathfinding

James Korge

Jan 11th, 2024

One of the primary challenges confronting cybersecurity administrators is addressing the multitude and variety of attack, traverse, and attack strategies hackers use to penetrate IT infrastructure. Understanding these "attack paths" - routes from a point of compromise to an attractive and valuable target in an organization's network - becomes critical in devising a strategy to effectively block attacks.

The approach of using attack paths in order to analyze a network's susceptibility requires determining not only how can a malicious actor enter the network but also how that actor would move through the network and what resources they would access in doing so. This is complicated by the fact that even a mid-size organization with a few hundred devices may exhibit tens of thousands of possible attack paths.

Performing a manual analysis of each step in the attack and attempting to estimate a potential attacker's favored route(s) is a data and computationally expensive effort, an ideal task for a trained artificial intelligence system. Reveald's Epiphany Intelligence Platform performs such and is capable of addressing these issues by not only automating the discovery of attack paths but also prioritizing them based on the likelihood that each path would be used in a real attack.

Background

Attack Paths

An attack path comprises three primary components: a foothold, a target, and a set of movements connecting the two. A foothold can be any tangible asset in a network including users, workstations, and other physical infrastructure. Typically, footholds susceptible to social engineering are most likely to provide an attacker with an opportunity to enter a network. Targets are more subjective than footholds. These are assets of value to the attacker (and likely the organization) which can range from application servers hosting collections of attractive data, workstations with sensitive data, or systems that act as powerful traversal points due to their privileged access. Identifying footholds and targets is just the first step toward mitigating attack paths. More challenging is the enumeration of the movements an attacker would make to obtain control over the target asset after securing a foothold in the network.

Reinforcement Learning

A common approach to automation is to leverage reinforcement learning. This is a machine learning paradigm comprising an agent, an environment, and a reward function. The agent is a process that can iteratively update a decision-making policy based on previous experience. The policy can take several forms from hash-tables to deep neural networks. The environment is a collection of states that an agent can occupy; it also imposes constraints on the moves the agent can make. The reward function is an algorithmic model for measuring the quality of the agent's decisions. Together, these components define a robust framework for navigating an arbitrary state-space. Careful construction of such an agent results in effective mimicry of a purpose-driven actor.

AI Pathfinding

Epiphany combines a source-agnostic IT data model with the proper reinforcement learning implementation resulting in a system of automated pathfinding that can analyze network graphs in minutes and produce a prioritized list of attack paths for remediation. We term these pathfinders “EvilBots”, given their intent of replicating motivated cyber attackers' behavior to enter and traverse a target's network to seek a desired outcome in terms of access or information, but working benignly on behalf of the legitimate users of the system they investigate.

The backbone of this implementation is Epiphany’s expert-curated reward function. This algorithm is designed to reinforce virtual agent policies that target highly susceptible footholds, make efficient movements through the network, and obtain control over both their targets and other valuable assets accessible along their trajectory.

With that in hand, virtual agents are initialized with state-of-the-art deep neural networks that model decision-making policies. These models read in vector representations of the states available to the agent and produce a distribution of the relative probability that each state would be traversed in a real attack. In the context of network graphs, the states are network assets such as users, workstations, servers, IAM objects, etc.

The process of training these models follows a basic (off-policy) reinforcement learning process. The agent uses its (initially untrained) model to choose a foothold, then continues choosing states until it reaches a target. At each step, it receives some reward. This experience is saved and sampled at regular intervals to train the neural networks. Training is accelerated through prioritized experience replay[1] whereby trajectories that offer greater insight are sampled more frequently. After sufficient exploration, the agent switches to evaluating the network.

To generate viable attack paths, the agent runs the same process as it did during training. The key difference is that the agent now explores all options, in order, and records each path it traverses. The result of this process is an ordered set of realistic attack paths. With these in hand, cybersecurity administrators are able to prioritize their work to remediate the issues that present the highest risk to their network.

Outcomes

Leveraging Epiphany's EvilBots saves analysts' time by automating the discovery of attack paths; saves administrators' time by prioritizing those paths for remediation; and saves the organization money by enabling the former to anticipate and mitigate attacks early and often. Moreover these savings compound: organizations can invest these resources in hardening their network with the comfort of knowing Epiphany will be there to double-check their work, and be able to repeat the investigation without any additional setup making continuous validation a trivial activity.

[1] Schaul, T., Quan, J., Antonoglou, I., & Silver, D. (2015). Prioritized Experience Replay (Version 4). arXiv.

https://doi.org/10.48550/ARXIV.1511.05952

James Korge | Senior Data Scientist at Reveald Inc.

James acts as Reveald's Senior Data Scientist, using AI to drive material risk reduction for all our customers against a host of real cyber attackers while ensuring stable, secure access to any and all data Reveald's analysts need to deliver top-quality service to all of our customers. When he's not doing that, you can find James bouldering at his local gym, spending some quality time with his dog Cersei, or checking out NYC's hidden restaurant gems.

Trusted by industry-leading organizations across the globe.