EcoSta 2024: Start Registration
View Submission - EcoSta 2025
A1138
Title: Mutually-exciting point processes and topic modelling of honeypot computer terminal data Authors:  Daniyar Ghani - Imperial College London (United Kingdom) [presenting]
Nick Heard - Imperial College London (United Kingdom)
Francesco Sanna Passino - Imperial College London (United Kingdom)
Abstract: Topic modelling is combined with a mutually-exciting point process to analyze honeypot computer terminal data. Each terminal session is treated as a sequence of text-based commands and assigned a latent topic representing the intent of a cyber-attacker. A natural approach to learning these intents is to use a Dirichlet distribution topic model that groups similar sessions conditional only on the commands. An extension is proposed to incorporate additional session metadata. Sessions are assumed to arrive over time according to a self-exciting and mutually-exciting multivariate Hawkes process, so the occurrence of a session increases the likelihood of related sessions arriving soon after. Source Internet Protocol addresses (IPs) are included as session labels to uncover further structure among threat actors. A flexible inference procedure using Markov chain Monte Carlo is presented to learn the topics and point process parameters, verified for a range of simulation settings. Preliminary results on real-world honeypot data reveal patterns where attacks continue over time under different source IPs but with the same intents, suggesting the presence of threat actors using multiple IPs, or groups of coordinated attackers.