ABSTRACT
In this work we propose a novel approach to anomaly detection in streaming communication data. We first build a stochastic model for the system based on temporal communication patterns across each edge, which we call the REWARDS (REneWal theory Approach for Real-time Data Streams) model. We then define a measure of anomaly for an arbitrary subgraph based on the likelihood of its recent activity given past behavior. Finally, we develop an algorithm to efficiently identify subgraphs with the most anomalous activity. Although our work has until now focused on the cybersecurity domain, the model we present is more broadly applicable to information retrieval in data streams and information networks.
- V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Comput. Surv., 41(3), 2009. Google ScholarDigital Library
- R. E. Tarjan. Efficiency of a good but not linear set union algorithm. J. ACM, 22:215--225, April 1975. Google ScholarDigital Library
Index Terms
- The early bird gets the buzz: detecting anomalies and emerging trends in information networks
Recommendations
L(2,1)-labeling of dually chordal graphs and strongly orderable graphs
An L(2,1)-labeling of a graph G=(V,E) is a function f:V(G)->{0,1,2,...} such that |f(u)-f(v)|>=2 whenever uv@__ __E(G) and |f(u)-f(v)|>=1 whenever u and v are at distance two apart. The span of an L(2,1)-labeling f of G, denoted as SP"2(f,G), is the ...
Computing a minimum outer-connected dominating set for the class of chordal graphs
For a graph G=(V,E), a dominating set is a set D@?V such that every vertex v@?V@?D has a neighbor in D. Given a graph G=(V,E) and a positive integer k, the minimum outer-connected dominating set problem for G is to decide whether G has a dominating set ...
Reservoir-based network traffic stream summarization for anomaly detection
Summarization is an important intermediate step for expediting knowledge discovery tasks such as anomaly detection. In the context of anomaly detection from data stream, the summary needs to represent both anomalous and normal data. But streaming data ...
Comments