TIPM: Pattern mining and anomaly detection in multi-dimensional time series and event logs
Implementation of A framework for pattern mining and anomaly detection in multi-dimensional time series and event logs, by Len Feremans and Vincent Vercruyssen.
Presented at New Frontiers in Mining Complex Patterns workshop, at ECML-PKDD 2019, the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2019.
Abstract:
In the present-day, sensor data and textual logs are generated by many devices. Analyzing these time series data leads to the discovery of interesting patterns and anomalies. In recent years, numerous algorithms have been developed to discover interesting patterns in time series data as well as detect periods of anomalous behaviour. However, these algorithms are challenging to apply in real-world settings. We propose a framework, consisting of generic transformations, that allows to combine state-of-the-art time series representation, pattern mining, and pattern-based anomaly detection algorithms. Using an early- or late integration, our framework handles a mix of multi-dimensional continuous series and event logs. Finally we present an open-source, lightweight, interactive tool that assists both pattern mining and domain experts to select algorithms, specify parameters, and visually inspect the results, while shielding themfrom the underlying technical complexity of implementing our framework.
Summary
TIPM takes univariate, multi-variate and mixed-type time series as input. Using TIPM end-users can interactively compute an anomaly score for each window without the need for labels, by specify options for time series representation, pattern mining, reduction of patterns, and anomaly detection in an interactive manner.
TIPM consist of 4 major steps:
- Preprocessing univariate, multivariate, and mixed-type time series.
- Mining a (non-redundant) set of itemsets and sequential patterns from each time series (using SPMF).
- Computing an anomaly score using generalisation of PBAD: Pattern based anomaly detection and Fp-outlier: Frequent pattern based outlier detection.
- Visualising time series, pattern occurrences, labels and predicted anomaly scores.
Framework
TIPM, (Time Series Pattern Mining) is an open-source web-based application. We can import any dataset that contains at least a datetime and at least one value column, either continuous or discrete.TIPM visualizes the histogram and summary statistics for each column, and allows to transform continuous time series using our framework. For subsequent pattern mining and anomaly detection we apply generic transformations and existing pattern mining algorithms. For visualisation, TIPM can plot continuous timeseries values, discrete event logs, labels, and segmentation, on different levels of granularity in time (raw, hourly, daily, yearly, etc.). For validation of pattern mining we can ender pattern occurrences and anomaly scores. After each transform, TIPM saves intermediate files and end-users can undo any transformation. Most transformations in our framework are implemented using streaming techniques, thereby loading only a small set of rows at a time, instead of loading all data into main memory. By only loading and processing data in a streaming, or paginated, way, the interface and many preprocessing and postprocessing transformations can handle large time series with millions of samples. For pattern mining we can manage resources by setting support to a relatively high value, and choosing appropriate transformations during preprocessing.
Workflow of our framework: 
Usage
See demo (slightly older version) video

- Upload a CSV file containing a multi-variate timeseries.
- Two special fields are
Label
and the first column, which is assumed to be aTime
column either coded as integer or in ISO datetime format. - Transform continuous time series using discretisation and a sliding window.
- Compute pattern mining and anomaly detection.
- Visualise time series, patterns, and anomaly score (including AUC and AP).
Installation
Remark: The current version was tested with Java
jdk1.8.0_60.jdk
and jdk-9.0.4.jdk
, and Apache Maven 3.6.3
on macOs 10.15.2
.
It was also tester with java Openjdk-11.0.15
and maven 3.8.6
on archlinux
.
If you have any issue please contact me.
- Clone the repository
- Code is implemented in
Java
based on theSpring
framework for a web-application development. User interface is programmed usingJavascript
. UseMaven
to compile and run the webapp. - Go to http://localhost:8080 with your browser.
cd ~
git clone git@bitbucket.org:len_feremans/tipm_pub.git
mvn clean install spring-boot:run
For running the PBAD
anomaly detection method, PBAD
must be also installed which is implemented in Python
(and C
using Cython
).
Compile and install PBAD
, in the same parent directory as TIPM
, the name of the folder has to be specified in the Settings.java file:
cd ~
git clone git@bitbucket.org:len_feremans/pbad.git
cd pbad/src/utils/cython_utils/
python setup.py build_ext --inplace
More information for researchers and contributors
The current version is 1.01, last updated on February 2020. The main implementation is written in Java 1.8
.
For mining closed, maximal and minimal infrequent itemsets and sequential patterns we depend on the Java
-based SPMF library.
Java Dependencies specifed in Maven
and are org.springframework.boot=1.1.8
, com.h2database==1.4.187
(in memory database), com.google.guava==18.0
, org.apache.commons==3.2
, nz.ac.waikato.cms.weka==3.6.11
and xstream==1.2.2
.
Some example datasets are provided in /data:
-
univariate
New york taxi, ambient temperature, and request latency. Origin is the Numenta repository. -
multivariate
Indoor physical exercises dataset captured using a Microsoft Kinect camera. Origin is AMIE: Automatic Monitoring of Indoor Exercises.
Contributors
- Len Feremans, Adrem Data Labs research group, University of Antwerp, Belgium.
Licence
Copyright (c) [2019] [Len Feremans]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFWARE.