Unverified Commit dbfe2fcc authored by BARBIER Marc's avatar BARBIER Marc
Browse files

H2 fix + readme update + 400 fix

parent 09a286a7
# TIPM: Pattern mining and anomaly detection in multi-dimensional time series and event logs
Implementation of _A framework for pattern mining and anomaly detection in multi-dimensional time series and event logs_,
by Len Feremans and Vincent Vercruyssen.
Implementation of _A framework for pattern mining and anomaly detection in multi-dimensional time series and event logs_,
by Len Feremans and Vincent Vercruyssen.
Presented at [New Frontiers in Mining Complex Patterns workshop](http://www.di.uniba.it/~loglisci/NFMCP2019/index.html), at *ECML-PKDD 2019*, the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2019.
......@@ -12,14 +12,14 @@ Abstract:
## Summary
**TIPM** takes *univariate*, *multi-variate* and *mixed-type time series* as input.
**TIPM** takes *univariate*, *multi-variate* and *mixed-type time series* as input.
Using **TIPM** end-users can interactively compute an anomaly score for each window without the need for *labels*,
by specify options for time series representation, pattern mining, reduction of patterns, and anomaly detection in an interactive manner.
**TIPM** consist of 4 major steps:
1. Preprocessing univariate, multivariate, and mixed-type time series.
2. Mining a (non-redundant) set of *itemsets* and *sequential patterns* from each time series (using [SPMF](www.philippe-fournier-viger.com/spmf/)).
2. Mining a (non-redundant) set of *itemsets* and *sequential patterns* from each time series (using [SPMF](www.philippe-fournier-viger.com/spmf/)).
3. Computing an anomaly score using generalisation of [PBAD: Pattern based anomaly detection](http://adrem.uantwerpen.be/bibrem/pubs/pbad.pdf) and [Fp-outlier: Frequent pattern based outlier detection](https://www.researchgate.net/profile/Zengyou_He/publication/220117736_FP-outlier_Frequent_pattern_based_outlier_detection/links/53d9dec60cf2e38c63363c05/FP-outlier-Frequent-pattern-based-outlier-detection.pdf).
4. Visualising time series, pattern occurrences, labels and predicted anomaly scores.
......@@ -42,23 +42,24 @@ See demo (slightly older version) [video](https://bitbucket.org/len_feremans/tip
4. Compute pattern mining and anomaly detection.
5. Visualise time series, patterns, and anomaly score (including AUC and AP).
## Installation
## Installation
Remark: The current version was tested with `Java` `jdk1.8.0_60.jdk` and `jdk-9.0.4.jdk`, and `Apache Maven 3.6.3` on `macOs 10.15.2`.
It was also tester with `java Openjdk-11.0.15` and `maven 3.8.6` on `archlinux`.
If you have any issue please contact me.
1. Clone the repository
2. Code is implemented in `Java` based on the `Spring` framework for a web-application development.
2. Code is implemented in `Java` based on the `Spring` framework for a web-application development.
User interface is programmed using `Javascript`. Use `Maven` to compile and run the webapp.
3. Go to [http://localhost:8080](http://localhost:8080) with your browser.
```bash
cd ~
git clone git@bitbucket.org:len_feremans/tipm_pub.git
mvn clean install -DskipTests=True spring-boot:run
mvn clean install spring-boot:run
```
For running the `PBAD` anomaly detection method, `PBAD` must be also installed which is implemented in `Python` (and `C` using `Cython`).
Compile and install `PBAD`, in the same parent directory as `TIPM`:
Compile and install `PBAD`, in the same parent directory as `TIPM`, the name of the folder has to be specified in the Settings.java file:
```bash
cd ~
......@@ -68,16 +69,16 @@ python setup.py build_ext --inplace
```
## More information for researchers and contributors ###
The current version is 1.01, last updated on February 2020. The main implementation is written in `Java 1.8`.
For mining closed, maximal and minimal infrequent itemsets and sequential patterns we depend on the `Java`-based [SPMF](www.philippe-fournier-viger.com/spmf/) library.
The current version is 1.01, last updated on February 2020. The main implementation is written in `Java 1.8`.
For mining closed, maximal and minimal infrequent itemsets and sequential patterns we depend on the `Java`-based [SPMF](www.philippe-fournier-viger.com/spmf/) library.
Java Dependencies specifed in `Maven` and are `org.springframework.boot=1.1.8`, `com.h2database==1.4.187` (in memory database), `com.google.guava==18.0`, `org.apache.commons==3.2`, `nz.ac.waikato.cms.weka==3.6.11` and `xstream==1.2.2`.
Some example datasets are provided in _/data_:
- `univariate` *New york taxi*, *ambient temperature*, and *request latency*. Origin is the [Numenta repository](https://github.com/numenta).
- `multivariate` *Indoor physical exercises* dataset captured using a Microsoft Kinect camera. Origin is [AMIE: Automatic Monitoring of Indoor Exercises](https://dtai.cs.kuleuven.be/software/amie).
## Contributors
- Len Feremans, Adrem Data Labs research group, University of Antwerp, Belgium.
......@@ -100,4 +101,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFWARE.
\ No newline at end of file
SOFWARE.
......@@ -48,8 +48,7 @@
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<version>2.1.214</version>
<scope>provided</scope>
<version>1.4.187</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
......
......@@ -96,7 +96,7 @@ public class ArffUtils {
return new Pair<List<String>,List<String>>(lines.subList(0, indexData+1),lines.subList(indexData+1,lines.size()));
}
public static void updateAttributeNames(File input, List<String> newNames) throws Exception{
public static void updateAttributeNames(File input, List<String> newNames) throws IOException{
List<String> types = getAttributeValues(input);
List<String> header = makeHeader("whatever", newNames, types, false);
File tempFile = File.createTempFile("bla","foo");
......@@ -123,25 +123,19 @@ public class ArffUtils {
IOUtils.copy(tempFile, input);
}
private static void copyData(File input, File outputOnlyHeader) throws Exception {
private static void copyData(File input, File outputOnlyHeader) throws IOException {
final BufferedWriter writer = new BufferedWriter(new FileWriter(outputOnlyHeader, true));
final GenericSerializerArff serializer = new GenericSerializerArff(input, writer);
streamSpareOrNormalFile(input, new StreamerArffGeneric() {
@Override
public void doSomething(List<String> attributes, List<String> values) {
serializer.write(attributes, values);
}
});
streamSpareOrNormalFile(input, serializer::write);
writer.close();
}
public static class GenericSerializerArff{
private boolean sparse;
private List<String> attributes;
private Map<String,Integer> allAtributesMap = new HashMap<String,Integer>();
private Map<String,Integer> allAtributesMap = new HashMap<>();
private BufferedWriter writer;
public GenericSerializerArff(File inputFile, BufferedWriter writer) throws Exception{
public GenericSerializerArff(File inputFile, BufferedWriter writer) throws IOException{
this.sparse = ArffUtils.isSparse(inputFile);
if(sparse){
this.attributes = ArffUtils.getAttributeNames(inputFile);
......
......@@ -3,6 +3,7 @@ package be.uantwerpen.ldataminining.preprocessing;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.Writer;
import java.sql.Connection;
import java.sql.DriverManager;
......@@ -35,9 +36,10 @@ public class SQLQueryOnArff {
* Output: output csv
* @param arffFile
* @return
* @throws SQLException
* @throws Exception
*/
public File runQuery(File arffFile, String query) throws Exception{
public File runQuery(File arffFile, String query) throws IOException, SQLException, ClassNotFoundException{
System.out.println("runQuery");
// start the TCP Server
Server server = Server.createTcpServer().start();
......@@ -49,7 +51,7 @@ public class SQLQueryOnArff {
//create table schema
List<String> attributeNames = ArffUtils.getAttributeNames(arffFile);
List<String> types = ArffUtils.getAttributeValues(arffFile);
final List<String> sqlTypes = new ArrayList<String>();
final List<String> sqlTypes = new ArrayList<>();
for(String type: types){
sqlTypes.add(arffToSQLType(type));
}
......
......@@ -107,7 +107,7 @@ public class IOUtils {
return count;
}
public static void copy(File f1, File f2) throws Exception{
public static void copy(File f1, File f2) throws IOException{
FileUtils.copyFile(f1,f2);
}
......
......@@ -16,7 +16,7 @@ public abstract class AbstractController {
@Autowired
ProjectRepository repository;
protected FileItem getCurrentItem(HttpServletRequest request, Optional<String> id){
protected FileItem getCurrentItem(HttpServletRequest request, Optional<String> id) {
FileItem currentInput = null;
if(id.isPresent()) {
currentInput = repository.findItemById(id.get().trim());
......@@ -24,6 +24,8 @@ public abstract class AbstractController {
}
MySession session = (MySession) request.getSession().getAttribute("mySession");
return repository.findItemById(session.getCurrentItem());
if(session != null)
return repository.findItemById(session.getCurrentItem());
else throw new IllegalStateException();
}
}
......@@ -23,15 +23,16 @@ import java.util.TreeSet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpSession;
import org.apache.commons.io.FileUtils;
import org.codehaus.jackson.map.ObjectMapper;
import org.codehaus.jackson.type.TypeReference;
import org.springframework.http.HttpStatus;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.ResponseBody;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.server.ResponseStatusException;
import be.uantwerpen.datamining.pattern_mining.MakePatternOccurrences;
import be.uantwerpen.datamining.pattern_mining.MakeWindows;
......@@ -90,7 +91,7 @@ public class TransformController extends AbstractController{
{
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
try {
File output = getNewFileName(currentInput);
//generate output
......@@ -126,7 +127,7 @@ public class TransformController extends AbstractController{
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff()) {
System.err.println("Only arff supported");
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
}
try {
System.out.println("Run query:" + query);
......@@ -142,6 +143,27 @@ public class TransformController extends AbstractController{
}
}
@PostMapping(value="/rest/transform/rename")
public @ResponseBody void renameColumn(
@RequestParam("id") Optional<String> id,
@RequestParam("old_name") String oldName,
@RequestParam("new_name") String newName,
HttpServletRequest request) {
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff() || oldName.contains(" ") || newName.contains(" "))
throw new ResponseStatusException(HttpStatus.BAD_REQUEST, "Only arff is supported");
try {
List<String> attributeNames = ArffUtils.getAttributeNames(currentInput.getFile());
List<String> newAttributeNames = translateAttributes(oldName + " " + newName, attributeNames);
ArffUtils.updateAttributeNames(currentInput.getFile(), newAttributeNames);
currentInput.getStackOperations().add("renamed "+ oldName + " to " + newName);
} catch(IOException e) {
throw new ResponseStatusException(
HttpStatus.INTERNAL_SERVER_ERROR, "Foo Not Found", e);
}
}
@PostMapping(value="/rest/transform/translate-labels")
public @ResponseBody void translateLabels(
@RequestParam("id") Optional<String> id,
......@@ -150,7 +172,7 @@ public class TransformController extends AbstractController{
{
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
try {
List<String> attributeNames = ArffUtils.getAttributeNames(currentInput.getFile());
List<String> newAttributeNames = translateAttributes(labelsTranslationTxt, attributeNames);
......@@ -175,7 +197,7 @@ public class TransformController extends AbstractController{
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
try {
File output = getNewFileName(currentInput);
List<String> names = ArffUtils.getAttributeNames(currentInput.getFile());
......@@ -218,7 +240,7 @@ public class TransformController extends AbstractController{
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
final int iWindow = Integer.parseInt(window);
final List<String> selectedCols = Arrays.asList(columns.split(",\\s*"));
......@@ -291,7 +313,7 @@ public class TransformController extends AbstractController{
}
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
try {
File output = getNewFileName(currentInput);
List<String> names = ArffUtils.getAttributeNames(currentInput.getFile());
......@@ -425,7 +447,7 @@ public class TransformController extends AbstractController{
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
try {
File output = getNewFileName(currentInput);
boolean transformed_anything = ArffDenseUtils.transformAnyDateColumns(currentInput.getFile(), output, dateformatFrom, dateformatTo);
......@@ -455,7 +477,7 @@ public class TransformController extends AbstractController{
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
try {
File output = getNewFileName(currentInput);
......@@ -522,7 +544,7 @@ public class TransformController extends AbstractController{
FileItem otherItem = repository.findItemById(otherItemID);
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
if(!otherItem.isArff())
throw new RuntimeException("Only arff supported to joun with");
......@@ -548,12 +570,13 @@ public class TransformController extends AbstractController{
{
FileItem currentInput = getCurrentItem(request, id);
if(!currentInput.isArff())
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
try {
final List<String> columnsList = new ArrayList<>(Arrays.asList(columns.split(",")));
List<String> transformsList = Arrays.asList(transforms.split(","));
File output = getNewFileName(currentInput);
if(transformsList.contains("remove")){
System.out.println(columns);
dropColumns(request, currentInput, columnsList, output);
return;
}
......@@ -586,11 +609,9 @@ public class TransformController extends AbstractController{
columnsList.removeAll(toIgnore);
if(transformsList.contains("normalize")){
normalizeColumns(request, currentInput, columnsList, output, columnData);
return;
}
else if(transformsList.contains("remove_outliers")){
removeOutlierColumns(request, currentInput, output, columnsList, columnData);
return;
}
} catch (Exception e) {
......@@ -739,7 +760,7 @@ public class TransformController extends AbstractController{
if(currentInput == null) return;
if(!currentInput.isArff()) {
throw new RuntimeException("Only arff supported");
throw new IllegalArgumentException("Only arff is supported");
}
try {
System.out.println("Run windows:" + window + "," + increment);
......
......@@ -26,7 +26,7 @@ public class ProjectRepository {
private static File datafile = new File(Settings.DATA_FILE); //no caching for know
private List<Project> projects = new ArrayList<Project>();
private List<Project> projects = new ArrayList<>();
public Project findByName(String name){
load();
......@@ -59,6 +59,7 @@ public class ProjectRepository {
}
public FileItem findItemById(String itemId){
load();
for(Project project: projects){
for(FileItem item: project.getFileItems()){
if(item.getId().equals(itemId))
......@@ -99,7 +100,7 @@ public class ProjectRepository {
private void load(){
if(!datafile.exists())
{
this.projects = new ArrayList<Project>();
this.projects = new ArrayList<>();
return;
}
try{
......
......@@ -85,7 +85,7 @@ function show_patternset_selection(){
var noPatterns = patternset['noPatterns'];
var option_group = $("<optgroup></optgroup>").attr("label", type + " " + columns + "(#" + noPatterns + ")");
$("#patterns").append(option_group);
$.ajax({ url: "/rest/load-patterns?filename=" + fname, context: document.body, async:false}).done(function(data) {
$.ajax({ url: "/rest/load-patterns?filename=" + encodeURIComponent(fname), context: document.body, async:false}).done(function(data) {
//if only one time series as input: do not show label, e.g. show 'value=7 value=9' as '7 9'
var simplify = columns + '=';
if(columns.indexOf(',') != -1){
......@@ -184,7 +184,7 @@ function load_pattern_occurrences(){
continue;
}
var fname = patternset['filename'];
$.ajax({ url: "/rest/load-pattern-occ?filename=" + fname, context: document.body, async:false}).done(function(data) {
$.ajax({ url: "/rest/load-pattern-occ?filename=" + encodeURIComponent(fname), context: document.body, async:false}).done(function(data) {
//data is a table with Window, PatternId
var rows = data["rows"].slice(1);
for(var j=0; j<rows.length; j++){
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment