Update data-mining doc authored by VERNEREY Charles's avatar VERNEREY Charles
......@@ -2,6 +2,45 @@
Data mining with Choco Solver.
## Read a transactional database
A transactional database is a file of the following format:
```
1 3 4
2 3 5
1 2 3 5
2 5
1 2 3 5
```
In the example above, `{1,2,3,4,5}` is the set of items. Each line of the file represents a transaction. To read the database, you can use the following line of code:
```java
Database database = new DatReader(path, 0, true).readFiles()
```
`DatReader` takes 3 arguments in the constructor:
- path: a String which represents the path of the transactional database
- nbValueFiles: an Integer which represents the number of value files that we want to read (in this example, 0)
- noClasses: a Boolean which is true if we want to ignore class of the transactions (in this example, true)
If the noClasses argument is set to false, then the class item of each transaction will be ignored during the mining. We consider the first item of each transaction as the class item. In the example above, class items are `{1,2}`.
If nbValueFiles is set to an Integer n > 0, then we read *.val0, *.val1,..., *.valn files, where each line of this file represents the value of an item. See the `data` directory to get examples of value files.
It is not mandatory to have n consecutive Integer as items. For example, the following database can be read with `DatReader`:
```
1 2 5
2 3
```
In this example, the items are `{1,2,3,5}`. The first line of the value file represents the value of 1, the second line the value of 2, the third the value of 3 and the fourth the value of 5.
However, to work correctly, class items must have values inferior to that of no class items.
## Constraints
The following constraints are available:
......
......