Update doc authored by Charles Vernerey's avatar Charles Vernerey
......@@ -39,8 +39,6 @@ It is not mandatory to have n consecutive Integer as items. For example, the fol
In this example, the items are `{1,2,3,5}`. The first line of the value file represents the value of 1, the second line the value of 2, the third the value of 3 and the fourth the value of 5.
However, to work correctly, class items must have values inferior to that of no class items.
## Constraints
The following constraints are available:
......@@ -161,3 +159,73 @@ The following measures are available (see the package `io.gitlab.chaver.mining.p
- `Max`
- `Min`
- `Mean`
## Code examples
The package `io.gitlab.chaver.mining.examples` contains examples on how to use constraints for specific tasks.
**ExampleAdequateClosure**
In this example, we want to mine the set of closed patterns w.r.t. the set of measures `{freq(x),max(x.freq)}`. We can analyse the code of the `main` method:
```java
String dataPath = "src/test/resources/contextPasquier99/contextPasquier99.dat";
List<Measure> measures = Arrays.asList(freq(), maxFreq());
Model model = new Model("adequate closure test");
Database database = new DatReader(dataPath, 0, true).readFiles();
```
First, we create the data structures that we need to build our model. To build our set of measures (represented with a List), we can use the class `MeasureFactory` that is located in the package `io.gitlab.chaver.mining.patterns.measure` and which proposes different methods to instantiate measures.
```java
IntVar freq = model.intVar("freq", 1, database.getNbTransactions());
IntVar length = model.intVar("length", 1, database.getNbItems());
BoolVar[] x = model.boolVarArray("x", database.getNbItems());
```
Then, we create two integer variables `freq` and `length` which represent the frequency and the length (i.e. the number of items) of the pattern, and a BoolVar array that represents the pattern we are looking for (`x[i] = 1` means that item `i` belongs to the pattern).
```java
model.sum(x, "=", length).post();
int[] itemFreq = database.computeItemFreq();
IntVar[] itemFreqVar = model.intVarArray(database.getNbItems(), 0, database.getNbTransactions());
for (int i = 0; i < database.getNbItems(); i++) {
// itemFreqVar[i] = itemFreq[i] if items[i] == 1 else 0
model.arithm(x[i], "*", model.intVar(itemFreq[i]), "=", itemFreqVar[i]).post();
}
String maxFreqId = maxFreq().getId();
IntVar maxFreq = model.intVar(maxFreqId, 0, database.getNbTransactions());
// Compute max value of itemFreqVar
model.max(maxFreq, itemFreqVar).post();
```
We post a constraint that links the length variable to the the sum of `x` array. Then, we compute the frequency of each item in an array `itemFreq` such that `itemFreq[i]` represents the frequency of item `i`. We can now create an array of integer variables `itemFreqVar` such that:
- `itemFreqVar[i] = 0` if `x[i] = 0`
- `itemFreqVar[i] = itemFreq[i]` if `x[i] = 1`
We can finally compute the max frequency of the pattern in a variable named `maxFreq` by imposing a max constraint on the `itemFreqVar` array.
```java
model.post(new Constraint("Cover Size", new CoverSize(database, freq, x)));
model.post(new Constraint("Adequate Closure", new AdequateClosureDC(database, measures, x)));
```
We post two constraints, **CoverSize** to compute the frequency of the pattern and **AdequateClosure** to ensure that `x` is closed w.r.t. the set of measures `{freq(x),max(x.freq)}`.
```java
List<Pattern> closedPatterns = new LinkedList<>();
while (model.getSolver().solve()) {
int[] itemset = IntStream.range(0, x.length)
.filter(i -> x[i].getValue() == 1)
.map(i -> database.getItems()[i])
.toArray();
closedPatterns.add(new Pattern(itemset, new int[]{freq.getValue(), maxFreq.getValue()}));
}
for (Pattern closed : closedPatterns) {
System.out.println(Arrays.toString(closed.getItems()) + ", freq=" + closed.getMeasures()[0] + ", maxFreq=" +
closed.getMeasures()[1]);
}
```
Finally, we solve our model and find all the closed patterns.
\ No newline at end of file