Add data-mining authored by VERNEREY Charles's avatar VERNEREY Charles
# Data-mining
Data mining with Choco Solver.
## Constraints
The following constraints are available:
`AdequateClosureDC(Database database, List<Measure> measures, BoolVar[] items)`
**Parameters**:
- A transactional database `database`
- A list of measures `measures` (see Measures section for more information on available measures)
- An array of Boolean variables `items` where `items[i]` is true iff item `i` belongs to the pattern
**Description**: Ensure that the pattern represented by `items` is closed w.r.t. the set of measures `measures` (Domain Consistency version)
**References**: `Vernerey et al. - Threshold-free Pattern Mining Meets Multi-Objective Optimization: Application to Association Rules`
`AdequateClosureWC(Database database, List<Measure> measures, BoolVar[] items)`
**Parameters**:
- A transactional database `database`
- A list of measures `measures` (see Measures section for more information on available measures)
- An array of Boolean variables `items` where `items[i]` is true iff item `i` belongs to the searched pattern
**Description**: Ensure that the pattern represented by `items` is closed w.r.t. the set of measures `measures` (Weak Consistency version)
**References**: `Vernerey et al. - Threshold-free Pattern Mining Meets Multi-Objective Optimization: Application to Association Rules`
`CoverClosure(Database database, BoolVar[] items)`
**Parameters**:
- A transactional database `database`
- An array of Boolean variables `items` where `items[i]` is true iff item `i` belongs to the searched pattern
**Description**: Ensure that the pattern represented by `items` is closed w.r.t. the support
**References**: `Schaus et al. - CoverSize : A Global Constraint for Frequency-Based Itemset Mining`
`CoverSize(Database database, IntVar freq, BoolVar[] items)`
**Parameters**:
- A transactional database `database`
- An integer variable `freq` that represents the frequency of the pattern
- An array of Boolean variables `items` where `items[i]` is true iff item `i` belongs to the pattern
**Description**: Ensure that the variable `freq` is equal to the frequency of the pattern represented by `items` variables
**References**: `Schaus et al. - CoverSize : A Global Constraint for Frequency-Based Itemset Mining`
`FrequentSubs(Database database, int freq, BoolVar[] x)`
**Parameters**:
- A transactional database `database`
- A threshold `freq`
- An array of Boolean variables `x` where `x[i]` is true iff item `i` belongs to the pattern
**Description**: Ensure that all the subsets of `x` are frequent w.r.t. the `freq` threhsold (i.e. frequency(y) >= freq for all y subsets of x)
**References**: `Belaid et al. - Constraint Programming for Mining Borders of Frequent Itemsets`
`Generator(Database database, BoolVar[] items)`
**Parameters**:
- A transactional database `database`
- An array of Boolean variables `items` where `items[i]` is true iff item `i` belongs to the pattern
**Description**: Ensure that the pattern represented by `items` is a generator (i.e. has no subset with the same frequency)
**References**: `Belaid et al. - Constraint Programming for Association Rules`
`InfrequentSupers(Database database, int freq, BoolVar[] x)`
**Parameters**:
- A transactional database `database`
- A threshold `freq`
- An array of Boolean variables `x` where `x[i]` is true iff item `i` belongs to the pattern
**Description**: Ensure that all the supersets of `x` are infrequent w.r.t. the `freq` threhsold (i.e. frequency(y) < freq for all y supersets of x)
**References**: `Belaid et al. - Constraint Programming for Mining Borders of Frequent Itemsets`
## Measures
The following measures are available (see the package `io.gitlab.chaver.mining.patterns.measure`) :
- Pattern measures:
- `AllConf`: All-confidence
- `Area`
- `Freq`
- `Freq1`
- `Freq2`
- `FreqNeg`
- `GrowthRate`
- `Length`
- `MaxFreq`
- Attribute measures:
- `Max`
- `Min`
- `Mean`
\ No newline at end of file