Skip to content

Commit

Permalink
Merge pull request #1132 from mrapp-ke/revise-readme
Browse files Browse the repository at this point in the history
Revise README
  • Loading branch information
michael-rapp authored Nov 17, 2024
2 parents 71a2299 + ae65163 commit 7eeec33
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 24 deletions.
52 changes: 37 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PyPI version](https://badge.fury.io/py/mlrl-boomer.svg)](https://badge.fury.io/py/mlrl-boomer) [![Documentation Status](https://readthedocs.org/projects/mlrl-boomer/badge/?version=latest)](https://mlrl-boomer.readthedocs.io/en/latest/?badge=latest) [![X URL](https://img.shields.io/twitter/url?label=Follow&style=social&url=https%3A%2F%2Ftwitter.com%2FBOOMER_ML)](https://twitter.com/BOOMER_ML)

**Important links:** [Documentation](https://mlrl-boomer.readthedocs.io/en/latest/) | [Issue Tracker](https://github.com/mrapp-ke/MLRL-Boomer/issues) | [Changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html) | [Contributors](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html) | [Code of Conduct](https://mlrl-boomer.readthedocs.io/en/latest/misc/CODE_OF_CONDUCT.html) | [License](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html)
**:link: Important links:** [Documentation](https://mlrl-boomer.readthedocs.io/en/latest/) | [Issue Tracker](https://github.com/mrapp-ke/MLRL-Boomer/issues) | [Changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html) | [Contributors](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html) | [Code of Conduct](https://mlrl-boomer.readthedocs.io/en/latest/misc/CODE_OF_CONDUCT.html) | [License](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html)

This software package provides the official implementation of **BOOMER - an algorithm for learning gradient boosted multi-output rules** that uses [gradient boosting](https://en.wikipedia.org/wiki/Gradient_boosting) for learning an ensemble of rules that is built with respect to a specific multivariate loss function. It integrates with the popular [scikit-learn](https://scikit-learn.org) machine learning framework.

Expand All @@ -19,46 +19,68 @@ The problem domains addressed by this algorithm include the following:

To provide a versatile tool for different use cases, great emphasis is put on the *efficiency* of the implementation. Moreover, to ensure its *flexibility*, it is designed in a modular fashion and can therefore easily be adjusted to different requirements. This modular approach enables implementing different kind of rule learning algorithms. For example, this project does also provide a [Separate-and-Conquer (SeCo) algorithm](https://mlrl-boomer.readthedocs.io/en/latest/user_guide/seco/index.html) based on traditional rule learning techniques that are particularly well-suited for learning interpretable models.

## References
## :book: References

The algorithm was first published in the following [paper](https://doi.org/10.1007/978-3-030-67664-3_8). A preprint version is publicly available [here](https://arxiv.org/pdf/2006.13346.pdf).

*Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz Vu-Linh Nguyen and Eyke Hüllermeier. Learning Gradient Boosted Multi-label Classification Rules. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2020, Springer.*

If you use the algorithm in a scientific publication, we would appreciate citations to the mentioned paper. An overview of publications that are concerned with the BOOMER algorithm, together with information on how to cite them, can be found in the section [References](https://mlrl-boomer.readthedocs.io/en/latest/misc/references.html) of the documentation.

## Functionalities
## :wrench: Functionalities

The algorithm that is provided by this project currently supports the following core functionalities for learning ensembles of boosted classification or regression rules:
The algorithm that is provided by this project currently supports the following core functionalities for learning ensembles of boosted classification or regression rules.

### Deliberate Loss Optimization

- **Decomposable or non-decomposable loss functions** can be optimized in expectation.
- **$L_1$ and $L_2$ regularization** can be used.
- **Shrinkage (a.k.a. the learning rate) can be adjusted** for controlling the impact of individual rules on the overall ensemble.

### Different Prediction Strategies

- **Decomposable or non-decomposable loss functions** can be minimized in expectation.
- **L1 and L2 regularization** can be used.
- **Single-output, partial, or complete heads** can be used by rules, i.e., they can predict for a single output, a subset of the available outputs, or all of them. Predicting for multiple outputs simultaneously enables to model local dependencies between them.
- **Various strategies for predicting scores, binary labels or probabilities** are available, depending on whether a classification or regression model is used.
- **Isotonic regression models can be used to calibrate marginal and joint probabilities** predicted by a classification model.

### Flexible Handling of Input Data

- **Native support for numerical, ordinal, and nominal features** eliminates the need for pre-processing techniques such as one-hot encoding.
- **Handling of missing feature values**, i.e., occurrences of *NaN* in the feature matrix, is implemented by the algorithm.

### Fine-grained Control over Model Characteristics

- **Rules can be constructed via a greedy search or a beam search.** The latter may help to improve the quality of individual rules.
- **Sampling techniques and stratification methods** can be used for learning new rules on a subset of the available training examples, features, or output variables.
- **Shrinkage (a.k.a. the learning rate) can be adjusted** for controlling the impact of individual rules on the overall ensemble.
- **Single-output, partial, or complete heads** can be used by rules, i.e., they can predict for a single output, a subset of the available outputs, or all of them. Predicting for multiple outputs simultaneously enables to model local dependencies between them.
- **Fine-grained control over the specificity/generality of rules** is provided via hyperparameters.

### Support for Post-Optimization and Pruning

- **Incremental reduced error pruning** can be used for removing overly specific conditions from rules and preventing overfitting.
- **Post- and pre-pruning (a.k.a. early stopping)** allows to determine the optimal number of rules to be included in an ensemble.
- **Sequential post-optimization** may help improving the predictive performance of a model by reconstructing each rule in the context of the other rules.
- **Native support for numerical, ordinal, and nominal features** eliminates the need for pre-processing techniques such as one-hot encoding.
- **Handling of missing feature values**, i.e., occurrences of NaN in the feature matrix, is implemented by the algorithm.

## Runtime and Memory Optimizations
## :watch: Runtime and Memory Optimizations

In addition to the features mentioned above, several techniques that may speed up training or reduce the memory footprint are currently implemented.

In addition, the following features that may speed up training or reduce the memory footprint are currently implemented:
### Approximation Techniques

- **Unsupervised feature binning** can be used to speed up the evaluation of a rule's potential conditions when dealing with numerical features.
- **Sampling techniques and stratification methods** can be used for learning new rules on a subset of the available training examples, features, or output variables.
- **[Gradient-based label binning (GBLB)](https://arxiv.org/pdf/2106.11690.pdf)** can be used for assigning the labels included in a multi-label classification data set to a limited number of bins. This may speed up training significantly when minimizing a non-decomposable loss function using rules with partial or complete heads.

### Sparse Data Structures

- **Sparse feature matrices** can be used for training and prediction. This may speed up training significantly on some data sets.
- **Sparse ground truth matrices** can be used for training. This may reduce the memory footprint in case of large data sets.
- **Sparse prediction matrices** can be used for storing predicted labels. This may reduce the memory footprint in case of large data sets.
- **Sparse matrices for storing gradients and Hessians** can be used if supported by the loss function. This may speed up training significantly on data sets with many output variables.

### Parallelization

- **Multi-threading** can be used for parallelizing the evaluation of a rule's potential refinements across several features, updating the gradients and Hessians of individual examples in parallel, or obtaining predictions for several examples in parallel.

## Documentation
## :books: Documentation

An extensive user guide, as well as an API documentation for developers, is available at [https://mlrl-boomer.readthedocs.io](https://mlrl-boomer.readthedocs.io/en/latest/). If you are new to the project, you probably want to read about the following topics:

Expand All @@ -70,7 +92,7 @@ A collection of benchmark datasets that are compatible with the algorithm are pr

For an overview of changes and new features that have been included in past releases, please refer to the [changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html).

## License
## :scroll: License

This project is open source software licensed under the terms of the [MIT license](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html). We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available [here](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html).

Expand Down
6 changes: 3 additions & 3 deletions python/subprojects/common/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PyPI version](https://badge.fury.io/py/mlrl-common.svg)](https://badge.fury.io/py/mlrl-common) [![Documentation Status](https://readthedocs.org/projects/mlrl-boomer/badge/?version=latest)](https://mlrl-boomer.readthedocs.io/en/latest/?badge=latest)

**Important links:** [Documentation](https://mlrl-boomer.readthedocs.io/en/latest/) | [Issue Tracker](https://github.com/mrapp-ke/MLRL-Boomer/issues) | [Changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html) | [Contributors](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html) | [Code of Conduct](https://mlrl-boomer.readthedocs.io/en/latest/misc/CODE_OF_CONDUCT.html) | [License](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html)
**:link: Important links:** [Documentation](https://mlrl-boomer.readthedocs.io/en/latest/) | [Issue Tracker](https://github.com/mrapp-ke/MLRL-Boomer/issues) | [Changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html) | [Contributors](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html) | [Code of Conduct](https://mlrl-boomer.readthedocs.io/en/latest/misc/CODE_OF_CONDUCT.html) | [License](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html)

This software package provides common modules to be used by different types of **multi-output rule learning (MLRL)** algorithms that integrate with the popular [scikit-learn](https://scikit-learn.org) machine learning framework.

Expand All @@ -16,7 +16,7 @@ The library serves as the basis for the implementation of the following rule lea
- **BOOMER (Gradient Boosted Multi-Output Rules)**: A state-of-the art algorithm that uses [gradient boosting](https://en.wikipedia.org/wiki/Gradient_boosting) to learn an ensemble of rules that is built with respect to a given multivariate loss function.
- **Multi-label Separate-and-Conquer (SeCo) Rule Learning Algorithm**: A heuristic rule learning algorithm based on traditional rule learning techniques that are particularly well-suited for learning interpretable models.

## Functionalities
## :wrench: Functionalities

This package follows a unified and modular framework for the implementation of different types of rule learning algorithms. In the following, we provide an overview of the individual modules an instantiation of the framework must implement.

Expand Down Expand Up @@ -73,6 +73,6 @@ Post-optimization methods can be employed to further improve the predictive perf

A prediction algorithm is needed to derive predictions from the rules in a previously assembled model. As prediction methods heavily depend on the rule learning algorithm and problem domain at hand, no implementation is provided by this package out-of-the-box. However, it defines interfaces for the prediction of **scores, binary predictions, or probability estimates.**

## License
## :scroll: License

This project is open source software licensed under the terms of the [MIT license](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html). We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available [here](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html).
8 changes: 4 additions & 4 deletions python/subprojects/seco/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PyPI version](https://badge.fury.io/py/mlrl-seco.svg)](https://badge.fury.io/py/mlrl-seco) [![Documentation Status](https://readthedocs.org/projects/mlrl-boomer/badge/?version=latest)](https://mlrl-boomer.readthedocs.io/en/latest/?badge=latest)

**Important links:** [Documentation](https://mlrl-boomer.readthedocs.io/en/latest/user_guide/seco/index.html) | [Issue Tracker](https://github.com/mrapp-ke/MLRL-Boomer/issues) | [Changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html) | [Contributors](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html) | [Code of Conduct](https://mlrl-boomer.readthedocs.io/en/latest/misc/CODE_OF_CONDUCT.html) | [License](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html)
**:link: Important links:** [Documentation](https://mlrl-boomer.readthedocs.io/en/latest/user_guide/seco/index.html) | [Issue Tracker](https://github.com/mrapp-ke/MLRL-Boomer/issues) | [Changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html) | [Contributors](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html) | [Code of Conduct](https://mlrl-boomer.readthedocs.io/en/latest/misc/CODE_OF_CONDUCT.html) | [License](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html)

This software package provides an implementation of a **Multi-label Separate-and-Conquer (SeCo) Rule Learning Algorithm** that integrates with the popular [scikit-learn](https://scikit-learn.org) machine learning framework.

The goal of [multi-label classification](https://en.wikipedia.org/wiki/Multi-label_classification) is the automatic assignment of sets of labels to individual data points, for example, the annotation of text documents with topics. The algorithm that is provided by this package uses the SeCo paradigm for learning interpretable rule lists.

## Functionalities
## :wrench: Functionalities

The algorithm that is provided by this project currently supports the following core functionalities to learn a binary classification rules:

Expand All @@ -22,7 +22,7 @@ The algorithm that is provided by this project currently supports the following
- **Native support for numerical, ordinal, and nominal features** eliminates the need for pre-processing techniques such as one-hot encoding.
- **Handling of missing feature values**, i.e., occurrences of NaN in the feature matrix, is implemented by the algorithm.

## Runtime and Memory Optimizations
## :watch: Runtime and Memory Optimizations

In addition, the following features that may speed up training or reduce the memory footprint are currently implemented:

Expand All @@ -31,6 +31,6 @@ In addition, the following features that may speed up training or reduce the mem
- **Sparse prediction matrices** can be used to store predicted labels. This may reduce the memory footprint in case of large data sets.
- **Multi-threading** can be used to parallelize the evaluation of a rule's potential refinements across several features or to obtain predictions for several examples in parallel.

## License
## :scroll: License

This project is open source software licensed under the terms of the [MIT license](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html). We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available [here](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html).
4 changes: 2 additions & 2 deletions python/subprojects/testbed/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

This software package provides **utilities for training and evaluating machine learning algorithms**, including classification and regression problems.

## Functionalities
## :wrench: Functionalities

Most notably, the package includes a command line API that allows configuring and running machine learning algorithms. For example, the [BOOMER algorithm](https://mlrl-boomer.readthedocs.io/en/stable/user_guide/boosting/index.html) integrates with the command line API out-of-the-box. For [using other algorithms](https://mlrl-boomer.readthedocs.io/en/latest/user_guide/testbed/runnables.html) only a few lines of Python code are necessary.

Expand All @@ -26,6 +26,6 @@ The command line API allows applying machine learning algorithms to different da
- **Models can be saved on disk** in order to be reused by future experiments.
- **Algorithmic parameters can be read from configuration files** instead of providing them via command line arguments. When providing parameters via the command line, corresponding configuration files can automatically be saved on disk.

## License
## :scroll: License

This project is open source software licensed under the terms of the [MIT license](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html). We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available [here](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html).

0 comments on commit 7eeec33

Please sign in to comment.