A Python library for Boolean Matrix Factorization. Work under Preferred.ai.
PyBMF is under active development. We welcome the authors of BMF papers and those interested in BMF to play around and contribute. Please contact us if you have any questions or suggestions.
Boolean matrix factorization (BMF) is a well-known problem in pattern mining. Throughout the years of prosperous research, it has evolved from greedy heuristics to include a wide range of advanced technologies. We hold the belief that a playground with fairness and adaptiveness is necessary for the development of such algorithms.
PyBMF aims to provide a unified framework with:
generators
for various types of synthetic data- easy ways of importing and sampling real-world
datasets
likeMovieLensData
andNetflixData
- data
RatioSplit
andCrossValidation
utilities - tools for generating
negative_sample()
when needed - compatibility of
scipy.sparse
matrices when it can - tools to
evaluate()
using binary and continuous metrics - visualization tools to
show_matrix()
in single or multi-matrix mode - tools to
save_model
andshow_logs
in HTML or OverLeaf withlogs2html
andlogs2latex
- ability to incorporate Boolean matrix simplification and visualization
models
in planned future
Category | Model | Paper | Original Implementation | In PyBMF |
---|---|---|---|---|
Heuristics | Asso | PKDD2006 TKDE2008 | C | ✅ |
Heuristics | Hyper/Hyper+ | SIGKDD2011 | ✅ | |
Heuristics | GreConD | JCSS2010 | MATLAB | ✅ |
Heuristics | Panda | ICDM2010 | ✅ | |
Heuristics | Panda+ | TKDE2013 | ✅ | |
Heuristics | NASSAU | SDM2015 | link | |
Heuristics | GreConD+ | DAM2018 | MATLAB | ✅ |
Heuristics | MEBF | AAAI2020 | R | ✅ |
Continuous | NMFSklearn | 🛞 Wrapper of sklearn.nmf | ||
Continuous | WNMF | ✅ Multiplicative update | ||
Continuous | BinaryMF-Penalty | ICDM2007 | MATLAB | ✅ Multiplicative update |
Continuous | BinaryMF-Thresholding | ICDM2007 | MATLAB | ✅ Line search |
Continuous | FastStep | PAKDD2016 | C++ | ✅ Line search |
Continuous | PRIMP | DMKD2017 | CUDA C++ | ✅ PALM |
Continuous | PNL-PF | SP2021 | ✅ Multiplicative update | |
Continuous | ELBMF | NIPS2022 | Julia Python | ✅ PALM |
Probablistic | MessagePassing | ICML2016 | Python | 🛞 Wrapper of original implementation |
Probablistic | OrMachine | ICLM2017 | Cython | 🛞 Wrapper of original implementation |
Linear Optimization | ColumnGeneration | AAAI2021 | Python | 🛞 Wrapper of original implementation |
Satisfiability | UndercoverBMF | AAAI2021 | C++ | 🛞 Wrapper of original implementation |
Simplification | IterEss | IS2019 | ||
Simplification | DelegationBMF | AAAI2024 | C++ | |
Visualization | OrderedBMF | SIAM2019 | C++ | |
Visualization | BiclusterVisualization | PKDD2023 | Python |
Check Examples that help you get started with PyBMF.
Check Models in which you can implement your own models.
Currently built and tested on Python 3.9.18.
- Diagnosis of thresholding models
- Fix DataFrame display utils in dataframe_utils.py
- Add mask parameter W to PRIMP and ELBMF
- Make a page dedicated to contributors and references
- Include BMF visualization models
- Include BMF simplification models