PyBMF

Documentation Status PyPi

A Python library for Boolean Matrix Factorization. Work under Preferred.ai.

PyBMF is under active development. We welcome the authors of BMF papers and those interested in BMF to play around and contribute. Please contact us if you have any questions or suggestions.

Prospectives

Boolean matrix factorization (BMF) is a well-known problem in pattern mining. Throughout the years of prosperous research, it has evolved from greedy heuristics to include a wide range of advanced technologies. We hold the belief that a playground with fairness and adaptiveness is necessary for the development of such algorithms.

PyBMF aims to provide a unified framework with:

  1. generators for various types of synthetic data

  2. easy ways of importing and sampling real-world datasets like MovieLensData and NetflixData

  3. data RatioSplit and CrossValidation utilities

  4. tools for generating negative_sample() when needed

  5. compatibility of scipy.sparse matrices when it can

  6. tools to evaluate() using binary and continuous metrics

  7. visualization tools to show_matrix() in single or multi-matrix mode

  8. tools to save_model and show_logs in HTML or OverLeaf with logs2html and logs2latex

  9. ability to incorporate Boolean matrix simplification and visualization models in planned future

Models

Category

Model

Paper

Original Implementation

In PyBMF

Heuristics

Asso

PKDD2006 TKDE2008

C

Heuristics

Hyper/Hyper+

SIGKDD2011

Heuristics

GreConD

JCSS2010

MATLAB

Heuristics

Panda

ICDM2010

Heuristics

Panda+

TKDE2013

Heuristics

NASSAU

SDM2015

link

Heuristics

GreConD+

DAM2018

MATLAB

Heuristics

MEBF

AAAI2020

R

Continuous

NMFSklearn

🛞 Wrapper of sklearn.nmf

Continuous

WNMF

✅ Multiplicative update

Continuous

BinaryMF-Penalty

ICDM2007

MATLAB

✅ Multiplicative update

Continuous

BinaryMF-Thresholding

ICDM2007

MATLAB

✅ Line search

Continuous

FastStep

PAKDD2016

C++

✅ Line search

Continuous

PRIMP

DMKD2017

CUDA C++

✅ PALM

Continuous

PNL-PF

SP2021

✅ Multiplicative update

Continuous

ELBMF

NIPS2022

Julia Python

✅ PALM

Probablistic

MessagePassing

ICML2016

Python

🛞 Wrapper of original implementation

Probablistic

OrMachine

ICLM2017

Cython

🛞 Wrapper of original implementation

Linear Optimization

ColumnGeneration

AAAI2021

Python

🛞 Wrapper of original implementation

Satisfiability

UndercoverBMF

AAAI2021

C++

🛞 Wrapper of original implementation

Simplification

IterEss

IS2019

Simplification

DelegationBMF

AAAI2024

C++

Visualization

OrderedBMF

SIAM2019

C++

Visualization

BiclusterVisualization

PKDD2023

Python

How to use PyBMF

Check Examples that help you get started with PyBMF.

Check Models in which you can implement your own models.

Compatibility

Currently built and tested on Python 3.9.18.

TO-DO

  • [x] Diagnosis of thresholding models

  • [x] Fix DataFrame display utils in dataframe_utils.py

  • [ ] Add mask parameter W to PRIMP and ELBMF

  • [ ] Make a page dedicated to contributors and references

  • [ ] Include BMF visualization models

  • [ ] Include BMF simplification models

Subpackages

Indices and tables