Mowgli is a novel method for the integration of paired multi-omics data with any type and number of omics, combining integrative Nonnegative Matrix Factorization and Optimal Transport. Read the preprint!
Mowgli is implemented as a Python package seamlessly integrated within the scverse ecosystem, in particular Muon and Scanpy.
On all operating systems, the easiest way to install Mowgli is via PyPI. Installation should typically take a minute and is continuously tested with Python 3.8 on an Ubuntu virtual machine.
pip install mowgli
git clone git@github.com:cantinilab/Mowgli.git
pip install ./Mowgli/
Mowgli takes as an input a Muon object and populates its obsm
and uns
fiels with the embeddings and dictionaries, respectively. Visit mowgli.rtfd.io for more documentation and tutorials.
You may download a preprocessed 10X Multiome demo dataset here.
A GPU is not required for small datasets, but is strongly recommmended above 1,000 cells. On CPU, the cell lines demo (206 cells) should run in under 5 minutes and the PBMC demo (500 cells) should run in under 10 minutes (tested on a Ubuntu 20.04 machine with an 11th gen i7 processor).
import mowgli
import mudata as md
import scanpy as sc
# Load data into a Muon object.
mdata = md.read_h5mu("my_data.h5mu")
# Initialize and train the model.
model = mowgli.models.MowgliModel(latent_dim=15)
model.train(mdata)
# Visualize the embedding with UMAP.
sc.pp.neighbors(mdata, use_rep="W_OT")
sc.tl.umap(mdata)
sc.pl.umap(mdata)
@article{huizing2023paired,
title={Paired single-cell multi-omics data integration with Mowgli},
author={Huizing, Geert-Jan and Deutschmann, Ina Maria and Peyre, Gabriel and Cantini, Laura},
journal={bioRxiv},
pages={2023--02},
year={2023},
publisher={Cold Spring Harbor Laboratory}
}
If you're looking for the repository with code to reproduce the experiments in our preprint, here is is!