Single-Cell Data Analysis Using MMD Variational Autoencoder


Variational Autoencoder (VAE) is a generative model from the computer vision community; it learns a latent representation of images and generates new images in an unsupervised way. Recently, Vanilla VAE has been applied to single-cell data analysis, in the hope of harnessing the representation power of latent space to evade the “curse of dimensionality” of the original dataset. However, Vanilla VAE is suffering from the issue of less informative latent space, which raises a question concerning the reliability of Vanilla VAE latent space in representing the high-dimensional single-cell datasets. Therefore I set up such a study to examine this issue from the multiple perspectives. This paper confirms the issue of Vanilla VAE by comparing it with MMD-VAE, a variant of VAE which has claimed to have overcome this issue based on image data, across a series of single-cell RNAseq and mass cytometry datasets. The result indicates that MMD-VAE is superior to Vanilla VAE in retaining the information not only in the latent space but also the reconstruction space, which suggests that MMD-VAE be a better option for single-cell data analysis than Vanilla VAE.

The draft of the manuscript can be found from bioRxiv now and is in preparation of publication. Any constructive feedback is welcome.
Chao (Cico) Zhang
Thinker, Doer, Mindfulness Meditator, Mathematician, PhD Candidate

With mindfulness and philosophy, I relish thinking about the meaning of being and doing.