Machine Learning Pipelines for Deconvolution of Cellular and Subcellular Heterogeneity from Cell Imaging

Chuangqi Wang, Worcester Polytechnic Institute


Cell-to-cell variations and intracellular processes such as cytoskeletal organization and organelle dynamics exhibit massive heterogeneity. Advances in imaging and optics have enabled researchers to access spatiotemporal information in living cells efficiently. Even though current imaging technologies allow us to acquire an unprecedented amount of cell images, it is challenging to extract valuable information from the massive and complex dataset to interpret heterogeneous biological processes. Machine learning (ML), referring to a set of computational tools to acquire knowledge from data, provides promising solutions to meet this challenge. In this dissertation, we developed ML pipelines for deconvolution of subcellular protrusion heterogeneity from live cell imaging and molecular diagnostic from lens-free digital in-line holography (LDIH) imaging. Cell protrusion is driven by spatiotemporally fluctuating actin assembly processes and is morphodynamically heterogeneous at the subcellular level. Elucidating the underlying molecular dynamics associated with subcellular protrusion heterogeneity is crucial to understanding the biology of cellular movement. Traditional ensemble averaging methods without characterizing the heterogeneity could mask important activities. Therefore, we established an ACF (auto-correlation function) based time series clustering pipeline called HACKS (deconvolution of heterogeneous activities in coordination of cytoskeleton at the subcellular level) to identify distinct subcellular lamellipodial protrusion phenotypes with their underlying actin regulator dynamics from live cell imaging. Using our method, we discover “accelerating protrusion”, which is driven by the temporally ordered coordination of Arp2/3 and VASP activities. Furthermore, deriving the merits of ML, especially Deep Learning (DL) to learn features automatically, we advanced our pipeline to learn fine-grained temporal features by integrating the prior ML analysis results with bi-LSTM (bi-direction long-short term memory) autoencoders to dissect variable-length time series protrusion heterogeneity. By applying it to subcellular protrusion dynamics in pharmacologically and metabolically perturbed epithelial cells, we discovered fine differential response of protrusion dynamics specific to each perturbation. This provides an analytical framework for detailed and quantitative understanding of molecular mechanisms hidden in their heterogeneity. Lens-free digital in-line holography (LDIH) is a promising microscopic tool that overcomes several drawbacks (e.g., limited field of view) of traditional lens-based microscopy. Numerical reconstruction for hologram images from large-field-of-view LDIH is extremely time-consuming. Until now, there are no effective manual-design features to interpret the lateral and depth information from complex diffraction patterns in hologram images directly, which limits LDIH utility for point-of-care applications. Inherited from advantages of DL to learn generalized features automatically, we proposed a deep transfer learning (DTL)-based approach to process LDIH images without reconstruction in the context of cellular analysis. Specifically, using the raw holograms as input, the features extracted from a well-trained network were able to classify cell categories according to the number of cell-bounded microbeads, which performance was comparable with that of object images as input. Combined with the developed DTL approach, LDIH could be realized as a low-cost, portable tool for point-of-care diagnostics. In summary, this dissertation demonstrate that ML applied to cell imaging can successfully dissect subcellular heterogeneity and perform cell-based diagnosis. We expect that our study will be able to make significant contributions to data-driven cell biological research.