Single-cell technologies have become a staple in modern biological discovery, offering an incredibly detailed look at the complex dynamics at play in individual cells. These technologies are now routinely used to produce datasets with hundreds of thousands to millions of cells, a data regime ripe for developing and applying machine learning methods to model and predict cellular features. However, while incredibly powerful for understanding cellular and systems biology, the cost and complexity of single-cell technologies has typically limited studies to smaller sample cohorts (1s to 10s of samples), making them difficult to apply to population-scale human health. This disparity in the scale of the number of cells to the number of samples has created an abundance of ML methods to model individual cells and a paucity of methods to model entire samples. As cost reduction and commodification of single-cell technologies increases, there is a burgeoning need for more methods that can model and predict sample-level labels for human health.
In his talk, David will discuss some of the unique challenges of single-cell data analysis, such as the sparsity of features, the outsized effects that rare cells can have, and its inherently unordered nature. He will discuss how AI/ML strategies are uniquely positioned to model this data, and present a set of ideal features that ML models developed for this space would have. He will then present some single-cell model architectures and how they are being applied at his startup (ImYoo, https://imyoo.health) for doing unsupervised discovery of patient groupings with common molecular pathways in autoimmune disease, as well as predicting sample-level phenotypical data.
David began his career in software engineering and machine learning. He developed software for the retail sector for several years and then transitioned into an AI role at NASA’s Jet Propulsion Labs, where he developed systems for decision-making and planning of unmanned surface vehicle swarms. David transitioned to applying software and machine learning to biological applications after seeing the scale of data coming online from next-generation sequencing assays, and seeing their potential for dramatically improving human health. David then did his PhD in Computation and Neural Systems at Caltech, where he developed methods for predicting and characterizing the delivery of AAV gene therapies across many cells and tissues. In 2021, David co-founded ImYoo (https://imyoo.health), a start-up building longitudinal single-cell immune datasets leveraging small blood samples that people can collect on their own at home, paving the path for population-scale single-cell analysis.
Read More
Read More