
- This event has passed.
Seminar Series
April 8, 2020 @ 10:00 am - 11:00 am
Robust and scalable methods for massively sequenced genomes and single-cell transcriptomes
Speaker: Hyun Min Kang, Ph.D.
ABSTRACT: The rapidly accelerating pace of genome and single-cell transcriptome sequencing holds great promise for precision medicine but also sets us tremendous computational and statistical challenges at an unprecedented scale. These challenges include accurately calling genetic variants from petabytes of sequenced genomes, performing phenome-wide association analysis across millions of genomes, & identifying cell-type specific regulatory determinants across billions of single-cell transcriptomes. In this talk, I will first describe the analysis of >150,000 deeply sequenced genomes to characterize genetic signatures associated with heart, lung, blood, and sleep (HLBS) disorders for the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. I will describe several analytic challenges and solutions to scale up the analysis of deep genome sequence to millions of genomes. Through such a large-scale genome sequencing, I will demonstrate how our understandings of functional rare variants have expanded over the last several years. Next, I will present computational methods that enable massively multiplexed population-scale single-cell RNA sequencing (scRNA-seq) experiments, at a several-fold lower cost and at >10-fold higher throughput than a typical 10x Genomics experiment. Our methods, demuxlet and freemuxlet, enable genetically multiplexed designs of scRNAseq by harnessing natural genetic variation to determine the sample identity of each cell and filter out droplets containing multiple cells. I will demonstrate that demuxlet and freemuxlet accurately identify the provenance barcoded droplets across multiple tissues and scRNA-seq technologies. Finally, I will introduce popscle, an integrative software package designed for population-scale scRNA-seq and scATAC-seq studies.