Jingyi Jessica Li

Jingyi Jessica Li (Chinese:李婧翌) is a statistical scientist whose work bridges statistics and computational biology, with a focus on developing rigorous statistical methods for the analysis of high-throughput biological data. Her research integrates statistical principles with biological data analysis, particularly in genomics and transcriptomics.

Education and career

Li received her undergraduate degree in biological sciences from Tsinghua University in 2007, and earned her Ph.D. in biostatistics from the University of California, Berkeley in 2013, under the supervision of Peter J. Bickel and Haiyan Huang. She joined the University of California, Los Angeles (UCLA) as a faculty member in 2013, From 2022 to 2023, she was a Radcliffe Fellow at the Harvard Radcliffe Institute for Advanced Study and a visiting professor in the Department of Statistics at Harvard University. In July 2025, she joined the Fred Hutchinson Cancer Center as Professor and Program Head of the Biostatistics Program, where she holds the Donald and Janet K. Guthrie Endowed Chair in Statistics and a joint appointment in the Herbold Computational Biology Program. She is also an Affiliate Professor in the Department of Biostatistics at the University of Washington. == Research ==

Research

Her work relates to transcription and translational control of protein expression levels in the central dogma and statistical methods for RNA-seq data at the bulk and single-cell levels. Her 2015 Science study, a reanalysis of a 2011 Nature article, suggested that transcription, rather than translation, remains the dominant factor regulating protein abundance, primarily influencing differences in protein expression levels across genes. Her research group developed a suite of single-cell data simulators, including scDesign, scDesign2 that captures gene-gene correlations, scDesign3 for single-cell and spatial multi-omics data, and scReadSim for single-cell RNA-seq and ATAC-seq read simulation. Besides, her group developed scImpute, an imputation tool for missing gene expression values. Her contributions also extend to statistical and computational methodologies, including Clipper, a p-value-free false discovery rate (FDR) control method; ITCA, a criterion for guiding the combination of ambiguous class labels in multiclass classification; and Neyman-Pearson classification, a framework for prioritizing the control of misclassification errors in critical classes. Her recent efforts advocate for the importance of statistical rigor in genomics data analysis. In a recent study, she and co-authors raised a warning in using popular RNA-seq differential expression (DE) methods blindly without checking the underlying assumptions. For example, in population-scale human RNA-seq samples where the negative binomial assumption for each gene does not hold, popular methods relying on this assumption can lead to excessive false discoveries, while non-parametric tests such as the Wilcoxon rank-sum test gives more reliable results. Moreover, she developed scDEED, a statistical method leveraging permutation techniques to evaluate and optimize embeddings produced by t-SNE and UMAP. scDEED detects dubious embeddings that fail to preserve mid-range distances and refines t-SNE and UMAP hyperparameters. == References ==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com