Paving the Way for New Discoveries and New Ideas
We develop high-throughput & interdisciplinary technologies that can measure, perturb, predict, or interpret cellular and genomic features. Traditionally, biology has been grounded in iterations of observation, individual perturbation, and measurement. The efficiency and scalability are limiting our ability to address many fundamental questions. We are developing and applying biologically grounded predictive genomics AI technologies as well as high-throughput experimental techniques to address these challenges, aiming to accelerate discoveries in fundamental biology.
For decades, forward and reverse genetic screens have been central in functional studies of genes and beyond. Forward genetic screening starts with phenotypes and aims to determine the genetic basis responsible for a given phenotype, while reverse genetic screening starts from known genes – or more broadly, DNA sequences – and assays the effect of each gene upon perturbation. However, traditional genetic screening typically requires a large-scale setup and is strongly limited by the available resources and experimental feasibility.
We recently proposed and developed the in silico screen research framework as a next-generation approach to genetic discoveries. This in silico screen framework relies on advanced machine learning / artificial intelligence models that are grounded in solid biological principles and achieve accurate prediction performance across varied biological contexts. Similar to the experimental genetic screen, the in silico screen framework interrogates the effect of perturbations through accurate prediction in an ultra-high-throughput scenario. We recently developed C.Origami (Nature Biotech, 2023) and Chromnitron (BioRxiv, 2025), both biologically informed deep neural networks that predict cis- and trans-regulatory mechanisms of gene expression in unseen cell types. Coupling these models with the in silico screen framework, we discovered unknown key factors that regulate genome organization and cell fate transitions. We continue developing biologically grounded predictive genomics AI technologies and high-throughput in silico genetic screen platforms to enable more fundamental discoveries in genome sciences.
Chemical genomics, at the intersection of chemistry and genomics, plays an important role in the advanced understanding of the genome through technological innovation. Chemical approaches for studying dynamic biological interactions such as protein-DNA and protein-protein interactions set the foundational knowledge for genome regulation. Bisulfite salt treatment to DNA bases has provided fundamental insights into the epigenetic status of the genome. We embrace all chemical magics that can enable genomic discovery and cell fate engineering.
We have pioneered the development of several chemical genomic technologies for DNA epigenetic modifications. DNA methylation (5mC) and TET-protein-mediated DNA methylation-derivatives (5hmC, 5fC, and 5caC) represent one major part of epigenetics. The critical information to understand the function of an epigenetic factor is to profile its genome-wide distribution pattern. Bisulfite reaction and sequencing used to be a golden standard for DNA methylation, but failed to distinguish these new epigenetic bases. We have developed several unique and robust chemical methods to analyze the genome distribution map of 5fC and 5hmC. Represented by 'fC-CET' and 'CLEVER-seq', these methods demonstrated the concept of 'bisulfite-free & base-resolution' analysis of DNA epigenetic modifications. We used these methods to analyze the epigenomes in embryonic stem cells and single cells of the early developing embryo. These technologies will help us understand the molecular basis of epigenetic gene expression regulation and how these chemical modifications – and their modifier and reader proteins – affect mammalian development.