Kipoi Seminar - Peter Koo (CSHL)
Вставка
- Опубліковано 4 лис 2024
- Interpreting rules of gene regulation learned by deep learning
Abstract:
Deep neural networks (DNNs) have become a widely used tool to analyze high-throughput functional genomics data. However, a DNN’s low interpretability makes it challenging to translate their improved predictions into new discoveries in biology and healthcare. While progress has been made to explain DNN predictions through feature attributions and counterfactual explanations, existing interpretability methods sacrifice deeper insights in favor of simpler explanations that can work generically across models and domains. Here, we introduce two domain-specific model interpretability methods for regulatory genomics. First, we leverage domain knowledge based on multiplex assays of variant effects (MAVEs) to characterize biological mechanisms within a genomic locus learned by a DNN. Our approach, called SQUID, approximates a DNN about a custom region of sequence space using in silico MAVEs and biophysics-inspired surrogate models, which provide parameters with direct interpretations as cis-regulatory mechanisms. We show that SQUID leads to more consistent representations of transcription factor binding motifs and yields improved single-nucleotide variant effect predictions compared to existing attribution methods. Second, we leverage CRISPR-inspired perturbations to address targeted biological questions of gene regulation learned by a DNN. Our approach, called CRÈME, employs multiscale in silico perturbations to identify CREs and characterize their enhancing or silencing effect on DNN predictions. By interpreting Enformer, a state-of-the-art sequence-based DNN for gene expression prediction, CRÈME reveals that Enformer’s predictions integrate numerous enhancers and silencers in an additive manner, though for some genes, the CREs exhibit cooperativity or redundancy. Together, our work demonstrates that domain knowledge is essential for interpreting deeper biological insights from genomic DNNs.