FENYO LAB
Research: Multiomics Integration
Measurements of DNA, RNA, proteins and their modifications, and metabolites provide complementary information about a biological system. Omics data sets are typically noisy and have missing valves making integration challenging. We are approaching this challenge in many different ways.
Tools for proteogenomic integration: We have developed proteogenomic integration methods including: QUILTS for generating tumor-specific protein sequence databases using the observed DNA and RNA sequence variation (Ruggles et al., MCP 20016, GitHub), PGx for mapping peptide sequences onto putative genomic coordinates (Askenazi et al., JPR 2015, GitHub), LlamaMagic for sequencing high-affinity single-chain llama antibodies (Fridy et al., Nature Methods 2014, GitHub), BlackSheep for differential extreme value analysis (Blumenberg et al., JPR 2021, GitHub), a method based on independent component analysis for extracting signatures (Liu et al., MCP 2019), PhosphoDisco for discovery of co-regulated phosphorylation modules in phosphoproteomic data (Schraink et al., MCP 2021), and Panoptes for integrating proteogenomics and histopathology using deep learning (Hong et al., Cell Reports Medicine 2021, Wang et al., Cell Reports Medicine 2023).
Cancer proteogenomics: In collaboration with the Clinical Tumor Analysis Consortyium (CPTAC) and the International Cancer Proteogenome Consortium (ICPC), we apply multiomics integration methods to define the proteomic landscape across tumor types with the goal of increasing our understanding of tumor biology: endometrial cancer (Dou & Kawaler et al. Cell 2020, Dou & Katsnelson et al. Cancer Cell 2023, Hong et al. Cell Reports Medicine 2021. ovarian oancer (Zhang et al., Cell 2016, Chowdhury et al., Cell 2024), lung cancer: (Gillette al., Cell 2020, Satpathy et al., Cell 2021, PubMed), breast cancer (Mertins et al., Nature 2016, Krug et al., Cell 2020, PubMed), melanoma (Kuras et al., bioRxixv 2023, Betancourt et al., Clin Transl Med. 2021, PubMed), brain cancer (Petralia et al., Cell 2020, Wang et al., Cancer Cell 2021, Liu et al , Cancer Cell 2024), kidney cancer (Clark et al., Cell 2020, Li et al., Cancer Cell 2023), pancreatic cancer (Cao et al., Cell 2021), We are also exploring the similarities between cancer types through pan-cancer analysis: we have integrated proteogenomics data with histology across cancer types using deep learning (Wang et al., Cell Reports Medicine 2023), explored how aneuploidy is associated with different modes of gene regulation (Cheng et al., eLife 2022), investigated the role of retrotransposons in tumor biology and its relationship to p53 mutation, copy number alteration, and S phase checkpoint (McKerrow et al. PNAS 2021), studied HLA expression at the transcript and protein levels across tumor types (Wang et al., JPR 2023), and explored the immune landscape of tumors (Petralia et al., Cell 2024).
Antibody characterization: We used proteogenomics integration strategies to determine the amino acid sequences of disease-specific antibodies in patients infected with HIV (Scheid et al., Science 2011), and malaria (Muellenbeck et al., J. Exp. Med. 2013), and for producing large repertoires of nanobodies, single-domain antibodies derived from the variable regions of Camelidae atypical immunoglobulins, against two antigens, GFP and mCherry (Fridy et al., Nature Methods 2014, Fridy et al., JBC 2024), and SARS-CoV-2 (Mast et al., eLife 2021).