Immune checkpoint inhibitors (CPIs) are arguably the most transformative advance in cancer treatment in the last decade. Nevertheless, despite improved outcomes and future promise, a vast majority of cancer patients today fail to benefit from CPIs. Checkmate Pharmaceuticals is conducting a phase Ib multi-center, open label study with checkpoint inhibitors (pembrolizumab) and a CpG‐A Toll‐like receptor 9 (TLR9) agonist in anti-PD-1 nonresponders with advanced cutaneous melanoma. Genialis partnered with Checkmate Pharmaceuticals to analyze clinical trial sequencing data to help identify molecular signatures of response using our end-to-end NGS data management & analysis platform, Genialis Expressions.
Get the case study here.
Drug development in oncology stands to benefit greatly from an increasing number of public datasets like The Cancer Genome Atlas (TCGA) and Cancer Cell Line Encyclopedia (CCLE). Further, translational research groups and even pharma teams themselves regularly publish novel gene signatures that claim to improve our predictive, prognostic or diagnostic capabilities around a particular disease, mechanism or class of drug. But making use of these data resources and prior analytic works is not trivial. One must take a cautious and systematic approach to QA/QC, and spend the time understanding both the data and the models before hoping to apply them to new therapies and patient cohorts. Genialis is working with several leading biopharma, across a range of cancer sites and drug types, to leverage public and proprietary data in building predictive models for the next generation of life saving treatments.
U-BIOPRED (Unbiased BIOmarkers in PREDiction of respiratory disease outcomes) was a multi-country, multi-year research project that used patient medical information and tissue samples to learn more about different types of asthma to ensure better diagnosis and treatment for each person. Genialis partnered with Boehringer Ingelheim to develop a web application for the identification of gene signatures associated with the characteristics of patient cohort. Using patients’ microarray data from the U-BIOPRED database, our web-app provided access to hundreds of expression profiles and more than a thousand demographic and clinical variables for each patient. We built a data import function from the tranSMART database, and custom ontology to accommodate the project-specific metadata. The visualization suite included graphical modules for everything from exploring individual gene expression patterns to stratifying patients by metadata parameters using unsupervised clustering.
We developed a web-application atop the Genialis platform for the analysis of disease variants profiled by Swift Biosciences Accel Amplicon panels. Researchers at CPMCRI use the software for primary and secondary analysis of targeted gene sequencing data from low-input tissue derived from high mutation load patient samples. We further engage the client to pipe these results into their in-house data integration workflow. The goal is to develop efficacious disease models (PDX, cell culture) and faithful biomarkers across a large spectrum of cancer types prevalent in the affected population.
Genialis supported the development of cancer diagnostic biochips by applying various machine learning methods to disease and control data sets from medium and large probe sets. We surveyed the performance of Support Vector Machines, Neural Networks, Random Forest, and Gradient Boosting, as well as logistic regression. In addition to providing the customer with a ranked prediction set for immediate lab validation, we performed a detailed examination of model performance across different methods and attribute optimization. The final trusted results were reached by model stacking to produce a super-model with superior prediction accuracy while minimizing risk of over-fitting.
With discovery partners at the Therapeutic Innovation Center at Baylor College of Medicine, Genialis is developing a computational biology framework to identify and exploit cancer vulnerabilities across a range of indications. The principle challenges include identifying essential genes that exhibit disease specificity, modeling the impact of drugging those targets, and synthesizing the evidence from in vivo and in vitro validation studies. This work necessitates an innovative approach to bioinformatics processing and integration of RNA-, ChIP-, CLIP-, and ATAC-seq, both bulk and single cell. Data integration, as well as target ranking and triage, are supported by novel biostatistical metrics and various machine learning modules. We are also developing a visualization framework to foster collaboration, speed up iteration cycles, and aid in the interpretation of model outputs.
Genialis works with the RNA Regulatory Networks Laboratory at Francis Crick, to develop a comprehensive data repository and analysis software for RNA-protein interaction and associated gene expression data. Protein-RNA interactions represent one of the most crucial yet understudied aspects of the way our cells regulate gene activity, and are implicated in myriad diseases and developmental processes, e.g. cancer, motor neuron disease, Fragile X syndrome and ataxia. This project aims to elucidate the dynamics and evolution of protein-RNA complexes, and to build models to predict targets for novel therapies and disease diagnostics. We have created algorithms and a web-based user interface for managing, analyzing, exploring and sharing CLIP- and RNA-seq data. Further, we developed a custom AI-based quality control module to automatically detect and report on pass-fail characteristics of these data.
Thiopurines are a class of chemotherapy drugs used in treatment of ALL and IBD, and are toxic to cells. Previous observations noted an association between decreased activity of the gene TPMT and increased toxicity. However, the endogenous role of TPMT and its molecular processes were unknown. We performed data fusion to construct a predictive model of this genetic system. The model integrated proprietary patient data from gene expression profiling and genotype-phenotype correlations, as well as public information sources such as gene annotations, Gene Ontology semantic structure, and protein-protein interactions. The model enabled gene prioritization and gene network prediction, which suggested a role for TPMT in oxidoreductive processes and regulating cell redox capacity. In vitro studies confirmed a difference in oxidative toxicity in HepG2 cells transfected with TPMT versus controls, validating the model’s general prediction. Additional predicted gene associations are under investigation as biomarkers of TPMT-mediated thiopurine response.
We identified novel genetic risk factors for two disease indications, ulcerative colitis and rheumatoid arthritis, from a genetically distinct north Indian cohort. Both disease datasets contained microarray genotype data of disease and control patients and GWAS studies. The goal was to discover novel genetic risk factors that were not revealed by the GWAS studies. We compared the GWAS ranking of genetic risk factors with the rankings by 3 machine learning methods for supervised learning: random forest, support vector machine and neural networks. Models were trained on the microarray genotype data with genetic risk factors as explanatory variables and the phenotype (disease or control) as a dependent target variable. For each model, we estimated the importance of genetic risk factors. The variable importance measures how much a risk factor contributes to the correct prediction of the target variable. Finally, genetic risk factors were ranked by their importance, revealing novel insights from the ML models previously overlooked by GWAS alone.