RNA-seq data hold significant potential for identifying genetic variants. However, interpreting genomic sites where no variant is detected remains a challenge. These sites may be true negatives (non-events) or may result from insufficient sequencing coverage, making their interpretation uncertain. Resolving this ambiguity is crucial for improving the accuracy and utility of RNA-seq-based variant detection.
Here, we introduce an innovative method for predicting a robust confidence metric, pQUAL, which quantifies the confidence in non-events that are not reported in traditional VCF files. Our metric leverages information from existing VCF files (e.g., coverage, alternative allele depth, and QUAL scores) and coverage files, reporting per-base coverage for all sites to model confidence scores for previously unreported sites (non-event sites). This approach enables accurate classification of sites into likely events, likely non-events, or inconclusive cases based on user-defined thresholds.
Published for ISMB/ECCB 2025.
Matjaž Žganec1, Marcel Levstek2, Roman Luštrik2, Janez Kokošar1, Anže Lovše2,3, Luka Ausec1, Kristian Urh2
1 Genialis, Inc.
2 Genialis d.o.o.
3 University of Ljubljana, Faculty of Computer and Information Science