TAP: a targeted clinical genomics pipeline for detecting transcript variants using RNA-seq data

Abstract Background RNA-seq is a powerful and cost-effective technology for molecular diagnostics of cancer and other diseases, and it can reach its full potential when coupled with validated clinical-grade informatics tools. Despite recent advances in long-read sequencing, transcriptome assembly of...

Full description

Saved in:
Bibliographic Details
Main Authors: Readman Chiu (Author), Ka Ming Nip (Author), Justin Chu (Author), Inanc Birol (Author)
Format: Book
Published: BMC, 2018-09-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_04c83c6e11664161bc6bc3eb730190a1
042 |a dc 
100 1 0 |a Readman Chiu  |e author 
700 1 0 |a Ka Ming Nip  |e author 
700 1 0 |a Justin Chu  |e author 
700 1 0 |a Inanc Birol  |e author 
245 0 0 |a TAP: a targeted clinical genomics pipeline for detecting transcript variants using RNA-seq data 
260 |b BMC,   |c 2018-09-01T00:00:00Z. 
500 |a 10.1186/s12920-018-0402-6 
500 |a 1755-8794 
520 |a Abstract Background RNA-seq is a powerful and cost-effective technology for molecular diagnostics of cancer and other diseases, and it can reach its full potential when coupled with validated clinical-grade informatics tools. Despite recent advances in long-read sequencing, transcriptome assembly of short reads remains a useful and cost-effective methodology for unveiling transcript-level rearrangements and novel isoforms. One of the major concerns for adopting the proven de novo assembly approach for RNA-seq data in clinical settings has been the analysis turnaround time. To address this concern, we have developed a targeted approach to expedite assembly and analysis of RNA-seq data. Results Here we present our Targeted Assembly Pipeline (TAP), which consists of four stages: 1) alignment-free gene-level classification of RNA-seq reads using BioBloomTools, 2) de novo assembly of individual targets using Trans-ABySS, 3) alignment of assembled contigs to the reference genome and transcriptome with GMAP and BWA and 4) structural and splicing variant detection using PAVFinder. We show that PAVFinder is a robust gene fusion detection tool when compared to established methods such as Tophat-Fusion and deFuse on simulated data of 448 events. Using the Leucegene acute myeloid leukemia (AML) RNA-seq data and a set of 580 COSMIC target genes, TAP identified a wide range of hallmark molecular anomalies including gene fusions, tandem duplications, insertions and deletions in agreement with published literature results. Moreover, also in this dataset, TAP captured AML-specific splicing variants such as skipped exons and novel splice sites reported in studies elsewhere. Running time of TAP on 100-150 million read pairs and a 580-gene set is one to 2 hours on a 48-core machine. Conclusions We demonstrated that TAP is a fast and robust RNA-seq variant detection pipeline that is potentially amenable to clinical applications. TAP is available at http://www.bcgsc.ca/platform/bioinfo/software/pavfinder 
546 |a EN 
690 |a RNA-seq 
690 |a Transcriptome assembly 
690 |a Clinical genomics 
690 |a Gene fusion 
690 |a Alternative splicing 
690 |a Internal tandem duplication 
690 |a Internal medicine 
690 |a RC31-1245 
690 |a Genetics 
690 |a QH426-470 
655 7 |a article  |2 local 
786 0 |n BMC Medical Genomics, Vol 11, Iss 1, Pp 1-9 (2018) 
787 0 |n http://link.springer.com/article/10.1186/s12920-018-0402-6 
787 0 |n https://doaj.org/toc/1755-8794 
856 4 1 |u https://doaj.org/article/04c83c6e11664161bc6bc3eb730190a1  |z Connect to this object online.