NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer

Abstract Background The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogen...

Full description

Saved in:
Bibliographic Details
Main Authors: Irantzu Anzar (Author), Angelina Sverchkova (Author), Richard Stratford (Author), Trevor Clancy (Author)
Format: Book
Published: BMC, 2019-05-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_2ca0de2b9f6c430187e30e04f48eef1b
042 |a dc 
100 1 0 |a Irantzu Anzar  |e author 
700 1 0 |a Angelina Sverchkova  |e author 
700 1 0 |a Richard Stratford  |e author 
700 1 0 |a Trevor Clancy  |e author 
245 0 0 |a NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer 
260 |b BMC,   |c 2019-05-01T00:00:00Z. 
500 |a 10.1186/s12920-019-0508-5 
500 |a 1755-8794 
520 |a Abstract Background The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity. Methods In light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples. Results A robust and exhaustive evaluation of NeoMutate's performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools. Conclusions We show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer. 
546 |a EN 
690 |a Somatic variant detection 
690 |a Machine learning 
690 |a Cancer genomics 
690 |a Precision medicine 
690 |a Internal medicine 
690 |a RC31-1245 
690 |a Genetics 
690 |a QH426-470 
655 7 |a article  |2 local 
786 0 |n BMC Medical Genomics, Vol 12, Iss 1, Pp 1-14 (2019) 
787 0 |n http://link.springer.com/article/10.1186/s12920-019-0508-5 
787 0 |n https://doaj.org/toc/1755-8794 
856 4 1 |u https://doaj.org/article/2ca0de2b9f6c430187e30e04f48eef1b  |z Connect to this object online.