Preview

Medical Genetics

Advanced search

Elimination of incorrectly mapped reads from the results of Ion AmpliSeq targeted NGS

https://doi.org/10.25557/2073-7998.2018.05.19-22

Abstract

Background. The use of high-throughput parallel sequencing (NGS) is fraught with errors: not all of the genetic variants that are detected by NGS are true and are confirmed by alternative methods. Incorrectly mapped reads contribute to the appearance of false-positive variants. We believe that for targeted sequencing using Ion AmpliSeq technology, the task of excluding erroneously mapped reads can be solved algorithmically. Additional information on the genomic coordinates of target regions and primers used in amplification allows us to evaluate the validity of the mapping of each reading and to exclude readings from the further analysis that do not correspond to the design of the experiment. Objective. To develop an algorithm to minimize the contribution of mapping errors to the spectrum of genetic variants detected by NGS. Material and methods. Using the information on the genomic coordinates of target regions and primers, we have analyzed 30 BAM files obtained by Ion AmpliSeq (targeted multiplex PCR) NGS with commercial Ion AmpliSeq Comprehensive Cancer Panel (15,992 target regions) and Ion AmpliSeq Inherited Disease Panel (10,309 target regions). The algorithm for excluding incorrectly mapped reads was implemented using the Python programming language. Result. We have performed comparison of the initial set of reads and the set obtained after excluding incorrectly mapped reads. This comparison revealed three groups of genetic variants: (1) detectable in both sets of reads, 6072 variants; (2) detectable exclusively in the original set, 127 (the predominant part of these variants is present in most samples and can be interpreted as a result of systematic alignment errors); (3) detectable in the set generated by exclusion of erroneously mapped reads only, 63 (true positive, previously masked variants). Conclusion. The use of additional information on the expected start and end of the read in the targeted study allows to (1) reduce the number of false-positive genetic variants detected due to misleading reads, (2) detect new ones that were not previously detected due to a seemingly low allele frequency, (3) obtain more reliable values of allelic frequencies of the identified variants. The use of our algorithm to exclude the incorrect mapping of the of DNA fragment reads increases the quality of interpretation of the NGS results, which is especially important for DNA diagnostics.

About the Authors

K. O. Karandasheva
Research Centre for Medical Genetics
Russian Federation


K. I. Anoshkin
Research Centre for Medical Genetics; Pirogov Russian National Research Medical University
Russian Federation


I. V. Volodin
Research Centre for Medical Genetics
Russian Federation


E. B. Kuznetsova
Research Centre for Medical Genetics; I.M. Sechenov First Moscow State Medical University
Russian Federation


D. V. Zaletayev
Research Centre for Medical Genetics; I.M. Sechenov First Moscow State Medical University
Russian Federation


V. V. Strelnikov
Research Centre for Medical Genetics; Pirogov Russian National Research Medical University
Russian Federation


A. S. Tanas
Research Centre for Medical Genetics; Pirogov Russian National Research Medical University
Russian Federation


References

1. Г.В. Байдакова, Е.Ю. Захарова, И.В. Канивец, Ф.А. Коновалов, В.В. Стрельников, С.И. Куцев. Диагностика врожденных и наследственных болезней у детей: достижения и перспективы развития. Вестник Росздравнадзора. 2016; 3: 27-33.

2. Ребриков Д.В., Коростин Д.О., Шубина Е.С., Ильинский В.В. NGS: высокопроизводительное секвенирование, 2-е издание. М.: БИНОМ. 2015; С. 209.

3. Worthey E.A. Analysis and Annotation of Whole-Genome or Whole-Exome Sequencing-Derived Variants for Clinical Diagnosis. Human Genetics. 2013; 24: 1-24.

4. Yang Y., Muzny D.M., Reid J.G. Clinical Whole-Exome Sequencing for the Diagnosis of Mendelian Disorders. The New England Journal of Medicine. 2013; 69: 501-512.

5. Sikkema-Raddatz B., Johansson L.F, de Boer E.N., Almomani R., Boven L.G. Targeted Next-Generation Sequencing can replace Sanger Sequencing in clinical diagnostics. Human Mutation. 2013; 10: 83-92.

6. Mi-Hyun Park, Hwanseok Rhee, Jung Hoon Park, Soo Kyung Koo. Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing. PLOS ONE. 2014; 9(1): e86664.


Review

For citations:


Karandasheva K.O., Anoshkin K.I., Volodin I.V., Kuznetsova E.B., Zaletayev D.V., Strelnikov V.V., Tanas A.S. Elimination of incorrectly mapped reads from the results of Ion AmpliSeq targeted NGS. Medical Genetics. 2018;17(5):19-22. (In Russ.) https://doi.org/10.25557/2073-7998.2018.05.19-22

Views: 730


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2073-7998 (Print)