Skip to Main content Skip to Navigation
Conference papers

MTG-Link: filling gaps in draft genome assemblies with linked read data

Abstract : De novo genome assembly is a challenging task, especially for large non-model organism genomes. Low sequence coverage, genomic repeats and heterozygosity often create ambiguities in the assembly, and result in undefined sequences between contigs called "gaps". Hence, filling gaps in draft genomes has become a natural sub-problem of many de novo genome assembly projects. Even though there are several tools for closing gaps, to our knowledge none uses the long-range information of the linked read data. Linked read technologies have a great potential for filling gaps in draft genomes as they provide long-range information while maintaining the power and accuracy of short-read sequencing. In this work, we present MTG-Link, a novel gap-filling tool dedicated to linked read data. Taking advantage of the barcode information contained in the linked read dataset, a subsample of reads is first selected for each gap. These reads are then locally assembled and the resulting gap-filled sequences are automatically evaluated. We validated our approach on a real 10X genomics linked read dataset, on a set of simulated gaps, and showed that the read subsampling step of MTG-Link enables to get better gap assemblies in a time/memory efficient manner. We also applied MTG-Link on individual genomes of a mimetic butterfly (Heliconius numata), where it significantly improved the contiguity of a 1.3 Mb locus of biological interest. MTG-Link is freely available at https://github.com/anne-gcd/MTG-Link.
Document type :
Conference papers
Complete list of metadata

https://hal.inria.fr/hal-03441914
Contributor : Claire Lemaitre Connect in order to contact the contributor
Submitted on : Monday, November 22, 2021 - 8:42:14 PM
Last modification on : Friday, November 26, 2021 - 11:08:19 PM

File

JOBIM2021_paper_20.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03441914, version 1

Citation

Anne Guichard, Fabrice Legeai, Denis Tagu, Claire Lemaitre. MTG-Link: filling gaps in draft genome assemblies with linked read data. JOBIM 2021 - Journées Ouvertes Biologie, Informatique et Mathématiques, Jul 2021, Paris, France. ⟨hal-03441914⟩

Share

Metrics

Record views

15

Files downloads

14