MTG-Link: filling gaps in draft genome assemblies with linked read data - Irisa Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

MTG-Link: filling gaps in draft genome assemblies with linked read data

Résumé

De novo genome assembly is a challenging task, especially for large non-model organism genomes. Low sequence coverage, genomic repeats and heterozygosity often create ambiguities in the assembly, and result in undefined sequences between contigs called "gaps". Hence, filling gaps in draft genomes has become a natural sub-problem of many de novo genome assembly projects. Even though there are several tools for closing gaps, to our knowledge none uses the long-range information of the linked read data. Linked read technologies have a great potential for filling gaps in draft genomes as they provide long-range information while maintaining the power and accuracy of short-read sequencing. In this work, we present MTG-Link, a novel gap-filling tool dedicated to linked read data. Taking advantage of the barcode information contained in the linked read dataset, a subsample of reads is first selected for each gap. These reads are then locally assembled and the resulting gap-filled sequences are automatically evaluated. We validated our approach on a real 10X genomics linked read dataset, on a set of simulated gaps, and showed that the read subsampling step of MTG-Link enables to get better gap assemblies in a time/memory efficient manner. We also applied MTG-Link on individual genomes of a mimetic butterfly (Heliconius numata), where it significantly improved the contiguity of a 1.3 Mb locus of biological interest. MTG-Link is freely available at https://github.com/anne-gcd/MTG-Link.
Fichier principal
Vignette du fichier
JOBIM2021_paper_20.pdf (484.44 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03441914 , version 1 (22-11-2021)

Identifiants

  • HAL Id : hal-03441914 , version 1

Citer

Anne Guichard, Fabrice Legeai, Denis Tagu, Claire Lemaitre. MTG-Link: filling gaps in draft genome assemblies with linked read data. JOBIM 2021 - Journées Ouvertes Biologie, Informatique et Mathématiques, Jul 2021, Paris, France. pp.1-8. ⟨hal-03441914⟩
118 Consultations
166 Téléchargements

Partager

Gmail Facebook X LinkedIn More