Improvement of the Threespine Stickleback Genome Using a Hi-C-Based Proximity-Guided Assembly

Peichel, Catherine; Sullivan, Shawn T; Liachko, I; White, Michael A (2017). Improvement of the Threespine Stickleback Genome Using a Hi-C-Based Proximity-Guided Assembly. Journal of heredity, 108(6), pp. 693-700. Oxford University Press 10.1093/jhered/esx058

[img] Text
Peichel et al 2017 JOH.pdf - Published Version
Restricted to registered users only
Available under License Publisher holds Copyright.

Download (1MB)
[img]
Preview
Text
Peichel et al 2017 accepted.pdf - Accepted Version
Available under License Publisher holds Copyright.

Download (1MB) | Preview

Scaffolding genomes into complete chromosome assemblies remains challenging even with the rapidly increasing sequence coverage generated by current next-generation sequence technologies. Even with scaffolding information, many genome assemblies remain incomplete. The genome of the threespine stickleback (Gasterosteus aculeatus), a fish model system in evolutionary genetics and genomics, is not completely assembled despite scaffolding with high-density linkage maps. Here, we first test the ability of a Hi-C based proximity-guided assembly (PGA) to perform a de novo genome assembly from relatively short contigs. Using Hi-C based PGA, we generated complete chromosome assemblies from a distribution of short contigs (20-100 kb). We found that 96.40% of contigs were correctly assigned to linkage groups (LGs), with ordering nearly identical to the previous genome assembly. Using available bacterial artificial chromosome (BAC) end sequences, we provide evidence that some of the few discrepancies between the Hi-C assembly and the existing assembly are due to structural variation between the populations used for the 2 assemblies or errors in the existing assembly. This Hi-C assembly also allowed us to improve the existing assembly, assigning over 60% (13.35 Mb) of the previously unassigned (~21.7 Mb) contigs to LGs. Together, our results highlight the potential of the Hi-C based PGA method to be used in combination with short read data to perform relatively inexpensive de novo genome assemblies. This approach will be particularly useful in organisms in which it is difficult to perform linkage mapping or to obtain high molecular weight DNA required for other scaffolding methods.

Item Type:

Journal Article (Original Article)

Division/Institute:

08 Faculty of Science > Department of Biology > Institute of Ecology and Evolution (IEE) > Evolutionary Ecology

UniBE Contributor:

Peichel, Catherine

Subjects:

500 Science > 570 Life sciences; biology

ISSN:

0022-1503

Publisher:

Oxford University Press

Language:

English

Submitter:

Catherine Peichel

Date Deposited:

06 Dec 2017 12:24

Last Modified:

05 Dec 2022 15:08

Publisher DOI:

10.1093/jhered/esx058

PubMed ID:

28821183

BORIS DOI:

10.7892/boris.106842

URI:

https://boris.unibe.ch/id/eprint/106842

Actions (login required)

Edit item Edit item
Provide Feedback