Mining microsatellite markers from public expressed sequence tags databases for the study of threatened plants

Lopez, Lua; Barreiro, Rodolfo; Fischer, Markus; Koch, Marcus A. (2015). Mining microsatellite markers from public expressed sequence tags databases for the study of threatened plants. BMC Genomics, 16(1), pp. 1-14. BioMed Central 10.1186/s12864-015-2031-1

s12864-015-2031-1.pdf - Published Version
Available under License Creative Commons: Attribution (CC-BY).

Download (1MB) | Preview

Simple Sequence Repeats (SSRs) are widely used in population genetic studies but their classical development is costly and time-consuming. The ever-increasing available DNA datasets generated by high-throughput techniques offer an inexpensive alternative for SSRs discovery. Expressed Sequence Tags (ESTs) have been widely used as SSR source for plants of economic relevance but their application to non-model species is still modest.

Here, we explored the use of publicly available ESTs (GenBank at the National Center for Biotechnology Information-NCBI) for SSRs development in non-model plants, focusing on genera listed by the International Union for the Conservation of Nature (IUCN). We also search two model genera with fully annotated genomes for EST-SSRs, Arabidopsis and Oryza, and used them as controls for genome distribution analyses. Overall, we downloaded 16 031 555 sequences for 258 plant genera which were mined for SSRsand their primers with the help of QDD1. Genome distribution analyses in Oryza and Arabidopsis were done by blasting the sequences with SSR against the Oryza sativa and Arabidopsis thaliana reference genomes implemented in the Basal Local Alignment Tool (BLAST) of the NCBI website. Finally, we performed an empirical test to determine the performance of our EST-SSRs in a few individuals from four species of two eudicot genera, Trifolium and Centaurea.

We explored a total of 14 498 726 EST sequences from the dbEST database (NCBI) in 257 plant genera from the IUCN Red List. We identify a very large number (17 102) of ready-to-test EST-SSRs in most plant genera (193) at no cost. Overall, dinucleotide and trinucleotide repeats were the prevalent types but the abundance of the various types of repeat differed between taxonomic groups. Control genomes revealed that trinucleotide repeats were mostly located in coding regions while dinucleotide repeats were largely associated with untranslated regions. Our results from the empirical test revealed considerable amplification success and transferability between congenerics.

The present work represents the first large-scale study developing SSRs by utilizing publicly accessible EST databases in threatened plants. Here we provide a very large number of ready-to-test EST-SSR (17 102) for 193 genera. The cross-species transferability suggests that the number of possible target species would be large. Since trinucleotide repeats are abundant and mainly linked to exons they might be useful in evolutionary and conservation studies. Altogether, our study highly supports the use of EST databases as an extremely affordable and fast alternative for SSR developing in threatened plants.

Item Type:

Journal Article (Original Article)


08 Faculty of Science > Department of Biology > Institute of Plant Sciences (IPS) > Plant Ecology
08 Faculty of Science > Department of Biology > Institute of Plant Sciences (IPS)

UniBE Contributor:

Fischer, Markus


500 Science > 580 Plants (Botany)




BioMed Central




Peter Alfred von Ballmoos-Haas

Date Deposited:

19 Oct 2015 17:20

Last Modified:

05 Dec 2022 14:49

Publisher DOI:


PubMed ID:


Uncontrolled Keywords:

Conservation; Evolution, EST-SSR, Functional markers, Population genetics, Threatened plants




Actions (login required)

Edit item Edit item
Provide Feedback