Gene classification and mining of molecular markers useful in red clover (Trifolium pratense) breeding
Authors | |
---|---|
Year of publication | 2017 |
Type | Article in Periodical |
Magazine / Source | Frontiers in Plant Science |
MU Faculty or unit | |
Citation | |
Web | http://journal.frontiersin.org/article/10.3389/fpls.2017.00367/full |
Doi | http://dx.doi.org/10.3389/fpls.2017.00367 |
Field | Genetics and molecular biology |
Keywords | biosynthetic pathways; genetic diversity; sequencing; SNP; specific genes; SSR |
Description | Red clover (Trifolium pratense) is an important forage plant worldwide. This study was directed to broadening current knowledge of red clover’s coding regions and enhancing its utilization in practice by specific reanalysis of previously published assembly. A total of 42,996 genes were characterized using Illumina paired-end sequencing after manual revision of Blast2GO annotation. Genes were classified into metabolic and biosynthetic pathways in response to biological processes, with 7,517 genes being assigned to specific pathways. We identified 6,749 potential microsatellite loci in red clover coding sequences, and we characterized 4,005 potential simple sequence repeat (SSR) markers as generating polymerase chain reaction products preferentially within 100–350 bp. Marker density of 1 SSR marker per 12.39 kbp was achieved. Aligning reads against predicted coding sequences resulted in the identification of 343,027 single nucleotide polymorphism (SNP) markers, providing marker density of one SNP marker per 144.6 bp. Altogether, 95 SSRs in coding sequences were analyzed for 50 red clover varieties and a collection of 22 highly polymorphic SSRs with pooled polymorphism information content >0.9 was generated. A set of 8,623 genome-wide distributed SNPs was developed and used for polymorphism evaluation in individual plants. The polymorphic information content ranged from 0 to 0.375. Temperature switch PCR was successfully used in single-marker SNP genotyping for targeted coding sequences and for heterozygosity or homozygosity confirmation in validated five loci. Predicted large sets of SSRs and SNPs throughout the genome are key to rapidly implementing genome-based breeding approaches, for identifying genes underlying key traits, and for genome-wide association studies. |
Related projects: |