Search in Medwell
Journal of Animal and Veterinary Advances
Year: 2010 | Volume: 9 | Issue: 21 | Page No.: 2759-2762
DOI: 10.3923/javaa.2010.2759.2762  
Using Bioinpormcotics Methods to Develop EST-SSR Makers from Sheep’s ESTs
Wenxiang Zhang, Zunbao Wang, Zongsheng Zhao, Xiancun Zeng, Hongbin Wu and Peng Yu
Abstract: About 29685 ESTs of sheep’s skin from NCBI were mined to develop EST-SSR markers. Totally, 1456 SSRs were identified from 1364 ESTs using SSRHunter soft tools. The frequency of EST-SSRs was 4.9%. The dinucleotide repeat motif was the most abundant SSR accounting for 32.14% followed by 28.23, 7.14, 25.21 and 7.28%, respectively for tri, hexa, penta and tetra-nucleotide repeats. Among the di-nucleotide repeats, AC/TG (58.33%) were the most abundant type. About 261 primer pairs were designed from microsatellite among which 62 pairs could show clear PCR products by electrophoresisand 5 of these primers revealed moderate to high Polymorphism Information Content (PIC) in Chinese Merino sheep. The mean PIC value was 0.6971, ranging from 0.5499-0.7440.


Simple Sequence Repeats (SSRs) have increasingly become the marker of choice for population genetic analyses. Unfortunately, the development of traditional anonymous SSRs from genomic DNA is costly and time consuming (Ellis and Burke, 2007). These problems are further compounded by a paucity of resources in taxa that lack clear economic importance.

However, the advent of the genomics age has resulted in the production of vast amounts of publicly available DNA sequence data including large collections of Expressed Sequence Tags (ESTs) from a variety of different taxa.

Recent research has revealed that ESTs are a potentially rich source of SSRs that reveal polymorphisms not only within the source taxon but also in related taxa (Cordeiro et al., 2001; Varshney et al., 2002). This study is on the purpose of identifying EST-SSR markers and investigate the type and distribution of repeat motifs in the expressed sequence tags of sheep’s skin. The results will facilitate the use of molecular markers in sheep breeding.

These EST-SSR markers can be used for further study such as in genetic mapping, identification of Quantitative Trait Loci (QTLs) and comparative genomics studies of sheep (Scott et al., 2000; Rohrer et al., 2002; Gupta et al., 2003).


ESTs mining and EST-SSR identification: ESTs of sheep’s skin were directly downloaded from NCBI dbEST on December 21, 2008, total 3295. Using SSRHunter software and SSRIT to identify SSRs from these sheep ESTs. Theparam-eters for data mining were set as: 7 repetitions for di, 6 repetitions for tri, 5 repetitions for tetra and 4 repetitions for penta and hexa-nucleotides.

Development of EST-SSR primer pairs: For each microsatellite-containing EST, primers were designed using primer premier (VERSION 5.0) and Oligo (Version 6.7). The major parameters for primer design were set as follows: primer length 18-23 bp with 20 bp as the optimum; PCR product size 100-450 bp; optimum annealing temperature 50-63°C; GC content 40-60% with 50% as the optimum. The primers were synthesized by Bioasia Biotech, Shanghai, China.

DNA extraction, PCR amplification and electrophoresis:The study covered a total of 120 Chinese Merino sheep kept on a farm located in the northwastern part of xinjiang Province. Genomic DNA was isolated as per the method described by Sambrook with minor modifications (Sambrook and Russell, 2002). After checking the quality and quantity DNA was diluted to a final concentration of 50 ng μL-1 in water and stored at 4°C. PCR amplification was performed in a thermal cycler. The reaction mixture contained about 50 ng of template DNA, 1 μmol L-1 of each primer, 200 μmol L-1 of each dNTP, 1.5 mmol L-1 of MgCl2 and 1.5 U Taq (Sangon) with 1xPCR buffer in a total volume of 25 μL.

Table 1: Polymorphisic EST-SSR markers in this study

The PCR program was: 5 min at 95°C followed by 35 cycles of 94°C for 45 sec, 45 sec at the annealing temperature of each primer pair (Table 1) and 60 sec at 72°C with a final extension at 72°C for 10 min. PCR products were visualized on silver-stained 10% polyacrylamide gels.

General statistical analyses: To analyze the variation of microsatellite loci, number of alleles at a certain locus (A) and allele size range (R) were determined by software Quantity One (Bio-Rad). For 5 microsatellite loci observed and expected heterozygosity estimates were calculated after Leveneand Nei as implemented in POPGENE software.

The observed and effective numbers of alleles were also calculated using POPGENE software. Allelic frequencies were utilized for the calculation of the Polymorphic Information Content (PIC) values.


Searching for ESTs containing microsatellites: A total of 209 SSRs were identified from 196 EST sequences. The frequency of EST-SSRs in skin of sheep was 5.9%. The di-inucleotide repeat motifs were the most abundant SSRs in skin of sheep accounting for 71.8% followed by 22, 13, 7 and 4% for tri, hexa, penta and tetra-nucleotide repeats. Depending upon the length of the repeat unit itself (2-6 bp) the lengths of SSRs varied from 14-58 bp. The frequencies of EST-SSRs are shown in Fig. 1.

Frequencies of sheep’s skin SSRs with different repeat motifs: Investigation of the distribution of SSR motifs can help gain insights into the composition of genome. The observed frequency of di-nucleotide repeat motifs comprising the SSRs is shown in Fig. 2.

Among the di-nucleotide repeats, AC/TG was the most abundant type accounting for 67% of all di-nucleotide repeats found in the sheep’s skin ESTs. AG/TC was the second abundant type, accounting for 19%. The GC/CG motif was the least abundant type only accounting for 1%.

Fig. 1: Frequency distribution of different repeat types (2-6 motif unit) microsatellite identified in EST of sheep’s skin. The numbers on the bars indicate the percentage of each repeat type microsatellites in total number

Fig. 2: Frequency distribution of 4 di-nucleotide repeat type in EST of sheep’s skin. The numbers on the bars indicate the percentage of the 4 di-nucleotide repeat type in all di-nucleotide repeat types

EST-SSR locus polymorphism: About 22 primer pairs were used in PCR amplification, 18 pairs could show clear PCR products by electrophoresis and 5 pairs were polymorphic in Chinese Merino sheep (Table 1 and Fig. 3 and 4).

Genetic data analysis: Microsatellite allele frequency profile in Chinese Merino sheep population is shown in Table 2. A total of 25 alleles were detected and the amplification fragments ranged from 213-286 bp. The number of alleles observed across the microsatellite loci varied from 3 (SER006) to 7 (SER012) with an average allele number of 4.8/locus. The observed number of alleles across the loci was more than the effective number of alleles (2.6422-4.8485) as expected. The PIC (Polymorphic Information Content) showed that most of the loci were highly informative indicating polymorphism across the loci with an overall mean of 0.7001±0.0835 (Table 3).

Table 2: Allele frequency of population

Table 3: Genetic parameters of 5 microsatellite loci in Chinese Merino sheep

Fig. 3: Electrophoresis result type

Fig. 4: Electrophoresis result type

The average observed heterozygosity was less than the expected (Table 3). The average expected gene diversity within the population ranged from 0.4833 (SER007) to 0.7667 (SER012) with an over all mean of 0.6194.

The rapid and inexpensive development of SSRs from Expressed Sequence Tag (EST) databases has been shown to be a feasible option for obtaining high-quality nuclear markers. Moreover, the National Center for Biotechnology Information (NCBI) EST database contains an ever-increasing number of these single-pass cDNA sequences, meaning that the resources necessary for the efficient development of large numbers of so-called EST-SSRs already exist for a wide variety of taxa. Yin et al. (2005) analysised the repeated sequence of the skin of goat and the result showed that almost >10 copise is keratin and the zipper structure of amino acids is likely the most abundant district of SSR. So we chose the skin’s EST to develop EST-SSR marker. The result showed that the percentage of sequences containing SSR in the skin’s EST database was 6.3% which was relatively abundant.

Di-nucleotide SSRs are mostly abundant in sheep’s skin ESTs registered in GenBank. Some similar results were also reported in many animal’s researches, different from plants (Thiel et al., 2003; Yue et al., 2004; Yan et al., 2008). The duplication of AT/CG is the most affluence. A total of 22 primer pairs were designed from microsatellite, among which 18 pairs could show clear PCR products by electrophoresis and 5 microsatellite site are high polymorphism. They may be expected as the markers of wool trait locus, researchers look forward to contributing to the later relative research such as in genetic mapping and identification of Quantitative Trait Loci (QTLs).

EST is a part of functional gene. The better explanation can be made by its difference of appearance. So we will use 5 polymorphismy microsatellite site to find the difference between the shag sheep (duolang sheep, half-fuzz sheep (romney marsh sheep) and fuzz sheep (Chinese merino sheep) in latter research.


The result showed that these EST-SSR markers can be used for further study such as in genetic mapping, identification of Quantitative Trait Loci (QTLs) and comparative genomics studies of sheep.


Great thanks to Mr. Zongsheng Zhao and Mr. Bin Jia, for their help and technical assistance. Research was supported by the International science and technology cooperation projects (Project No. DF2007DFB30420).