Abstract:Abstract In masson pine (Pinus massoniana) breeding programs, lack of co-dominant genetic markers highly constrained the development of molecular marker assisted breeding. In order to comprehensively elucidate the germplasm genetic information and develop more new molecular markers suitable for P. massoniana, currently, an EST-SSR marker system was established based on transcriptome data obtained by high-throughput sequencing technology of this tree, and their characteristics were analyzed. Consequently, the distribution patterns of the markers in the transcriptome sequences the SSR-PCR system for P. massoniana was established and examined with 27 individuals. Results indicated that a total of 70 896 unigenes were screened by MISA software from P. massoniana transcriptome, 3 329 SSR loci occurred in 3 074 unigenes. Around 2 750 unigenes contained a single SSR loci, and the occurrence frequency was 3.88%. Average of 223 unigenes contained 2 or more than 2 SSR loci, and the occurrence frequency was 0.31%. One hundred and one unigenes contained the mixed SSR loci, and the occurrence frequency was 0.14%. The frequency of these SSRs was 4.69%, and the mean distance was 20.94 kb. Among the SSR locis, mononucleotide, trinucleotide and dinucleotide were the major types, accounting for 40.16%, 32.83% and 20.94% of the total, respectively. A/T, AT/AT and AAG/CTT were the most frequent motifs in mononucleotide, dinucleotide and trinucleotide repeats, accounting for 97.68%, 71.45% and 21.70% of the total, respectively. Totally, 200 pairs of primers were randomly selected for amplification, and their amplification rate was 78.5%. A total of 137 SSR markers were scored from 27 accessions of P. massoniana amplified by 24 pairs of primers with superior polymorphism, and the polymorphism information content (PIC) was 0.703. Among selected pairs of primers, PmS33 belonged to low polymorphism loci, the polymorphism content was 0.166, PmS88 and primer PmS97 were moderate polymorphism loci, the polymorphism content were 0.346 and 0.263, respectively, and the others were highly polymorphic point. And the average observed number of alleles (Na), effective number of alleles (Ne), Nei's genetic diversity (I) and Shannon index of diversity (H) in the 27 P. massoniana gemplasms were 6, 3.9, 1.338 and 0.66, respectively. The tested germplasms could be completely distinguished by primers Pms164 and Pms184. The manual cultivar identification diagram (MCID) map of tested germplasms was constructed by EST-SSR markers. The coefficient of the 27 germplasms ranged from 0.32 to 0.92 with the threshold of 0.63. All the germplasms were grouped into 3 subclusters by the unweighted pair-group method with arithmetic means (UPGMA) the first major group included 3 red heartwood clones and some fast growing clones, the second major group included all the clones which were fast growing clones, and the third major group included 5 high fat clones. Therefore, a group of SSR primers with high polymorphic potential was designed based on the sequencing of the transcriptome of P. massoniana. The primers might be effectively used to classify 27 clones, the cluster analysis somewhat collaborated with that based on the wood characters as well as the origin of the germplasms, thus limited primers could provided a comparatively polymorphic genetic information. Conclusively, EST-SSR markers based on transcriptome information are feasible, which may provide a theoretical basis for the construction of genetic map, draw fingerprinting and calibration of target genes for P. massoniana germplasm resources.