摘要密码子使用偏好性普遍存在于各种生物中,但突变和自然选择压力对每种生物密码子使用偏好性不同。WRKY基因家族是一类只存在于植物的转录因子,主要参与植物体内转录调控和信号转导过程。本研究以蒺藜苜蓿(Medicago truncatula)WRKY转录因子(MtWRKY)为研究对象,揭示MtWRKY基因密码子使用偏好性形成的主要因素,并筛选最优密码子。研究结果表明,代表MtWRKY基因的点均分布在有效密码子数(effective number of codons, ENC)标准曲线以下,表明密码子受自然选择压力或突变选择压力或其他因素影响;密码子第1、2位平均GC含量(GC12)与密码子第3位GC含量(GC3)相关性分析发现,GC12与GC3呈显著正相关(r=0.34, P<0.01),表明突变压力导致密码子3个位点具有相似的GC含量;GC3s值分布在0.2~0.5之间,表明密码子使用偏好性主要受突变压力影响。奇偶偏好分析表明,MtWRKY基因第3位密码子CT含量>AG含量。G和C(或者A和T)不成比例分布在密码子第3位上,表明密码子使用偏好性受到自然选择压力影响,但很可能突变压力仍起主要作用。最优密码子使用频率(frequency of optimal codons, Fop)与GC含量以及序列长度相关性分析发现,Fop与外显子GC含量呈显著正相关(r=0.57, P<0.01),而与内含子GC含量呈较弱正相关(r=0.09, P>0.05);Fop与外显子序列的长度呈正相关(r=0.28, P<0.05),而与内含子长度呈负相关(r=-0.01, P>0.05)。表明MtWRKY基因外显子和内含子序列的形成受不同选择压力影响;外显子密码子使用偏好性受突变压力影响,而内含子可能是由于自然选择压力作用于突变选择形成的。确定了4个以G或C结尾的最优密码子。研究结果为WRKY转基因研究过程中密码子优化提供了理论支持。
Abstract:Codon usage bias has been documented in a wide diversity of species. The relative contributions of mutation and various forms of natural selection on codon usage bias are different. Plant specific WRKY transcription factors play important roles in transcriptional regulation and signal transduction. However, systematical analysis on codon usage bias of Medicago truncatula WRKY (MtWRKY) genes has not been reported. In the present study, a systematic examination of codon usage for MtWRKY genes was carried out. The results showed that all MtWRKY genes were below the high effective number of codons(ENC) standard curve, suggesting that other factors independent of nucleotide composition had affected codon usage bias. Neutrality plots (GC12 vs. GC3) were used to analyze the relationships among the 3 codon positions. There was a significant positive correlation (r=0.34, P<0.01) between GC12 and GC3 codons of MtWRKY genes, indicating that GC mutational bias led to similar GC content in all codon positions. Moreover, GC content in MtWRKY genes showed a wide range of GC3s values (0.2~0.5), indicating that mutational pressure was the main factor in shaping codon usage. The CT contents were higher than that of GA on the 3rd position of codons according to parity rule 2 analysis. GC or AT were used disproportionately, with C and T used more frequently than G and A in the 3rd position of codon in MtWRKY genes. This result indicated that natural selection contributed to MtWRKY codon usage bias, but mutational bias was the major influence on codon usage. Correspondence between frequency of optimal codons (Fop) and GC content analysis showed that there was a significant positive correction between Fop and GC content of exon (r=0.57, P<0.01) and weak positive correction between Fop and GC content of intron (r=0.09, P>0.05). There was a significant positive correction between Fop and length of exon (r=0.28, P<0.05), and a negative correction between Fop and length of intron (r=-0.01, P>0.05). These results indicated that mutation contributed to codon usage bias of exon, while natural selection was the main factor that shaped the intron sequences. This study identified 4 optimal codons, on which the 3rd position exclusively used G or C. All together, these results provide important information for codon optimization on transgenic studies of WRKY genes in the future.