1 College of Bioscience and Resources Environment,Beijing University of Agriculture, Beijing 102206, China; 2 Institute of Biotechnology/Beijing Key Laboratory of Agricultural Genetic Resources and Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China; 3 School of Life Sciences, Hebei University, Baoding 071002, China
Abstract:Peach (Prunus persica) is a model species for genetic and genomic research within the Rosaceae family. China is the geographical origin of peach, and maintains the richest diversity of peach germplasm worldwide. Founder parents play a critical role in breeding programs, and their genome sequences offer basis insights into the genetic mechanisms underlying key horticultural traits. To investigate the basic genomicinformation of peach founder parents, over 1 000 pedigree records were analyzed, and the Japanese cultivar 'Okubo' and the American cultivar 'Legrand' were identified as key founder parents, with their generational relationships within breeding pedigree clarified. Combining assembly of sequenced reads from third-generation sequencing technology Oxford Nanopore Technologies (ONT) with the second-generation sequencing technology Illumina, high-quality, chromosome-scale genome assemblies for 'Okubo' (Okubo v1.0; 238.19 Mb) and 'Legrand' (Legrand v1.0; 238.57 Mb) were generated by combining assembly of sequenced reads from third-generation sequencing technology Oxford Nanopore Technologies (ONT) with second-generation sequencing technology Illumina, with Contig N50 lengths of 9.95 and 14.97 Mb, with mapping back rates of 99.71% and 99.32%, and GC contents of 37.55% and 37.57%, respectively. The genome assembly quality assessment benchmarking universal single-copy orthologs (BUSCO) value was of 98.8%, the long-terminal-repeat assembly index (LAI) value was of 29.13 and 27.46, and the second-generation short read alignment rates was of 99.30% and 99.85% for 'Okubo' and 'Legrand', respectively. Through genomic repeat element annotation, 47.01% and 52.15% repetitive sequences were identified in Okubo v1.0 and Legrand v1.0, respectively, included retroelements (20.20% and 18.95%) and DNA transposons (11.73% and 11.75%). A total of 23 503 ('Okubo') and 22 062 ('Legrand') protein-coding genes were annotated based on integrated genomic annotation strategy of de novo prediction, homology-based methods, and transcriptome data evidence, among which 81.04% and 81.03% of genes were assigned by functional annotation. Comparative genomic analysis revealed 91.83% for 'Okubo' and 89.36% for 'Legrand' of collinearity between the reference Lovell v2.0 genome, and identified 20.71 and 16.36 Mb of divergent regions in 'Okubo' and 'Legrand', respectively, encompassing 2 319 and 1 493 genes enriched in functions related to binding, catalytic activity, and metabolic processe. This study identified peach founder parents through pedigree analysis, with 2 cultivars subjected to in-depth genome sequencing, high-quality genome assembly, gene annotation, and comparative genomic analysis against reference genomes. These efforts provide backbone genomes for peach pan-genomics research, facilitating genetic dissection and improvement of important agronomic traits while advancing comparative genomics within the Rosaceae family.
[1] 董志丹, 宋尚伟, 宋春晖, 等. 2020. 我国育成苹果品种的系谱分析及其育种启示[J]. 中国农业科学, 53(21): 4485-4496. (Dong Z D, Song S W, Song C H, et al.2020. Pedigree analysis and breeding inspiration of apple cultivars in China[J]. Scientia Agricultura Sinica, 53(21): 4485-4496.) [2] 李永祥, 王天宇, 黎裕. 2019. 主要农作物骨干亲本形成与研究利用[J]. 植物遗传资源学报, 20(05): 1093-1102. (Li Y X, Wang T Y, Li Y.2019. Formation, research and utilization of founder parents in major crops[J]. Journal of Plant Genetic Resources, 20(05): 1093-1102.) [3] 刘应红. 2010. 玉米骨干亲本及其衍生系主要表型性状演变规律研究[D]. 博士学位论文, 四川农业大学, 导师: 黄玉碧, pp. 23. (Liu Y H.2010. Estimating phenotypic character variation of elite inbred lines of Chinese maize [D]. Thesis for Ph.D., Sichuan Agricultural University, Supervisor: Huang Y B, pp. 23.) [4] 孙宗修, 鄂志国, 王磊, 等. 2014. 对中国水稻骨干亲本评定方法的探索[J]. 作物学报, 40(6): 973-983. (Sun Z X, E Z G, Wang L, et al.2014. Exploring assessment method of Chinese rice backbone parents[J]. Acta Agronomica Sinica, 40(6): 973-983.) [5] 王力荣. 2021. 中国桃品种改良历史回顾与展望[J]. 果树学报, 38(12): 2178-2195. (Wang L R.2021. History and prospect of peach breeding in China[J]. Journal of Fruit Science, 38(12): 2178-2195.) [6] 王力荣, 朱更瑞, 方伟超. 2012. 中国桃遗传资源[M]. 北京:中国农业出版社, pp.320-767. (Wang L R, Zhu G R, Fang W C.2012. Peach Genetic Resource in China[M]. China Agriculture Press, Beijing, China, pp. 320-767.) [7] 汪祖华, 庄恩及. 2001. 中国果树志•桃卷[M]. 北京:中国林业出版社, pp.107-281. (Wang Z H, Zhuang E J.2001. Chinese Fruit Trees Records: Peach Volume[M]. China Forestry Publishing House, Beijing, China, pp. 107-281.) [8] 徐鑫, 李小军. 2012. 小麦骨干亲本研究进展[J]. 河南农业科学, 41(2): 5-8. (Xu X, Li X J.2012. Research progress of founder parents in wheat[J]. Journal of Henan Agricultural Sciences, 41(2): 5-8.) [9] 庄巧生. 2003. 中国小麦品种改良及系谱分析[M]. 北京: 中国农业出版社, pp. 10. (Zhuang Q S.2003. Chinese Wheat Improvement and Pedigree Analysis[M]. China Agriculture Press, Beijing, China, pp. 10.) [10] Abbott A, Georgi L, Yvergniaux D, et al.2002. Peach: The model genome for rosaceae[J]. Acta Horticulturae, 575: 145-155. [11] Aramaki T, Blanc-Mathieu R, Endo H, et al.2020. KofamKOALA: KEGG ortholog assignment based on profile HMM and adaptive score threshold[J]. Bioinformatics, 36(7): 2251-2252. [12] Bandi V, Gutwin C, Siri J N, et al.2022. Visualization tools for genomic conservation[C]//, Edwards D (eds.). Plant Bioinformatics. Methods in Molecular Biology, vol 2443. Humana, New York, pp. 285-308. [13] Cao K, Yang X W, Li Y, et al.2021. New high‐quality peach (prunus persica l. batsch) genome assembly to analyze the molecular evolutionary mechanism of volatile compounds in peach fruits[J]. The Plant Journal, 108(1): 281-295. [14] Cao K, Zheng Z J, Wang L R, et al.2014. Comparative population genomics reveals the domestication history of the peach, Prunus persica, and human influences on perennial fruit crops[J]. Genome Biology, 15(7): 415. [15] Ellinghaus D, Kurtz S, Willhoeft U.2008. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons[J]. BMC Bioinformatics, 9: 18. [16] Faust M, Timon B.1995. Origin and dissemination of peach[C]//, Janick J (ed.). Horticultural Reviews. John Wiley & Sons, Inc, Hoboken, pp. 331-379. [17] Flynn J M, Hubley R, Goubert C, et al.2020. RepeatModeler2 for automated genomic discovery of transposable element families[J]. Proceedings of the National Academy of Sciences, 117(17): 9451-9457. [18] Gremme G, Brendel V, Sparks M E, et al.2005. Engineering a software tool for gene structure prediction in higher organisms[J]. Information and Software Technology, 47(15): 965-978. [19] Guan J T, Xu Y G, Yu Y, et al.2021. Genome structure variation analyses of peach reveal population dynamics and a 1.67 Mb causal inversion for fruit shape[J]. Genome Biology, 22(1): 13. [20] Haas B J, Delcher A L, Mount S M, et al.2003. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies[J]. Nucleic Acids Research, 31(19): 5654-5666. [21] Haas B J, Papanicolaou A, Yassour M, et al.2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis[J]. Nature Protocols, 8(8): 1494-1512. [22] Haas B J, Salzberg S L, Zhu W, et al.2008. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments[J]. Genome Biology, 9(1): R7. [23] Hu D H, Yu Y M, Wang C, et al.2021. Multiplex CRISPR-Cas9 editing of DNA methyltransferases in rice uncovers a class of non-CG methylation specific for GC-rich regions[J]. The Plant Cell, 33(9): 2950-2964. [24] Hu J, Fan J P, Sun Z Y, et al.2020. NextPolish: A fast and efficient genome polishing tool for long-read assembly[J]. Bioinformatics, 36(7): 2253-2255. [25] Hu J, Wang Z, Sun Z Y, et al.2024. NextDenovo: An efficient error correction and accurate assembly tool for noisy long reads[J]. Genome Biology, 25(1): 107. [26] Jones P, Binns D, Chang H Y, et al.2014. InterProScan 5: Ggenome-scale protein function classification[J]. Bioinformatics, 30(9): 1236-1240. [27] Kim D, Paggi J M, Park C, et al.2019. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype[J]. Nature Biotechnology, 37(8): 907-915. [28] Li H.2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM[J]. arXiv:1303.3997 [q-bio.GN]. [29] Lian X D, Zhang H P, Jiang C, et al.2022. De novo chromosome‐level genome of a semi‐dwarf cultivar of prunus persica identifies the aquaporin pptip2 as responsible for temperature‐sensitive semi‐dwarf trait and ppb3‐1 for flower type and size[J]. Plant Biotechnology Journal, 20(5): 886-902. [30] Marçais G, Delcher A L, Phillippy A M, et al.2018. MUMmer4: A fast and versatile genome alignment system[J]. PLOS Computational Biology, 14(1): e1005944. [31] Ou S J, Jiang N.2018. LTR_retriever: A highly accurate and sensitive program for identification of long terminal repeat retrotransposons[J]. Plant Physiology, 176(2): 1410-1422. [32] Ou S J, Jiang N.2019. LTR_FINDER_parallel: Parallelization of LTR_FINDER enabling rapid identification of long terminal repeat retrotransposons[J]. Mobile DNA, 10: 48. [33] Pertea M, Pertea G M, Antonescu C M, et al.2015. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads[J]. Nature Biotechnology, 33(3): 290-295. [34] Shaw P D, Graham M, Kennedy J, et al.2014. Helium: Visualization of large scale plant pedigrees[J]. BMC Bioinformatics, 15(1): 259. [35] Shulaev V, Korban S S, Sosinski B, et al.2008. Multiple models for Rosaceae genomics[J]. Plant Physiology, 147(3): 985-1003. [36] Simão F A, Waterhouse R M, Ioannidis P, et al.2015. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs[J]. Bioinformatics, 31(19): 3210-3212. [37] Singh R, Ming R, Yu Q Y.2016. Comparative analysis of GC content variations in plant genomes[J]. Tropical Plant Biology, 9(3): 136-149. [38] Su T, Wilf P, Huang Y J, et al.2015. Peaches preceded humans: Fossil evidence from SW China[J]. Scientific Reports, 5(1): 16794. [39] Tarailo-Graovac M, Chen N S.2009. Using RepeatMasker to identify repetitive elements in genomic sequences[J]. Current Protocols in Bioinformatics, 25: 4.10.11-14.10.14. [40] Verde I, Abbott A G, Scalabrin S, et al.2013. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution[J]. Nature Genetics, 45(5): 487-494. [41] Verde I, Jenkins J, Dondini L, et al.2017. The Peach v2.0 release: High-resolution linkage mapping and deep resequencing improve chromosome-scale assembly and contiguity[J]. BMC Genomics, 18(1): 225. [42] Okie W. R.1998. Handbook of Peach and Nectarine Varieties : Performance in the Southeastern United States and Index of Names[M]. United States Department of Agriculture, Washington, D.C., pp. 1-372. [43] Walker B J, Abeel T, Shea T, et al.2014. Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement[J]. PLOS ONE, 9(11): e112963. [44] Wang J, Cao K, Li Y, et al.2024. Genome variation and LTR-RT analyses of an ancient peach landrace reveal mechanism of blood-flesh fruit color formation and fruit maturity date advancement[J]. Horticulture Research, 11(1): uhad265. [45] Wang Y P, Tang H B, Debarry J D, et al.2012. MCScanX: A toolkit for detection and evolutionary analysis of gene synteny and collinearity[J]. Nucleic Acids Research, 40(7): e49. [46] Yu Y, Fu J, Xu Y G, et al.2018. Genome re-sequencing reveals the evolutionary history of peach fruit edibility[J]. Nature Communications, 9(1): 5404. [47] Yu Y, Guan J T, Xu Y G, et al.2021. Population-scale peach genome analyses unravel selection patterns and biochemical basis underlying fruit flavor[J]. Nature Communications, 12(1): 3604. [48] Zhang A D, Zhou H, Jiang X H, et al.2021. The draft genome of a flat peach (Prunus persica L. cv. ‘124 Pan’) provides insights into its good fruit flavor traits[J]. Plants, 10: 538. [49] Zhang H P, Lian X D, Gao F, et al.2025. A gap-free genome of pillar peach (Prunus persica L.) provides new insights into branch angle and double flower traits[J]. Plant Biotechnology Journal, 23(1):81-83. [50] Zheng Y F, Crawford G W, Chen X G.2014. Archaeological evidence for peach (Prunus persica) cultivation and domestication in China[J]. PLOS ONE, 9(9): e106595.