Comprehensive mapping and modelling of the rice regulome landscape unveils the regulatory architecture underlying complex traits

Zhu, Tao; Xia, Chunjiao; Yu, Ranran; Zhou, Xinkai; Xu, Xingbing; Wang, Lin; Zong, Zhanxiang; Yang, Junjiao; Liu, Yinmeng; Ming, Luchang; You, Yuxin; Chen, Dijun; Xie, Weibo

doi:10.1038/s41467-024-50787-y

Download PDF

Article
Open access
Published: 03 August 2024

Comprehensive mapping and modelling of the rice regulome landscape unveils the regulatory architecture underlying complex traits

Nature Communications volume 15, Article number: 6562 (2024) Cite this article

12k Accesses
16 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Unraveling the regulatory mechanisms that govern complex traits is pivotal for advancing crop improvement. Here we present a comprehensive regulome atlas for rice (Oryza sativa), charting the chromatin accessibility across 23 distinct tissues from three representative varieties. Our study uncovers 117,176 unique open chromatin regions (OCRs), accounting for ~15% of the rice genome, a notably higher proportion compared to previous reports in plants. Integrating RNA-seq data from matched tissues, we confidently predict 59,075 OCR-to-gene links, with enhancers constituting 69.54% of these associations, including many known enhancer-to-gene links. Leveraging this resource, we re-evaluate genome-wide association study results and discover a previously unknown function of OsbZIP06 in seed germination, which we subsequently confirm through experimental validation. We optimize deep learning models to decode regulatory grammar, achieving robust modeling of tissue-specific chromatin accessibility. This approach allows to predict cross-variety regulatory dynamics from genomic sequences, shedding light on the genetic underpinnings of cis-regulatory divergence and morphological disparities between varieties. Overall, our study establishes a foundational resource for rice functional genomics and precision molecular breeding, providing valuable insights into regulatory mechanisms governing complex traits.

A single-cell rice atlas integrates multi-species data to reveal cis-regulatory evolution

Article Open access 17 September 2025

Constraint of accessible chromatins maps regulatory loci involved in maize speciation and domestication

Article Open access 12 March 2025

Genome-wide identification of CRF gene family members in four rice subspecies and expression analysis of OsCRF members in response to cold stress at seedling stage

Article Open access 18 November 2024

Introduction

Rice (Oryza sativa) is not only one of the most important crops in the world but also an outstanding model species for studying plant growth and development. Over the past two decades, tremendous efforts have been made to understand the genetic basis of important agronomic traits in rice¹. Genome-wide association studies (GWAS) have played a pivotal role in this pursuit, helping to link genetic variations to phenotypic diversity. These studies have identified a large number of candidate genes that hold promise for trait improvement^2,3,4,5. However, despite these advances, our understanding of the regulatory mechanisms governing complex traits in rice remains incomplete.

Gene regulatory networks (GRNs) are largely dictated by cis-regulatory DNA sequences, such as promoters and enhancers, which are bound by specific transcription factors (TFs)⁶. Deciphering the regulatory code within these regulatory sequences and linking the regulatory sequences to target genes are crucial for rewiring GRNs for crop improvement and trait optimization⁶. Nonetheless, efforts to profile the regulome, encompassing the entirety of regulatory elements in the genome, remain constrained in rice. These efforts often concentrate on specific tissues, neglecting the comprehensive landscape across developmental stages and tissues^7,8. Similarly, endeavors to establish links between regulatory regions and their target genes in rice are also limited⁸.

Meanwhile, many functional genetic variants associated with agronomic traits in rice reside within noncoding regulatory regions (e.g., qSH1⁹, DROT1¹⁰, and FZP¹¹), which makes their interpretation challenging and underscores the necessity for a systematic dissection of regulatory sequences. Given that diverse traits manifest across distinct developmental stages and tissues, systematic annotation of noncoding regulatory variants in rice is currently hindered by the lack of a comprehensive epigenome map across various tissues and growth stages.

To bridge these gaps, we systematically mapped chromatin accessibility profiles in various tissues across the life cycle of three representative rice cultivars using the UMI-ATAC-seq method¹², a modified ATAC-seq (assay for transposase accessible chromatin-sequencing) protocol developed in our lab. Through analysis of 145 ATAC-seq datasets, we obtained a total of 117,176 unique open chromatin regions (OCRs), accounting for ~15% of the rice genome. By integration of RNA-seq data from matched tissues, we predicted potential target genes for OCRs based on the correlation of gene expression and adjacent chromatin accessibility across tissues. Through TF footprinting analysis, we inferred tissue- or stage-specific regulatory networks and identified cultivar-polymorphic/trait-associated OCRs by comparing the regulatory landscapes between indica and japonica rice subspecies. Notably, our analysis unveiled a preference for GWAS-associated variants within tissue-specific OCRs, enabling the identification of causal associations between 209 complex agronomic traits and noncoding regulatory variants using this OCR landscape. Utilizing optimized deep learning models, we decoded the regulatory grammar through modeling of tissue-specific chromatin accessibility and across-variety predictions from sequences. The modeling approach sheds light on the key genetic alterations contributing to cis-regulatory divergence. Overall, these data not only serve as a cornerstone resource for the plant research community but also provide valuable regulatory variants for precision molecular breeding.

Results

Charting a reference atlas of chromatin accessibility in rice

To generate a comprehensive landscape of accessible chromatin in rice (Oryza sativa), we took advantage of an improved ATAC-seq protocol (UMI-ATAC-seq¹², which incorporates unique molecular identifiers to the regular ATAC-seq technique for accurate quantification and footprinting) to perform chromatin accessibility profiling in 23 tissues/organs spanning the entire life cycle of rice. The representative tissues include callus, radicle, plumule, leaf, leaf sheath, root, apical meristem (AM1/AM2), dormant buds (DBuds), shoot apical meristem (SAM1/SAM2/SAM3), panicle neck node (PNN), stem, young panicle (Panicle1/Panicle2/Panicle3/Panicle4), lemma, palea, pistil, stamen and seed coat (Seed1/Seed2/Seed3). The experiments were conducted in three representative rice varieties, namely Nipponbare (NIP; japonica subspecies), Minghui 63 (MH63; indica subspecies type II), and Zhenshan 97 (ZS97; indica subspecies type I), with each experiment consisting of at least two biological replicates (Fig. 1a and Supplementary Data 1). In total, 145 genome-wide chromatin accessibility datasets with high sequencing depth (~30.7 M reads on average) were generated. We applied the ENCODE standards^13,14 to establish the analysis pipeline (see Methods). Compared to published ATAC-seq datasets in the plants as deposited in the ChIP-Hub database¹⁴, our data exhibited a significantly higher signal-to-noise ratio (Supplementary Fig. 1). Through data analysis using the corresponding reference genomes of the three cultivars^15,16, we identified on average of 40,676 (ranging from 28,991 to 49,737) reproducible OCRs (with an Irreproducible Discovery Rate [IDR]¹⁷ < 0.05) per experiment (Fig. 1b). As expected, the identified OCRs from all experiments predominantly located either in the proximal upstream regions of the transcription start site (TSS) or the distal intergenic regions (Fig. 1b, f, Supplementary Fig. 2 and Supplementary Data 2), resembling promoters or enhancers, respectively¹⁸. Of note, OCRs from intragenic regions accounted for a relatively small proportion (about 15.7%), while most of these OCRs originated from intronic regions (Supplementary Fig. 2b). These observations indicate that the vast majority of OCRs originate from noncoding regions of the rice genome.

**Fig. 1: Characterization of an open chromatin landscape in rice.**

We estimated that approximate 15% of the rice genome could be annotated as OCRs, with a consistent pattern observed across each variety (Fig. 1c), and the estimation appeared to have reached saturation in rice (Fig. 1d). OCRs contain multiple TF binding sites and are responsible for regulating the expression of target genes^6,18. We collected publicly available ChIP-seq data for 56 distinct TFs (Supplementary Data 3) and predicted DNA motifs for 458 TFs in rice from the ChIP-Hub database¹⁴, and showed that OCRs were significantly enriched for TF binding sites (Fig. 1e). Furthermore, we found that OCRs are highly evolutionarily constrained compared to flanking genomic regions (Fig. 1g), supporting previous findings that conserved noncoding sequences (CNSs) are predictive of OCRs in plants^19,20.

We next assessed the overall similarities and differences of chromatin accessibility across varieties and tissues. We quantified all datasets based on the merged OCRs (n = 117,176) called from the same reference genome (i.e., Nipponbare) and visualized their global patterns using t-distributed stochastic neighbor embedding (t-SNE)²¹. While dimension 1 and 2 of t-SNE results generally reflected differences between the indica (MH63 and ZS97) and japonica (NIP) subspecies, dimensions 2 and 3 primarily delineated distinct clusters among tissue types (Fig. 1h). For instance, the chromatin accessibility patterns of vegetative and productive tissues of NIP were separated into distinct clusters, whereas young panicles and callus tissues exhibited similar patterns regardless of their variety origin. We further calculated the tissue specificity of each OCR based on the Jensen-Shannon divergence (JSD) index. Obviously, distal OCRs showed significantly higher specificity scores than proximal OCRs (Fig. 1i and Supplementary Fig. 3a, b), consistent with previous findings^14,18,22.

In short, the comprehensive accessible chromatin landscape in rice represents a value resource for crop functional genomic studies.

Linking open chromatin regions to target genes

To decipher which genes these OCRs may regulate, we generated matched RNA-seq datasets for the investigated tissues in each rice variety (Supplementary Fig. 3c and Supplementary Data 4). We adopted a strategy²³ to predict OCR-to-gene links based on correlation analysis between the OCR accessibility and gene expression across all samples (Fig. 2a; see Methods). Genes can be regulated by multiple OCRs (including promoters and enhancers) through chromatin interactions, which are supposed to occur within topologically associated domains (TADs). Since the size of TADs in the rice genome was estimated to be ranging from 35 kilobase pair (kb) to 45 kb based on Hi-C data^24,25, we restricted our analysis to 40 kb (i.e., from 20 kb upstream to 20 kb downstream of the TSS) to predict target genes of OCRs. Using a cutoff of absolute Pearson correlation coefficient |R|>= 0.4 and P < 0.05, we obtained a total of 59,075 unique links between OCRs (n = 38,437, 32.8% of all OCRs) and genes (n = 18,781, 48.1% of annotated genes; Supplementary Fig. 4a, b and Supplementary Data 5). As expected, the OCR-to-gene links tended to occur more frequently in the proximal OCRs, and consequently the correlation between the gene expression and chromatin accessibility is higher for proximal links (Supplementary Fig. 4c–f).

Genetic variants within OCRs can contribute to changes in gene expression levels through expression quantitative trait loci (eQTL). We colocalized the identified OCR-to-gene links from our study with published eQTL data in rice²⁶, and we found a significant overlap (Chi-squared test, P < 1.55e − 06) between OCR-to-gene links and eQTL-gene pairs (Supplementary Fig. 4g). As expected, the correlation coefficients of colocalized OCR-to-gene links with eQTLs are significantly higher than those without colocalization (Wilcoxon test, P = 4.11e−38; Supplementary Fig. 4h). We identified numerous known regulatory variants that influence the expression of genes associated with agronomic traits. To name a few, a variant within a distal regulatory region ( ~ 12 kb upstream) of qSH1 modulates its expression dynamics, leading to change the seed shattering in rice⁹. Accordingly, there is a positive correlation (R = 0.47, P < 0.013) between the accessibility of this enhancer and the expression of qSH1 in various tissues, particularly in SAM where gene expression increases (Supplementary Fig. 4i, l). Similarly, OsLG1 is tightly linked to upstream regulatory regions that colocalize with a strong QTL associated with the panicle shape trait²⁷ (Supplementary Fig. 4j, l). IPA1 showed significantly positive correlation (R = 0.84, P < 2.95e−8) between its enhancer activity and gene expression, with increased expression in yield-related tissues (Supplementary Fig. 4k, l), confirming an important role of IPA1 to shape rice ideal plant architecture (IPA) and thus to enhance grain yield²⁸.

Taken together, the predicted OCR-to-gene links provide regulatory insights into agronomic trait development in rice and highlight targetable OCRs of important genes for genome editing.

Dissecting tissue-specific and stage-specific regulatory grammar

The comprehensive chromatin accessibility landscape of representative tissues gave us an opportunity to uncover tissue-specific regulatory grammar. We quantified the tissue-specificity of OCRs by utilizing the JSD score, which enables the discrimination of target genes from housekeeping (e.g., GAPDH²⁹ and OsGOGAT1³⁰) to tissue-specific (e.g., OsYABBY5³¹ and OsWRKY47³²) according to the above predicted OCR-to-gene links (Supplementary Fig. 5 and Supplementary Data 6). We have specifically focused on analyzing highly tissue-specific OCRs (n = 6686 with a cutoff of JSD > 0.08, ~ 7% of all OCRs) as they may encode the tissue-specific regulatory grammar. These OCRs were further annotated as promoters (n = 2322) or enhancers (n = 4364) according to the genomic distance to the TSS. By performing joint clustering analysis of chromatin accessibility and target gene expression using OCR-to-gene links, we identified 20 distinct clusters of OCRs (Fig. 2b and Supplementary Data 7). Each cluster had 200~500 OCR-to-gene links that were highly activated in specific tissues, and showed a high degree of consistency with the known biological characteristics of the corresponding tissues (Fig. 2b–d). For instance, the palea- and lemma-specific links in cluster 5 (C5) contained promoter-enhancer interactions at the locus of GW8, which is a known gene controlling grain weight in rice³³ (Fig. 2c). Accordingly, GW8 was highly expressed in pistil, lemma, and palea. Gene ontology (GO) enrichment analysis using genes from C5 revealed that biological processes such as ‘pollen−pistil interaction’ and ‘pollination’ were overrepresented (Fig. 2d). Similarly, we identified a number of OCRs in C19 that were highly and specifically accessible in meristem-like tissues (including young panicle and shoot apical meristem), and the associated target genes showed significant enrichment for functions related to ‘reproductive system development’, ‘flower development’, and ‘shoot system development’ (Fig. 2b, d). Notably, RFL, a crucial regulator for plant architecture and flowering time^34,35, was among these target genes (Fig. 2c). Interestingly, we observed that a higher proportion (28.9%) of tissue-specific OCRs originated from distal intergenic regions compared to constitutive OCRs (12.3%). In contrast, approximately 85% of constitutive OCRs were derived from the proximal-promoter regions. (Fig. 2e).

To delineate the TFs that may bind to these tissue-specific OCRs, we used GimmeMotifs³⁶, a versatile tool can detect tissue-specific TF binding motifs by comparing TF binding activity across multiple experiments. We restricted our analysis to the top 2,500 OCRs in each tissue, as determined by their specificity measurement (SPM) score³⁷. The predicted regulatory motifs showed significant enrichments in a tissue-specific manner in matching tissue types (Supplementary Fig 6 and Supplementary Data 8). We narrowed our focus to the top enriched regulators in each tissue type, and found many of the inferred links correspond to known regulatory relationships (Fig. 3a). For example, OsIDS1, a gene that plays a vital role in shaping inflorescence structure and establishing floral meristems^38,39, exhibited relatively high activity in the panicle. OsbZIP72, enriched in plumule tissue, has been found to regulate plumule length by modulating abscisic acid (ABA) signaling and promote seed germination^40,41. Notably, the tissues of seed and pistil demonstrated a co-enrichment pattern of crucial regulators involved in flower and seed development, including MFO1 and MADS63^42,43,44 from the MADS gene family (Fig. 3a). For each tissue type, we performed a systematic analysis to calculate the relative preference of regulators within TF families. Our analysis revealed distinct tissue-specific TF binding patterns, indicating clear preferences for specific regulators in different tissues (Fig. 3b). For instance, the TCP TF family showed a preference for enrichment in stem, stamen, and panicle neck node (PNN) tissues. This observation aligns with the known biological function of TCP genes, specifically their role in regulating cell proliferation in developing tissues⁴⁵.

**Fig. 3: Tissues-specific and stage-specific regulatory elements.**

Analyzing temporal ATAC-seq data through footprinting could assist in identifying key regulators, such as pioneer factors, that control developmental progression and transition⁴⁶. We generated temporal open chromatin data from the young panicle, which is a crucial organ determining the yield of rice^47,48, across four successive developmental stages (<1 mm, 1–2 mm, 3–5 mm, and 5–10 mm; Fig. 1a). We endeavored to identify regulatory motifs that exhibited either positive or negative correlation with the young panicle developmental stage in terms of enrichment, using dynamically changing OCRs (n = 9244; Fig. 3c, Supplementary Fig. 7a and Supplementary Data 9). The regulators that were most enriched displayed predominantly positive correlations, indicating their function as transcriptional activators. Conversely, a subset of factors exhibited negative correlations, suggesting a repressive role. In this regard, DL (encoding OsYABBY⁴⁹), OsSPL9⁵⁰, and OsSPL14⁵¹ were identified as representative positive regulators, during the development of young panicles in rice (Fig. 3d and Supplementary Fig. 7b). However, further experimental data is necessary to validate the potential involvement of these TFs in young panicle development.

Overall, the above results provide a valuable resource that can help guide studies of candidate key regulators for tissue-specific gene regulation.

Systemic localization of GWAS variants in tissue-specific regulatory DNA

Genome-wide association studies (GWAS) have identified numerous natural variations linked to various agronomic traits in rice³. To systematically colocalize GWAS-associated variants with the above annotated regulatory elements, especially those from noncoding regulatory regions, we compiled a comprehensive rice GWAS catalog from recent genome-wide association meta-analysis studies^2,52,53,54 as well as the NGDC GWAS Atlas database⁵⁵. In total, we collected 4831 significant (P < 1e−5) and representative (only considering lead SNP) associations for 209 distinct quantitative traits which can be classified into seven major categories⁵⁶: morphological characteristics, physiological features, yield components, grain quality, resistance, coloration, and others (Fig. 4a and Supplementary Data 10). In a nutshell, these GWAS SNPs dominantly located in intergenic noncoding regions (Fig. 4b and Supplementary Fig. 8a) and 24.5% of them were either situated within a noncoding OCR (21.1%) or located in linkage disequilibrium (LD) with SNPs in a neighboring OCR (3.4%) (Fig. 4c). Moreover, OCRs revealed significantly higher enrichment of GWAS SNPs than protein-coding sequences (Fig. 4d), highlighting the crucial function of regulatory variants in determining phenotypic characteristics.

**Fig. 4: GWAS-associated variants localize in tissue-specific OCRs.**

Furthermore, our findings demonstrated that OCRs containing GWAS SNPs exhibited greater tissue specificity (Fig. 4e, f and Supplementary Fig. 8b-d). For instance, one of the OCRs containing a GWAS lead variant vg0724671055⁵³ (C/T, GWAS P < 9.27e−8) significantly associated to panicle number. This OCR was found to be highly accessible specifically to young panicle tissues and its accessibility showed a positive OCR-to-gene link with the expression of GW7 (R = 0.59, P < 9.14e−5; Fig. 4g). In another example, the GWAS lead variant vg0431427332 is significantly associated to leaf blade width⁵² (P < 1.58e−8), which was located in a SAM/Panicle-specific OCR to positively regulate the expression of NAL1 (R = 0.72, P < 1.16e−6) (Fig. 4h). The previous studies have shown that NAL1 is not only associated with leaf width but also with yield⁵² and has natural variations in expression levels²⁶. More examples of validated OCR-related associations are presented in Supplementary Fig. 8e.

Tissue-specific regulatory variants explain agronomic trait associations

The variation in DNA sequences within OCRs plays a significant role in driving phenotypic innovation through altering chromatin state and gene expression patterns, which usually occurs in a tissue-specific manner. To investigate the relationship between genetic variations associated with agronomic traits and tissue-specific OCRs, we calculated the enrichment of genetic variations within OCRs in a tissue-specific manner. It turned out that significant GWAS SNPs were frequently enriched in OCRs of trait-relevant tissues (Fig. 4f and Supplementary Fig. 8d). For example, GWAS variants associated with spikelet traits were highly enriched in OCRs specific to the tissues of SAM1, pistil and panicle. Motivated by this observation, we performed an enrichment analysis of GWAS-identified SNPs in OCRs from various tissues, using a SNP enrichment method termed CHEERS⁵⁷ (Supplementary Fig. 9). Of the 209 curated GWAS-related traits, ~78% (163 of 209) phenotypic traits showed GWAS SNP enrichment in at least one tissue (Supplementary Fig. 10 and Supplementary Data 11). The observed enrichment of agronomic trait-associated variants in regulatory elements was highly specific to tissue types, and the association is largely compatible with our current understanding of the tissue function (Fig. 5a). For example, in various GWAS studies, regulatory variants associated with plant height was enriched in stem-related tissues; while genetic associations for grain-related traits (such as grain thickness, grain width, grain length, blighted grains per plant, and filled grains per plant) were highly enriched in OCRs specific to the tissues of seed, lemma, pistil, and stamen (Fig. 5a). Meanwhile, we found that variants associated with root length were predominantly enriched in the root tissue. Specifically, a significant SNP (vg0806201957⁵⁸, P < 3.98e-8) located in a root-specific enhancer of OsHAK12, which has been shown to be involved in K⁺ uptake in roots⁵⁹ (Supplementary Fig. 11a).

**Fig. 5: Association of tissue type with complex traits.**

In the case of seed germination percentage, GWAS SNPs were most significantly enriched in plumule-specific OCRs (Fig. 5a). We noted a lead SNP (vg0131729028⁶⁰, A/G, P < 8.4e−8) localized within an intronic OCR of OsbZIP06, where the intronic OCR and OsbZIP06 formed a positive OCR-to-gene link (R = 0.82, P < 2.55e-7) with high tissue specificity in plumule and radicle (Fig. 5b). The minor allele (G) of vg0131729028 was present in a very small proportion (0.3%) in the XI population, but in 65.80% of the Aus population (Fig. 5c). We mutated the coding region (mainly 1st exon) of OsbZIP06 with CRISPR/Cas9 and found that the germination rate was higher in two frameshift mutations (osbzip06-1 and osbzip06-2) than in the wild type (Fig. 5d-f and Supplementary Data 12). In contrast, overexpression of the OsbZIP06 resulted in a lower germination rate (Fig. 5e, f). Therefore, the integration of publish GWAS data and our chromatin landscape can greatly facilitate the identification of candidate genes and the functional annotation of noncoding variants.

Furthermore, when we divided the OCRs into proximal (<3 kb from the TSS, 60,006 OCRs) and distal OCRs (>3 kb from TSS, 35,691 OCRs) before using CHEERS to do enrichment analysis. We observed that the proximal OCRs are more enriched in GWAS SNPs (Fig. 5g–i and Supplementary Fig. 11b). This implies that the enrichment above is mainly driven by the OCR close to the TSS and this result is consistent with previous studies^57,61.

Deep learning models accurately predict differences in chromatin accessibility between tissues and unveil common regulatory grammar among varieties

We further investigated whether the tissue- and stage-specific regulatory grammar can be modelled. Deep learning has been successfully utilized to learn and identify essential features in genomic sequences, such as the identification of cis-elements^62,63. Our previous study demonstrated that the Basenji deep learning framework⁶⁴ is powerful for modelling epigenomic data in rice, such as the ability to accurately predict chromatin accessibility and to assess the impacts of variants⁷. Therefore, we optimized the Basenji framework to effectively model our ATAC-seq datasets from multiple tissues (Supplementary Fig. 12a,b). Three distinct models were trained for the varieties of NIP, MH63, and ZS97, demonstrating high accuracy with the mean AUROC values of 0.931, 0.921, and 0.928, respectively (Fig. 6a and Supplementary Fig. 12c). We observed that the Pearson’s correlation coefficient between the predicted and observed values of chromatin accessibility at different locations on the genome reached approximately 0.81, with the best prediction at the location of <1 kb upstream regions (Fig. 6b and Supplementary Fig. 12d). This implies that the regulatory syntax patterns within promoter regions could carry more significant information encoded in sequences, which can be effectively captured by deep learning models. Furthermore, the predicted signals from the test sets exhibit the ability to discern between distinct tissues and closely align with the clustering results of the actual values (Fig. 6c). For example, the root-specific expressed gene RCc3, responsible for regulating lateral root growth⁶⁵, exhibits distinct chromatin accessibility patterns specifically in root (Fig. 6d and Supplementary Fig. 13).

**Fig. 6: Using deep learning model to predict chromatin accessibility across tissues and varieties.**

Subsequently, for each variety-specific model, we used test sets from the remaining two varieties to evaluate the model’s capacity for making predictions across different varieties. Our analysis revealed high Pearson correlation coefficients (about 0.8) between the predicted and observed signals (Fig. 6e). Notably, in the GSE9 promoter region, there is divergence between indica and japonica rice, marked by a 9 bp deletion and several SNPs in MH63 when compared to the sequences of NIP and ZS97⁶⁶. The ZS97 model predicted the chromatin accessibility of this region in MH63 with weak signals. Contrarily, the ZS97 model accurately predicted the chromatin accessibility in NIP and ZS97, showing strong signals (Fig. 6f and Supplementary Fig. 14). These results suggest that the deep learning model can effectively make accurate predictions across varieties, implying that shared regulatory grammar across rice varieties.

We next performed comparative analyses on ATAC-seq data of 22 matched tissues/organs in both japonica rice (NIP) and indica rice (MH63 and ZS97), utilizing their respective reference genomes (Fig. 1a, b). We found that roughly 60% (60,764 out of 95,697) of OCRs were shared across all three cultivars (Fig. 6g and Supplementary Data 13). The indica varieties MH63 and ZS97 exhibited a higher proportion of shared OCRs compared to NIP from different subspecies (Fig. 6g). We next sought to compare chromatin accessibility dynamics of the 1:1:1 orthologous OCRs across the three varieties (referred to as triads; see Methods). To investigate the accessible bias of orthologous OCRs, we compared the chromatin accessibility of orthologous OCRs in each individual tissue (Fig. 6h). Orthologous OCRs were assigned into seven categories on the ternary plot based on their relative accessibility, including a balanced category and six dominated or suppressed categories in specific cultivars (Supplementary Fig. 15). The proportion of OCR triads assigned to unbalanced categories varied among different tissues, ranging from 3.2% in plumule to 24.8% in AM1 (Fig. 6h and Supplementary Fig. 16a). While promoters generally display balanced OCRs, indicating consistent accessibility across different cultivars, enhancers frequently exhibit unbalanced OCRs, reflecting cultivar-specific regulation (Supplementary Fig. 16b). Interestingly, unbalanced OCRs harbored more genotypic variations in terms of SNPs (Fig. 6i). This observation led us to suppose whether sequence variation among different varieties caused the differences in chromatin accessibility of these OCR orthologs. Therefore, we used NIP-based deep learning model to predict the chromatin accessibility signals of sequences from orthologous OCRs in NIP, MH63 and ZS97, respectively, and then compare these predictions. The results showed that about 50% of the differences in orthologous OCRs could be successfully resolved in terms of sequence variation (Fig. 6j and Supplementary Fig. 17).

In summary, the above results illustrate that deep learning models could accurately predict chromatin accessibility across tissues and varieties. The high accuracy of the models also indicates the high quality of our data.

Elucidate key genetic changes underlying cis-regulatory divergence by deep learning models

Genetic variants and de novo mutations in regulatory regions may lead to cis-regulatory divergence and thus changes in gene expression and organismal phenotypes⁶⁷. We systematically dissected the cis-regulatory divergence due to genomic sequence changes (e.g., SNPs) in regulatory regions, which could be inferred from ATAC-seq data. To measure the effect of the variant on chromatin accessibility, we extracted variants that differed in the three varieties. The effect of different alleles of each variant on chromatin accessibility was evaluated using the deep learning models. We found that unbalanced OCRs had a higher absolute effect score than the balanced OCRs (Supplementary Fig. 18a) and these large-effect loci were significantly enriched for eQTLs^26,68 (Supplementary Fig. 18b). This observation suggests that these putative large-effect variants are associated with changes in chromatin accessibility and gene expression. Meanwhile, we performed separate OCR-to-gene correlation analysis for each of the three varieties. We then identified conserved OCR-to-gene links and compared the correlation coefficients between them (Fig. 7a). Notably, OCRs with significant differences in correlation coefficients exhibited higher SNP density (Fig. 7b), and the OCR-to-gene links with large differences in correlation coefficients between MH63 and ZS97 were significantly enriched for differential cis-eQTL between MH63 and ZS97 (Fisher’s exact test, odds ratio = 1.81 and P < 1.83e−28)⁶⁹. These suggesting that regulatory sequence variations among different varieties could influence gene expression. For instance, we observed that a SNP (vg0336150781, G/A) located in the GNP1 promoter region control grain number and plant height⁷⁰. Among the OCR-to-gene links we inferred, the allele in NIP (‘G’ at this SNP) correlated with GNP1 (R = 0.59, P < 6.48e−04), whereas the allele (‘A’ at this SNP) did not show OCR-to-gene correlation in MH63 (R = 0.01, P = 0.99) and ZS97 (R = 0.17, P = 0.34) (Fig. 7c). In addition, eGWAS also demonstrated that this SNP affects GNP1 expression (Fig. 7d). When we evaluated the effects of this SNP with the deep learning model, we found that mutation of this SNP from “G” to “A” in Panicle2 significantly reduced chromatin accessibility (Fig. 7e). We also found that this variant overlaps with the footprint of OsSPL10 identified in Panicle2, and its binding site shows the typical “GTAC” motif of the SBP TF family. These results suggest that mutations control gene expression by affecting TF binding to alter chromatin accessibility.

**Fig. 7: Genomic mutations contribute to *cis*-regulatory divergence.**

Besides point mutations, small genomic alterations (including short insertions/deletions, inversions, and duplications) may abolish OCRs and thus confer an important avenue of regulatory divergence. We quantified all OCRs based on the NIP reference genome and investigated whether their regulatory activity dynamics were associated with short genomic alterations, which were determined by whole genome comparison across different cultivars (see Methods). In total, we found that nearly one third (26.6%) of the OCRs harbored small alterations (Fig. 7f). The regulatory activity of these mutation-associated OCRs is positively correlated with their surrounding gene expression patterns in a cultivar-specific manner (Fig. 7g, h), as exemplified at the loci of Oshsp18.0-CII and MAG2 (Fig. 7i, j and Supplementary Fig. 19a). Notably, GO analysis showed that these genes were highly enriched for various ‘response’ related functions (Supplementary Fig. 19b and Supplementary Data 14). Further investigation revealed that the identified mutation-embedded OCRs were significantly overlapped with transposable elements (TEs) (Supplementary Fig. 19c). The above results indicate that TEs may contribute to modification of regulatory sequences, fine-tuning gene expression networks and driving new functions⁷¹.

Discussion

Despite substantial progress, a complete catalog of regulatory sequences within the rice genome remains elusive, limiting the understanding of tissue-specific regulatory dynamics and GRNs. Our study presents a comprehensive exploration of rice genome regulation using the UMI-ATAC-seq technique¹², providing insights into tissue-specific regulatory elements and their influence on complex agronomic traits. Of note, the identified OCRs in rice encompass approximately 15% of the genome, a notably higher proportion compared to previous reports in plants such as Arabidopsis (~4%)⁷² and maize (~4%)⁷³. This expanded coverage underscores the importance of sampling depth in characterizing the regulatory complexity in plants and highlights the need for further comparative analyses to elucidate species-specific regulatory features.

Predicting OCR-to-gene links presents a significant challenge due to the intricate regulatory mechanisms governing gene expression. By integrating RNA-seq data from matched tissues, we predicted 59,075 OCR-to-gene links, including many reported enhancer-to-gene links. This analysis offers a holistic view of how changes in chromatin accessibility directly impact gene expression patterns, underscoring the significance of regulatory elements in shaping the rice transcriptome. The identified associations between enhancers and target genes provide guidelines for dissecting complex regulatory mechanisms and gene editing in non-coding regions. The approach for predicting OCR-to-gene links based on multi-omics data is versatile and transferable to other plant species. Despite our efforts to predict OCR-to-gene links, less than half of the protein-coding genes exhibit a relatively strong correlation (Pearson correlation coefficient |R|≥ 0.4 and P < 0.05) with OCRs. The complexities of dynamic and context-dependent regulation, coupled with long-range and indirect regulatory mechanisms, introduce additional layers of complexity to OCR-to-gene link prediction beyond the capabilities of linear models aimed at directly mapping OCRs to their target genes. These factors likely contribute to the weaker correlations observed for certain genes. Moreover, tissue-specific and housekeeping genes are difficult to correlate through linear models due to the small variation in expression levels between tissues (Supplementary Fig. 20).

Deep learning has emerged as a potent tool for interpreting the genomic and epigenomic data^62,63, but its application in rice is hindered by the scarcity of high-quality epigenomic datasets. Our study addressed this gap and successfully modelled the chromatin accessibility of three rice varieties. The highly accurate models enable the prediction of chromatin accessibility variation across varieties using sequences, providing a reference for scientists to explore the functional effects of rare variants or new variants across different tissues.

Moreover, our comparative analysis across varieties revealed cis-regulatory divergence that could largely be predicted using deep learning models based on sequences, highlighting the genetic diversity of rice varieties and its impact on regulatory architecture. By integrating GWAS data, we localized significant variants within noncoding regulatory regions, demonstrating that these variants are preferentially located in tissue-specific OCRs, thus providing insights into the influence of regulatory variations on phenotypic outcomes. A notable achievement of our study is the identification of OsbZIP06’s role in seed germination, demonstrating the potential of integrating GWAS data with chromatin accessibility to uncover the genetic basis of complex traits.

In summary, our extensive chromatin accessibility atlas and the deep learning models we have constructed not only enhance our understanding of regulatory elements in rice but also serve as a versatile resource for gene editing and breeding strategies targeting non-coding regions. Nevertheless, there are several limitations associated with our study. Firstly, our map solely encompasses data from normal conditions, omitting insights into responses to biotic or abiotic stresses, mutants, and diverse environmental circumstances. Secondly, the inferred associations between OCRs and genes require experimental validation to confirm their regulatory relationships. Thirdly, our study primarily employed the NIP reference genome, thereby excluding sequences that were not available in the NIP genome. Furthermore, the advent of single-cell technologies has opened avenues for studying cis-elements at a single-cell resolution^73,74. In the future, incorporating single-cell data will be crucial for further characterizing the heterogeneity among different cell types. These advancements will collectively contribute to a more comprehensive understanding of the regulatory landscape in rice and beyond.

Methods

Plant materials, ATAC-seq, and RNA-seq experiments

Three rice varieties, Nipponbare, Zhenshan 97 and Minghui 63, were planted in a field in Wuhan, China in the summer of 2020 and were used to obtain most of the tissues or organs used in this study. Details of the sampling are listed in Supplementary Data 1. We followed our previously established method to perform UMI-ATAC-seq experiments¹². RNA was isolated using TRIzol reagent (Invitrogen Life Technologies), and sequencing libraries were prepared using the MGIEasy RNA Library Preparation Kit. The libraries were subsequently sequenced on the MGISEQ-2000.

ATAC-seq data analysis

For the pre-processing of ATAC-seq data, we follow the workflow of ChIP-Hub¹⁴ and cisDynet⁷⁵. The raw reads were first trimmed by Trimmomatic (v.0.36)⁷⁶ to remove sequencing adapters. The trimmed reads were aligned to the Oryza sativa L.ssp.japonica (cv. Nipponbare) reference genome (v.7.0)¹⁶ using Bowtie2⁷⁷ with the following parameters”-q—no-unal—threads 8—sensitive”. All reads mapped to mitochondrial and chloroplast DNA were removed. After sorting mapped reads with SAMtools⁷⁸ (version 0.1.19), we only used properly paired reads with high mapping quality (MAPQ score > 30) for the subsequent analysis. The PCR duplicates were removed using the MarkDuplicates function from Picard tools (version 2.60; http://broadinstitute.github.io/picard/). The “callpeak” function in MACS2⁷⁹ (version 2.1.0) was used to call peaks with the following parameters: “-g 3.0e8 --nomodel --keep-dup 1 -B --SPMR --call-summits”. The “-shift” used in the model was determined by the analysis of cross-correlation scores using the phantompeakqualtools package (https://code.google.com/archive/p/phantompeakqualtools/).

RNA-seq data analysis

RNA-seq reads were aligned to the Nipponbare reference genome¹⁶ using STAR⁸⁰ (version 2.7.1a). The expression of annotated genes was measured by RSEM⁸¹ (version 1.2.22) and normalized with transcripts per million (TPM).

Linking OCRs to target genes

To assign OCRs to genes, we used an approach similar to the previous study^23,82. First, we prepared the ATAC-seq quantification matrix, with each row representing a merged OCR and each column representing a sample. After merging replicates, 66 tissues with both ATAC-seq data and RNA-seq data were taken as independent samples for the analysis. For the gene expression quantification matrix, we removed possible noise by considering only those genes whose TPM of each row added up to > 1.5. For each of the remaining 29,571 genes, we screened the OCRs that might regulate the gene within 20 kb upstream and downstream of the TSS of that gene separately. Then we calculated the Pearson correlation coefficients between the chromatin accessibility of these OCRs and the expression of that gene. Then we randomly generated pseudo-peak sets of the same length and number as these OCRs on other chromosomes, repeated the process 10,000 times, and used Z-test (z.test function from the R package ‘TeachDemos’) to calculate P values. Finally, we considered that absolute Pearson correlation coefficients (|R|) >= 0.4, and P < 0.05 were significant OCR-to-gene links. For the identification of OCR-to-gene links of NIP, MH63, and ZS97, we used the same strategy except that we used ATAC-seq and RNA-seq samples of the corresponding varieties.

Tissue-specific OCRs analysis

We merged the peaks with NIP tissues, counted the number of Tn5 cuts of these peaks in different tissues, normalized them, and then used the Jensen-Shannon Divergence (JSD) from the philentropy R package (https://github.com/drostlab/philentropy) to screen tissue-specific OCRs. Here we considered OCRs with JSD score > 0.08 (except > 0.1 for young panicle) as tissue-specific OCRs. For tissue-specific OCR, we performed Z-score transformation by row for visualization. To identify the top tissue-specific OCRs in each tissue, we employed a scoring metric known as Specificity Measurement (SPM), as detailed in the method provided at https://github.com/apcamargo/tspex. Subsequently, we sorted the OCRs within each tissue based on their SPM scores to select the top 2500 tissue-specific OCRs in each tissue.

Motif enrichment analysis

For motif enrichment analysis of tissue-specific OCRs, we first calculated the Tau index score using SPM metric for each tissue’s OCRs and selected the top 2500 OCRs of each tissue for motif enrichment analysis according to the ranking of Tau index scores. Then we used GimmeMotifs³⁶ with maelstrom function to determine the tissue-specific motifs enrichment. We set the “--filter-cutoff” to 0.4. The input Position weight matrix (PWM) was downloaded from the JASPAR⁸³ database (https://jaspar.genereg.net/). We combined the enrichment results of three methods (Lasso, Bayesian ridge regression, and boosted trees regression) to get the final motif enrichment lists.

ChIP-seq enrichment analysis

The public ChIP-seq data used in this study are provided in Supplementary Data 3. We downloaded the narrow Peak files of these TFs from the ChIP-Hub database (https://biobigdata.nju.edu.cn/ChIPHub/), and then used BEDTools⁸⁴ (version 2.29.1) fisher function to calculate the enrichment level with the OCRs.

TF motif and footprinting analysis

For the TF motif enrichment analysis, we used the SEA program from the MEME suite and used constitutive OCRs as background. We considered motifs with P-value < 1e−5 to be significantly enriched. For genome-wide TF potential binding sites, we used the FIMO program in MEME to identify them and also used P-value < 1e−5 as the cutoff.

TF footprints were calculated by TOBIAS (version 0.13.1)⁸⁵. We first used TOBIAS ATACCorrect function to correct the Tn5 inherent insertion bias. Then we calculated the footprint score in OCRs using FootprintScores function with default parameters. Finally, we used BINDetect function to predict the transcription factor binding footprint for each sample, which were matched to curated list of JASPAR⁸³ motifs (https://jaspar.genereg.net/).

Cross-variety comparisons of OCRs

We first aligned the whole genome sequences of NIP, MH63, and ZS97 to each other. The strategy used for the whole-genome alignment was similar to the previously described method²³. The results were further filtered to obtain more reliable conserved sequences following the default process of “Reciprocal Best” (http://genomewiki.ucsc.edu/index.php/HowTo:_Syntenic_Net_or_Reciprocal_Best). We obtained three superset OCRs by merging OCRs of tissues shared by three varieties s (n = 22). Then we used the bnMapper.py script in bx-python (https://github.com/bxlab/bx-python) to convert the OCRs coordinates of MH63, ZS97 to the corresponding coordinates of the NIP. We then considered the OCRs of MH63, and ZS97 with at least a 50% overlap with the OCRs of NIP to be conserved OCRs for the three varieties. To obtain a quantitative matrix of conserved OCRs, we first quantified all OCRs for each variety and divided the length of the corresponding OCRs by the CPM strategy, and then extracted the conserved OCRs for each variety for subsequent analysis. We then refer to it to classify conservative OCRs into seven categories (NIP dominant, MH63 dominant, ZS97 dominant, NIP suppressed, MH63 suppressed, ZS97 suppressed, and balanced).

Deep learning model analysis

We used the Basenji⁶⁴ deep learning framework with modifications to accommodate the relatively small rice genome for deep learning model training. We first use the bam_cov.py script to convert the bam files into bigwig files. We then used the basenji_data.py script to prepare the input files for the deep learning model according to the following parameters: “-d 1.0 -s 0.1 –local -t 1 -v 4”. The “-c, -l, -w” of these parameters are shown in the Supplementary Data 15. Data from chromosome 1 was used as the test sets and data from chromosome 2 was used as the validation sets. Next, we used the basenji_train.py script to train the model on a NVIDIA GTX 3090. The basenji_test.py script (default parameters) was used to perform the model performance test. We found differences in the training performance for different parameter settings for rice, with “-l 32768 -c 2048 -w 128” being the best, and subsequent analyses were based on models trained with this parameter. To measure the effect of variation in OCRs, we used the basenji_sat_bed.py script to perform base mutations at this locus and calculated the difference in signals between the reference and the mutation as the variation effect value. To predict the chromatin accessibility of orthologous OCRs, we extended the center of the OCR by 16,384 bp left and right to make a total length of 32,768 bp. Sequences exceeding the length of the corresponding chromosome were removed, and then sequences of the corresponding varieties were extracted using BEDTools getfasta function, and then basenji-predict_bed.py was modified to enable it to use fasta as input.

Analysis of structural variants and transposable elements

We downloaded deletions, duplications and inversions for MH63 (CX145), ZS97 (B156) in Rice SNP-Seek Database (https://snp-seek.irri.org/). Since this database provides large structural variants with a minimum length of 10 bp, we also integrated a series of variants with reference to this workflow⁷. Briefly, we selected Leaf ATAC-seq data from NIP, MH63, and ZS97 varieties, then aligned them to the Nipponbare reference genome using BWA-MEM⁷⁸ (version 0.7.12-r1039) and identified INDELs using GATK⁸⁶ (version 3.3-0-g37228af). The annotation files of transposable elements (TEs) were downloaded from Phytozome database. We use the default parameters of the BEDtools⁸⁴ (version 2.29.1) intersect function to identify OCRs that overlap with structural variants and TEs.

GO enrichment analysis

All GO enrichment analysis was done in the Rice Gene Index database (https://riceome.hzau.edu.cn/)⁸⁷ using default parameters. We considered FDR < 0.05 as a significantly enriched pathway.

GWAS data processing

The genotype and phenotype data used in this study were downloaded from four published cohorts. We refer to this reference³ to name them as 529 rice accessions², 1,275 Chinese rice accessions⁵³, 176 Japanese rice accessions⁵², and 3 K rice accessions⁵⁴, respectively. GWAS was performed separately for each cohort by GCTA⁸⁸ (version 7.93.2) with a mixed linear model. To determine the significant SNPs cutoff, we first used Genetic type 1 error calculator (GEC⁸⁹, version 0.2) to evaluate the effective numbers of independent SNPs (N) and approximated by 0.05/N to estimate the cutoff. The threshold for significant SNPs varied by cohorts, we set the thresholds to 1 × 10⁻⁶, 1 × 10⁻⁴, 1 × 10⁻⁵, and 1 × 10⁻⁶ for 3 K rice accessions, 176 Japanese rice accessions, 529 rice accessions, and 1,275 Chinese rice accessions, respectively. Variants with a minor allele frequency (MAF) that was <5% were excluded. For the lead SNP identification, we used PLINK⁹⁰ (version 1.9) and set the parameter “--clump-p1” to the threshold we defined above, “--clump-p2 0.05 --clump-r2 0.6 --clump-kb 1000” for the first round of parameters. Then we set the second round of “--clump-r2” to 0.1, other parameters are unchanged. We used PLINK with the following parameters “--ld-window-kb 1000 --ld-window 99999 --ld-window-r2 0.8” to calculate the SNPs with strong linkage disequilibrium (r² > 0.8) with lead SNPs.

Enrichment analysis of GWAS-associated SNPs of different P-values with OCRs

The enrichment in the OCRs of tissue at a given threshold of different P-values was calculated as the fraction of SNPs with P-values below this threshold that overlap with the OCRs (merged all NIP tissues’ OCRs), divided by the fraction of all noncoding SNPs that overlap with the OCRs in the study. Enrichment was performed at P-value thresholds ranging from 1e-1 to 1e-7. The smallest threshold had at least 50 SNPs in their study to ensure a sufficient sample size.

GWAS SNPs enrichments

We first merged the peaks from all tissues in Nipponbare and used this peak superset to quantify each tissue. To make sure that our analysis was not interfered by low confident peaks, we dropped the peaks in the tenth percentile of the lowest Tn5 cuts coverage, yielding 86,011 ATAC-seq peaks finally. Then we performed the normalization with CHEERS_normalize.py from CHEERS⁵⁷ (Chromatin Element Enrichment Ranking by Specificity) software (https://github.com/TrynkaLab/CHEERS/tree/python3). The normalized quantification matrix was next transformed to tissue-specificity score with range 0–1. To do the enrichment analysis, we used the set of lead SNPs and SNPs with strong linkage disequilibrium (r² > 0.8) with the lead SNPs computed separately for the corresponding cohort from the 209 GWAS above as the input to CHEERS_computeEnrichment.py. The enrichment P-values were transformed by -log10 and normalized by row with Z-score for visualization. For the proximal and distal GWAS SNPs enrichment analysis, we first divided OCRs into proximal and distal according to its summit distance from the nearest TSS. All other steps are the same as described above.

Generation of transgenic rice plants

To obtain overexpression lines of OsbZIP06, the cDNA of OsbZIP06 was cloned using primers OsbZIP06-OE-F and OsbZIP06-OE-R and inserted into the Kpn1-BamH1 site of the pCAMBIA1301 vector and fused with the maize Ubiquitin promoter and three FLAG tags at its C-terminus using the ClonExpress II One Step Cloning Kit (Vazyme). The construct was then transformed into ZhongHua11 (ZH11) by Biogle GeneTech. Primers used to clone OsbZIP06 are listed in Supplementary Data 16.

For the OsbZIP06 mutant strain, T1 generation seeds produced using the CRISPR-Cas9 system were purchased from Biogle GeneTech. The sgRNA sequence OsbZIP06-CR-gRNA is listed in Supplementary Data 16.

Seed germination experiments

Seed germination experiments were performed as previously described⁶⁰. Seeds of Zhonghua 11, OsbZIP06 mutant in the Zhonghua 11 background were used for germination experiments.

Statistics and reproducibility

If not specified, all statistical analyses and data visualization were done in R (version 4.0.0) or Python (version 3.8.9). R packages (e.g. ggplot2 and plotly) and Python packages (e.g. Seaborn) are heavily used for graphics. All the sources data for each figure can be found in the Supplementary Information. Specific tests used to determine statistical analyses are noted in each figure legend.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The sequencing data from ATAC-seq and RNA-seq generated in this study have been deposited in the NCBI BioProject database under accession code PRJNA940508. All public ChIP-seq used in this study are download from ChIP-Hub database (https://biobigdata.nju.edu.cn/ChIPHub/). The accession number are provided in the Supplementary Data 3. Some critical analysis results about this study can be accessed in the CART database (https://biobigdata.nju.edu.cn/cart/).

Code availability

The code related to figures is available at https://github.com/compbioNJU/CART.

References

Wang, C. & Han, B. Twenty years of rice genomics research: From sequencing and functional genomics to quantitative genomics. Mol. Plant. 15, 593–619 (2022).
Article PubMed Google Scholar
Xie, W. et al. Breeding signatures of rice improvement revealed by a genomic variation map from a large germplasm collection. Proc. Natl Acad. Sci. 112, E5411–E5419 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wei, X. et al. A quantitative genomics map of rice provides genetic insights and guides breeding. Nat. Genet. 53, 243–253 (2021).
Article CAS PubMed Google Scholar
Purugganan, M. D. & Jackson, S. A. Advancing crop genomics from lab to field. Nat. Genet. 53, 595–601 (2021).
Article CAS PubMed Google Scholar
Li, Y. et al. Rice functional genomics research: past decade and future. Mol. Plant 11, 359–380 (2018).
Article CAS PubMed Google Scholar
Schmitz, R. J., Grotewold, E. & Stam, M. Cis-regulatory sequences in plants: Their importance, discovery, and future challenges. Plant Cell 34, 718–741 (2022).
Article PubMed Google Scholar
Zhao, H. et al. An inferred functional impact map of genetic variants in rice. Mol. Plant 14, 1584–1599 (2021).
Article CAS PubMed Google Scholar
Zhang, W. et al. High-resolution mapping of open chromatin in the rice genome. Genome Res. 22, 151–162 (2012).
Article CAS PubMed PubMed Central Google Scholar
Konishi, S. et al. An SNP caused loss of seed shattering during rice domestication. Science 312, 1392–1396 (2006).
Article ADS CAS PubMed Google Scholar
Sun, X. et al. Natural variation of DROT1 confers drought adaptation in upland rice. Nat. Commun. 13, 4265 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Bai, X. et al. Duplication of an upstream silencer of FZP increases grain yield in rice. Nat. Plants 3, 885–893 (2017).
Article CAS PubMed Google Scholar
Zhu, T., Liao, K., Zhou, R., Xia, C. & Xie, W. ATAC-seq with unique molecular identifiers improves quantification and footprinting. Commun. Biol. 3, 675 (2020).
Article CAS PubMed PubMed Central Google Scholar
Landt, S. G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).
Article CAS PubMed PubMed Central Google Scholar
Fu, L.-Y. et al. ChIP-Hub provides an integrative platform for exploring plant regulome. Nat. Commun. 13, 3413 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, J. et al. Extensive sequence divergence between the reference genomes of two elite indica rice varieties Zhenshan 97 and Minghui 63. Proc. Natl Acad. Sci. 113, E5163–E5171 (2016).
CAS PubMed PubMed Central Google Scholar
Kawahara, Y. et al. Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice 6, 4 (2013).
Article PubMed PubMed Central Google Scholar
Qunhua, L., James, B. B., Haiyan, H. & Peter, J. B. Measuring reproducibility of high-throughput experiments. Ann. Appl. Stat. 5, 1752–1779 (2011).
MathSciNet Google Scholar
Yan, W. et al. Dynamic control of enhancer activity drives stage-specific gene expression during flower morphogenesis. Nat. Commun. 10, 1705 (2019).
Article ADS PubMed PubMed Central Google Scholar
Zhou, X. et al. Systematic annotation of conservation states provides insights into regulatory regions in rice. J. Genet. Genomics 49, 1127–1137 (2022).
Article PubMed Google Scholar
Van de Velde, J., Van Bel, M., Vaneechoutte, D. & Vandepoele, K. A collection of conserved noncoding sequences to study gene regulation in flowering plants. Plant Physiol. 171, 2586–2598 (2016).
Article PubMed PubMed Central Google Scholar
Linderman, G. C., Rachh, M., Hoskins, J. G., Steinerberger, S. & Kluger, Y. Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nat. Methods 16, 243–245 (2019).
Article CAS PubMed PubMed Central Google Scholar
Villar, D. et al. Enhancer evolution across 20 mammalian species. Cell 160, 554–566 (2015).
Article CAS PubMed PubMed Central Google Scholar
Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science 362, eaav1898 (2018).
Article ADS PubMed PubMed Central Google Scholar
Liu, C., Cheng, Y.-J., Wang, J.-W. & Weigel, D. Prominent topologically associated domains differentiate global chromatin packing in rice from Arabidopsis. Nat. Plants 3, 742–748 (2017).
Article CAS PubMed Google Scholar
Golicz, A. A., Bhalla, P. L., Edwards, D. & Singh, M. B. Rice 3D chromatin structure correlates with sequence variation and meiotic recombination rate. Commun. Biol. 3, 235 (2020).
Article CAS PubMed PubMed Central Google Scholar
Liu, C. et al. eQTLs play critical roles in regulating gene expression and identifying key regulators in rice. Plant Biotechnol. J. n/a, https://doi.org/10.1111/pbi.13912 (2022).
Ishii, T. et al. OsLG1 regulates a closed panicle trait in domesticated rice. Nat. Genet. 45, 462–465 (2013).
Article CAS PubMed Google Scholar
Zhang, L. et al. A natural tandem array alleviates epigenetic repression of IPA1 and leads to superior yielding rice. Nat. Commun. 8, 14789 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, X.-H., Rao, X.-L., Shi, H.-T., Li, R.-J. & Lu, Y.-T. Overexpression of a cytosolic glyceraldehyde-3-phosphate dehydrogenase gene OsGAPC3 confers salt tolerance in rice. Plant Cell Tissue Organ Cult. (PCTOC) 107, 1–11 (2011).
Article CAS Google Scholar
Lee, S. et al. Concurrent activation of OsAMT1;2 and OsGOGAT1 in rice leads to enhanced nitrogen use efficiency under nitrogen limitation. Plant J. 103, 7–20 (2020).
Article CAS PubMed PubMed Central Google Scholar
Tanaka, W. et al. The YABBY Gene TONGARI-BOUSHI1 is involved in lateral organ development and maintenance of meristem organization in the rice spikelet. Plant Cell 24, 80–95 (2012).
Article CAS PubMed PubMed Central Google Scholar
Raineri, J., Wang, S., Peleg, Z., Blumwald, E. & Chan, R. L. The rice transcription factor OsWRKY47 is a positive regulator of the response to water deficit stress. Plant Mol. Biol. 88, 401–413 (2015).
Article CAS PubMed Google Scholar
Wang, S. et al. The OsSPL16-GW7 regulatory module determines grain shape and simultaneously improves rice yield and grain quality. Nat. Genet. 47, 949–954 (2015).
Article CAS PubMed Google Scholar
Rao, N. N., Prasad, K., Kumar, P. R. & Vijayraghavan, U. Distinct regulatory role for RFL, the rice LFY homolog, in determining flowering time and plant architecture. Proc. Natl Acad. Sci. 105, 3646–3651 (2008).
Article ADS CAS PubMed PubMed Central Google Scholar
Huang, L. et al. The LARGE2-APO1/APO2 regulatory module controls panicle size and grain number in rice. Plant Cell 33, 1212–1228 (2021).
Article PubMed Google Scholar
van Heeringen, S. J. & Veenstra, G. J. C. GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments. Bioinformatics 27, 270–271 (2011).
Article PubMed Google Scholar
Xiao, S.-J., Zhang, C., Zou, Q. & Ji, Z.-L. TiSGeD: a database for tissue-specific genes. Bioinformatics 26, 1273–1275 (2010).
Article CAS PubMed PubMed Central Google Scholar
Lee, D.-Y. & An, G. Two AP2 family genes, SUPERNUMERARY BRACT (SNB) and OsINDETERMINATE SPIKELET 1 (OsIDS1), synergistically control inflorescence architecture and floral meristem establishment in rice. Plant J. 69, 445–461 (2012).
Article CAS PubMed Google Scholar
Lee, Y.-S., Lee, D.-Y., Cho, L.-H. & An, G. Rice miR172 induces flowering by suppressing OsIDS1 and SNB, two AP2 genes that negatively regulate expression of Ehd1 and florigens. Rice 7, 31 (2014).
Article PubMed PubMed Central Google Scholar
Baoxiang, W. et al. OsbZIP72 is involved in transcriptional gene-regulation pathway of abscisic acid signal transduction by activating rice high-affinity potassium transporter OsHKT1;1. Rice Sci. 28, 257–267 (2021).
Article Google Scholar
Wang, Y. et al. Abscisic acid promotes jasmonic acid biosynthesis via a ‘SAPK10-bZIP72-AOC’ pathway to synergistically inhibit seed germination in rice (Oryza sativa). N. Phytol. 228, 1336–1353 (2020).
Article CAS Google Scholar
Yu, X. et al. Abnormal flower and grain 1 encodes OsMADS6 and determines palea identity and affects rice grain yield and quality. Sci. China Life Sci. 63, 228–238 (2020).
Article CAS PubMed Google Scholar
Li, H. et al. Rice MADS6 interacts with the floral homeotic genes SUPERWOMAN1, MADS3, MADS58, MADS13, and drooping leaf in specifying floral organ identities and meristem fate. Plant Cell 23, 2536–2552 (2011).
Article CAS PubMed PubMed Central Google Scholar
Li, H. et al. The AGL6-like gene OsMADS6 regulates floral organ and meristem identities in rice. Cell Res. 20, 299–313 (2010).
Article CAS PubMed Google Scholar
Martín-Trillo, M. & Cubas, P. TCP genes: a family snapshot ten years later. Trends Plant Sci. 15, 31–39 (2010).
Article PubMed Google Scholar
Zaret, K. S. & Mango, S. E. Pioneer transcription factors, chromatin dynamics, and cell fate control. Curr. Opin. Genet Dev. 37, 76–81 (2016).
Article CAS PubMed PubMed Central Google Scholar
Xing, Y. & Zhang, Q. Genetic and molecular bases of rice yield. Annu. Rev. Plant Biol. 61, 421–442 (2010).
Article CAS PubMed Google Scholar
Li, G., Zhang, H., Li, J., Zhang, Z. & Li, Z. Genetic control of panicle architecture in rice. Crop J. 9, 590–597 (2021).
Article Google Scholar
Yamaguchi, T. et al. The YABBY Gene DROOPING LEAF regulates carpel specification and midrib development in Oryza sativa [W]. Plant Cell 16, 500–509 (2004).
Article CAS PubMed PubMed Central Google Scholar
Hu, L. et al. OsSPL9 regulates grain number and grain yield in rice. Front Plant Sci. 12, 682018 (2021).
Article PubMed PubMed Central Google Scholar
Song, X. et al. Targeting a gene regulatory element enhances rice grain yield by decoupling panicle number and size. Nat. Biotechnol. 40, 1403–1411 (2022).
Article CAS PubMed Google Scholar
Yano, K. et al. Genome-wide association study using whole-genome sequencing rapidly identifies new genes influencing agronomic traits in rice. Nat. Genet. 48, 927–934 (2016).
Article CAS PubMed Google Scholar
Li, X. et al. Analysis of genetic architecture and favorable allele usage of agronomic traits in a large collection of Chinese rice accessions. Sci. China Life Sci. 63, 1688–1702 (2020).
Article PubMed Google Scholar
Wang, W. et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557, 43–49 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, X. et al. GWAS Atlas: an updated knowledgebase integrating more curated associations in plants and animals. Nucleic Acids Res. 51, D969–D976 (2023).
Article CAS PubMed Google Scholar
Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).
Article CAS PubMed Google Scholar
Soskic, B. et al. Chromatin activity at GWAS loci identifies T cell states driving complex immune diseases. Nat. Genet. 51, 1486–1493 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zhao, Y. et al. Genetic basis and network underlying synergistic roots and shoots biomass accumulation revealed by genome-wide association studies in rice. Sci. Rep. 11, 13769 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Bañuelos, M. A., Garciadeblas, B., Cubero, B. & Rodríguez-Navarro, A. Inventory and functional characterization of the HAK potassium transporters of rice. Plant Physiol. 130, 784–795 (2002).
Article PubMed PubMed Central Google Scholar
Magwa, R. A., Zhao, H. & Xing, Y. Genome-wide association mapping revealed a diverse genetic basis of seed dormancy across subpopulations in rice (Oryza sativa L.). BMC Genet. 17, 28 (2016).
Article PubMed PubMed Central Google Scholar
Chen, K. M., Wong, A. K., Troyanskaya, O. G. & Zhou, J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat. Genet. 54, 940–949 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhao, H. et al. PlantDeepSEA, a deep learning-based web service to predict the regulatory effects of genomic variants in plants. Nucleic Acids Res. 49, W523–W529 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Maslova, A. et al. Deep learning of immune cell differentiation. Proc. Natl Acad. Sci. 117, 25655–25666 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).
Article CAS PubMed PubMed Central Google Scholar
Li, X. et al. Overexpression of RCc3 improves root system architecture and enhances salt tolerance in rice. Plant Physiol. Biochem. 130, 566–576 (2018).
Article CAS PubMed Google Scholar
Chen, R. et al. A de novo evolved gene contributes to rice grain shape difference between indica and japonica. Nat. Commun. 14, 5906 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Wittkopp, P. J. & Kalay, G. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat. Rev. Genet. 13, 59–69 (2012).
Article CAS Google Scholar
Ming, L. et al. Transcriptome-wide association analyses reveal the impact of regulatory variants on rice panicle architecture and causal gene regulatory networks. Nat. Commun. 14, 7501 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, J. et al. An expression quantitative trait loci-guided co-expression analysis for constructing regulatory network using a rice recombinant inbred line population. J. Exp. Bot. 65, 1069–1079 (2014).
Article CAS PubMed PubMed Central Google Scholar
Wu, Y. et al. The QTL GNP1 encodes GA20ox1, which increases grain number and yield by increasing cytokinin activity in rice Panicle Meristems. PLoS Genet. 12, e1006386 (2016).
Article PubMed PubMed Central Google Scholar
Bourque, G. et al. Ten things you should know about transposable elements. Genome Biol. 19, 199 (2018).
Article CAS PubMed PubMed Central Google Scholar
Sullivan, A. M. et al. Mapping and dynamics of regulatory DNA and transcription factor networks in A. thaliana. Cell Rep. 8, 2015–2030 (2014).
Article CAS PubMed Google Scholar
Marand, A. P., Chen, Z., Gallavotti, A. & Schmitz, R. J. A cis-regulatory atlas in maize at single-cell resolution. Cell 184, 3041–3055.e3021 (2021).
Article CAS PubMed Google Scholar
Cuperus, J. T. Single-cell genomics in plants: current state, future directions, and hurdles to overcome. Plant Physiol. 188, 749–755 (2022).
Article CAS PubMed Google Scholar
Zhu, T. et al. cisDynet: An integrated platform for modeling gene-regulatory dynamics and networks. iMeta 2, e152 (2023).
Article PubMed PubMed Central Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Article PubMed PubMed Central Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinforma. 12, 323 (2011).
Article CAS Google Scholar
Sheffield, N. C. et al. Patterns of regulatory activity across diverse human cell types predict tissue identity, transcription factor binding, and long-range interactions. Genome Res 23, 777–788 (2013).
Article CAS PubMed PubMed Central Google Scholar
Castro-Mondragon, J. A. et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 50, D165–D173 (2022).
Article CAS PubMed Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Bentsen, M. et al. ATAC-seq footprinting unravels kinetics of transcription factor binding during zygotic genome activation. Nat. Commun. 11, 4267 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Article CAS PubMed PubMed Central Google Scholar
Yu, Z. et al. Rice Gene Index (RGI): a comprehensive pan-genome database for comparative and functional genomics of Asian rice. Mol. Plant, https://doi.org/10.1016/j.molp.2023.03.012 (2023).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: A tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Article CAS PubMed PubMed Central Google Scholar
Li, M.-X., Yeung, J. M. Y., Cherny, S. S. & Sham, P. C. Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum. Genet. 131, 747–756 (2012).
Article CAS PubMed Google Scholar
Purcell, S. et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work is supported by grants from STI2030-Major Projects (2023ZD04076), the National Natural Science Foundation of China (No. 32070656, U23A20189, 31922065), the Hubei Provincial Natural Science Foundation of China (2023AFA043), the Earmarked Fund for the China Agriculture Research System (CARS-01-01), the Fundamental Research Funds for the Central Universities (2662023PY002), the Foundation of Hubei Hongshan Laboratory (2021hszd005) and HZAU-AGIS Cooperation Fund (SZYJY2023003). The authors acknowledge the Center for Information Technology and the High-Performance Computing Center of Nanjing University and the bioinformatics computing platform of the National Key Laboratory of Crop Genetic Improvement at Huazhong Agricultural University for providing high performance computing (HPC) resources.

Author information

These authors contributed equally: Tao Zhu, Chunjiao Xia.

Authors and Affiliations

State Key Laboratory of Pharmaceutical Biotechnology, Department of Gastroenterology, Nanjing Drum Tower Hospital, National Resource Center for Mutant Mice, School of Life Sciences, Nanjing University, Nanjing, 210023, China
Tao Zhu, Ranran Yu, Xinkai Zhou, Lin Wang, Yuxin You & Dijun Chen
Chemistry and Biomedicine Innovation Center, Nanjing University, Nanjing, 210023, China
Tao Zhu & Dijun Chen
National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China
Chunjiao Xia, Xingbing Xu, Zhanxiang Zong, Junjiao Yang, Yinmeng Liu, Luchang Ming & Weibo Xie
Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
Weibo Xie
Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
Weibo Xie

Authors

Tao Zhu
View author publications
Search author on:PubMed Google Scholar
Chunjiao Xia
View author publications
Search author on:PubMed Google Scholar
Ranran Yu
View author publications
Search author on:PubMed Google Scholar
Xinkai Zhou
View author publications
Search author on:PubMed Google Scholar
Xingbing Xu
View author publications
Search author on:PubMed Google Scholar
Lin Wang
View author publications
Search author on:PubMed Google Scholar
Zhanxiang Zong
View author publications
Search author on:PubMed Google Scholar
Junjiao Yang
View author publications
Search author on:PubMed Google Scholar
Yinmeng Liu
View author publications
Search author on:PubMed Google Scholar
Luchang Ming
View author publications
Search author on:PubMed Google Scholar
Yuxin You
View author publications
Search author on:PubMed Google Scholar
Dijun Chen
View author publications
Search author on:PubMed Google Scholar
Weibo Xie
View author publications
Search author on:PubMed Google Scholar

Contributions

D.C. and W.X. conceived and designed the project. T.Z., C.X., X.X., Y.L., and J.Y. performed experiments. T.Z., D.C., X.Z., R.Y., L.W., Z.Z., L.M. and Y.Y. conducted the bioinformatics analysis. D.C., W.X. and T.Z. wrote the paper. All the authors reviewed and approved the paper.

Corresponding authors

Correspondence to Dijun Chen or Weibo Xie.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Xuehui Huang and the other anonymous reviewer for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of additional supplementary files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Supplementary Data 10

Supplementary Data 11

Supplementary Data 12

Supplementary Data 13

Supplementary Data 14

Supplementary Data 15

Supplementary Data 16

Reporting Summary

Peer Review File

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhu, T., Xia, C., Yu, R. et al. Comprehensive mapping and modelling of the rice regulome landscape unveils the regulatory architecture underlying complex traits. Nat Commun 15, 6562 (2024). https://doi.org/10.1038/s41467-024-50787-y

Download citation

Received: 08 September 2023
Accepted: 19 July 2024
Published: 03 August 2024
Version of record: 03 August 2024
DOI: https://doi.org/10.1038/s41467-024-50787-y

This article is cited by

One versus many independent assemblies of symbiotic nitrogen fixation in flowering plants
- Jeff J. Doyle
- Jian Ren
- Yingzhi Gao
Nature Communications (2025)
Constraint of accessible chromatins maps regulatory loci involved in maize speciation and domestication
- Yuting Liu
- Xiang Gao
- Junpeng Shi
Nature Communications (2025)
Cis-regulatory elements: systematic identification and horticultural applications
- Tian Li
- Wen Zeng
- Peitao Lü
aBIOTECH (2025)
Enhancer Elements in Plants: Genomic Context and Biological Functions
- Hongwoo Lee
- Dong Hyeon Kang
- Pil Joon Seo
Journal of Plant Biology (2025)