- Research
- Open access
- Published:
Direction and modality of transcription changes caused by TAD boundary disruption in Slc29a3/Unc5b locus depends on tissue-specific epigenetic context
Epigenetics & Chromatin volume 18, Article number: 55 (2025)
Abstract
Background
Topologically associating domains (TADs) are believed to play a role in the regulation of gene expression by constraining or guiding interactions between the regulatory elements. While the impact of TAD perturbations is typically studied in developmental genes with highly cell-type-specific expression patterns, this study examines genes with broad expression profiles separated by a strong insulator boundary. We focused on the mouse Slc29a3/Unc5b locus, which encompasses two distinct TADs containing ubiquitously expressed and essential for viability genes. We disrupted the CTCF-boundary between these TADs and analyzed the resulting changes in gene expression.
Results
Deletion of four CTCF binding sites at the TAD boundary altered local chromatin architecture, abolishing pre‑existing loops and creating novel long‑range interactions that spanned the original TAD boundary. Using UMI-assisted targeted RNA-seq we evaluated transcriptional changes of Unc5b, Slc29a3, Psap, Vsir, Cdh23, and Sgpl1 across various organs. We found that TAD boundary disruption led to variable transcriptional responses, where not only the magnitude but also the direction of gene expression changes were tissue-specific. Current hypotheses on genome architecture function, such as enhancer competition and hijacking, as well as genomic deep learning models, only partially explain these transcriptional changes, highlighting the need for further investigation into the mechanisms underlying TAD function and gene regulation.
Conclusions
Disrupting the insulator element between broadly expressed genes resulted in moderate, tissue-dependent transcriptional alterations, rather than uniformly activating or silencing the target genes. These findings show that TAD boundaries contribute to context‑specific regulation even at housekeeping loci and underscore the need for refined models to predict the effects of non‑coding structural variants.
Introduction
In vertebrates, interphase chromatin is organized into Topologically Associating Domains (TADs) and sub-TAD loops by the interaction of the DNA-looping activity of the Cohesin complex and the DNA-binding insulator factor CTCF [1]. Since this organization influences the spatial interactions of cis-regulatory elements, it is believed that TAD structure plays a crucial role in controlling gene expression. Despite intensive research, the role of TADs in transcription regulation remains controversial. There is a high conservation of TAD structure across species [2,3,4,5], and some experiments have shown that perturbations in TADs can have a significant impact on the expression of adjacent genes [6,7,8], leading to pathological phenotypes. However, other evidence indicates that TAD perturbations often have minor effects on gene expression [9, 10] and do not always lead to phenotypic consequences. Therefore, interpreting TAD boundary mutations and predicting their effects on the magnitude and direction of expression changes in specific tissues remains challenging [11].
Previous studies discovered several mechanisms that explain the consequences of TAD boundary mutations. TADs limit the range of enhancer activity, and alterations in the evolutionarily shaped TAD structure can lead to the miswiring of regulatory elements and ectopic activation of genes, a phenomenon known as enhancer hijacking [12,13,14]. These improper enhancer-promoter interactions often result in a gene’s overexpression or ectopic expression [8, 15]. In cases involving oncogenes or developmental genes, which require tight expression control, TAD boundary mutations can lead to cancer [16, 17] and developmental disorders [6, 15, 18].
Cases where TAD structure is required to bring promoters and enhancers together and TAD disruption disconnects these genomic elements from each other, causing downregulation of gene expression are relatively less abundant [19]. However, there are some well studied examples of these mechanisms, particularly those involving developmental genes C-MYC [20, 21], SHH [22, 23], and Pax3 [24].
Furthermore, the structural integrity of TADs influences gene regulation not only through enhancer and promoter interactions but also by affecting the distribution and maintenance of chromatin marks. TAD boundaries can block the propagation of histone modifications by chromatin remodelers, and losing their insulator function can lead to the establishment of inappropriate chromatin domains [25]. This is presumably true for both active and repressive histone marks in vertebrates [26].
The effects of TAD perturbations are typically studied in developmental genes, which display highly cell-type specific expression signatures. Although the difference between developmental and housekeeping genes is not well-defined, studying genes silenced in most cell types may be biased because they are unresponsive to the effects of TAD boundary perturbations in the majority of epigenetic contexts. Conversely, there is substantially less data on the effects of TAD perturbations on genes, which are ubiquitously expressed across various cell types. It is assumed that the fusion of regulatory landscapes hosting multiple ubiquitously expressing genes could lead to the rewiring of regulatory elements, driven by competition between promoters for regulatory activity [27].
To further investigate this, in this study we focus on genes that have widespread activity profiles but are divided into two TADs by a strong insulator boundary. To assess the regulatory effects of chromatin architecture in this context, we selected the mouse Slc29a3/Unc5b locus, which contains two distinct TADs (Fig. 1A). The centromeric domain TAD contains Slc29a3, Cdh23, Vsir, and Psap genes, which are crucial for viability and are involved in a number of developmental processes. The telomeric domain contains genes Unc5b and Sgpl1 placed near its borders, and Unc5b has super enhancer signatures within its first intron, active in several organs, particularly in the cerebellum. Most of these genes show broad expression across tissues, making this locus well-suited for detecting subtle regulatory rewiring upon TAD boundary disruption. Though not uniformly expressed, they are more widespread than typical developmental genes. By investigating a boundary that partitions broadly expressed genes, this study aims to reveal how chromatin architecture can fine-tune transcription across diverse epigenetic contexts.
Several genes located in the locus Unc5b/Slc29a3 locus are clinically significant and play an important role in the development and functioning of organisms. Unc5b encodes the netrin-1 membrane receptor [28] that takes a part in axon [29] and vessel growth guidance [30, 31], and also regulates proliferation, migration, and apoptosis processes [32,33,34,35] and disruption of Unc5b expression is a clinically significant marker in specific cancer subtypes [32, 36, 37]. In Unc5b knockout mouse models, the absence of Unc5b causes neurodevelopmental anomalies, embryonic growth arrest, and death during embryogenesis [31].
Slc29a3 encodes a nucleoside transporter channel [38, 39] that is important for cell homeostasis maintenance and autophagy regulation [40, 41]. Its mutations are associated with hereditary scleroderma, hyperpigmentation, hypertrichosis, hypertrophy of internal organs, cardiovascular and musculoskeletal deformities [42], and immunodeficiency [43, 44]; Slc29a3 knockout mice exhibit splenic hyperplasia, hematopoietic dysfunction, and early death [40]. The Cdh23 gene is associated with both syndromic and nonsyndromic genetic deafness [45, 46]; absence of Cdh23 in knockout mice leads to congenital deafness, pathologies of the organ of Corti [47,48,49], as well as various sensory and motor disorders [50]. Mutations in Psap can lead to severe conditions such as saposin deficiency syndrome with hepatosplenomegaly and atrophy of brain structures [51, 52], and are also associated with an increased risk of parkinsonism [53]; Psap knockout mice show hypoactivity, demyelination, axonal degeneration, early death [54, 55], and certain sensory-motor disorders comparable to the phenotypes observed in Cdh23 knockouts [56, 57].
This locus has a strong insulating boundary dividing it into two distinct TADs that are evolutionary conserved among vertebrates (Fig. 1A-C). In mice, this boundary is formed by five CTCF binding sites (CTCF-bs): two centromeric CTCF-bs have reverse-orientation and are located in the intergenic region without any functional marks ([mm10] chr10:60,757,172 − 60,757,204 and chr10:60,760,631 − 60,760,663); the remaining three sites are placed within Unc5b introns ([mm10] chr10:60,775,582 − 60,775,773, chr10:60,778,683 − 60,778,844, chr10:60,779,677 − 60,779,992) and have forward orientation (Fig. 1D).
We hypothesize that disruption of the TAD boundary in the Slc29a3/Unc5b locus may lead to dysregulation of these genes and shed light on the mechanisms that tie together the spatial genome organization and the establishment of transcription states in different tissues.
Results
Generation of genetically-modified mice
To investigate the role of chromatin architecture in the Unc5b/Slc29a3 locus we generated two mouse lines with deleted TAD boundary forming CTCF-bs (Fig. 1D).
First, we deleted two centromeric sites by introducing ~ 5kbp deletion (chr10:60,755,585 − 60,761,088, mm10). As they are placed in intergenic regions and do not overlap any known regulatory or coding elements, we apply the conventional CRISPR/Cas9 system to introduce two double-strand breaks, supplemented with Homology-Directed Repair (HDR) template, stimulating deletions formation (Fig. 1E). This results in the generation of animals with deletion of a 5 kb region containing both CTCF-bs. From 19 animals carrying the target modification, we picked a single male mouse for backcrosses and obtained the homozygous line. PCR-genotyping and Sanger sequencing confirmed that deletion occurred between target Cas9 cut sites and likely represented the HDR product (Fig. 1F).
Spatial organization of the Unc5b/Slc29a3 locus and its gene editing. A - Hi-C map and gene localisation of mouse Unc5b/Slc29a3 locus. From [58]. B - Hi-C map and gene localisation of human UNC5B/SLC29A3 locus. From [59]. C - Hi-C map and gene localisation of western clawed frog Unc5b/Slc29a3 locus. From [60]. D - CTCF-bs cluster at the mouse Slc29a3/Unc5b locus and CTCF ChIP-Seq tracks from mouse liver tissue (Wild-type and Mutant variant inherited from mouse #16). Mutations coordinates are shown by red rectangles, and CTCF-bs orientations shown by blue triangles. E - Gene editing design employed for deleting the genomic region with two CTCF-bs. F - Sanger sequencing of the obtained deletion breakpoint regions. Obtained sequence aligned to the expected sequence of deletion. G - gene editing design for telomeric CTCF-binding sites knock-out
Further, we attempted to disrupt three telomeric CTCF-bs located within Unc5b introns. We refer to these telomeric binding sites as Left, Middle, and Right sites, according to their arrangement on the centromere to telomere axis (Fig. 1D). Close location to Unc5b gene elements prohibits the usage of simple editing designs such as removing all three CTCF-bs via single extended deletion. Thus, we constructed three CRISPR/Cas9 sgRNAs that provided cleavage within the predicted CTCF motifs. To minimize the risk of unintended deletions between the cleavage sites [61], we engineered single-stranded oligodeoxynucleotides (ssODNs) to guide the repair process. This design ensures the replacement of the CTCF core motif with a HindIII restriction site. This modification serves two purposes: it disrupts CTCF binding and simplifies the genotyping of the modified animals. We based our approach on the assumption that during the repair process, a few nucleotides at the ends of the double-strand breaks (DSBs) are resected before new synthesis occurs using the ssODN as a donor of homology (Fig. 1G).
In this experiment, we obtained 19 animals that were genotyped by PCR plus restriction fragment length polymorphism (PCR-RFLP) with enzymatic cleavage with the HindIII enzyme. This analysis showed that the expected mutation variant was obtained only in three mice (#1, 2, and 9) out of 19, and only in the one, middle CTCF site (Supplementary Table 2).
The observed mutation rate was significantly lower than anticipated, based on our previous studies [7]. Consequently, we hypothesized that the introduction of mutations through HDR was insufficiently effective. Therefore, we proceeded to investigate the presence of INDELs that might have resulted from other repair pathways. To achieve this, we analyzed PCR amplicons that encompass the CRISPR/Cas9 target sites using next-generation sequencing.
The analysis, performed for each target site and for each animal individually, did not reveal any new HDR outcomes in addition to those detected by PCR-RFLP; however, we identified multiple alleles harboring INDELs at target regions (Fig. 2A-C). Based on the presence of short homology motifs flanking the INDELs, we assumed that many of the observed deletions were generated via the microhomology end joining (MMEJ) repair pathway. We observed 24 MMEJ mutation occurrences (counting each variant from each animal), 23 NHEJ, and only 4 HDR occurrences among 19 animals and three mutation points. We also found that mutation outcomes agreed well with inDelphi [62] software predictions.
To predict how mutations affect CTCF binding, we assessed the CTCF motif score of all generated sequences. We found that the majority of unexpected NHEJ and MMEJ deletions had low CTCF binding scores, comparable with binding scores predicted for intended HDR-mediated mutations. We chose the mouse #16, carrying a compound of 5,5-kbp deletion of intergenic CTCF-bs with Left and Middle mutations that have low CTCF binding scores, as the most satisfying to derive a homozygous line.
Next, we confirmed that mutations of CTCF-bs affect CTCF binding. For this aim, we focused on liver tissue, where all five CTCF-bs in Unc5b/Slc29a3 locus are bound by CTCF according to the public ENCODE data. Based on the ChIP-Seq experiment results on the homozygous animals livers, we confirmed the absence of CTCF binding for the mutated sites (Fig. 1D).
Thus, we obtained homozygous mouse line carrying deletions [mm10] chr10:60,755,585 − 60,761,088 (of two intergenic CTCF-bs, 5504 bp), chr10:60,775,689 − 60,775,698 (deletion of 9 bp at Left site of telomeric cluster), and chr10:60,778,725 − 60,778,738 (deletion of 13 bp with insertion of -AA- dinucleotide at Middle site of telomeric cluster), disrupting CTCF binding in this region.
CTCF binding site deletions reconfigure local Spatial contacts
To investigate the introduced mutations outcomes on spatial interactions of locus, we prepared capture Hi-C (cHi-C) libraries for cerebellum, kidney, and liver, using mice with two or four CTCF-sites deleted and wild-type controls (Fig. 3A-D).
In the wild-type state, the patterns of spatial contact were almost identical in all analyzed tissues. The centromeric TAD contains multiple loops. The three upstream loop anchors include: a distal TAD boundary (located between the terminal end of the Cdh23 gene and Psap gene), and two anchors within the Cdh23 gene body. These three sites are forming the loops with downstream anchors, which are in the Cdh23 promoter and within the TAD boundary located between Slc29a3 and Unc5b genes. In the telomeric TAD, the loops are formed between the TAD boundary and Unc5b promoter and between two TAD boundaries. The distal boundary is placed near the cluster of long noncoding RNA genes between Unc5b and Sgpl1 genes. It is noteworthy that almost all described loops are formed by convergent CTCF binding sites. This locus has only a single exception in this case: the CTCF-binding sites at the distal TAD boundary of the Slc29a3 TAD align in the same direction as those at the anchors it interacts with.
The spatial maps of chromatin contacts in animals with two and four CTCF-bs deletions displayed similar subtle sub-TAD perturbations across all tissues examined. The similarity of contact patterns indicate that truncating the boundary by the deletion of centromeric CTCF-bs is sufficient to drive the detectable changes of chromatin contacts, whereas excising the additional pair does not impose any further, large-scale changes on the three-dimensional organization of the locus (Fig. 3C-E). In both genotypes we confirmed that loops formed by deleted CTCF-bs and directed into the Slc29a3 TAD side were disturbed in all tissues (Fig. 3C, D, black-filled arrowheads). In wild-type animals, these loops connect the TAD boundary region with CTCF-bs in the Cdh23 gene body. At the same time, loops directed towards the opposite domain were preserved.
The chromosomal segment extending from the boundary to the Cdh23 promoter, encompassed whole Slc29a3 gene, acquired new contacts with the entire Unc5b TAD. This alteration gradually becomes more noticeable from the liver to cerebellum tissues, coinciding with overall locus activity. It is apparent that the TAD boundary has shifted to the Cdh23 promoter site, transferring Slc29a3 gene from Cdh23 and Vsir to the spatial proximity with the Unc5b gene.
In addition to disruption of some loops, we noted the formation of new, ectopic long-range interactions. These new loops occur between the inner CTCF-binding sites of the Cdh23 gene and the distant border of the Unc5b TAD, an area containing the cluster of non-coding RNA genes, commonly silenced. Notably, these interactions cross the insulator TAD boundary. Furthermore, the pattern of these new loops differs significantly between tissues (Fig. 3C, D, white-filled arrowhead).
Overall, our cHi-C analysis shows that alterations of CTCF binding sites in the Slc29a3/Unc5b locus causes changes in local chromatin architecture, disruption of loops, and formation of novel long-range interactions between gene promoters and regulatory sequences.
Alterations in Spatial Chromatin Architecture at the Slc29a3/Unc5b locus. A - Gene locations and their activity status are indicated by a color scale from green (active expression) to red (repressed state). B - Location and orientation of CTCF binding sites. C, D - cHi-C Interaction Maps, displaying spatial chromatin interactions within the Slc29a3/Unc5b locus (chr10:60,103,000–61,356,000, mm10) in the liver, cerebellum, and kidney tissues for obtained model mice and wild-type controls. Maps above the main diagonal represent mice with deletions of two (C) or four (D) CTCF-bs, while maps below the diagonal show data from wild-type mice. Black-filled arrowheads indicate chromatin loops that were disturbed in mutation genotype, white-filled arrowhead indicate differing between wild-type and CTCF-bs deletion conditions. E– Insulation score for wild-type (black), two (blue) or four (red) CTCF-bs deleted. F– Chip-seq profiles of the Unc5b/Slc29a3 locus according to ENCODE data. Each column shows data for one of three organs: Liver, Cerebellum, and Kidney. Red vertical line represent 5-kb deletion of centromeric CTCF-bs. G - Bar plots representing expression of genes within Unc5b/Slc29a3 TADs across tissues. Data is shown in TPM units, whiskers represent standard error. H, I - statistically significant changes in gene expression relative to the wild-type levels for mice with deletions of two (H) or four (I) CTCF binding sites
Reorganization of local Spatial contacts leads to changes in the transcriptional activity of nearby genes
Reorganization of spatial contacts caused by deletions of CTCF binding sites may result in alteration of gene expression in this locus. To test this hypothesis, we performed expression analysis in the obtained mice (Fig. 4B, C). Previous studies have suggested that the magnitude of gene expression changes caused by TAD boundary perturbation might be relatively small. To obtain precision necessary to detect small expression changes, we developed advanced methodology based on allelic imbalance measurement. Specifically, we assessed gene expression alterations by utilizing hybrid models from F1 offspring, produced by crossing genetically modified mice with CTCF-bs deletions maintained on C57BL/6 background with wild-type CAST mouse strains. These hybrids possess distinct alleles that differ in exonic SNPs, allowing allelic transcripts discrimination. Thus, the effects of CTCF binding sites deletion can be studied in the same tissue samples by comparing the expression of CAST and C57BL/6 alleles. This design largely protects us from many biases since alleles share trans-regulating factors, so the only factor responsible for the differences in allele expression is the cis-environment.
Transcripts were quantified by targeted RNA-seq incorporating unique molecular identifiers (UMIs). Reverse transcription was primed using gene-specific oligonucleotides bearing UMI sequences, ensuring that each cDNA molecule acquired a unique tag. After PCR amplification and sequencing, a SNP-aware analysis leveraged these UMIs to digitally count individual allelic transcript molecules (Fig. 4A). Using this strategy, we profiled three groups of male F1 hybrids, obtained by crossing CAST females with (1) males from wild-type C57BL/6 mice, and (2) C57BL/6 mice with two centromeric or (3) four (two centromeric and two telomeric) CTCF-binding sites deleted at the Slc29a3/Unc5b locus.
Using this approach, we quantified the expression of six genes within the targeted locus (Unc5b, Slc29a3, Psap, Vsir, Cdh23, and Sgpl1) in five organs (bladder, cerebellum, kidney, liver, and olfactory bulb) (Fig. 4B, C). We identified 20 cases of significant gene expression changes caused by CTCF-bs deletions among 60 analyzed cases (FDR = 0.1). As we expected from previous studies, the alterations in gene transcription observed were modest, not surpassing a 50% change. Intriguingly, the nature of these changes showed a tissue-specific pattern with variation not just in magnitude but also in the direction of changes. For example, the Slc29a3 gene decreases its expression in the kidney, while increases expression levels in the cerebellum. To confirm the observed changes in gene expression levels, we performed digital PCR on the samples with four CTCF-bs deleted (Fig. 4C).
Transcription changes at the Slc29a3/Unc5b locus. A - graphical scheme illustrating the allele imbalance quantification in hybrids using UMI-assisted targeted RNA-seq. B, C - Heatmap representations of locus gene expression changes observed via UMI-assisted targeted RNA-seq for two (B) or four (C) CTCF-bs deletion versus wild-type alleles. D - Heatmap representations of locus gene expression changes measured via digital PCR for four CTCF-bs deletion mice versus wild-type mice. E, F - Enformer in silico predictions of the transcriptional effect of deletions of two (D) or four (E) CTCF-bs versus wild-type. G, H– AlphaGenome in silico predictions of the transcriptional effect of deletions of two (F) or four (G) CTCF-bs versus wild-type
For all heatmaps expression is shown as a percentage of changes compared to wild-type, NS indicates non-significant results (FDR > 0.1).
Genomic foundation models fails to predict expression changes caused by CTCF mutations
Next, we aimed to compare our experimental data with predictions made by the novel deep learning model Enformer [63] and AlphaGenome [64], which has been developed to predict epigenetic characteristics, including gene expression level, from DNA sequences. Our objective was to evaluate the applicability of such models for studying the regulatory function of CTCF-binding sites. The authors of the Enformer paper demonstrated that enhancer and insulator sequences contribute most significantly to the transcription level prediction accuracy. We hypothesized that applying the Enformer and AlphaGenome models would enable us to predict gene expression changes resulting from CTCF-binding site (CTCF-bs) mutations across various mouse cell types.
We used both models to simulate the deletion of CTCF‑binding sites and to predict how gene expression would change across the locus. Since Enformer accepts only a 200 kb sequence window, we could predict expression levels for just three genes, which lie closest to the engineered deletions (Cdh23, Slc29a3, and Unc5b). AlphaGenome handles a 1 Mb window, so we generated predictions for all genes within that span.
We predicted changes in gene expression for all genes and organs under two types of mutations: 2 CTCF-bs deletions and 4 CTCF-bs deletions (Fig. 4E, F for Enformer, Fig. 4G, H for AlphaGenome). The differences in predicted gene expression changes between these two mutation types were moderate for both used models. While Enformer yields largely similar predictions across cell types, consistently predicting down‑regulation of every gene in all cell types, with only the size of the decrease varying, AlphaGenome predict changes that vary in both magnitude and direction, demonstrating its greater sensitivity to cell‑type–specific effects.
To evaluate the predictive accuracy of the models, we performed Pearson’s chi-squared test to assess how well the models predicted the direction of gene expression changes among different genes and cell types. For Enformer, the test yielded a p-value of 0.24, and 0.33 for AlphaGenome, indicating low concordance between the predicted and experimental data.
Notably, recent studies [65,66,67] have also reported that Enformer does not always correctly predict the direction and magnitude of gene transcription changes in individuals. These researchers assessed Enformer’s predictive accuracy for specific genotypes using the RosMAP and Geuvadis databases, which include whole genome sequencing and gene transcriptional activity data from a single cell type for individuals in the 1000 Genomes Project. The Enformer model often failed to capture minor variations in gene expression between healthy individuals.
While AlphaGenome is a clear improvement over previous models, it retains some of their limitations. It still cannot fully account for the cell type specificity or the effects of individual genetic variation, as the authors themselves acknowledge [64]. Moreover, the mutations we study affect long-range regulatory elements, making them inherently difficult to model and interpret compared with intragenic variants. Based on our data and published findings, we conclude that the current genomic foundation models are not suitable for predicting moderate changes in gene expression.
Transcriptional effects rarely align with known mechanisms
We next aimed to fit observed in NGS-experiment differences in gene expression to the known models of interactions between promoters and regulatory elements. Since we observed tissue-specific patterns of gene expression changes, we assume that promoter-enhancer interactions in this locus are influenced by the local epigenetic context and are presumably determined by the gene’s transcription activity in the particular tissue.
Case I. Expression of genes separated by TAD boundary becomes more similar after CTCF-bs deletion. The pattern of expression changes of closely located Slc29a3 and Unc5b genes suggests that their cis-regulatory landscapes may interfere with each other after TAD boundary disruption. In the kidney, we detected almost a 40% decrease in Slc29a3 expression (Figs. 3H and I and 4B and C). In the same tissue, we observe the upregulation of Unc5b. The increase of Unc5b expression ranges from 20% in animals with 2 CTCF-bs deleted to 40% in animals with 4 CTCF-bs detected. Notably, in this organ Slc29a3 is normally activated, while the Unc5b is repressed, as suggested by its wild-type levels of expression (Fig. 3G). Thus, we assume that when the insulator boundary between these genes is disturbed, they are influenced by the regulatory elements of each other. As a result, the expression of these genes is “averaging”.
Case II. Enhancer hijacking causes upregulation of low expressed genes. We detected approximately a 15% increase in Slc29a3 expression in the cerebellum, where the expression of the Unc5b gene is significantly higher compared to its average level across mouse tissues. This suggests that Slc29a3 expression in this tissue is influenced by active cis-factors, for example Unc5b active super enhancer signatures within its first intron according to the ENCODE database (Fig. 3F). The Slc29a3 gene, by entering into spatial interaction with the Unc5b gene, also interacts with its chromatin environment, which explains the increase in its activity. In contrast to the aforementioned expression “averaging” observed in the kidney, there is no detectable decrease in the expression of the Unc5b gene, which can be attributed to the specific mode of its enhancer action or the small scales of these changes.
In addition to the two cases described above, there are many other examples of gene expression changes that do not fit well into known mechanisms of gene regulation.
In the liver, both Unc5b and Slc29a3 genes are repressed; however, we observe weak Slc29a3 expression increase upon boundary disruption. Increased Slc29a3 expression can not be explained by the interaction of the Slc29a3 gene with regulatory elements from the Unc5b spatial environment, because there are no candidate elements based on available ChIP-seq data, and it is not clear why if these elements exist in the Unc5b environment they do not cause active expression of the Unc5b gene.
A possible explanation for the increased Slc29a3 transcription is that centromeric CTCF-bs may exert a CTCF-mediated repressive effect, given the limited evidence that CTCF itself can act as a repressor. However, in this case, the distant sites are located much further from the Slc29a3 promoter than in documented examples of CTCF-mediated repression, and the observed effects in other tissues do not fully support this mechanism. Therefore, we find insufficient evidence for a CTCF repression model in this context.
In the cerebellum, we observed a significant reduction of the Cdh23 expression. There are two known mechanisms of gene silencing caused by TAD boundary disruption: disconnection of the promoter from the regulatory element and spreading of heterochromatin. We did not find any obvious candidate element with an enhancer signature that changes contact frequency with Cdh23 promoter upon TAD disruption.
As shown by Hi-C experiments, the disruption of the TAD boundary led to the emergence of ectopic long-range interactions between the Cdh23 gene body and Unc5b TAD distal boundary. Noteworthy that these loops are observed only for tissues where Cdh23 is expressed (for cerebellum and kidney, but not for the liver). We speculate that these novel interactions may be associated with the propagation of polycomb-mediated chromatin states (H3K27me3), which are known to form analogous loops in neural tissue [68]. This mechanism could potentially account for the downregulation of Cdh23 in the cerebellum. However, we did not detect any repressive chromatin marks at the anchors of the newly identified Cdh23 loop (Fig. 3F).
Among the genes we examined, Psap showed the most intriguing behavior. Although it is the farthest from the mutated TAD boundary, Psap is down-regulated in the kidney and up-regulated in the liver of animals with two deleted CTCF-bs. By contrast, its expression is normal on the allele lacking four CTCF-bs. Interestingly, this pattern closely mirrors that of Slc29a3. A similar shift may also occur for Vsir and Sgpl1 in the kidney, liver, and olfactory bulb. These pronounced, non-linear differences between the two-site and four-site deletions are puzzling, yet our cHi-C maps provide no direct mechanistic explanation. Finally, the changes observed for Unc5b and Sgpl1 in the bladder, and for Cdh23 in the olfactory bulb, remain unexplained with the data currently available.
In general, changes in gene expression tend to be opposite to their baseline activity level in a given tissue. Thus, in cases where expression levels change, active genes typically decrease their expression, while repressed genes show an increase.
Discussion
Here we obtained two mouse lines carrying deletions of two or four CTCF-bs between Slc29a3 and Unc5b genes. We have shown that disruption of the TAD boundary, formed by these CTCF-bs, led to the reorganization of spatial contacts with a fusion of the Slc29a3 gene region to the Unc5b-containing domain and establishing new inter-TAD loops which have unknown mechanisms of formation. This reorganization has a tissue-specific manner, manifesting more in actively transcribed states of locus.
During line generation, we used a multiplex modification of adjacent CTCF-binding sites by CRISPR/Cas9 gene editing. Our strategy utilized ssODNs as HDR templates due to their predictable mutagenesis pattern [69]. However, this strategy proved to be ineffective because it was outcompeted by the MMEJ DNA repair pathway. Nevertheless, we were able to obtain allele carrying MMEJ-induced mutations disrupting two CTCF-bs simultaneously. Thus, we do not recommend the use of ssODN-facilitated mutagenesis in murine zygotes to create CTCF motif-disrupting INDELs. Instead, MMEJ pathway showed a well-predictable specter of generated mutations that could be exploited for such purposes, even in a multiplexed manner.
We demonstrated that the deletions we obtained disrupt the TAD boundary and alter the inner loop structure of the Slc29a3 TAD. In the organs analyzed, there was no fusion of domains; however, the boundary shifted to the inner CTCF-binding site in the promoter of the Cdh23 gene. This shift transferred the DNA region containing Slc29a3 into the new cis-environment of the Unc5b TAD. Additionally, we observed the establishment of new long-range inter-TAD spatial contacts, the appearance of which cannot be readily explained by any currently known mechanisms.
We developed a sequencing-based method allowing us to precisely quantify gene expression alterations caused by TAD reorganization. We showed that this quantification strategy is well commutable with such a precise method as digital PCR, and therefore could be applied for the assessment of any other cis-acting mutation consequences.
We revealed a wide range of gene expression alterations, despite that, as we expected, expression modulation has a low amplitude. These results agreed with other experiments even with ones that investigated cases of developmental genes [8, 10]. Such a low impact on the molecular phenotype challenges the conservation of CTCF-bs sites and TAD structure. This discrepancy still remains a major obstacle for the whole theory of TAD function, and needs more detailed investigations to be released.
The revealed pattern of changes proved to be tissue-specific not only in terms of the amplitude of changes but also in the direction of these changes. Although it seems clear that the changes caused by TAD boundary disruption are conditioned by the epigenetic state of the locus, it is still hard to predict. Neither empirical models - such as the enhancer hijacking model - nor statistical models such as Enformer and AlphaGenome can explain the full spectrum of the observed expression alterations. These results highlight the need for new tools for non-coding genetic variants of interpretation, required by medical and evolutionary genomics [11, 70].
The hypothesis derived from our data is that actively transcribed genes tend to lose their level of expression, whereas repressed genes, oppositely, often gain new enhancer interactions when regions with different chromatin states are merged together. This effect was observed in our experiment for the interaction of Slc29a3 and Unc5b genes that tend to averaging of its expression levels after losing insulation between them, and for Cdh23 gene that, being normally highly activated in the cerebellum, lost its expression level upon TAD structure reorganization. Thus, depending on the locus’ epigenetic state and differentiation trajectory, it is expected that TAD boundary lesion can have a tissue-specific effect on gene expression varying not only by the amplitude, but, more intriguing, by the direction of alteration.
An intriguing discovery is that the gradual deletions of boundary-forming CTCF-bs do not consistently result in a proportional enhancement of the effect. Specifically, in the cases involving Psap and Vsir, partial disruption of the boundary had a more significant effect than its complete disappearance. Additionally, we observed that the effects on Slc29a3 did not escalate in tandem with the progression of boundary disruption. This indicates that deletions involving two and four CTCF-bs may impact different TAD boundary functions rather than simply exacerbating a single effect. Moreover, this observation leads to the fascinating hypothesis that bidirectional TAD boundaries operate under a different functional logic compared to those with only co-directed CTCF-bs.
An intriguing observation from our study is that the gradual deletion of boundary-forming CTCF-bs does not consistently lead to a proportional increase in gene expression changes. In particular, for Psap and Vsir, partial boundary disruption had a more pronounced effect than its complete disappearance, while for Slc29a3, the transcriptional response did not escalate in parallel with the degree of boundary weakening. These findings suggest that CTCF clusters may serve multiple, partially redundant roles at TAD boundaries, a notion supported by prior studies showing that individual CTCF sites often contribute non-equivalently to insulation [71, 72]. Despite this we have shown that cHi-C maps looked indistinguishable from the 4-site allele at the available 5 kb resolution. This mismatch cannot be easily explained. Possibly, subtle rearrangements at the scale of a few nucleosomes (∼1 kb) or transient promoter hubs can strongly modulate transcription yet remain invisible for standard cHi-C. Such micro-architectural changes would selectively affect genes embedded in the hub (e.g. Psap and Slc29a3) without altering the broader TAD contacts, and require more detailed strategies like DNAse Hi-C [73], Micro-C [74, 75], Hi-ChIP [76] or Hi-Track [77] combined with targeted enrichment methods [78]. Although our experiment was carefully controlled and the result was FDR-corrected, no RNA-quantification platform is immune to artefacts. At the modest effect sizes we observed, platform-specific bias in either NGS or dPCR can still turn noise into apparent expression changes, so minor shifts should be treated as provisional until higher-precision methods still do not exist.
In summary, here we demonstrated that the impact of TAD boundary disruption gene transcription regulation is highly tissue-specific, with their magnitude and direction varying from tissue to tissue, making it challenging to predict the consequences of TAD boundary mutations. This indicates a complex interaction between TAD structure and epigenetic regulation during the cell differentiation process in establishing gene expression patterns. Additionally, many of the observed effects do not align with any known mechanisms, prompting new questions for further investigation.
Materials and methods
Mouse lines and microinjections
We used the C57BL/6 mouse line as a basis for derivation of mutant mice (obtaining zygotes and backcrossing). We used pseudopregnant female CD-1 mice for transplantation of microinjected zygotes. Cytoplasmic microinjection of zygotes was performed using standard techniques that are widely used in transgenesis [61, 79]. Food and water were available for animals ad libitum.
All animal procedures were approved by the Ethics Committee of the institute of Cytology and Genetics (protocol #65, issued October, 09, 2020). Animals were obtained and handled in the SPF Animal Facilities of ICG SB RAS.
CRISPR/Cas9 SgRNA construction
We designed CRISPR sgRNAs for desired regions using web-tool «Benchling» (https://benchling.com). Then we simulated desired mutation sequences and designed ssODN for it with 60 bp homology arms and primers for genotyping (Supplementary Table 1). In the ssODN sequence for Left, Middle and Right loci we introduced HindIII recognition site for genotyping convenience. DNA templates for sgRNA synthesis were obtained via PCR using oligonucleotides containing T7 promoter, guide sequence and sgRNA scaffold. PCR products were used for an in vitro transcription (MEGAshortscript™ T7 Transcription Kit, Ambion). Obtained RNA was purified on MEGAclear™ Transcription Clean-Up Kit (Ambion) columns, and mixed with spCas9 mRNA (GeneArt™ CRISPR Nuclease mRNA, Thermo, USA) in ratios 8,2 pmol each sgRNA and 16,4 pmol Cas9 mRNA. ssODNs were purified on MEGAclear™ Transcription Clean-Up Kit (Ambion) columns and mixed in 250 ng/µl final concentration. RNA mix and ssODN were mixed immediately before microinjection.
Genotyping
Animal genotyping was performed using three week old animals. DNA from the tail tip tissue was purified using the Phenol-Chloroform method and amplified by standard PCR protocol using corresponding primers (Supplementary Table 1).
Deletion chr10:60,755,585 − 60,761,088 was detected using primers UNC5B-M1-F and UNC5B-R1-R (product length 300 bp). Wild type allele was detected using PCR from primer pairs UNC5B-M1-F and UNC5B-M1-R (475 bp); UNC5B-R1-F and UNC5B-R1-R (450 bp). Inversions and duplications of region chr10:60,755,585 − 60,761,088 were accessed by UNC5B-M1-F and UNC5B-R1-F; UNC5B-M1-R and UNC5B-R1-R; UNC5B-M1-R and UNC5B-R1-F primer pairs.
For detection of mutations in Left, Middle and Right loci primers UNC5B-L2-F and UNC5B-L2-R (461 bp); UNC5B-M2-F and UNC5B-M2-R (276 bp); UNC5B‑R2-F and UNC5B‑R2-R (324 bp) were used. PCR products were digested by HindIII restriction enzyme to detect HDR outcomes, or PstI (Left), SacI (Middle), or HaeIII (Right) to detect INDELs. The products of these reactions were analyzed by electrophoresis in 2% agarose gel.
NGS libraries construction for genotyping
We designed primer pairs for the Left, Middle, and Right regions so that the PCR product is 250 bp length. For each region we purchased four forward and five reverse barcoded primers, so the combination of them allowed us to multiplex NGS sequencing of 20 mice. PCR products from each region and each F0 mouse DNA sample were mixed and prepared for sequencing using Kapa Hyper Prep Kit (Roche, #KK8504) and a KAPA Single-Indexed Adapter Set A, (Roche, #KK8701) according to the manufacturer protocol without post-ligation PCR to prevent the formation of PCR chimeras. Library was sequenced in BGI and we obtained 25.5 mln 150 bp length paired reads.
We used cutadapt to remove adapters. Reads were mapped with BWA MEM utility using default configuration. Then demultiplexed we assumed that bases on specific positions in mapped reads are mouse-specific primer barcodes. All genomic variants present in reads, regardless of their frequency in mapped NGS data, were documented and scored according to the number of reads supporting the variant. We visualized the histogram of variant frequencies and defined an empirical threshold, thus filtering out all variants supported by less than 2000 reads. The resulting set of genomic variants was thus compiled and presented in the work.
Predictions of mutation variant frequency were obtained using inDelphi online software (https://indelphi.giffordlab.mit.edu). Predictions of the CTCF-binding score of mutation variants were obtained using CTCFBSDB 2.0 in silico CTCF-sb prediction tool (https://insulatordb.uthsc.edu).
ChIP-Seq
Freshly dissected liver samples (about one gram) were grinded with a razor, and homogenized in Dounce homogenizer in 1 ml of 3% formaldehyde, then samples were transferred in 50 ml tube and volume of 3% formaldehyde was adjusted to 40 ml. After 10 min incubation at RT the fixator was quenched by addition of glycine to 125 mM final concentration, incubated 10 min and washed twice by ice-cold PBS, then snap-freezed and stored at -80 °C. Frozen samples were thawed and lysed for 30 min in a lysis buffer (10 mM Tris-HCl, 1 mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate, protease inhibitors) supplemented with 0.5% SDS. Chromatin was shared using a Bandelin Sonopulse sonicator (70% power within 11 cycles of 30/90sec ON/OFF). 20–40 µg of fixed chromatin were used per one ChIP. Before incubation with specific antibodies, chromatin was diluted with a lysis buffer up to 0.1–0.2% SDS and pre-cleared by incubation with Protein A magnetic beads (New England Biolabs, S1425S) for 2 h at 4 °C with slow rotation. During this time, another aliquot of Protein A magnetic beads was washed in PBS, combined with 5 µg target antibodies (Cell Signaling Technology #2899), and incubated for 2 h at 40 C with slow rotation. Beads were removed and pre-cleared chromatin was immunoprecipitated with antibody/magnetic beads complexes and incubated overnight at 4 °C with slow rotation. The next day, the beads were thoroughly washed in a series of buffers (Buffer 1: 10 mM Tris-HCl, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, protease inhibitors; Buffer 2: 500 mM NaCl, 10 mM Tris-HCl, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, protease inhibitors; Buffer 3: 0.25 M LiCl, 10 mM Tris-HCl, 1 mM EDTA, 0.5% NP-40, 0.5% sodium deoxycholate; Buffer TE/Triton: 10 mM Tris-HCl, 1 mM EDTA, 1% Triton X-100; TE buffer: 10 mM Tris-HCl, 1 mM EDTA). Cross-links were removed and DNA was eluted in 100 µL elution buffer (10 mM Tris-HCl, 1 mM EDTA, 1% SDS) by incubation at 65 °C for 14 h. After treatment with RNAse A (New England Biolabs, T3018) and Proteinase K (New England Biolabs, P8107S), magnetic beads were removed. DNA was extracted using ChIP DNA Clean & Concentrator columns (Zymo Research, D5205). ChIP-seq libraries were prepared for sequencing using a Kapa Hyper Prep Kit (Roche, #KK8504) and a KAPA Single-Indexed Adapter Set A, (Roche, #KK8701). Libraries were sequenced on the DNBSEQ sequencing platform. ChIP-Seq data were processed by the standard ENCODE pipeline (https://github.com/kundajelab/chipseq_pipeline).
cHi-C
Capture Hi-C library preparation was performed according to the original Hi-C 2.0 protocol [80] with minor adaptations for tissue samples proceeding:
Tissue samples were manually minced using a sharp blade and then suspended in 2 ml of PBS. The mixture was homogenized using a glass homogenizer for 20 strokes on ice. Formaldehyde was added to the final concentration of 2%, and the mixture was incubated for 10 min at room temperature with rotation. Glycine was then added to a final concentration of 250 mM and incubated for 10 min under the same conditions. The samples were centrifuged at 1000 g for 5 min, followed by two washes with cold PBS, after which the supernatant was discarded. The pellet was quickly frozen in liquid nitrogen or directly resuspended in a cold lysis buffer (10 mM Tris, 10 mM NaCl, 0.2% Igepal CA-630). Homogenization was performed using a syringe, and the lysate was filtered through muslin. The homogenate was incubated for 30 min on ice, centrifuged at 600 g for 5 min, and then washed twice with cold lysis buffer and once with NeBuf 3.1 containing 0.3% SDS. The pellet was resuspended in NeBuf 3.1 with 0.1% SDS and incubated for 30 min at 37 °C, followed by the addition of 200 µl of 1.5% Triton X-100 and further incubation under the same conditions. DpnII enzyme (5 µl) was added, and the mixture was digested overnight at 37 °C. After a 20-minute incubation at 65 °C, the mixture was centrifuged to remove the supernatant. The DNA was then prepared for end repair by resuspension in 150 µl of DNA end repair mix (50 µM dGTP, 50 µM dATP, 50 µM dTTP, 50 µM dCTP-15bio, 1X NEBuffer 2.1, 25U PolII Klenow fragment), and incubated for 4 h at 23 °C. Ligation was performed by adding 1 ml of Ligation mix (1X T4 Ligase Buffer, 1% Triton X100, 5% PEG, 100 µg/ml BSA, 1 mM ATP, 4000U T4 Ligase) and incubating overnight at 16 °C. The procedure was then continued as specified in the original protocol. Sequencing libraries were prepared using the KAPA Hyper Prep kit. The hybridization of libraries with RNA probes was performed according to the myBaits Manual v4.01 (Arbor Biosciences). Enrichment probes were designed over the region chr10:60,103,000–61,356,000, mm10.
dPCR
The sample sizes in each group and organ combination varied due to differences in sample preparation quality and the availability of experimental mice. However, each group in this experiment consistently included at least five samples.
To perform digital PCR with fluorescent probes, the QIAcuity® Probe PCR Kit (Qiagen) was used. According to the protocol, for one 12 µl reaction, the following were mixed: 3 µl of 4x Probe PCR Master Mix, 1 µl of each corresponding primer (10 µM) and probe (10 µM), and 0.5 µl of cDNA sample. The reaction mix was transferred to reaction nanoplates and sealed with specialized film (from the QIAcuity Nanoplate Seal, Qiagen). The plates were placed in the QIAcuity One device (with a 5-channel sensor), and the amplification reaction was performed as follows: reaction initiation at 95 °C for 2 min; 35 cycles at 95 °C for 15 s, and 62 °C for 30 s.
Allele-specific NGS-based expression measurement
The sample sizes in each group and organ combination varied due to differences in sample preparation quality and the availability of experimental mice. However, the control group consistently included at least five samples, while both groups of mice carrying two or four CTCF-binding site deletions included at least six samples each.
To construct NGS libraries to get information about comparative allele expression levels, we first conduct reverse transcription from gene-specific primers, containing UMI and sequencing adapter part at its 5’-end, using RNAScribe reverse transcription kit (Biolabmix, Russia) using the manufacturer’s protocol. Next, we amplified cDNA using gene-specific second strand synthesizing primer and primer annealing at the sequencing adapter part of the reverse transcription primer. These amplicons were indexed using the second step of PCR using home-made indexing primers with completion of sequencing adapter sequences, then purified by SPRI beads and sequenced in paired-end mode using BGI service on the DNBSEQ sequencing platform. We obtained about 30,000 read pairs per transcript.
Sequencing data was aligned on expected transcript sequences using the bowtie2 tool with default parameters. UMI sequences were extracted by umi_tools extract with bc-pattern = NNNNXNNNNXNNNN and then deduplicated by umi_tools dedup. SNP data was collected by bcftools mpileup with -a AD parameter, and then VCF files were parsed using the home-made Python script. For each animal, counts of C57BL/6 SNP were normalized by division on CAST SNP counts, resulting in a comparative level of allelic expression. The significance of these distribution differences were assessed by the Mann-Whitney test and then filtered by Benjamini/Hochberg FDR correction with 0.1 p-value threshold.
Data analysis
Hi-C data was analyzed using Juicer software [81] with slight modifications - particularly, we analyzed Hi-C data quality using custom DE and cis/trans metrics as described at the quality control steps described in [73]. ChIP-seq data analysis was performed according to ENCODE pipelines, as described in our previous works [82].
To quantify the local segregation of chromatin we use TAD directionality score introduced by [83] slightly adjusting parameters: we first isolated the sub-matrix from the.hic file with hic-straw (with mm10 coordinates chr10:60000000:61300000 vs. chr10:60000000:61300000). For every 5-kb bin i, we then sum two sets of contacts that form a check-mark-shaped template offset from the main diagonal by 50 kb: [1] contacts between the 25-kb region 50 kb upstream of i and the 250-kb region 50 kb downstream of i; and [2] the reciprocal contacts between the 250-kb region upstream and the 25-kb region downstream. The raw check-mark isolation score is the sum of these two counts, and the final score is obtained by dividing each value by the mean for the capture region.
Data Availability
Sequencing data generated in this study are accessible via the NCBI BioProject PRJNA842410.
In Silico prediction changes in gene expression using the enformer model
We used the pytorch version of the deep learning Enformer model from https://github.com/lucidrains/enformer-pytorch. We predicted gene expression using as input three DNA sequences with the center in TSS of three genes (Cdh23, Slc29a3, Unc5b). We run Enformer as for reference mm10 genome sequences as for DNA sequences with modeled mutations: 2 CTCF-bs (chr10:60,755,585 − 60,761,088 deletion) and 4 CTCF-bs (chr10:60,755,585 − 60,761,088 deletion, chr10:60,775,689 − 60,775,698 deletion and chr10:60,778,725 − 60,778,738 deletion with insertion of -AA- dinucleotide). We summed 4 bins around the TSS bin for predicted wild-type data and predicted mutated data, then we calculated the ratio between these two values to assess changes in gene expression for each gene. We used 5 target cell types from mouse Enformer target data: ‘liver, adult pregnant day01’, ‘urinary bladder, adult’, ‘kidney, neonate N30’, ‘cerebellum, adult’, and ‘olfactory brain, adult’.
In Silico predictions using alphagenome
We used the score_variant function from AlphaGenome with the GeneMaskLFCScorer to assess the functional impact of genomic variants. For two-site deletions, we replaced the deleted region with a single nucleotide. For four-site deletions, since AlphaGenome’s score_variant does not support multiple variants located in cis, we replaced each deleted region with an equivalent number of N bases and submitted the resulting sequence as alternative allele for a single variant spanning mm10:60755585–60,778,739.
Predictions accessed using RNA-seq output channels from the following tissues: urinary bladder, cerebellum, liver, kidney, and olfactory bulb. We used the poly(A) + RNA‑seq assay and selected the track_strand matching the gene’s transcriptional orientation. To convert LFC scores to percent effect, we applied the transformation:
percent_change = (exp(score)– 1) × 100.
Data availability
Sequencing data generated in this study are accessible via the NCBI BioProject PRJNA842410.
References
Kabirova E, Nurislamov A, Shadskiy A, Smirnov A, Popov A, Salnikov P, et al. Function and evolution of the loop extrusion machinery in animals. Int J Mol Sci. 2023;24(5):5017.
Rao SSP, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, et al. A 3D map of the human genome at Kilobase resolution reveals principles of chromatin looping. Cell. 2015 July;162(3):687–8.
Nuriddinov M, Fishman V. C-InterSecture-a computational tool for interspecies comparison of genome architecture. Bioinforma Oxf Engl. 2019;35(23):4912–21.
Fishman V, Battulin N, Nuriddinov M, Maslova A, Zlotina A, Strunov A, et al. 3D organization of chicken genome demonstrates evolutionary conservation of topologically associated domains and highlights unique architecture of erythrocytes’ chromatin. Nucleic Acids Res. 2019;47(2):648–65.
Krefting J, Andrade-Navarro MA, Ibn-Salem J. Evolutionary stability of topologically associating domains is associated with conserved gene regulation. BMC Biol. 2018;16(1):87.
Lupiáñez DG, Kraft K, Heinrich V, Krawitz P, Brancati F, Klopocki E, et al. Disruptions of topological chromatin domains cause pathogenic rewiring of Gene-Enhancer interactions. Cell. 2015;161(5):1012–25.
Salnikov P, Korablev A, Serova I, Belokopytova P, Yan A, Stepanchuk Y, et al. Structural variants in the Epb41l4a locus: TAD disruption and Nrep gene misregulation as hypothetical drivers of neurodevelopmental outcomes. Sci Rep. 2024;14(1):5288.
Kabirova E, Ryzhkova A, Lukyanchikova V, Khabarova A, Korablev A, Shnaider T et al. TAD border deletion at the Kit locus causes tissue-specific ectopic activation of a neighboring gene. Nat Commun [Internet]. 2024;15(1). Available from: https://doi.org/10.1038/s41467-024-48523–7
Rao SSP, Huang SC, Glenn St Hilaire B, Engreitz JM, Perez EM, Kieffer-Kwon KR, et al. Cohesin Loss Eliminates all Loop Domains Cell. 2017;171(2):305–e32024.
Despang A, Schöpflin R, Franke M, Ali S, Jerković I, Paliou C, et al. Functional dissection of the Sox9–Kcnj2 locus identifies nonessential and instructive roles of TAD architecture. Nat Genet. 2019 July;51(8):1263–71.
Fishman VS, Salnikov PA, Battulin NR. Interpreting chromosomal rearrangements in the context of 3-Dimentional genome organization: A practical guide for medical genetics. Biochem Biokhimiia. 2018;83(4):393–401.
Weischenfeldt J, Dubash T, Drainas AP, Mardin BR, Chen Y, Stütz AM, et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat Genet. 2016;49(1):65–74.
Wang X, Xu J, Zhang B, Hou Y, Song F, Lyu H, et al. Genome-wide detection of enhancer-hijacking events from chromatin interaction data in rearranged genomes. Nat Methods. 2021 June;18(6):661–8.
Mortenson KL, Dawes C, Wilson ER, Patchen NE, Johnson HE, Gertz J et al. 3D genomic analysis reveals novel enhancer-hijacking caused by complex structural alterations that drive oncogene overexpression. BioRxiv Prepr Serv Biol. 2024;2024.01.23.576965.
Kragesteen BK, Spielmann M, Paliou C, Heinrich V, Schöpflin R, Esposito A, et al. Dynamic 3D chromatin architecture contributes to enhancer specificity and limb morphogenesis. Nat Genet. 2018 Sept;50(10):1463–73.
Valton AL, Dekker J. TAD disruption as oncogenic driver. Curr Opin Genet Dev. 2016;36:34–40.
Katainen R, Dave K, Pitkänen E, Palin K, Kivioja T, Välimäki N, et al. CTCF/cohesin-binding sites are frequently mutated in cancer. Nat Genet. 2015 July;47(7):818–21.
Franke M, Ibrahim DM, Andrey G, Schwarzer W, Heinrich V, Schöpflin R, et al. Formation of new chromatin domains determines pathogenicity of genomic duplications. Nature. 2016;538(7624):265–9.
Cavalheiro GR, Pollex T, Furlong EE. To loop or not to loop: what is the role of tads in enhancer function and gene regulation? Curr Opin Genet Dev. 2021;67:119–29.
Gombert WM, Krumm A. Targeted Deletion of Multiple CTCF-Binding Elements in the Human C-MYC Gene Reveals a Requirement for CTCF in C-MYC Expression. PLOS ONE. 2009 July 1;4(7):e6109.
Hyle J, Zhang Y, Wright S, Xu B, Shao Y, Easton J, et al. Acute depletion of CTCF directly affects MYC regulation through loss of enhancer–promoter looping. Nucleic Acids Res. 2019;47(13):6699–713.
Ushiki A, Zhang Y, Xiong C, Zhao J, Georgakopoulos-Soares I, Kane L, et al. Deletion of CTCF sites in the SHH locus alters enhancer–promoter interactions and leads to acheiropodia. Nat Commun. 2021;12(1):2282.
Kane L, Williamson I, Flyamer IM, Kumar Y, Hill RE, Lettice LA et al. Cohesin is required for long-range enhancer action at the Shh locus. Nat Struct Mol Biol. 2022 Sept 1;29(9):891–7.
Gerhards N. Effects of CTCF binding site deletions on genome architecture and gene expression in the Epha4 locus [PhD thesis]. [Berlin]: Charité– Berlin University Medicine; 2022.
Fang H, Tronco AR, Bonora G, Nguyen T, Thakur J, Berletch JB et al. CTCF-mediated insulation and chromatin environment modulate Car5b escape from X inactivation. BioRxiv. 2023;2023.05.04.539469.
Willi M, Yoo KH, Reinisch F, Kuhns TM, Lee HK, Wang C et al. Facultative CTCF sites moderate mammary super-enhancer activity and regulate juxtaposed gene in non-mammary cells. Nat Commun 2017 July 17;8(1):16069.
Tsujimura T, Klein FA, Langenfeld K, Glaser J, Huber W, Spitz F. A discrete transition zone organizes the topological and regulatory autonomy of the adjacent Tfap2c and Bmp7 genes. PLoS Genet. 2015;11(1):e1004897.
Tadagavadi RK, Wang W, Ramesh G. Netrin–1 regulates Th1/Th2/Th17 cytokine production and inflammation through UNC5B receptor and protects kidney against ischemia-reperfusion injury. J Immunol Baltim Md 1950 2010 Sept 15;185(6):3750–8.
Kaur S, Abu-Abab MS, Singla S, Yeo SY, Ramchandran R. Expression pattern for unc5b, an axon guidance gene in embryonic zebrafish development. Gene Expr. 2018 July;5(6):321–7.
Pradella D, Deflorian G, Pezzotta A, Di Matteo A, Belloni E, Campolungo D, et al. A ligand-insensitive UNC5B splicing isoform regulates angiogenesis by promoting apoptosis. Nat Commun. 2021;12:4872.
Lu X, Le Noble F, Yuan L, Jiang Q, De Lafarge B, Sugiyama D, et al. The Netrin receptor UNC5B mediates guidance events controlling morphogenesis of the vascular system. Nature. 2004;432(7014):179–86.
Kong C, Zhan B, Piao C, Zhang Z, Zhu Y, Li Q. Overexpression of UNC5B in bladder cancer cells inhibits proliferation and reduces the volume of transplantation tumors in nude mice. BMC Cancer. 2016;16(1):892.
Lee HK, Seo IA, Seo E, Seo SY, Lee HJ, Park HT. Netrin–1 induces proliferation of Schwann cells through Unc5b receptor. Biochem Biophys Res Commun. 2007;362(4):1057–62.
Ahn EH, Kang SS, Qi Q, Liu X, Ye K. Netrin1 deficiency activates MST1 via UNC5B receptor, promoting dopaminergic apoptosis in parkinson’s disease. Proc Natl Acad Sci U S 2020 Sept 29;117(39):24503–13.
Huang L, An X, Zhu Y, Zhang K, Xiao L, Yao X, et al. Netrin–1 induces the anti-apoptotic and pro-survival effects of B-ALL cells through the Unc5b-MAPK axis. Cell Commun Signal CCS. 2022;20(1):122.
Wu S, Guo X, Zhou J, Zhu X, Chen H, Zhang K et al. High expression of UNC5B enhances tumor proliferation, increases metastasis, and worsens prognosis in breast cancer. Aging 2020 Sept 9;12(17):17079–98.
Zeng Z, Yu J, Jiang Z, Zhao N. Oleanolic acid (OA) targeting UNC5B inhibits proliferation and EMT of ovarian cancer cell and increases chemotherapy sensitivity of niraparib. J Oncol. 2022;2022:5887671.
Baldwin SA, Yao SYM, Hyde RJ, Ng AML, Foppolo S, Barnes K, et al. Functional characterization of novel human and mouse equilibrative nucleoside transporters (hENT3 and mENT3) located in intracellular membranes. J Biol Chem. 2005;280(16):15880–7.
Elwi AN, Damaraju VL, Baldwin SA, Young JD, Sawyer MB, Cass CE. Renal nucleoside transporters: physiological and clinical implications. Biochem Cell Biol Biochim Biol Cell. 2006;84(6):844–58.
Hsu CL, Lin W, Seshasayee D, Chen YH, Ding X, Lin Z, et al. Equilibrative nucleoside transporter 3 deficiency perturbs lysosome function and macrophage homeostasis. Science. 2012 Jan 6;335(6064):89–92.
Nair S, Strohecker AM, Persaud AK, Bissa B, Muruganandan S, McElroy C et al. Adult stem cell deficits drive Slc29a3 disorders in mice. Nat Commun. 2019 July 3;10(1):2943.
Kang N, Jun AH, Bhutia YD, Kannan N, Unadkat JD, Govindarajan R. Human equilibrative nucleoside Transporter–3 (hENT3) spectrum disorder mutations impair nucleoside transport, protein localization, and stability. J Biol Chem 2010 Sept 3;285(36):28343–52.
Melki I, Lambot K, Jonard L, Couloigner V, Quartier P, Neven B, et al. Mutation in the SLC29A3 gene: A new cause of a monogenic, autoinflammatory condition. Pediatrics. 2013;131(4):e1308–13.
Cagdas D, Surucu N, Tan Ç, Özgül R, Akkaya-Ulum Z, Aydinoglu A et al. Autoinflammation in addition to combined immunodeficiency: SLC29A3 gene defect. Mol Immunol [Internet]. 2020 [cited 2024 Oct 10];121. Available from: https://avesis.hacettepe.edu.tr/yayin/310a77be–0f22–406b–93b7-c81a6437f718/autoinflammation-in-addition-to-combined-immunodeficiency-slc29a3-gene-defect
Bolz H, von Brederlow B, Ramírez A, Bryda EC, Kutsche K, Nothwang HG, et al. Mutation of CDH23, encoding a new member of the Cadherin gene family, causes Usher syndrome type 1D. Nat Genet. 2001;27(1):108–12.
Bork JM, Peters LM, Riazuddin S, Bernstein SL, Ahmed ZM, Ness SL, et al. Usher syndrome 1D and nonsyndromic autosomal recessive deafness DFNB12 are caused by allelic mutations of the novel cadherin-like gene CDH23. Am J Hum Genet. 2001;68(1):26–37.
Liu S, Li S, Zhu H, Cheng S, Zheng QY. A mutation in the cdh23 gene causes age-related hearing loss in Cdh23(nmf308/nmf308) mice. Gene. 2012;499(2):309–17.
Han F, Yu H, Tian C, Chen HE, Benedict-Alderfer C, Zheng Y, et al. A new mouse mutant of the Cdh23 gene with early-onset hearing loss facilitates evaluation of otoprotection drugs. Pharmacogenomics J. 2012;12(1):30–44.
Wada T, Wakabayashi Y, Takahashi S, Ushiki T, Kikkawa Y, Yonekawa H, et al. A point mutation in a Cadherin gene, Cdh23, causes deafness in a novel mutant, waltzer mouse Niigata. Biochem Biophys Res Commun. 2001;283(1):113–7.
Jones SM, Johnson KR, Yu H, Erway LC, Alagramam KN, Pollak N, et al. A quantitative survey of gravity receptor function in mutant mouse strains. J Assoc Res Otolaryngol JARO. 2005;6(4):297–310.
Hulková H, Cervenková M, Ledvinová J, Tochácková M, Hrebícek M, Poupetová H, et al. A novel mutation in the coding region of the prosaposin gene leads to a complete deficiency of prosaposin and saposins, and is associated with a complex sphingolipidosis dominated by lactosylceramide accumulation. Hum Mol Genet. 2001;10(9):927–40.
O’Brien JS, Kretz KA, Dewji N, Wenger DA, Esch F, Fluharty AL. Coding of two sphingolipid activator proteins (SAP–1 and SAP–2) by same genetic locus. Science. 1988;241(4869):1098–101.
Oji Y, Hatano T, Ueno SI, Funayama M, Ishikawa K, ichi, Okuzumi A, et al. Variants in Saposin D domain of prosaposin gene linked to parkinson’s disease. Brain. 2020;143(4):1190–205.
Sun Y, Witte DP, Zamzow M, Ran H, Quinn B, Matsuda J, et al. Combined Saposin C and D deficiencies in mice lead to a neuronopathic phenotype, glucosylceramide and alpha-hydroxy ceramide accumulation, and altered prosaposin trafficking. Hum Mol Genet. 2007;16(8):957–71.
Sun Y, Zamzow M, Ran H, Zhang W, Quinn B, Barnes S et al. Tissue-specific effects of Saposin A and Saposin B on glycosphingolipid degradation in mutant mice. Hum Mol Genet 2013 June 15;22(12):2435–50.
Oya Y, Nakayasu H, Fujita N, Suzuki K, Suzuki K. Pathological study of mice with total deficiency of sphingolipid activator proteins (SAP knockout mice). Acta Neuropathol (Berl). 1998 July;96(1):29–40.
Akil O, Chang J, Hiel H, Kong JH, Yi E, Glowatzki E, et al. Progressive deafness and altered cochlear innervation in knock-out mice lacking prosaposin. J Neurosci Off J Soc Neurosci. 2006;26(50):13076–88.
Bonev B, Mendelson Cohen N, Szabo Q, Fritsch L, Papadopoulos GL, Lubling Y, et al. Multiscale 3D genome rewiring during mouse neural development. Cell. 2017;171(3):557–e57224.
Calandrelli R, Wen X, Charles Richard JL, Luo Z, Nguyen TC, Chen CJ, et al. Genome-wide analysis of the interplay between chromatin-associated RNA and 3D genome organization in human cells. Nat Commun. 2023;14(1):6519.
Shi Z, Xu J, Niu L, Shen W, Yan S, Tan Y, et al. Evolutionarily distinct and sperm-specific supersized chromatin loops are marked by Helitron transposons in xenopus tropicalis. Cell Rep. 2023;42(3):112151.
Korablev A, Lukyanchikova V, Serova I, Battulin N, On-Target CRISPR. /Cas9 activity can cause undesigned large deletion in mouse zygotes. Int J Mol Sci. 2020;21(10):3604.
Shen MW, Arbab M, Hsu JY, Worstell D, Culbertson SJ, Krabbe O, et al. Predictable and precise template-free CRISPR editing of pathogenic variants. Nature. 2018;563(7733):646–51.
Avsec Ž, Agarwal V, Visentin D, Ledsam JR, Grabska-Barwinska A, Taylor KR, et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods. 2021;18(10):1196–203.
Avsec Ž, Latysheva N, Cheng J, Novati G, Taylor KR, Ward T et al. AlphaGenome: advancing regulatory variant effect prediction with a unified DNA sequence model [Internet]. bioRxiv; 2025 [cited 2025 July 28]. p. 2025.06.25.661532. Available from: https://www.biorxiv.org/content/https://doi.org/10.1101/2025.06.25.661532v2
Sasse A, Ng B, Spiro AE, Tasaki S, Bennett DA, Gaiteri C, et al. Benchmarking of deep neural networks for predicting personal gene expression from DNA sequence highlights shortcomings. Nat Genet. 2023;55(12):2060–4.
Huang C, Shuai RW, Baokar P, Chung R, Rastogi R, Kathail P, et al. Personal transcriptome variation is poorly explained by current genomic deep learning models. Nat Genet. 2023;55(12):2056–9.
Tang Z, Toneyan S, Koo PK. Current approaches to genomic deep learning struggle to fully capture human genetic variation. Nat Genet. 2023;55(12):2021–2.
Kraft K, Yost KE, Murphy SE, Magg A, Long Y, Corces MR, et al. Polycomb-mediated genome architecture enables long-range spreading of H3K27 methylation. Proc Natl Acad Sci U S A. 2022;119(22):e2201883119.
Chen F, Pruett-Miller SM, Davis GD. Gene editing using SsODNs with engineered endonucleases. Methods Mol Biol Clifton NJ. 2015;1239:251–65.
Belokopytova P, Fishman V. Predicting genome architecture: challenges and solutions. Front Genet. 2021;11:617202.
Amândio AR, Beccari L, Lopez-Delisle L, Mascrez B, Zakany J, Gitto S, et al. Sequential in Cis mutagenesis in vivo reveals various functions for CTCF sites at the mouse HoxD cluster. Genes Dev. 2021;35(21–22):1490–509.
Anania C, Acemel RD, Jedamzick J, Bolondi A, Cova G, Brieske N, et al. In vivo dissection of a clustered-CTCF domain boundary reveals developmental principles of regulatory insulation. Nat Genet. 2022 July;54(7):1026–36.
Gridina M, Mozheiko E, Valeev E, Nazarenko LP, Lopatkina ME, Markova ZG, et al. A cookbook for DNase Hi-C. Epigenetics Chromatin. 2021;14(1):15.
Hsieh THS, Weiner A, Lajoie B, Dekker J, Friedman N, Rando OJ. Mapping nucleosome resolution chromosome folding in yeast by Micro-C. Cell 2015 July 2;162(1):108–19.
Krietenstein N, Abraham S, Venev SV, Abdennur N, Gibcus J, Hsieh THS, et al. Ultrastructural details of mammalian chromosome architecture. Mol Cell. 2020;78(3):554–e5657.
Mumbach MR, Rubin AJ, Flynn RA, Dai C, Khavari PA, Greenleaf WJ, et al. HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat Methods. 2016;13(11):919–22.
Liu S, Cao Y, Cui K, Tang Q, Zhao K. Hi-TrAC reveals division of labor of transcription factors in organizing chromatin loops. Nat Commun. 2022;13(1):6679.
Gridina M, Lagunov T, Belokopytova P, Torgunakov N, Nuriddinov M, Nurislamov A, et al. Combining chromosome conformation capture and exome sequencing for simultaneous detection of structural and single-nucleotide variants. Genome Med. 2025;17(1):47.
Korablev AN, Serova IA, Serov OL. Generation of megabase-scale deletions, inversions and duplications involving the Contactin–6 gene in mice by CRISPR/Cas9 technology. BMC Genet. 2017;18(Suppl 1):112.
Belaghzal H, Dekker J, Gibcus JH, Hi. -C 2.0: an optimized Hi-C procedure for high-resolution genome-wide mapping of chromosome conformation. Methods San Diego Calif. 2017 July;1:123:56–65.
Durand NC, Shamim MS, Machol I, Rao SSP, Huntley MH, Lander ES et al. Juicer provides a One-Click system for analyzing Loop-Resolution Hi-C experiments. Cell Syst 2016 July 27;3(1):95–8.
Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, et al. ChIP-seq guidelines and practices of the ENCODE and ModENCODE consortia. Genome Res. 2012 Sept;22(9):1813–31.
Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485(7398):376.
Acknowledgements
We acknowledge support of the RSF grants (2017-2019 #17-74-10143 and 2021-2024 #22-14-00247), which allowed us to build infrastructure for genetically-modified mice generation and NGS-based transcription analysis. The original manuscript text was composed by authors; proofreading was conducted with the assistance of ChatGPT o1. The text was corrected and edited by the authors after ChatGPT proofreading.
Funding
This work was supported by the Ministry of Education and Science of the Russian Federation, agreement № 075-15-2024-539 (signed 24.04.2024).
Author information
Authors and Affiliations
Contributions
P.S. and V.F. conceived the studies and designed the experiment. P.S. and P.B. prepared gRNA and ssODN constructs; A.K. and I.S. performed microinjections; P.S. handled transgenic mice and performed genotyping with help from Y.S., A.Y., and S.T.; V.L and Y.S. performed ChiP-seq experiments; P.S., E.V. and P.B. performed NGS data analysis with help from V.F.; P.S. performed Hi-C experiments and N.T. analyzed the data; P.B. generated and analyzed Enformer predictions; P.S. prepared draft of the manuscripts with help from V.F.; all authors contributed to manuscript revision, read, and approved the submitted version.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Salnikov, P., Belokopytova, P., Yan, A. et al. Direction and modality of transcription changes caused by TAD boundary disruption in Slc29a3/Unc5b locus depends on tissue-specific epigenetic context. Epigenetics & Chromatin 18, 55 (2025). https://doi.org/10.1186/s13072-025-00618-1
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1186/s13072-025-00618-1