- Research
- Open access
- Published:
TRAF3 as a potential diagnostic biomarker for recurrent pregnancy loss: insights from single-cell transcriptomics and machine learning
BMC Pregnancy and Childbirth volume 25, Article number: 637 (2025)
Abstract
Background
Recurrent pregnancy loss (RPL), characterized by multiple miscarriages, remains a condition with unclear etiology, posing significant challenges for affected women and couples. This study aims to explore the underlying mechanisms of RPL, focusing on the role of decidual Natural Killer (dNK) cells and the TNF receptor-associated factor 3 (TRAF3) gene as a potential diagnostic marker and therapeutic target.
Methods
We used single-cell transcriptomic analysis and machine learning techniques to analyze decidual tissues from RPL patients and normal pregnancy(NP). Weighted Gene Co-expression Network Analysis (WGCNA) was employed to identify key gene clusters. Validation studies included RT-PCR, immunohistochemistry, and molecular docking analyses.
Results
We observed an increased proportion of specific dNK cell subtypes (dNK2 and dNK3) in the RPL group compared to NP, implicating their role in RPL pathology. dNK cells in RPL primarily interacted with monocytes via the Macrophage Migration Inhibitory Factor (MIF) signaling pathway. Our diagnostic model, incorporating TRAF3 and nine other genes, demonstrated high diagnostic efficiency. TRAF3 expression was significantly lower in the decidua of RPL patients, and Diethylstilbestrol and Metformin were identified as potential modulators of TRAF3.
Conclusions
This study highlights TRAF3 as a promising diagnostic marker and therapeutic target for RPL. The diagnostic model we developed has potential for early detection and personalized treatment strategies for RPL.
Introduction
Recurrent pregnancy loss (RPL) is a distressing condition for both patients and physicians. It is defined variably, either as the loss of two or more consecutive pregnancies before the age of viability, or as three or more consecutive pregnancy losses, depending on the guidelines followed by different countries or organizations [1,2,3]. The prevalence of recurrent pregnancy loss among women or couples striving for conception is estimated at 1–2%. This percentage is subject to rise to around 5% in circumstances involving two consecutive miscarriages [2, 4]. The experience is often heart-wrenching for the patients and frustrating for the physicians due to the uncertainty surrounding its etiology in many cases [5].
The known causes of recurrent pregnancy loss include abnormal chromosomes, endocrinological disorders, and uterine abnormalities. Despite thorough investigation, the cause remains unknown in about 50% of cases [4, 6, 7]. Immunological factors, such as the presence of antiphospholipid antibodies, have been associated with recurrent pregnancy loss, but the exact role played by various immunological mechanisms is yet to be elucidated. Pregnancy triggers a unique immunological process, with the decidua, the specialized uterine lining, playing a crucial role. Decidual Natural Killer (dNK) cells, a significant part of this process, aid in immune regulation, trophoblast invasion, and tissue remodeling essential for successful pregnancy [6, 8,9,10]. In recurrent pregnancy loss, altered interactions between dNK cells and fetal HLA-E, and potential immunological abnormalities, may disrupt the necessary immune tolerance, leading to pregnancy loss [11]. Understanding the role of dNK cells in recurrent pregnancy loss may provide insights for potential therapeutic interventions.
Machine learning (ML) has significantly progressed in recent years, and when combined with single-cell technologies, it unlocks immense potential in the biomedical field [12]. Single-cell analysis allows for the detailed examination of individual cells, shedding light on cellular heterogeneity and enabling the precise characterization of complex biological systems [13]. ML, with its capability to handle large datasets and identify patterns, is particularly suited for analyzing the massive amount of data generated by single-cell technologies. This synergy is promising for identifying and validating diagnostic and therapeutic markers, which are crucial for personalized medicine approaches [14,15,16].
Our study aims to utilize machine learning and single-cell transcriptomic analysis methods to establish a diagnostic model for recurrent pregnancy loss. By identifying diagnostic markers associated with dNK related genes, we will screen for potential diagnostic markers and Clinical samples will be used for validation. This research endeavor serves as a foundation for unraveling the molecular mechanisms underlying recurrent pregnancy loss and exploring novel diagnostic and therapeutic approaches. Figure 1 illustrates the procedural steps involved in this study.
Material and methods
Data resource
The single-cell sequencing data for this study was obtained from the Genome Sequence Archive (GSA) (https://ngdc.cncb.ac.cn/gsa/), while the bulk sequencing data was sourced from the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/). The dataset CRA002181 from the GSA database includes ATAC and single-cell sequencing data from decidua tissues of patients with RPL and NP group. The cells within this dataset have undergone CD45 + flow cytometry-based selection [17]. This study utilized the single-cell data from this dataset. The GSE165004 dataset from GEO database comprised RNA expression data annotated by GPL16699. It encompassed 24 subjects with recurrent pregnancy loss, 24 with unexplained infertility, and 24 healthy fertile women. On the other hand, the GSE26787 dataset, based on the GPL570 platform, contained five normal samples, five cases of repeated implantation failures (IF), and five samples from individuals with RPL. However, for the purpose of this study, only the RPL and NP groups were selected for analysis.
ScRNA-seq data processing
The droplet-based raw data were processed using Cell Ranger (Version 3.0.0) against the GRCh37 human reference genome with default parameters. Initially, the data from each batch was normalized separately using the NormalizeData function and scaled using the ScaleData function in the Seurat pipeline. Subsequently, the data from different batches were integrated using harmony method by R package ‘harmony’(Version 0.1.1). In the data processing pipeline, we retained cells that had a detected gene count between 500 and 10,000 and less than 20% mitochondrial unique molecular identifiers (UMIs) (Fig S1A). Additionally, genes that were expressed in fewer than three cells were excluded from the analysis. This filtering step was performed to ensure the inclusion of high-quality cells in the downstream analysis. After the initial data processing steps, including filtering for cell quality and gene expression, downstream analysis was performed using the Seurat package version 4.3.0. This involved additional steps such as normalization, batch removal, and dimension reduction (Fig S1B-D). Following these steps, the cells were clustered based on the integrated gene expression matrix using Seurat. The clustering was performed with a parameter Resolution = 1.0, which determines the granularity of the clusters. In this case, 18 clusters were generated based on the chosen resolution parameter. In addition to Seurat, we also utilized the Harmony package (version 0.1.1) to integrate different batches from healthy controls and RPL patients to validate the integration reliability. The same gene expression matrix used in Seurat was used for this integration. For cell annotation, you utilized the singleR package (version 2.2.0). The reference dataset used for annotation was'HumanPrimaryCellAtlasData'. To validate the cell annotation accuracy, we also employed manual annotation methods according to cellmarker (http://xteam.xbio.top/CellMarker/) [18]. In your study, you further classified dNK into four subtypes: dNK1, dNK2, dNK3, and dNKp. Each subtype is characterized by specific marker expression patterns. dNK1 cells are identified by the expression of CD39 (ENTPD1), CYP26 A1, and B4GALNT1. dNK2 cells are defined by the expression of ANXA1 and ITGB2. dNK3 cells share the marker ITGB2 with dNK2 cells, but they also express CD160, KLRB1, and CD103 (ITGAE). Notably, dNK3 cells do not express the innate lymphocyte cell marker CD127 (IL7R) [19].
Cellchat analysis
To investigate cellular communication and its role in diseases, we utilized the ‘CellChat’ package (version 1.6.1). This package allows for a more detailed analysis of signaling mechanisms and recognizes different levels of signal changes [20]. Using cell communication analysis, we explored the interactions of NK-related subsets with other cell types. Specifically, we compared these interactions with other NK cells and examined the differences and connections between NK cells and other non-NK cells. This subset of cells was used for the analysis.
High-dimensional Weighted Gene Co-expression Network Analysis (HdWGCNA)
The High-dimensional Weighted Gene Co-expression Network Analysis (HdWGCNA) method was utilized in this study to analyze datasets with high dimensions, such as single-cell RNA sequencing [21]. This approach enables the construction of cell type-specific co-expression networks, identification of robust modules consisting of interconnected genes, and provides the biological context for these modules. In our investigation, we employed hdWGCNA to explore the functionality of dNK cells in RPL, leading to the identification of characteristic genes within a subgroup associated with hepatocyte regeneration. The R package version used in this study was 0.2.24.
Enrichment analysis
We conducted Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses using the clusterprofiler R package (version 4.0.2). Gene sets with p-values less than 0.05 were considered to be significantly enriched. Bar charts and chord diagrams were utilized to visualize the gene sets that exhibited significant enrichment (p < 0.05). The PPI network was constructed using R package STRINGdb (version 2.12.1). The score_threshold is set to 400, while the remaining parameters are set to their default values.
Establishment of diagnostic models by machine learning
Diagnostic models were established using machine learning by the R package “mlr3” [22] and related packages. The neural network was implemented using the Python library Keras in TensorFlow (version 2.8.0) [23]. Multiple data acquired from GSE165004 was randomly divided at a 1:1 ratio. The 1/2 was utilized for modelling (training set), whereas the another 1/2 was applied as test set. The models established in discovery cohort were further verified using an independent cohort from GSE26787 (validation set). We avoided overfitting in the machine learning process by carefully selecting relevant features and using cross—validation techniques. The importance of indicators in the contribution to the model was also evaluated.
Immune infiltration
NMF has been applied to detect latent characteristics in gene expression patterns by decomposing the original matrix into three non-negative matrices [24]. Subsequently, we assessed immune activity within each cluster using the IOBR R package (version 0.99.9) [25]. Prior to analysis, we preprocessed the data by normalizing it and selectively filtering genes related to immunity. NMF enabled the identification of hidden patterns in the data, leading to distinct sample clusters. Immune scores were computed for each cluster using the IOBR package, allowing us to quantify immune-related gene expression patterns and evaluate immune system activity, The immunological scoring in our establishment is based on the MCPCounter algorithm [26]. Statistical analyses were conducted on immune scores to identify clusters with significant immune activity and to explore potential correlations with clinical parameters.
Hub gene and ROC analysis
We compared the gene expression levels of hub genes, selected through machine learning between the NP group and the RPL group. The comparison between the two groups was conducted using the Wilcoxon rank sum test method. A statistical difference is considered significant when the p-value is less than 0.05. In this study, we performed ROC analysis using their expression levels for disease diagnosis. We identified candidate hub genes based on their relevance to the disease and collected gene expression data from diseased and healthy individuals. After preprocessing and normalization, we constructed ROC curves and calculated the AUC values to assess the discriminatory power of the hub genes. Statistical tests were conducted to determine the significance of the AUC values, and cross-validation techniques were used for validation.
Sources of clinical samples
Decidual tissues were extracted from eight patients experiencing Recurrent pregnancy loss, wherein the embryos had ceased development and could not be promptly expelled from the uterus. Additionally, eight cases of normal decidua tissue were acquired through artificial termination of pregnancy at 8–10 weeks. The diagnostic criteria of RPL adopted for these datasets are those of ACOG and ASRM. Following the patients'consent, this study received approval from the Ethics Committee of the First Affiliated Hospital of Zhejiang Chinese Medical University after which the samples were obtained.
RNA extraction and real-time quantitative PCR
In accordance with the TRIzol reagent protocol, begin by thoroughly homogenizing the collected tissue or cells, followed by the addition of TRIzol reagent for lysis. Allow the lysis process to occur at room temperature for 5 min, then introduce chloroform and mix thoroughly. Employing a centrifuge, separate the sample into an upper aqueous phase and a lower organic phase, and carefully transfer the upper aqueous layer to a new RNase-free tube. Subsequently, add an equal volume of isopropanol, gently mix, and allow it to stand at room temperature. Discard the supernatant after the centrifugation step, wash the RNA with 75% ethanol, and allow the sample to air-dry for 10 min at room temperature. Real-time quantitative PCR was performed using the QuantStudio 6 Flex System PCR machine from Life Technologies. To assess each sample, the EnTurbo™ SYBR Green PCR SuperMix kit (ELK Biotechnology, China) was employed in triplicate experiments. The primers utilized for PCR amplification in our study are detailed in Table 1.
Immunohistochemistry
After deparaffinization and rehydration, the tissue sections underwent heat-induced epitope retrieval for antigen retrieval. To inhibit endogenous peroxidase activity, hydrogen peroxide was applied, and serum was utilized to prevent non-specific binding. The sections were kept at 4 °C overnight and subsequently incubated with primary antibodies targeting TNF receptor-associated factor 3 (TRAF3) from Proteintech (China). Following thorough washing, the sections were exposed to secondary antibodies conjugated with horseradish peroxidase. Finally, diaminobenzidine was used as a chromogen to visualize the sections, which were counterstained with hematoxylin. Positive staining for TRAF3 was observed under a Nikon microscope (Japan).
Molecular docking
CB-Dock 2 (https://cadd.labshare.cn/cb-dock2/php/index.php/) [27] is an online molecular docking tool that automatically identifies the binding site of the target protein, calculates the center position, defines the docking box size, and utilizes the AutoDock Vina program for docking. We downloaded the three-dimensional crystal structures of core target proteins from the UniProt database (https://www.uniprot.org/) and saved them in PDB format. The drug structure information was obtained from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/). The drug structure and target protein files were uploaded to the CB-Dock 2 database to obtain docking results. The docking mode with the lowest Vina score was selected as the optimal docking mode. An affinity score of < −4.25 kcal/mol indicates that the component can bind to the target, a score of < −5.0 kcal/mol suggests a good binding capability, and a score of < −7.0 kcal/mol indicates strong binding activity between the two [28]. Visualization was performed using Discovery Studio 2019 software (https://www.3ds.com/).
Statistical analysis
Statistical significance was determined using a two-tailed Student’s t-test, with p-values less than 0.05 considered significant. Correlation analysis was conducted using the cor function in R (version 4.1.3).
Results
Analysis of decidua tissue of RPL by single-cell RNA-seq
We downloaded single cell data from CRA002181 from the GSA database, containing 2 RPL and 3 NP samples. Through cell quality control and strict batch removal, 16,797 cells were finally selected for subsequent analysis. We utilized the"findclusters"function from the Seurat package, setting the resolution to 1.0, resulting in the division of all cells into 18 cellular subgroups (Fig S1E). We further applied the"singleR"package for cell annotation, using the HumanPrimaryCellAtlasData as the reference dataset [29]. The 18 cellular subgroups were annotated into 6 subgroups, namely DC, Endothelial cells, Macrophage, Monocyte, NK cells, and Tissue stem cells (Fig S1F) (Table S1). The cell clusters were visualized using the"UMAP"method for dimensionality reduction (Fig. 2A). We compared the proportions of cells between the RPL group and the NP group and found that the proportion of NK cells in the RPL group was significantly higher than that in the NP group (P < 0.05), while the proportion of macrophages was significantly lower in the RPL group compared to the NP group (P < 0.05) (Fig. 2B). To validate the accuracy of cell annotation, we compared the singleR annotation results with the annotation criteria provided on the CellMarker website. We annotated THY and FAP positive cells as tissue stem cells, XCR1 and CLEC9 A positive cells as DC, VWF and PECAM1 positive cells as endothelial cells, S100 A12, FCN1, and S100 A9 positive cells as macrophages, and NKG7, GNLY, CD247, and GZMB positive cells as NK cells. The results showed that the singleR annotation results were reliable and can be used for further analysis (Fig. 2C).
Single Cell Analysis of decidua tissues in recurrent pregnancy loss patients (A) Visualization of cell clusters in decidual tissues from both RPL and normal pregnancy (NP) groups, using UMAP dimensionality reduction. Each cluster is color-coded to represent distinct cell types based on gene expression profiles. B Comparison of cell cluster proportions between the RPL and NP groups, highlighting a significant increase in NK cells and a decrease in macrophages in the RPL group. C Markers used for the annotation of various cell types, with specific markers for NK cells, macrophages, and other immune cells within the decidua. D Distribution of NK cell subsets in the RPL and NP groups, showing the relative abundance of NK cells in each group. E Breakdown of NK cell subsets between the RPL and NP groups, illustrating differences in the proportions of subtypes such as dNK2 and dNK3. F Marker expression for the annotation of NK cell subsets, providing a detailed view of the distinguishing characteristics of each subset. G Annotation of NK cells in both RPL and NP groups based on gene expression markers, confirming the identity of dNK1, dNK2, dNK3, and dNKp subsets. H Distribution of annotated NK cell subsets across the RPL and NP groups, showing a higher prevalence of dNK2 and dNK3 cells in the RPL group compared to controls
We utilized the subset function in ‘Seurat’ to extract NK cells from the dataset. Subsequently, we performed dimensionality reduction and clustering with a resolution set at 1.0. As a result, all NK cells were divided into nine subgroups (Fig. 2D). We observed a significantly higher proportion of subgroups 2, 5, 6, and 8 within the NK cell population in the RPL group compared to the NP group (Fig. 2E). Therefore, we have identified these specific cell clusters (2, 5, 6, and 8) as RPL-associated NK cells. We further annotated these NK cells based on their expression of specific markers. The dNK cells in this context have been annotated as four known dNK subgroups, as referenced from the study conducted by Vento-Tormo, R [19]. NK cells positive for ENTPD1, CYP26 A1, and B4GALNT1 were annotated as dNK1. NK cells positive for ANXA1 and ITGB2 were annotated as dNK2. NK cells positive for CD160, KLRB1, and ITGAE were annotated as dNK3. Finally, NK cells positive for MKI67 and PCNA, but negative for IL7R, were annotated as dNKp (Fig. 2F, G). The results indicate that the proportion of dNK2 and dNK3 in the RPL group is higher compared to the NP group. Additionally, NK cell groups 2, 5, 6, and 8 belong to either dNK2 or dNK3 (Fig. 2H).
Cellchat analysis
Based on the results of single-cell annotation, we define subgroups 2, 5, 6, and 8 within the NK cells as recurrent pregnancy loss-associated NK cells, while other subgroups are defined as other NK cells. We will then conduct cellular communication analysis between these NK cells and other cell types. Macrophages, monocytes, and NK cells, exhibits the highest number of interactions or interaction weights (Fig. 3A, B). We proceed with the analysis of ligands in cellular communication. By designating NK cells as the receptor source, we can categorize them into two groups: RPL-related NK cell group and other NK cell group. We will retrieve information on ‘secreted signaling’ from CellChatDB. We have observed that RPL-related NK cells primarily interact with monocytes through the Macrophage Migration Inhibitory Factor (MIF) signaling pathway, whereas other NK cells do not engage in interactions with monocytes via MIF (Fig. 3C). Within the MIF signaling pathway network, there is a notable enrichment of interactions between RPL-related NK cells and monocytes (Fig. 3D). Among all the incoming signals, the MIF pathway has the highest proportion, and the MIF pathway of RPL-related NK cells significantly surpasses that of other NK cells (Fig. 3E). Among all the incoming and outgoing signaling pathways, monocytes have the highest number and intensity. The number and intensity of RPL-related NK cells are significantly higher than those of other NK cells (Fig. 3F).
Cell Chat Analysis of decidua tissues in recurrent pregnancy loss patients (A) The total number of interactions between various cell subsets, highlighting the frequency of communication within the decidual tissue in RPL patients. B Analysis of the interaction strength between cell subsets, illustrating the varying levels of signaling intensity across different cell populations. C Communication pathways between RPL-related NK cells and other NK cells, detailing specific receptor-ligand interactions that may be altered in RPL. D A detailed representation of the MIF (macrophage migration inhibitory factor) signaling network, showcasing interactions among different cell subsets involved in the signaling pathway. E Heatmaps showing that RPL-related NK cells exhibit the strongest correlation with MIF signaling compared to other NK cell subsets. (F) Analysis of incoming and outgoing interaction strength for each cell subset, providing insight into the directional nature of cell signaling and intercellular communication within the decidual microenvironment of RPL patients
HdWGCNA
We further apply the hdWGCNA method to screen for key gene clusters within the NK cell subpopulation. The scale-free topology fit index curve was plotted against various soft-thresholding powers (Fig. 4A). The analysis suggested an optimal power of 0.4, where the scale-free fit index reached 0.4. Eight module's eigengene was computed to summarize the expression profiles within the module (Fig. 4B), Hierarchical clustering resulted in the formation of distinct gene modules, represented by different colors (Fig. 4C). Gene significance was assessed in relation to various traits, with each module displaying a unique distribution of significance levels (Fig. 4D). The genes were color-coded according to their respective module membership. A heatmap of module-trait relationships was generated to explore the correlation between identified modules and traits (Fig. 4E) (Table S2). Based on the previous single-cell analysis, we identified RPL-related NK cell subpopulations (2, 5, 6, 8), which are primarily enriched in the black, yellow, and brown modules (Fig. 4F). Subsequently, we extracted the genes from these three modules for further analysis.
hdWGCNA analysis of genes in NK cells in RPL patients. A Selection of soft power for running hdWGCNA. Max, median and mean connectivity were showed, respectively. B The gene module distribution map of NK cell related genes was divided into 8 gene modules. C Dendrogram from hdWGCNA representing the clustering of NK cell genes in RPL patients, highlighting the hierarchical relationships between gene modules. D Distribution of module scores in NK cell subsets associated with RPL, illustrating the variation in gene expression across different subsets. E Correlation analysis of all identified gene modules, examining their relationships to various traits. F Dot plot visualizing the enrichment of RPL-associated NK cell clusters in the black, yellow, and brown gene modules, indicating the predominant gene expression profiles in these subsets
GO, KEGG and PPI analysis
We conducted GO, KEGG, and PPI analyses on the genes within the black, yellow, and brown modules obtained from hdWGCNA. Analysis of GO terms revealed significant enrichment in molecular functions related to MHC protein complex binding and ATP-dependent chromatin remodeling (Fig. 5A). KEGG pathway enrichment highlighted key pathways involved in cancer, infection, and signaling, suggesting a complex interplay between genetic regulation and disease (Fig. 5B). Protein–Protein Interaction (PPI) Network Analysis. The PPI network centered on NFκB1, indicating its potential as a central regulatory hub. Interactions with proteins such as CCL5 and HSPs suggest a network responsive to cellular stress and signaling (Fig. 5C).
Model selection and performance evaluation
We selected the top 25 genes with the highest kme values from each of the black, yellow, and brown modules obtained through hdWGCNA. In total, we have 75 genes for further analysis, including Univariate logistic regression. and LASSO regression analysis. Model evaluation using LASSO regularization identified the optimal lambda with a minimal binomial deviance, indicating the most parsimonious model (Fig. 6A). Coefficient paths for the predictors confirmed the feature selection effectiveness as lambda increased (Fig. 6B). After undergoing lasso analysis for selection, a total of 10 genes, namely CNOT6L, TRAF3, SOD1, VIM, DNAJA1, GNG2, SPAG9, ARID4B, PPP1R16B, and CUL3, were included in the establishment of the diagnostic model. Comparative analysis of machine learning models revealed that the naive bayes (NB) models outperformed others in terms of AUC, suggesting higher predictive accuracy (Fig. 6C). ROC curve analysis on the training set showed excellent model sensitivity and specificity, with the NB model achieving an AUC of 0.98, although the test set indicated 0.80 (Fig. 6D and E). The neural network’s learning curve demonstrated a consistent decrease in loss across epochs, affirming the model's learning efficiency (Fig. 6F). Lastly, ROC curves for both training and test sets for the finalized model exhibited strong AUCs of 0.98 and 0.84 respectively, substantiating the model’s robustness and generalizability (Fig. 6G and H).
Construction of Machine Learning Diagnostic Model for NK cell MIF pathway related genes. A Lasso algorithm for selection features for NK cell MIF pathway related genes. B Coefficient changes of the selected features using lasso algorithm. C Machine learning algorithms for building the diagnostic model for NK cell MIF pathway related genes. Seven machine learning algorithms were used in the mlr3verse (version 0.2.7) package. D ROC values of all seven algorithms were showed in training set. E ROC values of the naive_bayes model in test sets. F Learning curve of the neural network showing a consistent decrease in loss across epochs, indicating the model's efficient learning. G ROC curve for the training set of the neural network model. H ROC curve for the test set of the neural network model
NMF and immune infiltration analysis
The NMF rank survey indicated optimal factorization ranks for data stratification. Metrics such as cophenetic correlation, dispersion, explained variance, and silhouette scores were plotted against ranks 1 to 10. The highest cophenetic correlation and evar were observed at rank 2, suggesting a strong agreement within the clusters and a significant proportion of variance explained, respectively (Fig. 7A). The consensus matrix from NMF clustering demonstrated three distinct clusters with varying degrees of consensus. Higher values in the matrix indicated stronger agreement within clusters, depicted by the intensity of the blue color. The dendrogram on the left and the color-coded bar above the matrix corresponded to the identified clusters, with the silhouette width confirming the robustness of the clustering (Fig. 7B). Non-negative Matrix Factorization (NMF) clustering revealed distinct patient subtypes based on cell proportions and gene expression profiles. In particular, the analysis of NK cell proportions showed significant variations across the identified subtypes. Subtype 1 demonstrated a notably higher NK cell proportion compared to Subtypes 2 and 3, with statistical significance marked by asterisks (Fig. 7C). Furthermore, the expression of genes associated with NK cell function varied significantly between subtypes, as indicated in the gene expression heatmap (Fig. 7D).
Nonnegative matrix factorization (NMF) and immune infiltration analysis of gene expression according to hub genes. A NMF rank survey of gene expression. B According to NMF's algorithm, RPL samples were divided into three consensuses. The results were presented as heatmaps. C According to the hub gene, the immune infiltration scores of different NMF groups were presented. D The results showed that these Hub genes could effectively differentiate NK cell invasion levels. This part of the results is done using R package IOBR (version 0.99.9)
Differential gene expression and ROC analysis
Utilizing the machine learning methodology, we have derived a diagnostic model for RPL. This model comprises a total of 10 genes, namely CNOT6L, TRAF3, SOD1, VIM, DNAJA1, GNG2, SPAG9, ARID4B, PPP1R16B, and CUL3. We have conducted gene expression analysis and ROC analysis on these genes within the GSE165004 and GSE26787 datasets.Gene expression analysis in RPL tissues revealed a significant alteration of hub genes compared to normal pregnancy(NP), with some genes showing marked overexpression in the RPL group, as seen in the violin plot of GSE165004 (Fig. 8A). This differential expression was statistically significant across several genes, suggesting a potential molecular underpinning for RPL. ROC analysis of the same dataset supported these genes' discriminative power between RPL and NP, with satisfactory AUC values (Fig. 8B). A similar trend of hub gene overexpression in RPL was observed in an additional dataset, GSE26787, providing further evidence to the initial findings and indicating reproducibility across different datasets (Fig. 8C, D). RT-PCR analyses of decidua tissues from RPL and NP groups confirmed the overexpression of these genes, with significant differences lending weight to the microarray data (Fig. 8E). The ROC analysis derived from RT-PCR data aligned with the previous results, suggesting that the expression profiles could serve as potential biomarkers for RPL (Fig. 8F). Based on the aforementioned findings, we have observed that the TRAF3 gene exhibits significantly higher expression levels in the NP group compared to the RPL group in both datasets and RT-PCR results. Furthermore, the ROC analysis has demonstrated a high diagnostic efficiency for TRAF3. Consequently, we proceeded to validate these results further by conducting immunohistochemistry on clinical samples targeting TRAF3. Immunohistochemistry indicated a decrease in TRAF3 expression in RPL Decidual tissues (Fig. 8G, H).
Hub gene expression, ROC analysis and clinical sample validation. A Hub gene expression in RPL and NP groups in GSE165004. *P < 0.05, **P < 0.01, ***P < 0.001. B ROC analysis in RPL and NP groups in GSE165004. C Hub gene expression in RPL and NP group in GSE26787. ** P < 0.01. D ROC analysis in RPL and NP group in GSE26787. E Comparison of RT-PCR results of decidua tissues between 8 RPL and NP groups. **P < 0.01, ***P < 0.001. F ROC analysis in RPL and NP group based on RT-PCR result. G TRAF3 expression was decreased in RPL by immunohistochemistry
Molecular docking
Six compounds related to recurrent pregnancy loss and TRAF3 were selected from the CTD database. Among these, two compounds are therapeutic, while the remaining four compounds may be associated with disease markers or mechanisms. The two drugs studied are Diethylstilbestrol and Metformin. Both drugs interact with TRAF3, with Diethylstilbestrol exhibiting an affinity of −6.2 kcal/mol (Fig. 9A) and Metformin exhibiting an affinity of −4.5 kcal/mol (Fig. 9B). The binding energies indicate that Diethylstilbestrol has a strong affinity for TRAF3, while Metformin can interact with TRAF3, suggesting that TRAF3 may be a relevant target mediating the therapeutic effects of recurrent pregnancy loss.
Discussion
Decidual Natural Killer (dNK) cells, a specialized subset of Natural Killer (NK) cells, play a critical role in pregnancy, particularly in the establishment and maintenance of the maternal–fetal interface. These cells, predominantly found in the decidua during early pregnancy, differ from peripheral-blood NK cells in that they are less cytolytic and instead release cytokines and chemokines which facilitate crucial processes such as trophoblast invasion, tissue remodeling, embryonic development, and placentation [9, 30]. Their function extends to maintaining maternal–fetal immune tolerance, promoting extravillous trophoblast cell invasion, and driving the remodeling of uterine spiral arteries, essential for a successful pregnancy [31]. In the context of RPL, alterations in the functionality or quantity of dNK cells have been implicated. The imbalance in dNK cell regulation can lead to inadequate maternal–fetal tolerance or impaired placental development, contributing to the pathogenesis of RPL. This underscores the critical balance dNK cells maintain in pregnancy, where any disruption in their activity can lead to significant pregnancy complications, including RPL [9, 32].
We performed single-cell sequencing analysis on the decidual tissues of the RPL and NP groups and annotated the dNK cells. We found that the levels of dNK2 and dNK3 were significantly higher in the RPL group compared to the NP group. This finding is consistent with previous studies [17, 19]. The elevated proportions of dNK2 and dNK3 cells in RPL patients suggest that these subtypes play a critical role in the altered immune landscape of pregnancy, potentially contributing to the pathogenesis of RPL. Both dNK2 and dNK3 cells are involved in key immune functions such as cytokine secretion, regulation of trophoblast invasion, and vascular remodeling, all of which are essential for the establishment and maintenance of pregnancy. Their increased abundance in RPL may disrupt these processes, leading to an impaired maternal–fetal immune tolerance, inadequate placental development, and increased immune activation, which are known factors in RPL. Understanding the mechanistic contributions of these dNK cell subtypes could open avenues for targeted therapies aimed at restoring immune balance and improving pregnancy outcomes in women with RPL. Future studies should focus on elucidating the specific molecular pathways through which dNK2 and dNK3 cells contribute to the immune dysregulation observed in RPL. Furthermore, we have observed that RPL-associated NK cells primarily interact with monocytes through the MIF signaling pathway, while other NK cells do not engage in interactions with monocytes via MIF. Our finding suggests that the occurrence of RPL is likely associated with the interaction between NK cells and monocytes, and this interaction is mediated through the MIF signaling pathway. The Macrophage Migration Inhibitory Factor (MIF) pathway plays a critical role in various physiological and pathological processes, particularly in autoimmunity and inflammatory responses. MIF is a multifunctional molecule that can regulate glucocorticoid-mediated immunosuppression and is involved in cell survival signaling. It is also associated with disease susceptibility and clinical severity in various autoimmune diseases [33, 34]. MIF's unique structure has led to the development of targeted biologic and small-molecule therapeutics for rheumatic diseases. Regarding the relationship between RPL and the MIF pathway, studies have indicated the involvement of altered immunological mechanisms in RPL. However, the exact pathophysiological mechanisms and the role of specific immunological pathways, including the MIF pathway, remain not fully understood. Research on the MIF pathway in the context of pregnancy, particularly during the first trimester, has shown that MIF is secreted by the human placenta and plays a role in promoting trophoblast survival and suppressing apoptosis. This function is especially crucial under stress conditions like hypoxia/re-oxygenation, which are implicated in pregnancy complications such as miscarriage and pre-eclampsia [35]. While these studies provide some insight, the direct connection between the MIF pathway and RPL specifically requires further investigation. The current understanding indicates a potential role of MIF in placental development and response to stress, which could be relevant in the context of RPL.
We applied the hdWGCNA method to further perform module clustering on the highly variable genes within dNK cells in the RPL group. The results revealed the presence of three modules, namely black, yellow, and brown, which are associated with RPL-related NK cells. The application of hdWGCNA in single-cell sequencing represents a significant advancement in understanding the complex interactions and co-expression networks within biological systems. hdWGCNA, a comprehensive framework for analyzing high-dimensional transcriptomics data, including single-cell and spatial RNA sequencing, facilitates network inference, gene module identification, gene enrichment analysis, statistical testing, and data visualization. This tool's ability to perform isoform-level network analysis using long-read single-cell data is particularly noteworthy. HdWGCNA has proven to be a valuable tool for identifying disease-relevant co-expression network modules in various conditions [21]. This study applies this analytical technique to the single-cell analysis of RPL, demonstrating its compatibility with widely used transcriptomics analysis packages like Seurat and its scalability for handling large datasets.
In our study, the enrichment analysis of MIF-related genes in dNK cells reveals a pronounced association with MHC protein complex binding, highlighting their potential role in antigen processing and immune surveillance. This is consistent with reports that NK cells adapt their functions in response to MHC class I molecule expression, a process vital for their regulatory roles in immune responses [36]. The identified pathways, such as the Chemokine signaling pathway, suggest a broader influence on NK cell trafficking and function. The PPI network, showing interactions with key regulators like NFKB1 and CCL5, further underscores the complex regulatory mechanisms underpinning NK cell activity and their responses to inflammatory signals. The interaction between the MIF signaling pathway and TRAF3 in the context of RPL represents a crucial area of investigation in understanding immune dysregulation during pregnancy. TRAF3, a key modulator of immune signaling, is involved in regulating NF-kB and type I interferon production, both of which are essential for maintaining immune tolerance at the maternal–fetal interface. Dysregulation of these pathways can lead to impaired maternal immune tolerance and contribute to pregnancy complications such as RPL. In our study, we propose that the MIF signaling pathway, which plays a significant role in immune responses during pregnancy, may be aberrantly activated in RPL patients. TRAF3 could influence this pathway by modulating the immune response, particularly in dNK cells, which are vital for trophoblast invasion and placental development. The abnormal activation of dNK cells through MIF signaling, in concert with TRAF3’s regulation of inflammatory responses, could result in the disruption of the immune balance, impairing placental function and leading to pregnancy loss. Further studies are needed to elucidate the precise molecular mechanisms underlying these interactions and their contribution to the pathogenesis of RPL.
Based on Cox univariate analysis and Lasso regression analysis, we selected 10 genes (CNOT6L, TRAF3, SOD1, VIM, DNAJA1, GNG2, SPAG9, ARID4B, PPP1R16B, CUL3) to establish a diagnostic model. Various machine learning techniques, including neural networks, were employed. The results demonstrated that this diagnostic model exhibits a high diagnostic efficiency. The integration of machine learning in developing diagnostic models holds the promise of revolutionizing healthcare by improving clinical decision support systems. The development, validation, and implementation of such ML models require careful consideration to align with the intricacies of healthcare data and the complexity of disease phenotypes. Although Naive Bayes is generally less appropriate for high—dimensional transcriptomic data because of its assumption of feature independence, it demonstrated superior performance in our analysis. This unexpected outcome may be attributed to the data's relative sparsity and the algorithm's efficacy in handling small sample sizes. In contrast, more advanced models such as Random Forests and SVMs underperformed, likely due to parameter tuning challenges and the dataset's high dimensionality, which can lead to overfitting. To address these issues and enhance transparency, we plan to conduct additional research with more rigorous feature selection and parameter optimization on these alternative models, aiming to clarify the underlying causes of the observed performance discrepancies. Chen et al. highlight the necessity of a meticulous approach to increase the likelihood of enhancing patient care through these technologies [37]. However, our study has a limitation in terms of sample size. We used a training set consisting of 24 RPL patients and 24 NP patients, and a validation set consisting of 5 RPL patients and 5 NP patients. Clearly, the limited number of samples in our study may lead to more conservative results. In the future, we look forward to validating our findings with larger datasets that have a greater number of samples.
We performed NMF analysis on the expression matrix of RPL patients using these 10 genes, resulting in the identification of 3 subgroups. The MCPcounter scores of NK cells in these 3 subgroups showed significant differences. The expression levels of these genes were found to significantly impact the infiltration of NK cells in RPL patients. However, a limitation of this study is the lack of clinical data in the datasets used, which prevents further analysis.
We validated our findings using multiple datasets and further confirmed them through RT-PCR and immunohistochemistry analysis of clinical samples. Our findings demonstrate a significant alteration in the expression of key genes, notably TRAF3, in RPL tissues compared to NP. This differential expression, consistent across multiple datasets and validated through RT-PCR and immunohistochemistry, suggests these genes, especially TRAF3, could serve as potential biomarkers for RPL. The high diagnostic efficiency of TRAF3 highlights its role in RPL pathophysiology and underscores the potential for targeted therapeutic strategies. TRAF3 is a gene encoding a protein member of the TRAF protein family, which is involved in signal transduction from the TNF receptor superfamily [38]. TRAF3 plays a key role in immune response activation, particularly through its participation in the signal transduction of CD40 and lymphotoxin-beta receptor signaling complex, which induces NF-kappaB activation and cell death. It's distinct from other TRAF members as it does not induce activation of the common TRAF-inducible signaling pathways like NF-kappaB and JNK [29]. Furthermore, TRAF3 has been identified as a negative regulator of the NF-kappaB pathway and a positive regulator of type I IFN production, making it a critical regulator of both innate and adaptive immune responses [39]. In the context of RPL, TRAF3 may play a crucial role in modulating immune responses, which warrants further investigation.
Regarding its role in RPL, the current research suggests a strong genetic component in RPL. While specific studies directly linking TRAF3 to RPL were not accessible during my search, the broader genetic research indicates that pathogenic variants in highly penetrant genes can contribute to pregnancy loss. Genetic abnormalities that predispose to pregnancy loss include chromosomal aneuploidy, copy number variants, and single-gene changes. Given TRAF3's significant role in immune response regulation, further investigation into its potential impact on RPL could be a promising area of research. Our study suggests Diethylstilbestrol and Metformin may modulate TRAF3, a protein implicated in immune regulation, potentially offering new therapeutic strategies for recurrent pregnancy loss. Diethylstilbestrol's strong binding affinity (−6.2 kcal/mol) to TRAF3 could influence cellular mechanisms relevant to pregnancy, although its clinical application warrants caution due to historical safety concerns [40]. Metformin, with a moderate affinity (−4.5 kcal/mol), and known for its metabolic effects, might also impact TRAF3-related pathways, suggesting a novel application beyond diabetes [41]. These findings highlight the utility of computational tools in drug repurposing and necessitate further experimental validation to explore their clinical implications in recurrent pregnancy loss management.
One aspect that may merit further consideration in our study pertains to the potential influence of confounding factors on dNK cell populations and gene expression within the decidua. Owing to the retrospective nature of this research and the constraints in obtaining detailed clinical data, factors including maternal age, BMI, and underlying medical conditions could not be fully incorporated or adjusted for in our analyses. Looking ahead, we envision future studies that systematically gather comprehensive clinical information and employ sophisticated statistical approaches to better manage these variables, thereby enhancing the precision and robustness of our understanding of the underlying biological mechanisms.
Conclusions
This study provides valuable insights into RPL, focusing on the roles of dNK cells and the MIF signaling pathway. We identified elevated dNK2 and dNK3 subtypes in RPL, suggesting their involvement in immune processes related to pregnancy loss. A diagnostic model incorporating TRAF3 and other gene markers shows promise for early detection of RPL. Additionally, molecular docking identified Diethylstilbestrol and Metformin as potential therapeutic agents targeting TRAF3.While the small sample size necessitates cautious interpretation, future research with larger cohorts is needed to validate these findings and further explore the therapeutic potential of TRAF3.
Data availability
Single-cell sequencing data were accessed from GSA databases (https://ngdc.cncb.ac.cn/gsa/). Bulk sequencing data were accessed from GEO databases (https://www.ncbi.nlm.nih.gov/geo/). The other results have been included in the article and supplementary materials.
Abbreviations
- RPL:
-
Recurrent pregnancy loss
- TRAF3:
-
TNF receptor-associated factor 3
- dNK:
-
Decidual Natural Killer
- MIF:
-
Macrophage Migration Inhibitory Factor
- GO:
-
Gene Ontology
- GSEA:
-
Gene Set Enrichment Analysis
References
Coomarasamy A, et al. Recurrent miscarriage: evidence to accelerate action. Lancet. 2021;397(10285):1675–82.
Hennessy M, et al. A protocol for a systematic review of clinical practice guidelines for recurrent miscarriage. HRB Open Res. 2020;3:12.
Bedaiwy MA, et al. Prevalence, causes, and impact of non-visualized pregnancy losses in a recurrent pregnancy loss population. Hum Reprod. 2023;38(5):830–9.
Garrido-Gimenez C, Alijotas-Reig J. Recurrent miscarriage: causes, evaluation and management. Postgrad Med J. 2015;91(1073):151–62.
Xue D, et al. Quantitative proteomic analysis of sperm in unexplained recurrent pregnancy loss. Reprod Biol Endocrinol. 2019;17(1):52.
Homer HA. Modern management of recurrent miscarriage. Aust N Z J Obstet Gynaecol. 2019;59(1):36–44.
Liu L, et al. TCDD-inducible Poly (ADP-ribose) polymerase promotes adipogenesis of both brown and white preadipocytes. J Transl Int Med. 2022;10(3):246–54.
Zitti B, Bryceson YT. Natural killer cells in inflammation and autoimmunity. Cytokine Growth Factor Rev. 2018;42:37–46.
Zhang X, Wei H. Role of decidual natural killer cells in human pregnancy and related pregnancy complications. Front Immunol. 2021;12:728291.
Bao S, et al. Single-cell profiling reveals mechanisms of uncontrolled inflammation and glycolysis in decidual stromal cell subtypes in recurrent miscarriage. Hum Reprod. 2023;38(1):57–74.
Yamamoto M, et al. Evaluation of NKp46 expression and cytokine production of decidual NK cells in women with recurrent pregnancy loss. Reprod Med Biol. 2022;21(1):e12478.
Greener JG, et al. A guide to machine learning for biologists. Nat Rev Mol Cell Biol. 2022;23(1):40–55.
Stuart T, Satija R. Integrative single-cell analysis. Nat Rev Genet. 2019;20(5):257–72.
Wei P, et al. Identification and validation of a signature based on macrophage cell marker genes to predict recurrent miscarriage by integrated analysis of single-cell and bulk RNA-sequencing. Front Immunol. 2022;13:1053819.
Pan S, et al. Pan-cancer landscape of the RUNX protein family reveals their potential as carcinogenic biomarkers and the mechanisms underlying their action. J Transl Int Med. 2022;10(2):156–74.
He YB, et al. Bulk RNA-sequencing, single-cell RNA-sequencing analysis, and experimental validation reveal iron metabolism-related genes CISD2 and CYP17A1 are potential diagnostic markers for recurrent pregnancy loss. Gene. 2024;901:148168.
Guo C, et al. Single-cell profiling of the human decidual immune microenvironment in patients with recurrent pregnancy loss. Cell Discov. 2021;7(1):1.
Zhang X, et al. Cell Marker: a manually curated resource of cell markers in human and mouse. Nucleic Acids Res. 2019;47(D1):D721–8.
Vento-Tormo R, et al. Single-cell reconstruction of the early maternal-fetal interface in humans. Nature. 2018;563(7731):347–53.
Jin S, et al. Inference and analysis of cell-cell communication using Cell Chat. Nat Commun. 2021;12(1):1088.
Morabito S, et al. hdWGCNA identifies co-expression networks in high-dimensional transcriptomics data. Cell Rep Methods. 2023;3(6):100498.
Lang M, et al. mlr3: A modern object-oriented machine learning framework in R. J Open Source Softw. 2019;4(44):1–3.
Rampasek L, Goldenberg A. Tensorflow: biology’s gateway to deep learning? Cell Syst. 2016;2(1):12–4.
Brunet JP, et al. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A. 2004;101(12):4164–9.
Zeng D, et al. IOBR: multi-omics immuno-oncology biological research to decode tumor microenvironment and signatures. Front Immunol. 2021;12:687975.
Becht E, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 2016;17(1):218.
Liu Y, et al. CB-Dock2: improved protein-ligand blind docking by integrating cavity detection, docking and homologous template fitting. Nucleic Acids Res. 2022;50(W1):W159–64.
Hsin KY, Ghosh S, Kitano H. Combining machine learning systems and multiple docking simulation packages to improve docking prediction reliability for network pharmacology. PLoS One. 2013;8(12):e83922.
Aran D, et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol. 2019;20(2):163–72.
Mahajan D, et al. Role of natural killer cells during pregnancy and related complications. Biomolecules. 2022;12(1):68.
Liu Y, et al. Decidual natural killer cells: a good nanny at the maternal-fetal interface during early pregnancy. Front Immunol. 2021;12:663660.
Jabrane-Ferrat N. Features of human decidual NK cells in healthy pregnancy and during viral infection. Front Immunol. 2019;10:1397.
Kang I, Bucala R. The immunobiology of MIF: function, genetics and prospects for precision medicine. Nat Rev Rheumatol. 2019;15(7):427–37.
Sun Y, et al. Binding domain characterization of growth hormone secretagogue receptor. J Transl Int Med. 2022;10(2):146–55.
Ietta F, et al. Role of the Macrophage Migration Inhibitory Factor (MIF) in the survival of first trimester human placenta under induced stress conditions. Sci Rep. 2018;8(1):12150.
Bessoles S, et al. Adaptations of natural killer cells to self-MHC class I. Front Immunol. 2014;5:349.
Chen PC, Liu Y, Peng L. How to develop machine learning models for healthcare. Nat Mater. 2019;18(5):410–4.
He JQ, et al. TRAF3 and its biological function. Adv Exp Med Biol. 2007;597:48–59.
Bishop GA, Stunz LL, Hostager BS. TRAF3 as a multifaceted regulator of B lymphocyte survival and activation. Front Immunol. 2018;9:2161.
Bamigboye AA, Morris J. Oestrogen supplementation, mainly diethylstilbestrol, for preventing miscarriages and other adverse pregnancy outcomes. Cochrane Database Syst Rev. 2003;2003(3):CD004353.
Al-Biate MA. Effect of metformin on early pregnancy loss in women with polycystic ovary syndrome. Taiwan J Obstet Gynecol. 2015;54(3):266–9.
Acknowledgements
Not applicable.
Funding
This study was funded by Research Projects of Zhejiang Chinese Medical University (No. 2022 JKZKTS26 by YB.H, 2022 JKJNTZ16 by SL.C, 2022 JKJNTZ23 by C.W, ZDFY2023-CD-5 by YL. L), Zhejiang Provincial Natural Science Foundation of China (No. QN25H270030 by YB. H). Zhejiang Province Traditional Chinese Medicine Science and Technology Project (No. 2024ZR015 by YB.H., 2023ZL056 by ZZ. Z), Zhejiang Province Medical and Health Science and Technology Project (No. 2024 KY1201 by YB.H., 2024 KY1225 by C.W, 2024 KY1213 by ZZ. Z), Hainan Provincial Natural Science Foundation of China (No. 823QN371 by JY. L), Joint Program on Health Science & Technology Innovation of Hainan Province (No. WSJK2024QN001 by JY. L), Special Science and Technology Plan Project of Universities and Medical Institutions in Sanya City (No. 2021GXYL32 by JY. L), Hainan Province Health Industry Scientific Research Project (No. 21 A200333 by YL. L), Sanya University and Medical Institutions Special Science and Technology Project (No. 2021GXYL29 by YL. L) , Zhejiang Province Young and Middle-aged Clinical Famous Chinese Medicine Practitioners Project (by L.Z) and Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases.
Author information
Authors and Affiliations
Contributions
YB.H.: Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Software, Validation, Visualization and Writing-original draft. JY.L.: Methodology, Software, Validation, Visualization and Writing-original draft. SL.C.: Supervision and Writing-original draft. R.Y.: Data curation. YR.F.: Data curation. SY.T.: Formal analysis, Visualization and Methodology. YX.S.: Formal analysis, Visualization and Methodology. C.W.: Data curation. L.Z.: Validation. J.F.: Methodology. Y, S.: Methodology. ZZ.Z.: Funding acquisition and Supervision. J.C.: Supervision. AZ.Y.: Methodology. J.L.: Methodology. YL.L.: Conceptualization, Data curation, Formal analysis, Funding acquisition and Supervision.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of The First Affiliated Hospital of Zhejiang Chinese Medical University (KS-071–01, 27 June 2023). The clinical samples were collected and used with the informed consent of the patients themselves.
Consent for publication
All authors give their consent for publication.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
12884_2025_7742_MOESM1_ESM.pdf
Supplementary Material 1: Supplementary Fig 1. Quality Control, Normalization and annotation for Single Cell Analysis (A) Quality Control Metrics: Left to right, plots displaying the number of detected RNA features (nFeature_RNA), total RNA counts (nCount_RNA), and mitochondrial gene percentage (percent.mt) across different cell identities. (B) Elbow Plot: Graph depicting the variance ratio for each number of principal components, aiding in the determination of the optimal number for dimensionality reduction. (C) PCA Scatter Plot: Scatter plot of the first two principal components (PC_1 and PC_2) categorizing different cell identities, each represented by distinct colors. (D) Harmony PCA Scatter Plot: Scatter plot demonstrating the integration of harmony scores with the PCA data, indicating the consistency across samples after batch effect correction. (E) UMAP Clustering: UMAP plots for NP and RPL, displaying data clusters in a two-dimensional space with unique colors for each cluster. (F) Clustering Heatmap: A detailed heatmap with dendrogram showing the hierarchical clustering of various cell types according to their RNA expression profiles, with a color gradient representing cluster scores and labels indicating cell type identities based on ‘singleR’ package.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
He, Yb., Li, Jy., Chen, Sl. et al. TRAF3 as a potential diagnostic biomarker for recurrent pregnancy loss: insights from single-cell transcriptomics and machine learning. BMC Pregnancy Childbirth 25, 637 (2025). https://doi.org/10.1186/s12884-025-07742-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12884-025-07742-6