A two-tier feature selection method for predicting mortality risk in ICU patients with acute kidney injury

Liu, Mengqing; Fan, Zhiping; Gao, Yu; Mubonanyikuzo, Vivens; Wu, Ruiqian; Li, Wenjin; Xu, Naiyue; Liu, Kun; Zhou, Liang

doi:10.1038/s41598-024-63793-3

Download PDF

Article
Open access
Published: 22 July 2024

A two-tier feature selection method for predicting mortality risk in ICU patients with acute kidney injury

Scientific Reports volume 14, Article number: 16794 (2024) Cite this article

2309 Accesses
1 Citations
Metrics details

Subjects

Abstract

Acute kidney injury (AKI) is one of the most important lethal factors for patients admitted to intensive care units (ICUs), and timely high-risk prognostic assessment and intervention are essential to improving patient prognosis. In this study, a stacking model using the MIMIC-III dataset with a two-tier feature selection approach was developed to predict the risk of in-hospital mortality in ICU patients admitted for AKI. External validation was performed using separate MIMIC-IV and eICU-CRD. The area under the curve (AUC) was calculated using the stacking model, and features were selected using the Boruta and XGBoost feature selection methods. This study compares the performance of a stacking model using two-tier feature selection with a model using single-tier feature selection (XGBoost: 85; Boruta: 83; two-tier: 0.91). The predictive effectiveness of the stacking model was further validated by using different datasets (Validation 1: 0.83; Validation 2: 0.85) and comparing it with a simpler model and traditional clinical scores (SOFA: 0.65; APACH IV: 0.61). In addition, this study combined interpretable techniques and causal inference to analyze the causal relationship between features and predicted outcomes.

Construction of a machine learning-based interpretable prediction model for acute kidney injury in hospitalized patients

Article Open access 18 March 2025

Predicting outcomes of acute kidney injury in critically ill patients using machine learning

Article Open access 18 June 2023

Predicting mortality in critically ill patients requiring renal replacement therapy for acute kidney injury in a retrospective single-center study of two cohorts

Article Open access 17 June 2022

Introduction

Acute kidney injury (AKI), a significant factor to inpatient mortality worldwide, affects approximately one-fifth of hospitalized individuals^1,2,3. The International Society of Nephrology's 0 by 25 initiative aims to eradicate preventable AKI-related deaths by the year 2025⁴. Despite considerable recent efforts, identifying effective treatments that substantially enhance renal recovery remains challenging. Early prediction or detection of AKI carries significant clinical implications but poses a substantial hurdle. Early prediction or detection of AKI has significant clinical implications but poses a substantial challenge. To address the limitations of early AKI prediction, researchers have increasingly turned to machine learning methods. However, the success of these models hinges on the selection of relevant features. To this end, diverse feature selection techniques are employed to improve model generalization, stability, and interpretability^5,6,7.

In this context, artificial intelligence (AI) has demonstrated promise for time-sensitive applications in AKI. These applications encompass early identification, warning, and the provision of AKI treatment recommendations^8,9. Machine learning-based models can detect AKI at an early stage, providing clinicians with a chance to intervene earlier and potentially improve patient outcomes^10,11,12. While previous research on AKI prediction has predominantly focused on specific settings such as hospital-acquired AKI¹³, postoperative AKI¹⁴, cancer-related AKI⁸, and critically ill patients in intensive care units (ICUs)^15,16, as well as patients admitted to emergency departments¹⁷, there remains a gap in machine learning models for predicting AKI in general and critically ill patients. The high heterogeneity of patient history data in general hospitals presents a significant challenge for the independent validation of predictive models¹⁸. This is particularly problematic for current machine learning-based mortality risk prediction models, as they often rely on a multitude of patient test results, such as routine blood tests, as input features. The multiplicity of features and the lengthy data collection process pose a significant challenge to model application. Consequently, reducing the number of input features while preserving model accuracy has become a pressing concern. Zhu et al.¹⁹ successfully developed a machine learning model for predicting the risk of death in sepsis patients, achieving a 71% reduction in the number of features. Their findings indicate that even with small samples and low-dimensional data, accurate identification of patients at risk is feasible, enabling early treatment. Additionally, Shen et al.²⁰ and Wu et al.⁷ validated the impact of feature reduction on the stability and accuracy of model prediction performance. Although many models have been proposed to identify patients at risk for AKI, few models can predict the risk of clinically important outcomes (hospital death or dialysis) once a patient develops AKI. The application of models with clinically important predictions may help guide the early treatment of patients with AKI.

In this study, we introduce a machine learning model that employing a two-stage feature selection process to predict in-hospital mortality risk among ICU patients with AKI. Our aim is to identify crucial features for mortality prediction and, as a result, reduce feature dimensionality to enhance model interpretability without sacrificing accuracy.

Results

Study population characteristics

For this study, data from 16,090 initial ICU admissions in the MIMIC III database were collected, with 11,182 patients meeting the inclusion criteria. These data comprised the training set, with 30% allocated for internal validation. Patients were categorized as either dead or surviving. Furthermore, data from MIMIC-IV (validation 1) and eICU-CRD (validation 2) were retrieved for external validation using the same criteria. The mortality and survival data for the training set, internal validation set, and external validation set were statistically analyzed a summarized in Table 1. The training set consisted of 7828 cases, of which 2273 resulted in mortality and 5555 in survival. The internal validation set consisted of 3354 cases with 840 deaths and 2514 survivors (see Supplementary Table 1). External validation set 1 (MIMIC-IV) included 7822 cases with 6705 deaths and 1117 survivors. External validation set 2 (eICU-CRD) consisted of 5928 cases, with 5403 survivors and 525 deaths. Across all three datasets, the proportion of men diagnosed with AKI exceeded that of women, and correspondingly, the mortality rate was also higher among male patients. Additionally, AKI patients aged over 60 exhibited a notably elevated mortality rate compared to patients in other age brackets.

Table 1 The baseline characteristics of AKI patients are analyzed.

Full size table

Feature selection and Model performance

Feature selection involved initial screening with the Boruta algorithm for the first tier (see Supplementary Fig. 1A,B). Subsequently refinement was carried out using XGBoost for the second tier (see Supplementary Fig. 2A). Ultimately, a total of 24 relevant features were identified. Additionally, the features identified using only XGBoost feature selection are depicted in Supplementary Fig. 2B.

The construction of the model is carried out according to the features screened by the Boruta and XGBoost algorithms, respectively, and the prediction effect of the model is evaluated and compared. The evaluation indexes are shown in Fig. 1A–C,A–C′′). As can be seen from the figure, the best-evaluated algorithm is stacking, with an AUC (95% CI) of 0.85 (0.846–0.854) for XGBoost-Stacking (Stacking model construction using XGBoost filtered features) and an AUC (95% CI) of 0.83 (0.828–0.831) for Boruta-Stacking (Stacking model construction using Boruta filtered features). Subsequently, model construction was performed using two-tier feature selection, and different types of algorithms were used for model training. The results showed that the stacking algorithm gave the best prediction with an AUC (95% CI) of 0.91 (0.906–0.915) (Fig. 1E–G). The precision, accuracy, and F1-score of the trained model were evaluated, achieving values of 0.90, 0.89, and 0.90, respectively (Fig. 2A–C). Furthermore, a comparison was conducted between the performance of models developed using single-tier and two-tier feature selection methods (Table 2; see also Supplementary Table 2 for detailed results). This analysis revealed that the two-tier feature selection approach consistently yielded superior prediction performance compared to the single-tier method.

Table 2 Evaluation of models for predicting AKI with two-tier feature selection.

Full size table

To further validate the predictive effectiveness of the developed two-tier feature selection model, we used both internal and external validation sets to evaluate the model's performance. Figure 1E’–G’ demonstrates the validation results on validation set 1, with an AUC (95% CI) of 0.83 (0.830–0.833). Figure 2A’–C’ shows the precision of 0.86, accuracy of 0.81, and F1-score of 0.80 for validation set 1. In Fig. 1E”–G”, we show the validation results of validation set 2 with an AUC (95% CI) of 0.85 (0.845–0.853). Figure 2 A”–C” shows that validation set 2 has a precision of 0.84, an accuracy of 0.80, and an F1-score of 0.80. Comparing the training results with the validation results reveals (Tables 2 and 3) that the AUC values of both the internal and external validation sets are higher than 0.80, indicating that the model predicts well on different datasets. In addition, we compared the constructed model with the traditional clinical scoring systems SOFA and APACHE IV (Fig. 1E, E’, E”). The results showed that on the training set, the AUC was 0.65 for SOFA and 0.61 for APACHE; on the validation set 1, the AUC was 0.71 for SOFA and 0.64 for APACHE; and on the validation set 2, the AUC was 0.62 for SOFA and 0.64 for APACHE. These results indicate that, compared to the traditional clinical scoring system, the constructed model has a better prediction effect. The experimental results of internal validation are presented in Online Supplementary Fig. 3A–C, Fig. 4A–C, and Table 3.

Table 3 Model performance metrics.

Full size table

To achieve a more comprehensive evaluation of the model’s performance, calibration curves and Brier scores were employed. The results are presented in Fig. 3 and Table 4. The Brier scores (with 95% CI) indicated good calibration, with values of 0.103 (0.093–0.113) for the training set, 0.106 (0.096–0.118) for validation set 1, and 0.110 (0.100–0.122) for validation set 2 (Table 4).

Table 4 Model evaluation of predicted AKI using 95% CI for AUC.

Full size table

Model interpretability

The ensemble stacking model with two-tier feature selection utilizes two perspectives for model interpretation: individual and global. From an individual perspective, the interpretation module analyzes feature weights of the base model using the PI technique, as depicted in Fig. 4A–G. The top three important features in the base model are age, BUN, and temperature. During the analysis of the meta-model's feature weights, the predicted outputs of RFs' predicted outputs significantly influenced the final predictions. Table 5 displays the feature weights of the stacking model, which align with the findings of the base model analysis. From a global perspective, a causal diagram based on significant features is presented in Fig. 5. A causal relationship is observed between CL and HB as confounders of BUN and age and BUN as confounders of death, This relationship is represented as CL, HB → BUN → Death. However, determining the specific impact of each detection value is not feasible. Therefore, the influence of specific feature values is analyzed using LIME (Fig. 6). The LIME analysis reveals that the model's predictions vary under different combinations of feature values. For instance, the model is more likely to predict a lower risk of death when BUN ranges from 14.0 to 20.5 and age is ≤ 57 years, while predicting a higher risk of death when INR exceeds 1.5. This personalized analysis helps physicians understand the causal relationships between features and characteristics during model executions, thereby enhancing their comprehension of the models' decision-making processes.

Table 5 Overall model weights.

Full size table

In summary, the stacking ensemble model with two-tier feature selection integrates individual and global perspectives for model interpretation. The analysis of feature weights using the PI technique for both base and stacked models, along with causal diagrams and LIME analyses based on causal inference, enhances understanding of the model's predictive process and provides reliable references for medical decision-making.

Discussion

In the building of predictive models for AKI, logistic regression with backward or forward selection is a common approach for selecting a subset of features for model construction²¹. More recent approaches, methods such as Lasso, Boruta²², and XGBoost²³ have been employed for feature selection in AKI prediction.

However, Lasso methods are typically limited in their adaptability, often relying on linear models or assumptions. In nonlinear scenarios, Lasso methods may fail to accurately capture complex feature relationships, resulting in the selection of insufficient features for effective data interpretation. Logistic regression, which relies on a linear combination of individual features for classification, may not adequately capture feature interactions, potentially impacting prediction accuracy. While Boruta⁶, a feature selection method based on tree models, excels at uncovering complex feature relationships and handling highly correlated features. Nonetheless, it solely focuses on the relationship between features and targets, disregarding the importance of features and models. XGBoost²⁴, a gradient-boosting tree model, excels at capturing complex relationships among features, particularly in nonlinear scenarios. Its feature selection process focuses on the correlation between features and the model.

Several studies have highlighted the importance of feature selection in improving model performance for AKI prediction. Zhou et al.²⁵ demonstrated significant improvements in model predictions by incorporating deep features alongside those extracted using convolutional neural networks (CNNs). Similarly, Zhu et al.¹⁹ observed a substantial enhancement in prediction accuracy following a 71% reduction in feature set size. Based on these findings, we propose that the model prediction performance can be improved by selecting intersecting features and reducing redundant features. Therefore, in our experiments, we first conducted feature selection using the Boruta algorithm to filter out features that correlate with the target value. Subsequently, in the second tier of feature selection, we employed XGBoost to filter out features that correlate with the model. Experimental results demonstrate the superiority of the two-tier feature selection approach over the single-tier approach. The stacking ensemble model exhibited superior predictive performance compared to the baseline model. Notably, the stacking ensemble model with two-tier feature selection achieved the highest predictive performance. Yue et al.²⁶ applied the Boruta algorithm to screen 34 variables and built a random forest model for predicting mortality risk in acute kidney injury patients, achieving an AUROC of 0.82. Yang et al.²⁷ employed the Boruta algorithm to select 36 variables and utilized XGBoost for modeling mortality risk prediction in sepsis-associated acute kidney injury patients, achieving an AUROC of 0.85. In comparison to previous studies, our proposed model achieved an AUROC of 0.91, indicating improved model performance. These results affirm the efficacy of our proposed approach. Furthermore, by employing model interpretable techniques and causal inference, we conducted a causal analysis of factors influencing model predictions. Our findings revealed significant associations between various laboratory tests and the prediction of mortality risk in AKI patients, consistent with previous studies by Son²⁸, Zhang²⁹, and others, further validating the reliability of our model. Taken together with the experimental results, the two-tier feature selection proposed in this study can better predict the risk of death of AKI patients in the ICU, and it can better capture the complexity and diversity of AKI risk by reducing the confounding variables in the model inputs. By combining the predictive power of multiple models, it can provide a more reliable auxiliary diagnosis for clinical decision-making.

However, it is worth noting that a common challenge we faced was that this study was not prospective but retrospective. We chose to use the MIMIC-III dataset, but it is important to note that this dataset does not adequately represent the entire population and the diversity of different clinical practices. This somewhat limits our ability to analyze the problem in depth and make accurate predictions, as well as our ability to generalize the model to real-world applications. It is worth noting that due to the large number of missing urine output indicators in the dataset, we chose to temporarily omit this indicator from our study after referring to the relevant literature^{18,30,31,32,33}. However, recent studies³⁴ suggest that urine output plays a crucial role in the disease progression of AKI. Therefore, we will fully consider urine output as an important indicator in future studies to improve the ability to predict more accurately the risk of death in patients.

Methods

The conceptual framework for our developing two-tier feature selection prediction model is presented in Fig. 7.

Study population

Data for this study were retrieved from three distinct critical care databases: MIMIC-III³⁵, MIMIC-IV³⁶, and eICU-CRD³⁷. The prediction models were developed using the publicly accessible MIMIC-III databases. The data were divided into two sets: 30% of the data were reserved for internal validation, and the remaining 70% were used for model construction. The predictive performance of these models was validated using an entirely independent dataset, the MIMIC-IV and eICU-CRD datasets. MIMIC-III includes critical care data from 46,520 ICU patients admitted to Beth Israel Deaconess Medical Center in Boston between June 1, 2001, and October 31, 2012. This dataset encompasses 26 tables encompassing demographics, admission records, discharge summaries, ICD-9 diagnostic records, vital signs, laboratory measurements, and medication usage. In contrast, MIMIC-IV includes data from over 190,000 patients, 450,000 hospitalizations, and more than 1000 hospital admissions to Beth Israel Deaconess Medical Center (BIDMC) and Massachusetts Institute of Technology (MIT) between 2008 and 2019, totaling 1,000,000 admissions. It offers a broader array of information, covering demographics, laboratory tests, medication usage, vital signs, surgical procedures, and disease diagnoses. Although MIMIC-III and MIMIC-IV may share medical information and data types, their data collection, processing, and dissemination methodologies differ. The MIMIC-IV dataset is broader in scope, spanning more hospitals and patients and covering a longer timeframe. The eICU-CRD Collaborative Research Database (eICU-CRD) is a large public database created by MIT in collaboration with the Laboratory for Computational Physiology (LCP). The database is a completely independent dataset that brings together data from many hospitals within the United States, expanding the scope of the study by providing data from multiple centers. The database covers routine data on more than 200,000 patients admitted to intensive care units in 2014 and 2015 and includes a wealth of high-quality clinical information such as physiological parameters, laboratory results, medication records, and diagnostic information. The data are presented in both structured and unstructured forms and are automatically collected from monitoring equipment, electronic medical records, and other healthcare information systems.

For each patient sample, the following information was collected: (1) Demographic characteristics: including gender, age in years, and survival status; (2) Vital signs: including heart rate (HR, beats/min), respiratory rate (Resp, beats/min), body temperature (Temp, degrees Celsius), and pain (pain, not applicable); (3) Laboratory parameters: including blood urea nitrogen (BUN, mg/dL), creatinine (Creatinine, mg/dL), glucose (GLU, mg/dL), bicarbonate (HCO₃, mmol/L), international normalized ratio (INR), potassium (K), potassium (K, mmol/L), sodium (Na, mmol/L), partial pressure of carbon dioxide (PCO₂, mmHg), prothrombin time (PT, s), white blood cell count (PCR, mmol/L s), white blood cell count (WBC, in 10³/μL), chloride (CL, in mmol/L), Glasgow Coma Scale (GCS), hematocrit (HCT, %), hemoglobin ( HB, g/dL), acid–base balance index (PH,) platelet count (PL, in mmol/L), platelet count (PLT, in 10³/μL), oxygen pressure (PO₂, in mmHg), peripheral oxygen saturation (SpO₂, in %), and fraction of inspired oxygen (FiO₂, in %). Blood samples were taken before and after dialysis, following an 8-h fast for routine biochemical testing.

Determination of outcome variables: mortality and AKI

Mortality, defined as the death rate among patients with AKI during their ICU hospitalization, was determined through specific criteria. Firstly, AKI diagnosis followed the Kidney Disease Improving Global Prognosis (KDIGO)¹ guidelines, considering serum creatinine concentration (Scar) and urine output (UO) levels. According to literature studies^{30,31,32,33,38}, serum creatinine concentration (Scar) was used as the main target of study in this experiment. AKI was defined as: a 1.5-fold increase in serum creatinine concentration within the prior 7 days; a rise of ≥ 0.3 mg/dL within 48 h; or a sustained urine output of < 0.5 mL/kg/h for ≥ 6 h. In cases where baseline serum creatinine was unavailable pre-admission, the first serum creatinine at admission served as the baseline. Patients with AKI in the ICU were identified via departmental codes. Subsequently, ICU duration was computed based on admission and discharge times, and data from 24 h preceding admission were extracted^26,39,40. Data from the initial ICU admission were used for patients with multiple admissions; the average value was calculated for repeated examinations within 24 hours⁴¹.

Inclusion and exclusion criteria

To ensure data safety and emphasize the effectiveness of the model in early prediction, we focused on developing a predictive model using medical data from 24 h prior to a patient's admission to the hospital to screen patients diagnosed with AKI. The final dataset for the experiment was selected from this data (Fig. 8). During the data selection process, we excluded patients who met the following criteria: (1) age < 18 years old; (2) patients who were admitted to the intensive care unit for > 24 h; (3) patients who had already received chronic renal replacement therapy prior to admission; and (4) data with < 20% of missing values or a lack of outcome information. These exclusion criteria were designed to ensure the quality and accuracy of the experimental data for better exploring the relationship between early patient status and AKI.

Data processing

In this study, datasets with missing values exceeding 20% were excluded, and outliers were identified using box-and-whisker plots and subsequently removed. To handle missing values, multiple imputations were performed utilizing the RF algorithm, known for its effectiveness in imputing missing data⁴² RF offers several advantages, including the ability to handle mixed types of missing data, adaptability to interactions and nonlinearities, and scalability to large datasets⁴³, while preserving the distribution of data post-imputation. Additionally, the data underwent Min–Max normalization, transforming it into a specific range of intervals to ensure uniform scaling of each feature. This normalization process ensured uniform scaling for each feature, maintaining a relative weight balance between features. By addressing issues of model bias towards certain features due to differing scales, the normalization process improved the performance and interpretability of the machine learning model, ensuring consistent contribution weights of individual features to the model.

Statistical analyses

Descriptive statistics were utilized to assess the distribution and inherent patterns of numerical characteristics within the dataset. Measures such as mean, median, mode, range, variance, and standard deviation were examined as appropriate. Pearson's correlation coefficient was employed to analyze the degree of linear correlation between variables. Descriptive statistics for continuous variables included either mean ± standard deviation or median (interquartile range), while frequencies were utilized for categorical variables. The normal distribution of each variable was evaluated using the Kolmogorov–Smirnov test. Student's t-test compared continuous variables, and Fisher's exact method was used for correlational analysis between variables. Statistical analysis was performed using R version 4.3.1 for Windows.

Feature selection

This study employs a two-tier feature selection approach to improve both the performance and interpretability of the prediction model. The Boruta algorithm was utilized in the initial tier, and the XGBoost algorithm was employed in the subsequent tier. Boruta^6,44 is a RF-based feature selection method that evaluates feature importance through modelling the distribution of random and original features. In the first tier, Boruta is applied to filter out features with significant predictive power for the target variable from the initial set. XGBoost²³, an efficient gradient boosting tree algorithm, serves as the meta-model in the second tier, known for its excellent predictive performance and automatic feature screening. Feature selection within the XGBoost model further refines the initially selected features, resulting in a final subset with enhanced predictive power and stability.

Model construction

In this study, we have strategically employed the SEM (Stacking Ensemble Method) to build our model, with the goal of further enhancing overall model performance by adeptly integrating outputs from multiple base learners (single classifiers) as inputs to the meta-learner. Extensive prior research has demonstrated the substantial superiority of the SEM in performance compared to independent classifiers⁴⁵. To further optimize model performance, this study has employed the voting ensemble method in the preliminary stage. This method selectively crafts the base model for the SEM based on data characteristics and the principle of model diversity. Ultimately, Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Light Gradient Boosting Machine (LGBM), EXtreme Gradient Boosting (XGBoost), and Random Forest (RF) were identified as the base models for stacking ensemble, with LR being specifically chosen as the metamodel. This decision was made considering that the variables outputted by the base models represent linear data and align with the pursuit of model interpretability. The objective of this selection is to strike a balance between the diversity of the base models and the performance of the overall model, thereby providing a more comprehensive and reliable analytical foundation for this study.

Evaluation metrics

To assess the performance of the used models comprehensively and thoroughly, we utilize a diverse set of performance metrics, encompassing the area under the Receiver Operating Characteristic (AUROC), 95% Confidence Interval (CI), Precision-Recall Curve (AUC-PRC), Precision, Accuracy, Recall, F1 score, Calibration curves, and Brier scores. This comprehensive metric framework is designed to provide a more holistic understanding of the model's performance across various dimensions. Specifically, the evaluation is conducted using the following formulas:

True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).

$$AUC-ROC={\int }_{0}^{1}TPd FP$$

(1)

$$AUC-PRC={\int }_{0}^{1}FPd Recall$$

(2)

$$Recall=\frac{TP}{TP+FN}$$

(3)

$$Precision=\frac{TP}{TP+FP}$$

(4)

$$Accuarcy=\frac{TP+TN}{TP+TN+FP+FN}$$

(5)

$$F1 score=\frac{2\times Precision\times Recall}{Precision+Recall}$$

(6)

$$Brier Score=\frac{1}{N}\sum_{i=1}^{N}{({f}_{i}-{o}_{i})}^{2}$$

(7)

N is the total number of samples, ${f}_{i}$ is the predicted probability of the ith predicted sample, and ${o}_{i}$ is the actual outcome of the ith sample (usually 0 or 1).

Model interpretability

The SEM combines multiple base models to generate predictions. Thus, when interpreting the model, the feature weights of each base model undergo an initial assessment using the Permutation Importance (PI) technique. Subsequently, mathematical computation determines the feature weights of each model, which are then utilized as the feature weights of the stacked models. Features with higher weights are selected based on importance ranking, and a causal diagram is constructed using a causal inference framework⁴⁶. In this framework, confounders are defined as variables directly influencing both the predicted outcome and the predictor. These confounders are pivotal factors contributing to AKI mortality rates⁴⁷. Finally, Local Interpretable Model-Agnostic Explanations (LIME) is employed to analyze how specific values of different characteristics impact the model's predicted outcomes across various categories. This elucidation of clinical parameters leading to high patient mortality facilitates targeted interventions for potentially critical illnesses during clinical practice.

Data availability

The datasets generated during the current study are available in the GitHub (https://github.com/mengqings/Data_aki_all/tree/master and https://github.com/mengqings/eICU_Data_extract/tree/master) repository.

References

Ostermann, M. et al. Controversies in acute kidney injury: Conclusions from a Kidney Disease: Improving Global Outcomes (KDIGO) Conference. Kidney Int. 98, 294–309 (2020).
Article PubMed PubMed Central Google Scholar
Khwaja, A. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin. Pract. 120, c179–c184 (2012).
Article PubMed Google Scholar
Susantitaphong, P. et al. World incidence of AKI: A meta-analysis. Clin. J. Am. Soc. Nephrol. 8, 1482–1493 (2013).
Article PubMed PubMed Central Google Scholar
Mehta, R. L. et al. International Society of Nephrology’s 0by25 initiative for acute kidney injury (zero preventable deaths by 2025): A human rights case for nephrology. Lancet 385, 2616–2643 (2015).
Article PubMed Google Scholar
Bhowal, P., Sen, S. & Sarkar, R. A two-tier feature selection method using Coalition game and Nystrom sampling for screening COVID-19 from chest X-Ray images. J. Ambient Intell. Hum. Comput. 14, 3659–3674 (2023).
Article Google Scholar
Kursa, M. B., Jankowski, A. & Rudnicki, W. R. Boruta—A system for feature selection. Fundamenta Informaticae 101, 271–285 (2010).
Article MathSciNet Google Scholar
Wu, L. et al. Feature ranking in predictive models for hospital-acquired acute kidney injury. Sci. Rep. 8, 17298 (2018).
Article ADS PubMed PubMed Central Google Scholar
Loftus, T. J. et al. Artificial intelligence-enabled decision support in nephrology. Nat. Rev. Nephrol. 18, 452–465 (2022).
Article PubMed PubMed Central Google Scholar
Sabut, S., Patra, P. & Ray, A. Deep learning approach for classifying ischemic stroke using DWI sequences of brain MRIs. IJISTA 20, 524 (2022).
Article Google Scholar
Tomašev, N. et al. Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records. Nat. Protoc. 16, 2765–2787 (2021).
Article PubMed Google Scholar
Liu, K. et al. Development and validation of a personalized model with transfer learning for acute kidney injury risk estimation using electronic health records. JAMA Netw. Open 5, e2219776 (2022).
Article PubMed PubMed Central Google Scholar
Churpek, M. M. et al. Internal and external validation of a machine learning risk score for acute kidney injury. JAMA Netw. Open 3, e2012892 (2020).
Article PubMed PubMed Central Google Scholar
Cronin, R. M. et al. National Veterans Health Administration inpatient risk stratification models for hospital-acquired acute kidney injury. J. Am. Med. Inform. Assoc. 22, 1054–1071 (2015).
Article PubMed PubMed Central Google Scholar
Bihorac, A. et al. MySurgeryRisk: Development and validation of a machine-learning risk algorithm for major complications and death after surgery. Ann. Surg. 269, 652–662 (2019).
Article PubMed Google Scholar
Liu, J. et al. Mortality prediction based on imbalanced high-dimensional ICU big data. Comput. Ind. 98, 218–225 (2018).
Article Google Scholar
Lauritsen, S. M. et al. Explainable artificial intelligence model to predict acute critical illness from electronic health records. Nat. Commun. 11, 1–11 (2020).
Article Google Scholar
Martinez, D. A. et al. Early prediction of acute kidney injury in the emergency department with machine-learning methods applied to electronic health record data. Ann. Emerg. Med. 76, 501–514 (2020).
Article PubMed Google Scholar
Wu, C. et al. Predicting in-hospital outcomes of patients with acute kidney injury. Nat. Commun. 14, 1–9 (2023).
ADS PubMed PubMed Central Google Scholar
Yaqiang, Z. Research and Implementation of Death Risk Prediction Model for Septic Patients Based on Machine Learning (Beijing University of Posts and Telecommunications, 2022).
Shen, J. et al. Features Selection in a Predictive Model for Cardiac Surgery—Associated Acute Kidney Injury. https://www.researchsquare.com/article/rs-3103913/v1 (2023) https://doi.org/10.21203/rs.3.rs-3103913/v1.
Bell, S. et al. Risk of postoperative acute kidney injury in patients undergoing orthopaedic surgery—Development and validation of a risk score and effect of acute kidney injury on survival: Observational cohort study. BMJ (Clin. Res. Ed.) 351, h5639 (2015).
Google Scholar
Maurya, N. S., Kushwah, S., Kushwaha, S., Chawade, A. & Mani, A. Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta. Sci. Rep. 13, 6413 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, B., Zhang, Y. & Jiang, X. Feature selection for global tropospheric ozone prediction based on the BO-XGBoost-RFE algorithm. Sci. Rep. 12, 9244 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Manju, N., Harish, B. S. & Prajwal, V. Ensemble feature selection and classification of internet traffic using XGBoost classifier. IJCNIS 11, 37–44 (2019).
Article Google Scholar
Zhou, L., Nandal, A., Ganchev, T. & Dhaka, A. Breast cancer detection by fusion of deep features with CNN extracted features. IJISTA 20, 510 (2022).
Article Google Scholar
Yue, S. et al. Machine learning for the prediction of acute kidney injury in patients with sepsis. J. Transl. Med. 20, 215 (2022).
Article PubMed PubMed Central Google Scholar
Yang, J., Peng, H., Luo, Y., Zhu, T. & Xie, L. Explainable ensemble machine learning model for prediction of 28-day mortality risk in patients with sepsis-associated acute kidney injury. Front. Med. 10, (2023).
Song, X., Liu, X., Liu, F. & Wang, C. Comparison of machine learning and logistic regression models in predicting acute kidney injury: A systematic review and meta-analysis. Int. J. Med. Inform. 151, 104484 (2021).
Article PubMed Google Scholar
Zhang, X. et al. Machine learning for the prediction of acute kidney injury in critical care patients with acute cerebrovascular disease. Ren. Fail. 44, 43–53 (2022).
Article CAS PubMed PubMed Central Google Scholar
Mistry, N. S. & Koyner, J. L. Artificial intelligence in acute kidney injury: From static to dynamic models. Adv. Chronic Kidney Dis. 28, 74–82 (2021).
Article PubMed PubMed Central Google Scholar
Dong, J. et al. Machine learning model for early prediction of acute kidney injury (AKI) in pediatric critical care. Crit. Care. 25, 288 (2021).
Article PubMed PubMed Central Google Scholar
Yang, L. et al. Acute kidney injury in China: A cross-sectional survey. Lancet 386, 1465–1471 (2015).
Article PubMed Google Scholar
Song, X. et al. Cross-site transportability of an explainable artificial intelligence model for acute kidney injury prediction. Nat. Commun. 11, 5668 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, Z., Ho, K. M. & Hong, Y. Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit. Care 23, 112 (2019).
Article PubMed PubMed Central Google Scholar
Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data. 3, 1–9 (2016).
Article Google Scholar
Johnson, A. E. W. et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci. Data. 10, 1–9 (2023).
Article CAS PubMed PubMed Central Google Scholar
Pollard, T. J. et al. The eICU collaborative research database, a freely available multi-center database for critical care research. Sci. Data 5, 180178 (2018).
Article PubMed PubMed Central Google Scholar
Zhang, Z. Machine Learning method for the management of acute kidney injury: More than just treating biomarkers individually. Biomark. Med. 13, 1251–1253 (2019).
Article CAS PubMed Google Scholar
Yao, X. et al. Development of a nomogram model for predicting the risk of in-hospital death in patients with acute kidney injury. RMHP 14, 4457–4468 (2021).
Article Google Scholar
Lee, C.-W. et al. A combination of SOFA score and biomarkers gives a better prediction of septic AKI and in-hospital mortality in critically ill surgical patients: a pilot study. World J. Emerg. Surg. 13, 41 (2018).
Article ADS PubMed PubMed Central Google Scholar
Li, F. et al. Prediction model of in-hospital mortality in intensive care unit patients with heart failure: Machine learning-based, retrospective analysis of the MIMIC-III database. BMJ Open 11, e044779 (2021).
Article PubMed PubMed Central Google Scholar
Yang, D. et al. Development of a predictive nomogram for acute respiratory distress syndrome in patients with acute pancreatitis complicated with acute kidney injury. Ren. Fail. 45, 2251591 (2023).
Article PubMed PubMed Central Google Scholar
Tang, F. & Ishwaran, H. Random forest missing data algorithms. Stat. Anal. Data Min. ASA Data Sci. J. 10, 363–377 (2017).
Article MathSciNet Google Scholar
Kursa, M. B. & Rudnicki, W. R. Feature selection with the Boruta package. J. Stat. Softw. 36, 1–13 (2010).
Article Google Scholar
Wang, Y. et al. Stacking-based ensemble learning of decision trees for interpretable prostate cancer detection. Appl. Soft Comput. 77, 188–204 (2019).
Article Google Scholar
Zhang, Z. et al. Causal inference with marginal structural modeling for longitudinal data in laparoscopic surgery: A technical note. Laparosc. Endosc. Robot. Surg. 5, 146–152 (2022).
Article Google Scholar
Zhang, Z. Distinguishing between mediators and confounders is important for the causal inference in observational studies. AME Med. J. 4, 35 (2019).
Article Google Scholar

Download references

Acknowledgements

The authors greatly appreciate the editors and peer reviewers for their critical reading and insightful comments, which are helpful to improve our manuscript substantially. The authors express their gratitude to the researchers responsible for establishing and overseeing the MIMIC-III, MIMIC-IV, and EICU database. The authors would like to express their gratitude to the various project funds that financed this study.

Funding

Anhui Natural Science Foundation: Fault Diagnosis Research in Uncertain Environment under the Background of Industry 4.0, [Grant Number: KJ2021A0866]. National Natural Science Foundation of China [Grant Number: 82072228]. Natural General Research Project Fund of Shanghai University of Medicine & Health Sciences. National Natural Science Foundation of China [Grant Number: 62376152]. Three-Year Action Plan for Strengthening the Construction of Public Health System in Shanghai (2023–2025) of Construction project [Grant Number: GWVI-6]. Three-Year Action Plan for Strengthening the Construction of Public Health System in Shanghai (2023–2025) of Key discipline construction project (Grant No. GWVI-11.1-49).

Author information

Authors and Affiliations

College of Health Science and Engineering University of Shanghai for Science and Technology, Shanghai, 200093, China
Mengqing Liu, Vivens Mubonanyikuzo, Ruiqian Wu, Wenjin Li, Naiyue Xu & Kun Liu
Shanghai University of Medicine & Health Sciences, Shanghai, 201318, China
Zhiping Fan & Yu Gao
Jiading District Central Hospital Affiliated to Shanghai University of Medicine & Health Sciences, Shanghai, 201899, China
Liang Zhou
Research Center for Medical Intelligent Development, China Hospital Development Institute, Shanghai Jiao Tong University, Shanghai, 200025, China
Liang Zhou

Authors

Mengqing Liu
View author publications
Search author on:PubMed Google Scholar
Zhiping Fan
View author publications
Search author on:PubMed Google Scholar
Yu Gao
View author publications
Search author on:PubMed Google Scholar
Vivens Mubonanyikuzo
View author publications
Search author on:PubMed Google Scholar
Ruiqian Wu
View author publications
Search author on:PubMed Google Scholar
Wenjin Li
View author publications
Search author on:PubMed Google Scholar
Naiyue Xu
View author publications
Search author on:PubMed Google Scholar
Kun Liu
View author publications
Search author on:PubMed Google Scholar
Liang Zhou
View author publications
Search author on:PubMed Google Scholar

Contributions

M.L.: Writing-original draft, software, methodology, data collection and analysis, investigation, formal, conceptualization analysis, writing. Z.F.: Review and editing. Y.G.: Visualization. V.M.: Adjusting the English syntax. R.W.: Review. W.L.: Supervision. N.X.: Validation. K.L.: Validation. Z.L.: Funding and idea. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Liang Zhou.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, M., Fan, Z., Gao, Y. et al. A two-tier feature selection method for predicting mortality risk in ICU patients with acute kidney injury. Sci Rep 14, 16794 (2024). https://doi.org/10.1038/s41598-024-63793-3

Download citation

Received: 11 December 2023
Accepted: 03 June 2024
Published: 22 July 2024
DOI: https://doi.org/10.1038/s41598-024-63793-3

Subjects

Abstract

Similar content being viewed by others

Construction of a machine learning-based interpretable prediction model for acute kidney injury in hospitalized patients

Predicting outcomes of acute kidney injury in critically ill patients using machine learning

Predicting mortality in critically ill patients requiring renal replacement therapy for acute kidney injury in a retrospective single-center study of two cohorts

Introduction

Results

Study population characteristics

Feature selection and Model performance

Model interpretability

Discussion

Methods

Study population

Determination of outcome variables: mortality and AKI

Inclusion and exclusion criteria

Data processing

Statistical analyses

Feature selection

Model construction

Evaluation metrics

Model interpretability

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links