Convolutional neural network for maxillary sinus segmentation based on the U-Net architecture at different planes in the Chinese population: a semantic segmentation study

Chen, Jiayi

doi:10.1186/s12903-025-06408-1

Research
Open access
Published: 01 July 2025

Convolutional neural network for maxillary sinus segmentation based on the U-Net architecture at different planes in the Chinese population: a semantic segmentation study

Jiayi Chen¹

BMC Oral Health volume 25, Article number: 961 (2025) Cite this article

277 Accesses
Metrics details

Abstract

Background/purpose

The development of artificial intelligence has revolutionized the field of dentistry. Medical image segmentation is a vital part of AI applications in dentistry. This technique can assist medical practitioners in accurately diagnosing diseases. The detection of the maxillary sinus (MS), such as dental implants, tooth extraction, and endoscopic surgery, is important in the surgical field. The accurate segmentation of MS in radiological images is a prerequisite for diagnosis and treatment planning. This study aims to investigate the feasibility of applying a CNN algorithm based on the U-Net architecture to facilitate MS segmentation of individuals from the Chinese population.

Materials and methods

A total of 300 CBCT images in the axial, coronal, and sagittal planes were used in this study. These images were divided into a training set and a test set at a ratio of 8:2. The marked regions (maxillary sinus) were labelled for training and testing in the original images. The training process was performed for 40 epochs using a learning rate of 0.00001. Computation was performed on an RTX GeForce 3060 GPU. The best model was retained for predicting MS in the test set and calculating the model parameters.

Results

The trained U-Net model achieved yield segmentation accuracy across the three imaging planes. The IoU values were 0.942, 0.937 and 0.916 in the axial, sagittal and coronal planes, respectively, with F1 scores across all planes exceeding 0.95. The accuracies of the U-Net model were 0.997, 0.998, and 0.995 in the axial, sagittal and coronal planes, respectively.

Conclusion

The trained U-Net model achieved highly accurate segmentation of MS across three planes on the basis of 2D CBCT images among the Chinese population. The AI model has shown promising application potential for daily clinical practice.

Clinical trial number

Not applicable.

Peer Review reports

Introduction

The maxillary sinus (MS), also referred to as the “Antrum of Highmore”, was first documented by anatomist Nathaniel Highmore in 1651 [1]. It is the largest paranasal sinus and first to develop as a physiological structure among the four pairs of paranasal sinuses (maxillary, ethmoid, frontal and sphenoid sinuses). The MS is a pyramidal-shaped space located in the body of the maxilla [2]. It is bordered by six walls: the superior, inferior, anterior, posterior, lateral and medial walls. The location of the MS is vital in the maxilla. The contents of the orbit are separated from the MS by the thin superior wall, forming the orbital floor. The sinus contains the infraorbital artery and nerve, which are branches of the maxillary artery and trigeminal nerve. The canine eminence is located on the anterior wall and typically present with a prominent inferolateral focal convexity in CBCT images. The Caldwell–Luc procedure involves creating an opening in the anterior wall to facilitate sinus drainage. The posterior wall is adjacent to the pterygopalatine fossa. The clinical importance of the pterygopalatine fossa is that it acts as an indicator of tumour and sepsis spread from the MS to other adjacent anatomic structures through CT diagnosis. The inferior wall, also referred to as the maxillary sinus floor, presents various forms and is a component of the alveolar process. The Schneiderian membrane is lined with the maxillary sinus floor, which is a key factor for maxillary sinus floor elevation in dental implants. Accurate measurement of the thickness of the Schneiderian membrane in CBCT images before surgery is associated with the success rate of implants. Most scholars consider thicknesses greater than 2 mm as pathological membrane thicknesses [3]. The lateral wall is thin and adjacent to the buccal part of the alveolar ridge, which contains the posterior superior alveolar artery (PSAA) and other vital structures. It is necessary to detect the location of the PSAA before the lateral maxillary sinus floor elevation procedure is performed using a CBCT scanner [4]. The medial wall forms the lateral wall of the nasal cavity. The MS ostium is located in the middle nasal meatus, and the Schneiderian membrane is continuous with the nasal epithelium through this ostium. The nasal meatus is responsible for inflammation drainage and maintains wet ventilation [5]. Mucosal thickening caused by various pathological factors can lead to maxillary sinusitis with sinus ostium obstruction, which can be determined through CBCT images. The MS is a vital anatomic structure in the maxillary body due to its particular location, especially in oral implantology and oto-rhino-laryngology. The correct diagnosis of MS relies on experienced doctors, and less experienced doctors may encounter challenges regarding accurate interpretation of CBCT images.

With the advent of CBCT technology, some complex microchanges can be detected and diagnosed through radiological methods. Although this progress has revolutionized the field of dentistry, disease diagnosis and identification using CBCT still rely on the judgement and experience of doctors and radiologists [6]. Medical segmentation is an imperative part of medical diagnosis and treatment. Precise and accurate segmentation of medical images is a prerequisite for subsequent medical services. Misdiagnosis due to a lack of anatomic knowledge and inaccurate subject judgement poses substantial psychological and economic burdens on patients. Currently, deep learning, a subfield of AI, has been used in medical image segmentation, demonstrating great economic and social effectiveness.

Deep learning methods based on artificial neural networks are designed to simulate aspects of human intelligence, explain and classify data, and detect potential relationships. Convolution neural networks, as representative deep learning algorithms, are widely used in medical image segmentation. The U-Net architecture, a classical convolution neural network algorithm, is widely adopted for medical segmentation. In recent years, the U-net architecture has been applied in dentistry [7]. U-Net, an encoder–decoder algorithm, focuses on improving the capture of local and general features using a skip connection between downsampling and upsampling operations at each resolution [8]. Compared with that of other deep learning models, the U-Net model has achieved superior accuracy in the diagnosis of maxillary sinus disease. Nechyporenko et al. used 320 CT images and achieved an accuracy of 93.78% using the U-Net model, and Alekseeva et al. utilized 162 MSCT images and achieved an accuracy of 90.09% with the same model in maxillary sinus pathology detection [9, 10]. However, researchers from Korea reported accuracies between 87.5% and 88.4% after applying the Res-Net 18 model to 512 images [11]. Additionally, Zeng et al. reported accuracies of 90% using YOLOv5, Res-Net 34, GoogLeNet, InceptionV3, and Res-NeXt101 models for detecting maxillary sinus abnormalities [12]. Accurate segmentation of MS is crucial for oral surgeons, ENT specialists, and maxillofacial radiologists. After maxillary sinus floor elevation with a bone graft, the conditions of membrane perforation and postsurgery graft bone volume changes were detected through a CBCT scanner. Busra et al. used a U-Net deep learning algorithm for MS segmentation with 100 axial CBCT images from the European population and achieved high performance, with an IoU of 0.9275 and an F1 score of 0.9784 [13]. Several studies have demonstrated differences in the performance of AI algorithms across different races [14]. Seyyed-Kalantari et al. reported significant differences in the accuracy of automated chest X-ray diagnosis across racial and other demographic groups [15]. For example, when applied to patients who were both black and female, their model demonstrated a higher misdiagnosis rate compared to that for other races. One proposed explanation is that AI models may possess the capacity to identify the race of a patient from a medical image [16]. However, this hypothesis is still in its infancy and has not been explored on a large scale. A cross-sectional study revealed that sinonasal anatomic variants were more prevalent in some races [17]. Whether the U-Net model can detect and predict the boundaries of MS across different races is still unexplored.

In clinical practice, the application of the U-Net model to MS segmentation reduces time and is effective because manual labelling of neighbouring anatomic structures is not necessary. Considering that most studies regarding the application of U-Net for MS segmentation were performed with data from non–Asian individuals, this study uses the U-Net architecture for MS segmentation among a Chinese population with three different planes and verifies its performance.

Materials and methods

Ethics approval and consent to participate

The study protocol was established according to the ethical guidelines of the Helsinki Declaration and was approved by the Medical Ethics Committee of the Suzhou Wujiang District Hospital of Traditional Chinese Medicine (Ethical Approval ID: 2024-KY-13-01). The Institutional Review Board (IRB) of the Suzhou Wujiang District Hospital of Traditional Chinese Medicine approved the informed consent exemption. The study followed a noninterventional retrospective design, and there were no human experiments or human tissue samples used. All related information (e.g., name, sex, and age) was concealed in the study.

Patients and dataset source

CBCT images from 100 patients who visited the Stomatology Department of Suzhou Wujiang District Hospital of Traditional Chinese Medicine between August 2024 and September 2024 were included in this clinical study. All images were collected using a MEYER OPTOELECTRONIC CBCT (Hefei Meyer Optoelectronic Technology Inc., Hefei City, PR China), operating at 90 kV and 6 mA for males and 89 kV and 5 mA for females. Each patient’s Frankfort plane was positioned parallel to the ground. The top boundary of the CBCT scanner was limited to the infraorbital margin. The inclusion criteria for CBCT were as follows: (1) age ≥ 15 years; (2) maxillary sinus without a history of trauma, cancer, or surgery affecting delineation; (3) Schneiderian membrane thickness ≤ 2 mm; (4) no effusion detected in the maxillary sinus; (5) CBCT images without metal/motion artefacts or noisy points; and (6) both regular and irregular shapes of the maxillary sinus. The exclusion criteria for CBCT were as follows: (1) age < 15 years; (2) trauma to the maxillary sinus region or a history of surgical operations in this area; (3) maxillary sinus borders cannot be identified because of metal/motion artefacts; (4) sex and race differences were not considered; and (5) systemic disease morphologically affecting the maxillary sinus. Target images (axial, cross-sectional, and sagittal planes) were obtained using Meyer DCTViewer V4 software (Hefei Meyer Optoelectronic Technology Inc.). All target images were resized to 512 × 512 pixels by PyCharm software. A total of 300 CBCT images were selected, with 100 images per plane (axial plane: 100, cross-sectional plane: 100, and sagittal plane: 100). All the CBCT images were input into LabelMe software for annotating the maxillary sinus borders to create label files. This task is a binary task that includes the maxillary sinus and background. Images and label files in PNG format were divided into a training set and a test set at 8:2. The design of this study can be found in Fig. 1.

U-Net architecture

U-Net is a CNN that has demonstrate superior performance in the segmentation of medical images. The U-Net architecture is frequently used for segmentation of normal organs, tumours, and pathological sections. The shape of the U-Net architecture resembles the letter U, which is the basis for its name (Fig. 2). U-Net consists of an encoder module and a decoder module. The left side of the U-Net architecture includes a feature extraction network for images. The image features in each layer are extracted through double convolution operations (kernel size = 3, padding = 1, stride = 1, and ReLU), and the image size is compressed through max pooling. The channels of the output images rely on the channels of the convolution kernel. The basic convolution and pooling calculations are shown in Fig. 3. Four max pooling algorithms are applied in this architecture as downsampling for dimensionality reduction. The right side of the U-Net architecture consists of the decoder module. The features in the image are extracted in the encoder module, and the size of the image is reduced. To restore the size of the image and improve its resolution, bilinear upsampling is used in the decoder module (Bilinear upsampling is a commonly used upsampling method in image processing). In addition, feature concatenation is implemented using the copy and crop technique to better capture local and general features. Finally, a 1*1 convolution kernel is applied to appropriate channels to output the images.

Data augmentation

To increase the size and diversity of the dataset, random data augmentation is implemented using the PyTorch framework (Meta Platform Inc., Menlo Park, CA, USA). This technique can address data imbalance, small dataset, noise interference problems by randomly rotating, flipping, cropping, standardizing the images and randomly setting their brightness and contrast. For example, the same image can be rotated and flipped to generate copies that represent different viewing angles. This approach improves the generalizability of the model. In this study, 3200 images were obtained from the training set through random data augmentation.

Training details for U-Net

2D models are generally simpler to implement and train, and there are many pretrained models and resources available. Compared with those of 3D models, 2D models require significantly less computational power and memory, making them more accessible for institutions with limited resources. Considering the computer configuration and segmentation effectiveness, 2D segmentation was applied to identify the maxillary sinus instead of 3D segmentation. The axial plane, coronal plane, and sagittal plane represent maxillary sinus conditions. Therefore, these three planes were selected for detection. In the U-Net architecture, all images were resized to 512 × 512 with 3 channels. The batch size was 1, the learning rate was 0.00001, the number of epochs was 40, and the optimizer was RMSprop (Root Mean Square Propagation). The RMSprop optimizer is particularly well suited for training deep neural networks, especially when dealing with nonstationary objectives and noisy gradients. This optimizer is designed to adjust the learning rate for each parameter individually, which helps accelerate convergence and avoids issues such as vanishing or exploding gradients. A study by Uppal et al. demonstrated that the RMSprop optimizer exhibits superior performance, with a data training accuracy of 95.8% and a data testing accuracy of 94.9%, compared with those of the Adamax, RMSprop, and Adadelta optimizers [18]. The code was implemented using PyTorch 1.2.0 (Meta PlatformInc.), and the experiment was performed on an RTX GeForce 3060 (NVIDIA, Santa Clara, CA, USA).

Statistical analysis

The U-Net model performance is evaluated using a confusion matrix. A confusion matrix is a special spreadsheet for visual assessment of supervised learning performance, especially in classification algorithms. Evaluation metrics such as precision, accuracy, and recall can be calculated through a confusion matrix, and we are able to assess overall model performance with respect to various classifications (Fig. 4).

The following basic parameters and metrics are used to evaluate the model performance:

True positive (TP): maxillary sinus diagnosis was detected and segmented correctly.
True negative (TN): maxillary sinus diagnosis was not detected or segmented.
False-positive (FP): maxillary sinus diagnosis was not detected but was segmented incorrectly.
False-negative (FN): maxillary sinus diagnosis was detected but not segmented.

$$\:\text{S}\text{e}\text{n}\text{s}\text{i}\text{t}\text{i}\text{v}\text{i}\text{t}\text{y}/\text{R}\text{e}\text{c}\text{a}\text{l}\text{l}:\frac{\text{T}\text{P}}{\text{T}\text{P}+\text{F}\text{N}}$$

$$\:\text{P}\text{r}\text{e}\text{c}\text{i}\text{s}\text{i}\text{o}\text{n}/\text{P}\text{o}\text{s}\text{i}\text{t}\text{i}\text{v}\text{e}\:\text{p}\text{r}\text{e}\text{d}\text{i}\text{c}\text{t}\text{i}\text{v}\text{e}\:\text{v}\text{a}\text{l}\text{u}\text{e}:\frac{\text{T}\text{P}}{\text{T}\text{P}+\text{F}\text{P}}$$

$$\:\text{F}1\:\text{s}\text{c}\text{o}\text{r}\text{e}:\frac{2\text{T}\text{P}}{2\text{T}\text{P}+\text{F}\text{P}+\text{F}\text{N}}$$

$$\:\text{A}\text{c}\text{c}\text{u}\text{r}\text{a}\text{c}\text{y}:\frac{\text{T}\text{P}+\text{T}\text{N}}{\text{T}\text{P}+\text{T}\text{N}+\text{F}\text{P}+\text{F}\text{N}}$$

$$\:\text{I}\text{n}\text{t}\text{e}\text{r}\text{s}\text{e}\text{c}\text{t}\text{i}\text{o}\text{n}\:\text{o}\text{v}\text{e}\text{r}\:\text{u}\text{n}\text{i}\text{o}\text{n}\:\left(\text{I}\text{o}\text{U}\right):\frac{\text{A}\cap\:\text{B}}{\text{A}\cup\:\text{B}}$$

The IoU is a metric widely used for assessing the performance of medical image segmentation. The IoU measures the overlap ratio of different images. The intersection represents the overlap of the actual object and the detected object. The union represents a combination of actual objects and detected objects. The value of the IoU usually varies from 0 to 1. The better the performance of the model is, the higher the IoU value. Under ideal conditions, the value of the IoU tends to be 1, and the detected results cover the actual object completely. Typically, the predicted object is correct if the value of the IoU is ≥ 0.5. A single metric is unable to assess model performance in deep learning. Multiple metrics used for evaluation are acceptable.

Results

A CNN based on the U-Net architecture achieved high axial, sagittal and coronal MS segmentation success rates based on CBCT images. All test images were correctly segmented from the original CBCT images using the U-Net architecture. More details can be found in Fig. 5. This CNN model segments not only regular MS but also MS septa from original images. Figures (a), (e), and (i) represent original MS images from the axial, sagittal and coronal planes, respectively. Manual labelling results of MS from the original images were saved to binary images (Figures (b), (f), and (j)). The prediction results of the test images given as inputs to train the best model can be found in Figures (c), (g), and (k). Finally, the prediction results were placed on the original images, and the performance of the U-Net model was calculated. In Figures (d), (h), and (l), the overlap of the prediction result and the original image boundary of MS was distinct. Therefore, the value of the IoU was obtained for overlapping areas.

As a result, the IoU values were found to be 0.942, 0.937 and 0.916 in the axial, sagittal and coronal planes, respectively. An IoU value of 0 indicates the worst fit, and a value of 1 indicates the best fit. An IoU value of more than 0.9 indicates that the constructed deep learning model can segment MS accurately. Recall represents how many positive cases in the sample were predicted correctly (search all). Precision indicates how many of the predicted positive samples are actually positive samples (search correct). Accuracy measures the percentage of the total data that the model predicts correctly. As shown in Table 1, all the metrics are greater than 0.9 and tend to 1, which reflects the superior performance of the U-Net model. The F1 score is applicable under conditions of classification imbalance. Evaluation metrics such as precision may be biased when there is an imbalance between the positive samples and negative samples. The F1 score is an integrated indicator of precision and recall. In this study, the F1 scores across three planes exceeded 0.95, indicating excellent performance for the U-Net model in terms of MS segmentation.

Table 1 Performance evaluation of the U-Net architecture

Full size table

Discussion

Manual segmentation is a complex task in clinical practice because it is time-consuming and requires experience. Although semiautomatic segmentation is more effective in labelling anatomic structures, operator intervention is still needed for threshold selection [19]. To overcome the abovementioned limitations and provide timely and effective medical diagnosis services,

a CNN model based on the U-Net architecture was developed for MS segmentation of CBCT images from the Chinese population. In this study, the U-Net architecture demonstrated superior model performance and promising application in Chinese MS segmentation. The values of model parameters such as the recall, precision, accuracy, F1 score and IoU exceeded 90% across all the CBCT planes. In addition, the U-Net model performed accurately in MS septa segmentation.

Technical findings

The U-Net model was first proposed by Ronneberge et al. in 2015. They won the IEEE International Symposium on Biomedical Imaging (ISBI) cell tracking challenge [20]. This network model relies on strong data augmentation. It is not limited to small samples and achieves high accuracy in medical image segmentation. For example, segmentation of a 512 × 512 image requires less than one second on a routine GPU. Since its introduction, the U-Net architecture has been widely used in biomedical image segmentation and is not limited to cell bracket identification. Lee et al. leveraged 304 bitewing radiographs to train a U-Net model, and 50 images were used for model performance evaluation. The results demonstrated the accurate performance of the model, with a precision of 63.29%, a recall of 65.02%, and an F1 score of 64.14% [21]. Bayrakdar et al. used a D-CNN AI model based on the U-Net architecture for the segmentation of apical lesions on panoramic radiographs. The evaluation parameters were as follows: a sensitivity of 0.92, a precision of 0.84, an F1 score of 0.88 and an IoU of 70%. The U-Net model has shown potential for the diagnosis of periapical pathology [22]. Kim et al. also applied the U-Net al.gorithm to predict impacted mesiodens in paediatric panoramic radiographs, and the trained model achieved an accuracy of 91–92% and an F1 score of 94–95% [23]. Recently, Busra et al. trained a U-Net model for the segmentation of MS data from axial CBCT images and achieved quite accurate outcomes, with an F1 score of 0.9784 and an IoU of 0.9275 [13]. This result is similar to the results of the present study, although the U-Net model was trained based on a Chinese population. In this study, a U-Net model that achieved an F1 score of 0.970 and an IoU of 0.942 was trained based on axial CBCT images. This finding suggests that the U-net model is applicable for MS segmentation in the Turkish and Chinese populations and demonstrates promising application in daily medical practice between the two races. However, other races require further verification in the future. In addition, the lowest accuracy and precision can be found in the coronal plane, with values of 0.995 and 0.972, respectively. The coronal plane of the maxillary sinus covers nasal structures such as the concha bullosa and nasal septa, which might influence the accuracy of the detection of the maxillary sinus border. Some scholars have shown that nasal structures are accurately identified in the coronal plane [24, 25]. It is believed that the accuracy in the coronal plane was lower than that in the other planes owing to the deep learning model incorrectly recognizing the black cavity between the maxillary sinus and nasal structures.

Clinical implications

There have been few studies on MS segmentation using the U-Net model. Yoo et al. compared 2D, 2.5D, and 3D U-Net networks for maxillary sinuses and lesions inside the maxillary sinus segmentation. They reported that the 2.5D U-Net network achieved the best performance, with a precision of 0.974 and a recall of 0.971 for MS segmentation and a precision of 0.897 and a recall of 0.858 for lesions in maxillary sinus segmentation [26]. The results for MS segmentation were similar to those of the present study. The U-Net model output poor results for lesions in maxillary sinus segmentation. In the present study study, all the included maxillary sinuses were normal and healthy. It is believed that lesions only occupied a small proportion of the MS and provided limited feature information for computers [27]. Whangbo et al. trained four 3D U-Net models for paranasal sinus segmentation based on 40 patients using 5-fold cross-validation [28]. Although accurate segmentation of normal MS data has been achieved, limitations in its application have been found in the context of mucosal inflammation. A more advanced U-Net al.gorithm can be designed for the segmentation of lesions inside the maxillary sinus. Çelebi et al. used a state-of-the-art Swin transformer named Res-Swin-UNet to detect the maxillary sinus on CBCT images. This model achieved an F1 score of 91.72%, an accuracy of 99%, and an IoU of 84.71% [29]. The accuracy reported in this study is slightly greater than that of the present study. It is believed that the difference lies in the decoder part consisting of Swin transformer blocks and that the depth of the model affects the accuracy of the results. In addition, Choi et al. reported that a deep learning-based model for fully automatic segmentation of the maxillary sinus on 3D images achieved success, with an accuracy of 0.8852 ± 0.1784 in clear maxillary sinus prediction [30]. This result differs from that of the current study. It is believed that this difference can be attributed to the difference in the sample size of the normal maxillary sinus.

AI bias considerations

The risk of racial bias in medical AI has been a prominent concern in recent years. The MS volume is of interest to surgeons, especially for maxillary sinus floor elevation and endoscopic surgery. A comparative study reported variations in maxillary sinus volume across different ethnic groups [31]. In this study, the U-Net model was applicable for both Chinese and Turkish populations, as the accuracy difference was only 0.008 in terms of the F1 score. This study demonstrated that the U-Net model can potentially reduce the risk of racial bias in MS segmentation. However, a recent study suggested that biomarker-based approaches to training AI models do not necessarily eliminate the potential for racial bias in practice by iteratively thresholding, binarizing, and/or skeletonizing images [32]. This conclusion is inconsistent with the reported results. It is believed that the difference lies in the ethnic diversity of the United States, as the present study was limited to an Asian population. Further investigations are needed in the future.

Limitations

There are several limitations in this study. First, all included populations are based on a single race, and whether the U-Net model is applicable to other races requires further validation. Second, all analyses in this study are performed on an RTX GeForce 3060. With the development of computational calculations, advanced GPUs applied in AI diagnosis could improve accuracy and precision. Third, subanalyses of normal and abnormal MS segmentations were not performed using the U-Net model because the segmentation of inflammation in MS patients was unsatisfactory. More advanced algorithms could be developed for this purpose in the future. Fourth, the accessory ostium is one of the important anatomical variations in the maxillary sinus region. Shetty et al. used a deep learning model to detect the accessory ostium in coronal cone beam computed tomographic images and achieved satisfactory results [33]. However, the distinction between the maxillary sinus and accessory ostium was not considered in this study. Further studies are needed to separate the accessory ostium from the maxillary sinus. Sixth, the accuracy of the training set and testing set were not determined. Thus, the model may exhibit overfitting. A thorough design should be adopted in the future. Seventh, a comparative analysis of manual segmentation duration with deep learning model segmentation duration was been conducted. These findings provide evidence for clinical practitioners regarding the efficiency of the U-Net model. Eighth, the U-Net model has limitations. This model may have difficulties in segmenting small targets. For example, the border between the maxillary sinus and accessory ostium might not be accurately outlined by performers. Ninth, the U-Net model may produce both false positives and false negatives in maxillary sinus segmentation. False positives can include structures mistakenly detected, such as the posterior superior alveolar artery on the coronal plane. False negatives, on the other hand, may result in missed true structures, particularly in regions near the nasal cavity. Additionally, challenges arise in scenarios involving coregistration or slice thickness inconsistencies, especially when models trained based on 2D slices are applied to the inherently 3D nature of the maxillary sinus.

Conclusion

This study introduced that a CNN based on the U-Net architecture for MS segmentation among the Chinese population that achieved accurate performance. In this way, the U-Net model can provide initial suggestions for oral maxillofacial surgeons, oral radiologists and ENT specialists regarding MS diagnosis and presurgery plans. AI applications in oral radiology have promising prospects. In the future, AI application will benefit the field of dentistry.

Data availability

All data generated or analyzed are available from Zenodo database (https://doi.org/10.5281/zenodo.14441638).

References

Whyte A, Boeddinghaus R. The maxillary sinus: physiology, development and imaging anatomy. Dentomaxillofac Radiol. 2019;48:20190205.
Article PubMed PubMed Central Google Scholar
Von Arx T, Lozanoff S. Clinical oral anatomy: a comprehensive review for dental practitioners and researchers. Switzerland: Springer International Publishing. 2017;342–50.
Whyte A, Boeddinghaus R. Imaging of odontogenic sinusitis. Clin Radiol. 2019;74:503–16.
Article CAS PubMed Google Scholar
Varela-Centelles P, Loira-Gago M, Seoane-Romero JM, Takkouche B, Monteiro L, Seoane J. Detection of the posterior superior alveolar artery in the lateral sinus wall using computed tomography/cone beam computed tomography: a prevalence meta-analysis study and systematic review. Int J Oral Maxillofac Surg. 2015;44:1405–10.
Article CAS PubMed Google Scholar
Timmenga NM, Raghoebar GM, Liem RS, van Weissenbruch R, Manson WL, Vissink A. Effects of maxillary sinus floor elevation surgery on maxillary sinus physiology. Eur J Oral Sci. 2003;111:189–97.
Article PubMed Google Scholar
Kaasalainen T, Ekholm M, Siiskonen T, Kortesniemi M. Dental cone beam ct: an updated review. Phys Med. 2021;88:193–217.
Article PubMed Google Scholar
Sivari E, Senirkentli GB, Bostanci E, Guzel MS, Acici K, Asuroglu T. Deep learning in diagnosis of dental anomalies and diseases: a systematic review. Diagnostics (Basel). 2023;13:2512.
Article PubMed Google Scholar
Olaf R, Philipp F, Thomas B. U-net: convolutional networks for biomedical image segmentation. International conference on medical image computing and computer-assisted intervention. Springer, Cham. 2015;234–41.
Nechyporenko A, Frohme M, Alekseeva V, Gargin V, Sytnikov D, Hubarenko M. Deep learning based image segmentation for detection of odontogenic maxillary sinusitis. In: 2022 IEEE 41st International Conference on Electronics and Nanotechnology (ELNANO); 2022 Oct 10–14; Kyiv, Ukraine. IEEE; 2022. pp. 339–342.
Alekseeva V, Nechyporenko A, Frohme M, Gargin V, Meniailov I, Chumachenko D. Intelligent decision support system for differential diagnosis of chronic odontogenic rhinosinusitis based on U-Net segmentation. Electron (Basel). 2023;12:1202.
Google Scholar
Kim KS, Kim BK, Chung MJ, Cho HB, Cho BH, Jung YG. Detection of maxillary sinus fungal ball via 3-D CNN-based artificial intelligence: fully automated system and clinical validation. PLoS ONE. 2022;17:e0263125.
Article CAS PubMed PubMed Central Google Scholar
Zeng P, Song R, Lin Y, Li H, Chen S, Shi M, et al. Abnormal maxillary sinus diagnosing on CBCT images via object detection and ‘straight-forward’ classification deep learning strategy. J Oral Rehabil. 2023;50:1465–80.
Article CAS PubMed Google Scholar
Ozturk B, Taspinar YS, Koklu M, Tassoker M. Automatic segmentation of the maxillary sinus on cone beam computed tomographic images with u-net deep learning model. Eur Arch Otorhinolaryngol. 2024;281:6111–21.
Article PubMed PubMed Central Google Scholar
Gichoya JW, Banerjee I, Bhimireddy AR, et al. Ai recognition of patient race in medical imaging: a modelling study. Lancet Digit Health. 2022;4:e406–14.
Article CAS PubMed PubMed Central Google Scholar
Seyyed-Kalantari L, Zhang H, McDermott MBA, Chen IY, Ghassemi M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat Med. 2021;27:2176–82.
Article CAS PubMed PubMed Central Google Scholar
Wawira Gichoya J, McCoy LG, Celi LA, Ghassemi M. Equity in essence: a call for operationalising fairness in machine learning for healthcare. BMJ Health Care Inf. 2021;28:e100289.
Article Google Scholar
Kulich M, Long R, Reyes Orozco F, et al. Racial, ethnic, and gender variations in sinonasal anatomy. Ann Otol Rhinol Laryngol. 2023;132:996–1004.
Article PubMed Google Scholar
Uppal M, Gupta D, Juneja S, Gadekallu TR, El Bayoumy I, Hussain J, Lee SW. Enhancing accuracy in brain stroke detection: Multi-layer perceptron with adadelta, RMSProp and AdaMax optimizers. Front Bioeng Biotechnol. 2023;11:1257591.
Issa J, Olszewski R, Dyszkiewicz-Konwińska M. The effectiveness of semi-automated and fully automatic segmentation for inferior alveolar Canal localization on Cbct scans: a systematic review. Int J Environ Res Public Health. 2022;19:560.
Article PubMed PubMed Central Google Scholar
Ronneberger O, Fischer P, Brox T. U-net: onvolutional networks for biomedical image segmentation. Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5–9, 2015, Proceedings, part III 18. Springer International Publishing, 2015:234– 41.
Lee S, Oh SI, Jo J, Kang S, Shin Y, Park JW. Deep learning for early dental caries detection in bitewing radiographs. Sci Rep. 2021;11:16807.
Article CAS PubMed PubMed Central Google Scholar
Bayrakdar IS, Orhan K, Çelik Ö, et al. A u-net approach to apical lesion segmentation on panoramic radiographs. Biomed Res Int. 2022;2022:7035367.
Article PubMed PubMed Central Google Scholar
Kim H, Song JS, Shin TJ, et al. Image segmentation of impacted mesiodens using deep learning. J Clin Pediatr Dent. 2024;48:52–8.
Article PubMed Google Scholar
Shetty S, Mubarak AS, Al Jouhari RDL, Talaat MO, Al-Rawi W, AlKawas N, Shetty S, Uzun Ozsahin S. The application of mask Region-Based convolutional neural networks in the detection of nasal septal deviation using cone beam computed tomography images: Proof-of-Concept study. JMIR Form Res. 2024;8:e57335.
Article PubMed PubMed Central Google Scholar
Shetty S, Mubarak AS, David LR, Jouhari MOA, Talaat W, Kawas SA, Al-Rawi N, Shetty S, Shetty M, Ozsahin DU. Detection of Concha Bullosa using deep learning models in cone-beam computed tomography images: a feasibility study. Arch Craniofac Surg. 2025;26:19–28.
Article PubMed PubMed Central Google Scholar
Yoo YS, Kim D, Yang S, et al. Comparison of 2d, 2.5d, and 3d segmentation networks for maxillary sinuses and lesions in Cbct images. BMC Oral Health. 2023;23:866.
Article PubMed PubMed Central Google Scholar
Vaddi A, Villagran S, Muttanahally KS, Tadinada A. Evaluation of available height, location, and patency of the ostium for sinus augmentation from an implant treatment planning perspective. Imaging Sci Dent. 2021;51:243–50.
Article PubMed PubMed Central Google Scholar
Whangbo J, Lee J, Kim YJ, Kim ST, Kim KG. Deep Learning-based multi-class segmentation of the paranasal sinuses of sinusitis patients based on computed tomographic images. Sens (Basel). 2024;24:1933.
Article Google Scholar
Çelebi A, Imak A, Üzen H, Budak Ü, Türkoğlu M, Hanbay D, Şengür A. Maxillary sinus detection on cone beam computed tomography images using ResNet and Swin Transformer-based UNet. Oral Surg Oral Med Oral Pathol Oral Radiol. 2024;138:149–61.
Article PubMed Google Scholar
Choi H, Jeon KJ, Kim YH, Ha EG, Lee C, Han SS. Deep learning-based fully automatic segmentation of the maxillary sinus on cone-beam computed tomographic images. Sci Rep. 2022;12:14009.
Article CAS PubMed PubMed Central Google Scholar
Fernandes CL. Volumetric analysis of maxillary sinuses of Zulu and European crania by helical, multislice computed tomography. J Laryngol Otol. 2004;118:877–81.
Article CAS PubMed Google Scholar
Coyner AS, Singh P, Brown JM, et al. Association of biomarker-based artificial intelligence with risk of Racial bias in retinal images. JAMA Ophthalmol. 2023;141:543–52.
Article PubMed PubMed Central Google Scholar
Shetty S, Talaat W, Al-Rawi N, Al Kawas S, Sadek M, Elayyan M, Gaballah K, Narasimhan S, Ozsahin I, Ozsahin DU, David LR. Accuracy of deep learning models in the detection of accessory ostium in coronal cone beam computed tomographic images. Sci Rep. 2025;15:8324.
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

None.

Funding

No funding was received.

Author information

Authors and Affiliations

Department of Stomatology, Suzhou Wujiang District Hospital of Traditional Chinese Medicine, Suzhou, PR China
Jiayi Chen

Authors

Jiayi Chen
View author publications
Search author on:PubMed Google Scholar

Contributions

Jiayi Chen made all contributions.

Corresponding author

Correspondence to Jiayi Chen.

Ethics declarations

Human ethics and consent to participate

The study protocol was established according to the ethical guidelines of the Helsinki Declaration and was approved by the Medical Ethics Committee of the Suzhou Wujiang District Hospital of Traditional Chinese Medicine, Ethical Approval ID: 2024-KY-13-01. The Institutional Review Board (IRB) of the Suzhou Wujiang District Hospital of Traditional Chinese Medicine approved the informed consent exemption. This study had a noninterventional retrospective design, and there were no human experiments or human tissue samples used. All related information (e.g., name, sex, age, etc.) was concealed in the study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, J. Convolutional neural network for maxillary sinus segmentation based on the U-Net architecture at different planes in the Chinese population: a semantic segmentation study. BMC Oral Health 25, 961 (2025). https://doi.org/10.1186/s12903-025-06408-1

Download citation

Received: 20 November 2024
Accepted: 13 June 2025
Published: 01 July 2025
DOI: https://doi.org/10.1186/s12903-025-06408-1

Convolutional neural network for maxillary sinus segmentation based on the U-Net architecture at different planes in the Chinese population: a semantic segmentation study

Abstract

Background/purpose

Materials and methods

Results

Conclusion

Clinical trial number

Introduction

Materials and methods

Ethics approval and consent to participate

Patients and dataset source

U-Net architecture

Data augmentation

Training details for U-Net

Statistical analysis

Results

Discussion

Technical findings

Clinical implications

AI bias considerations

Limitations

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Human ethics and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Oral Health

Contact us