Introduction

Globally, about 7.5 million people were newly diagnosed with tuberculosis (TB) and notified in the year 2022. Among them, 6.2 million (83%) had pulmonary TB. The proportion of people diagnosed with microbiologically confirmed TB improved between 2018 and 2021, from 55 to 63%; it remained at 63% in 20221. One of the recommendations of the World Health Organization’s (WHO) Global Task Force on TB Impact Measurement is to conduct national TB prevalence surveys in 22 priority countries as part of the End TB strategy. These survey results are essential for assessing progress towards global, regional and national targets for reductions in TB burden2.

WHO also recommends conducting a systematic screening of the general population in regions with a TB prevalence of more than 0.5% and high-risk populations through active case finding (ACF)3. The cost implications of screening the healthy or high-risk population for TB infection and disease in low- and middle-income countries is a matter of concern. Simple, cheaper screening methods, such as symptom screening and chest X-rays followed by molecular diagnostic tests, are considered cost-effective strategies among the highest-risk groups4. ACF and prevalence surveys usually rely on symptom screening and chest X-ray for screening and diagnostic tests for microbiological confirmation. Sputum smear microscopy, WHO-recommended rapid molecular assays (mWRDs), and culture are the diagnostic tests that are most commonly used. Chada et al. argue that ACF exercise using molecular diagnostic tests would result in increased reporting of false positive cases and suggested evaluating the accuracy of molecular tests as a diagnostic tool in community settings5. Shewade et al. suggested that replacing smear microscopy with mWRDs may increase PPV to more than 90%, assuming the sensitivity of the mWRDs is similar in both ACF and PCF6. Most of the diagnostic accuracy measures of mWRDs are derived from passive case findings or studies done in healthcare facilities. While the evidence on the diagnostic accuracy of mWRDs in the ACF setting is limited, it could be extrapolated from national prevalence surveys. The best estimates of the sensitivity and specificity of Xpert MTB/RIF in the general population in the national prevalence surveys were 69% and 99%, respectively7.

Furthermore, the TB program would benefit from investigating the performance of the screening tests that are followed by a diagnostic test in the sequential screening strategy. Myanmar national prevalence survey estimated screening for any symptoms and signs suggestive of TB had a sensitivity of 59.8% (95% CI:54.1–65.3) and a specificity of 67.2% (95% CI: 66.7–67.6)8. Kenya National Tuberculosis survey revealed that the sensitivity of CXR being interpreted by a medical officer as “suggestive of TB” for bacteriologically-confirmed TB was 43.7% (95% Credible interval (Crl): 23.8–66.4%) and specificity was 89.2% (95% Crl: 89.0–89.6%)9. In the context of limited evidence on the performance of screening and diagnostic tests for TB disease detection in the community setting, we report the diagnostic accuracy of the screening and diagnostic tests used in the Tamil Nadu TB prevalence survey (TNTBPS). Our findings from this survey might help the TB program managers, especially in the high TB burden setting, to decide on the appropriate diagnostic algorithm to be used for ACF and prevalence surveys.

Results

We screened 130,932 individuals who consented to participate in the survey, and all participants were screened at least by one method. Among the screened population, 130,914 (99.9%) underwent symptom screening, and 125,870 (96.1%) completed both symptom screening and CXR examination. Of 130,932, 20,086 (15.34%) were found to be eligible for sputum collection, and eventually, 18,669 (92.9%) provided sputum for testing. The reasons for the inability to obtain sputum from all the eligible participants were the inability to produce sputum, unwillingness, and survey loss to follow-up. The first sputum sample was obtained from 18,654 (92.8%), and the second sputum sample was collected from 18,255 (90.8%) participants. Among the 18,669 samples, 17,184 participants had valid results of MTB/RIF, smear microscopy, and MGIT. We have removed 106 participants who were on treatment for tuberculosis and considered 17,078 for the final analysis of diagnostic accuracy. Of the 17,078 samples processed in the survey van for Xpert MTB/RIF, 185 (1.08%) tested positive for MTB (Fig. 1). In the reference laboratory’s smear microscopy, 89 (0.52%) samples were positive and 96 (0.56%) were positive in MGIT. The participants included predominantly females (50.2%), and almost one-tenth of them gave a history of previous treatment (9.1%) for TB (Table 1).

Fig. 1
figure 1

Flow diagram showing number of participants screened, eligible for sputum, sputum samples tested and their results.

Table 1 Demographic characteristics of study participants.

Diagnostic accuracy of screening tests

The sensitivities of the symptom screening checklist ranged from 3.1 to 41.6%, while the specificities ranged from 72.8 to 98.6%. Among the symptoms, cough of more than two weeks had higher sensitivity (41.6%, 95% CI:31.6–52.1) and lower specificity (72.8%, 95% CI:72.1–73.5) among all symptoms (Table 2). Fever for 14 days or more had higher specificity (98.60% 95% CI: 98.4–98.7). The presence of any one of the symptoms suggestive of TB had a sensitivity of 55.2% (95% CI: 44.7–65.3) and a specificity of 50.9% (95% CI: 50.1–51.6). Among all the screening tests, the sensitivity of an abnormal chest X-ray was the highest (86.46%, 95% CI: 77.9–92.5), while its specificity was lower (42.12%, 95% CI: 41.3–42.8) compared to symptom screening. When CXR was combined with symptom screening, the sensitivity rose by nearly 11% points (97.9%, 95% CI: 92.6–99.7), but the specificity significantly decreased (4.99%, 95% CI: 4.67–5.33).

Table 2 Diagnostic accuracy of screening tests.

Diagnostic accuracy of diagnostic tests

Among the diagnostic tests, Xpert/MTB RIF in the reference laboratory had the highest sensitivity (96.5%, 95% CI:88–99.5) (Table 3). Smear microscopy had the highest specificity (99.7%, 95% CI: 99.6–99.8), followed by Xpert assay in the mobile van (99.3%, 95% CI: 99.1–99.4). The sensitivity of Xpert MTB/RIF in the mobile van was 71.8% (95% CI: 61.7–80.5), and the sensitivity of smear microscopy was the lowest (53.13%, 95% CI: 42.6–63.3) among the diagnostic tests used in the survey. The kappa coefficient for smear microscopy and MGIT between sputum 1 and 2 were 0.64 (SD- 0.077, substantial agreement) and 0.592 (SD-0.077, moderate agreement), respectively.

Table 3 Diagnostic accuracy of diagnostic tests (Xpert MTB/RIF, smear miscroscopy). 
Table 4 Diagnostic accuracy of screening and diagnostic tests.

Diagnostic accuracy of a combination of screening and diagnostic tests

We analyzed the combination of screening and diagnostic tests used in the survey. Xpert MTB/RIF with symptom screening alone yielded a sensitivity of 84.38% (95% CI: 75.5–90.9) and specificity of 50.6% (95% CI:49.9–51.4) (Table 4). When Xpert MTB/RIF was combined with CXR and demonstrated a sensitivity of 93.75% (95% CI: 86.89–97.67) and specificity of 42.50% (95% CI: 41.76–43.25) in the survey van. The sensitivity (90.63%) was higher in the reference laboratory with almost similar specificity (42.60%. When CXR was added to symptom screening along with Xpert/MTB RIF, the sensitivity increased almost by 15% points (98.9% 95% CI: 94.3–99.9) with the decrease in specificity (4.98%, 95% CI: 4.6–5.3). This phenomenon was observed even with adding smear microscopy, with symptoms screening and CXR. However, the incremental yield in sensitivity was slightly lower for the Xpert assay (97.9%, 95% CI: 92.6–99.7) in the reference laboratory compared to the mobile van.

Discussion

Cough has been one of the most sensitive symptoms that is suggestive of tuberculosis in passive as well as ACF for TB. WHO’s screening guidelines reported a sensitivity of 42% for cough among HIV-negative individuals, and this finding corroborates with our survey (41.6%)3. However, this was lower than the prevalence survey from Kenya [52 (95% CI: 41–63)] which was done between 2005 and 0710. The specificity (72.8%) of cough in our survey was also lower than that of the Kenyan survey (89% (95% CI: 88–90)10. The Kenyan TB prevalence survey reported a sensitivity of 90% (95% CI:84–95) and a specificity of 32% (95% CI: 30–34) for the presence of any one symptom suggestive of TB. While we estimated a lower sensitivity (55.2%) than the Kenyan survey, our specificity was higher (50.9%) for any one symptom suggestive of TB. This is likely attributable to the fact that the survey conducted in Kenya included a substantial number of individuals who were HIV positive or had an unknown HIV status. Our sensitivity estimation for the presence of any one symptom is very similar to the survey conducted in Myanmar (59.8%), while the specificity (50.9%) is lower than that of the Myanmar survey (67.2%)8. A Cochrane review of 31 studies done among participants who were screened for tuberculosis estimated sensitivity of 70.6% (95% CI:61.7–78.2%) and specificity of 65.1% (95% CI:53.3–75.4%) for any tuberculosis symptoms. Symptom screening is the most simple screening tool that can be used even in limited resource settings. Though this tool is considered to have low accuracy, we found the specificity of symptom screening was higher than that of CXR11.

The sensitivity (87.25%) of CXR in our survey was almost similar to the Kenyan survey. However, the specificity (90.62%) was lower. CXR has been widely used in ACF and prevalence surveys and is known to be a screening tool with high sensitivity10. The aforesaid Cochrane review also included 19 studies and estimated sensitivity and specificity of 84.8% (95% CI:76.7–90.4) and 95.6% (95% CI: 92.6–97.4), respectively, for CXR abnormalities suggestive of TB11. Though CXR sensitivity in our survey was higher, the specificity was lower than in this review. When symptom screening was combined with CXR, it increased the sensitivity (98.04%) significantly at the cost of reduction in specificity. This indicates that CXR and symptom screening combined can significantly reduce false negatives in the screening population. Though CXR is a good screening test, it has a few limitations, such as the necessity for radiation-shielded vehicles, the higher cost associated with acquiring and upkeeping the vehicles and X-ray equipment, the availability of technicians; the crucial requirement for medical officers or radiologists to interpret CXR results; the inconsistencies in reporting between different observers and even within the same observer; and the exposure of apparently healthy individuals to radiation. However, these limitations and barriers may be surmounted by the recent advancements in portable X-ray machines and artificial intelligence to facilitate reporting. Given that our results unequivocally indicate that it is beneficial to allocate resources toward advanced CXR technology in both ACF and prevalence surveys.

We estimated a sensitivity of 71.88% (95% CI: 61.7–80.5) for Xpert MTB/RIF performed in the survey van. This is higher than the sensitivity of Xpert MTB/RIF in the recent prevalence surveys conducted in Kenya (69%), the Philippines (69%) and Vietnam (68%)7. However, our sensitivity estimate is lower than that of Bangladesh’s 84% (95% CI: 78–84) prevalence survey. The specificities of Xpert MTB/RIF from these surveys corroborate our findings. The sensitivity of the molecular test in the reference laboratory was significantly higher than the survey van [96.55% (95% CI: 88.0 −99.5)]. We could postulate several reasons for the decrease in sensitivity, such as temperature in the field setting, and environmental factors. The pooled sensitivity of Xpert Ultra (78%, 95% CI: 69–84%) was higher than Xpert MTB/RIF (73%, 95% CI: 62–82%) from the recent prevalence surveys (South Africa, Myanmar, Lesotho, and Zambia) conducted between 2017 and 20197. Future surveys should also consider WHO-recommended low complexity automated NAATs such as Truenat MTB Plus and Xpert Ultra to increase the yield in the survey as they have shown better performance in the healthcare setting and passive case finding. It is worthwhile to note that though smear microscopy had the lowest sensitivity (53.13% 95% CI: 42.6–63.9), it had the highest specificity (99.78% 95% CI: 99.6–99.8) among all the diagnostic tests. This implies that smear microscopy could still be used to confirm a diagnosis when molecular tests are unavailable in resource-limited settings.

When we combined our screening test and diagnostic tests, there was a significant increase in sensitivity with a reduction in specificity. However, our screening and diagnostic approach yielded a negative predictive value of 99.8%. Our participants underwent highly sensitive screening tests followed by a highly specific confirmatory test. CXR yielded a significant proportion of false positive results, which were later eliminated by the Xpert MTB/RIF. The possible reasons could be that the survey was done during and immediately after COVID-19, and the CXR was interpreted by both trained medical officers and specialists. The primary objective of the prevalence survey is to identify the true prevalence in the community, including cases that were not routinely detected by the passive case finding through healthcare system. This will enable us to estimate the prevalence notification gap, which is an initial step of the TB care cascade and an indicator of the efficiency of the TB management system in the state. Conducting an initial screening combining symptoms and chest X-rays may significantly reduce the likelihood of false negative results and enable participants to proceed with the specified diagnostic tests. We found that the True positive rate was significantly higher when all the tests were combined. We also noted similar estimates when the molecular test was replaced with smear microscopy.

Methodology

Tamil Nadu is a southern state in India with 33 administrative districts and more than a population of 60 million. The survey covered 180 clusters in all the districts with a multistage cluster sampling design with a target sample size of 144,000 and screened 130,932 individuals from the 143,005 eligible population12. The number of clusters in each district was allocated according to the population i.e. population proportionate to size.

Participants and screening procedures

We included 15 years and older participants in the selected village or urban census enumeration block. We excluded hospitalised residents, institutional populations, and sick or bedridden individuals. All the eligible participants initially underwent screening by a symptom checklist followed by a digital chest X-ray (PA view) in the mobile survey van. Trained survey staff screened the population with a symptom checklist that consisted of fever, cough, weight loss, haemoptysis, and chest pain in the last month prior to the survey. Participants with the presence of any of the symptoms, abnormal chest X-ray, or a history of current or past anti-TB treatment were defined to be eligible for sputum testing and asked to give sputum for examination.

Study procedures

The mobile survey van consists of an X-ray unit, Xpert/MTB RIF machines, a refrigerator, and a laptop. Eligible participants provided 3 to 5 ml of sputum for testing. The first sample was processed and analyzed in the mobile van used for the survey using Xpert MTB/RIF to detect MTB and rifampicin resistance. Subsequently, a second sputum sample was collected within 24 h and transported within 48 h to a reference laboratory which was identified before the survey. The second sample was subjected to AFB smear microscopy, liquid culture (Mycobacteria Growth Indicator Tube) and drug susceptibility testing (DST) in the reference laboratory located in Chennai and Madurai in Tamil Nadu, India. If MTB was detected by Xpert MTB/RIF in the survey van from the first sample, a third sample was collected and transported to the reference laboratory for Xpert MTB/RIF, smear microscopy, liquid culture and drug susceptibility testing (DST). Acid-fact bacilli smears were examined with fluorescence microscopy. Xpert MTB/RIF (Cepheid. Inc) was performed in accordance with the manufacturer’s instructions. Xpert MTB/RIF results are automatically generated (i.e., there is a single threshold), and the user is provided with printable test results. Indeterminate results were reported after repeating the test. Xpert MTB/RIF results done in the survey van were not shared with the reference laboratory. The samples were de-contaminated at the reference laboratory and inoculated into MGIT960 (Becton Dickinson, Sparks, MD, USA), a liquid culture method for the detection of mycobacteria. Subsequently, speciation was done for all the positive cultures and was confirmed using smear microscopy and TBc Identification test (TBcID, Becton Dickinson, Sparks, USA) as M. tuberculosis.

Quality assurance

The trained technicians performed all the laboratory tests in the study according to the standard operating procedures (SOPs). Temperature monitors were used to maintain the refrigerator for reagents storage at 2⁰−8⁰C and the Xpert cabin at 20⁰ – 25⁰C in the mobile vans. Using a specialized transporting system, it was ensured that the samples reached the reference laboratories within 24–28 h. Our quality assurance team consists of microbiologists, scientists from ICMR-NIRT, and district TB officers visited each participating laboratory and ensured the procedures were done as per the SOPs. National TB elimination program (NTEP) recommendations were followed for the quality assurance of smear, Xpert/MTB, and culture in reference laboratories. All the medical officers, X-ray technicians, and teleradiologists were trained in all the SOPs. If the medical officer in the field carried out the spot quality check of the X-ray image and if the image was not acceptable, the Chest X-ray was repeated again. Two teleradiologists reported all the CXRs, and if there was a discrepancy between the radiologists, a third radiologist provided the final report.

Data entry

Field data was collected electronically using a customized Android application, while data from reference laboratories and the teleradiology panel was entered into a customized online web application designed for the survey. The survey data was monitored periodically, synced and stored in the server at ICMR-NIRT, Chennai. Data cleaning was done to remove duplication, outliers, and misclassification of variables before the analysis.

Data analysis

All the statistical analysis was done using Stata 16.0 (Stata Corporation, College Station, TX, USA). Demographic characteristics were analyzed and presented as numbers and percentages. We excluded those who were currently on TB treatment for the analysis of diagnostic accuracy. Sensitivity, specificity, positive predictive values (PPVs), negative predictive values (NPVs), positive likelihood ratio (PLR) and negative likelihood ratio (NLR) were calculated and presented with 95% confidence intervals. We calculated the diagnostic accuracy of screening (symptom screening and CXR) and diagnostic (Xpert MTB/RIF and smear microscopy) against MGIT as a reference standard. We evaluated the accuracy of a combination of diagnostic tests and symptom screening. A positive result for a diagnostic test or the presence of any one symptom was considered positive and compared to the MGIT. A positive MGIT result was considered as disease positive for our analysis.

Strength and limtations

Our survey had several strengths. We strictly adhered to the design and methodogy recommended by the WHO. The staff were trained in the screening and diagnostic procedures and they were closely monitored for adherence to the SOPs of the survey. We had difficulty in obtaining good quality sputum from the participants. The survey included the apparently healthy individuals who could come to the survey site. We could have missed individuals who were sick and bed-ridden at home or unable to stand for CXR.

Conclusion

Symptom screening and CXR were the highly sensitive screening tool that can be used in the prevalence surveys. mWRDs (Xpert/MTB RIF) was found to be highly specific even in the prevalence surveys. We recommend a diagnostic algorithm consisting of symptom screening, CXR followed by mWRDs for the future prevalence surveys and active case finding.