Clinical Evaluation of a Line-Probe Assay for Tuberculosis Detection and Drug-Resistance Prediction in Namibia

ABSTRACT Treatment of tuberculosis requires rapid information about Mycobacterium tuberculosis (Mtb) drug susceptibility to ensure effective therapy and optimal outcomes. At the tuberculosis referral hospital in Windhoek, Namibia, a country of high tuberculosis incidence, we evaluated the diagnostic accuracy of a line-probe-assay (LPA), GenID, for the molecular diagnosis of Mtb infection and drug resistance in patients with suspected tuberculosis (cohort 1) and confirmed rifampin (RIF)-resistant tuberculosis (cohort 2). GenID test results were compared to Xpert MTB/RIF and/or Mtb culture and antimicrobial suceptibilty testing. GenID LPA was applied to 79 and 55 samples from patients in cohort 1 and cohort 2, respectively. The overall sensitivity of GenID LPA for the detection of Mtb DNA in sputum from patients with detectable and undetectable acid-fast bacilli by sputum smear microscopy was 93.3% (56/60; 95% confidence interval = 83.8–98.2) and 22.7% (5/22; 7.8–45.4). The sensitivity/specificity for the detection of drug resistance was 84.2% (32/38; 68.7–94.0)/100% (19/19; 82.4–100.0) for RIF, 89.7% (26/29; 72.6–97.8)/91.7% (22/24; 73.0–99.0) for isoniazid, and 85.7% (6/7; 42.1–99.6)/94.7% (18/19; 74.0–99.9) for fluoroquinolones; 23.6% of tests for second-line injectable resistance were invalid despite repeat testing. The diagnosis of tuberculosis by detection of Mtb DNA in sputum by GenID LPA depends strongly on the detection of acid-fast bacilli in sputum specimen. Prediction of drug resistance by GenID did not reach the World Health Organization (WHO) target product profile. IMPORTANCE Mycobacterium tuberculosis (Mtb) drug-resistance detection is crucial for successful control of tuberculosis. Line-probe assays (LPA) are frequently used to detect resistance to rifampin, isoniazid, fluoroquinolones (FQs), and second-line injectables (SLIs). GenID RIF/isoniazid (INH), FQ, and SLI LPA have not been widely tested and used so far. This study tested the diagnostic performance of the GenID LPA in a high-incidence TB/HIV, real-world setting in Namibia. The LPA demonstrates only an acceptable diagnostic performance for Mtb and drug-resistance detection. The diagnostic sensitivity and specificity fall short of the WHO suggested target product profiles for LPA.


Comment 15: Line 126/127 Instead of 'concur with', should this be 'confer'?
Thank you for your comment. We changed wording of the sentence. It reads now: Mutations at positions rrs 1401, 1402 are associated with resistance to AM/KM and mutations at position rrs 1484 with AM/KM/CM resistance.(6) Comment 16: Line 173/174 reference is needed Reference was included.

Comment 18: Line 197 '...maximum of four repeat runs' is unsustainable in a clinical laboratory.
We fully agree with the reviewer, that this is not sustainable. During the establishment of laboratory and pilot phase of study we noted the problems expressed in the paper regarding test performance. Table 4 quantifies the need for repeat testing. We opted for this approach of evaluating and depicting the challenges of test performance versus the mean number of test run to obtain a valid result. The maximum mean number of tests was 2.15 for obtaining a valid result on the strip for SLI.
Comment 19: Line 201 'technicians' -recommend using a broader term such as analyst. In some countries the term 'technician' is a well-defined human resources category and is differentiated from 'technologist'. Thank you, this was changed.

Comment 20: Line 218 MSF is 'Medecins Sans Frontieres'
Thank you, this was changed Comment 21: Line 280 One cannot consider the GenID LPA as novel since two studies were already published in 2014 and 2015. We agree and removed the word novel in this line and "new" in the title of the paper.
Comment 22: Line 286 'paucibacillary' is not the opposite of AFB smear-positive... We agree, that the word is not appropriate here and changed the sentence: While the test sensitivity was 93% in patients with detectable AFB on sputum smear microscopy, test sensitivity was only 23% in patients without AFB in sputum smear microscopy.  Thank you for giving me the opportunity to review the manuscript by Guenther and colleagues reporting on the performance of the AID suite of TB line probe assays in Namibia. The manuscript is well written and intelligible. I have some major and minor comments that the authors and editors may find helpful.

Major comments:
Comment 1: Title: My understanding is that the authors aimed to evaluated three (GenID RIF/INH, GenID SLI, GenID FQ) and not one product. This should be reflected in the study title. Thank you for the comment. We evaluated the LPA, which to our understanding was one product, developed to detect a range of drug resistance mutations of Mtb, using 3 different strips under the same workflow and protocols. We aimed to use a short title for the study. But we adjusted the title based on your comments to: Clinical evaluation of a line probe assay for tuberculosis detection and drug-resistance prediction in Namibia Comment 2: Are the assays actually commercially available? I could not find the respective information on the manufacturer's website. They assay is commercially available, has a CE mark and is in use by few laboratories for routine diagnostics. See link to the manufacturer webpage: https://www.aiddiagnostika.com/kits/molekularbiologischer-assay/infektionsdiagnostik/antibiotika-resistenzen/tbmodul-inh-/-rif Comment 3: The role of SLI testing has significantly declined since recent changes to WHO therapeutic regimens and a push towards all oral regimens. KAN and CM are no longer recommended therapeutic agents. So the assay design appears somewhat outdated to me for a novel test. Is the test really "novel" (line 116)? Thank you for the comment. We fully agree with this comment and removed the words novel and new from the manuscript.
Comment 4: The sample numbers (79 and 55) are low. What were the considerations around minimum sample numbers? Was there some sort of power calculation? Why did the authors choose not to run the study for a longer time?
The study was implemented in the context of building laboratory capacity and workflows for molecular diagnostics in a low resource setting. Capacity for molecular diagnostics of drugresistance was not preexistent at the University of Namibia laboratories. Logistical challenges and funding constraints for staff and consumables allowed only a single center study and consecutive sampling for one year.
The authors would have liked to increase sample size, particularly for DR-TB patients. But for the reasons given, mostly funding, it was not possible to increase the sample size.
Comment 5: Were leftover samples used? Were they analyzed retrospectively? If so, were they frozen (at what temperature?) between routine testing and performance of the LPA? No leftover samples were used. Samples were collected prospectively for the study and analyzed as fresh sputum samples in all cases with the LPA. No frozen samples were used.
Comment 6: Were operators performing the LPA blinded to the results of the routine diagnostic workup? Operators were not blinded to routine workup. But the LPA laboratory had no direct access to the laboratory information system of NIP, and so didn't the operators. Also LPA were interpreted immediately, when results were obtained, while culture-based resistance results could not influence this interpretation, as they were available only weeks later.

Comment 7: The authors should mention the intended use of the assays as per the package insert. Does the manufacturer actually claim that the test can be used as a screening PCR to identify patients with TB (as opposed to a reflex test following another screening PCR to identify MDR-TB)?
The package insert says, that the test can be used as test for presence of M. tuberculosis and resistance testing. Quote package insert: "This single kit assay by GenID® GmbH, Straßberg, which is based on a multiplex PCR followed by reverse hybridization using sequence-specific oligonucleotide probes (SSOP), enables accurate and rapid modular detection of Mycobacteria of the Mycobacterium tuberculosis complex and its resistances to Isoniazid and Rifampicin." We added a sentence which specifies this: The LPA can be used for the detection of Mtb complex and Mtb drug-resistance mutations against RIF, INH, EMB, FQ and SLI. The manufacturer of the test does not specify the materials to be used: The package insert only says: "……using DNA isolated from an appropriate specimen." Comment 9: For any discussion around the sensitivity of the assay, it would be necessary to understand whether a single-copy (e.g. rpoB) or a multi-copy (e.g. IS6110) target is used for detection of MTBC.
Unfortunately the manufacturer does not specify in its documentation if a single copy of multi copy target is used.
Comment 10: In relation to my previous comment: Which TB lineages were covered by the sample panel (IS6110 copy numbers are variable between MTBC members)? Can the authors comment on whether the assay will show similar performance characteristics in settings with different regional epidemiology? If not, this should be mentioned as a limitation. The manufacturer only states in the package insert, that the assay detects M. tuberculosis complex. Only two studies with the assay, performed in Europe with samples from Spain, Switzerland and South Africa have been performed. The studies focus on resistance detection. Therefore so far nothing is known about the performance of the LPA in different epidemiological settings. We added this point to the limitations. It reads: The impact of different Mtb lineages on Mtb detection could not be evaluated.
Comment 11: Lines 112-114: "LPA diagnostic performance is usually compared with culture-based drug susceptibility testing (DST), which remains the reference standard of Mtb resistance testing." This statement does not reflect the best practice. While culture can be considered the reference standard for detection of MTBC nucleic acid, predictions for RIF and INH susceptibility / resistance should not be compared with pDST alone but with a composite reference standard comprising pDST and the rapid molecular test that is validated in the respective routine diagnostic workflow (often Xpert or HAIN GenoType MTBDRplus or both). Discrepant samples should then be resolved using another method, preferably whole genome sequencing. This is important in the light of mutations conferring borderline resistance to rifampicin, which are not necessarily detected by pDST alone. Thank you for this important statement: We fully agree with your comment. Ideally we would have wished to perform this study and compare LPA results with genome sequencing results. But logistics, setting and funding did not allow the use of sequencing methodology. We had to battle in this study with the challenges of a resource limited setting. Stock outs of Xpert and culture reagent in the Namibia Institute of Pathology, who are doing Xpert MTB /RIF and culture/DST had impact on the study. Therefore we used for Mtb detection a combined reference standard of culture and Xpert MTB/RIF. Not doing so, would have further reduced the already low sample size. A follow up study, using genome sequencing, currently under manuscript preparation highlights the discrepancies in resistance detection between Xpert MTB/RIF and culture in particular for rifampicin in Namibia The now implemented change in the critical concentration for rifampicin also in the NIP laboratory should mitigate this problem. The results of follow up study show, that Xpert MTB/RIF predicts more accurately rifampicin resistance, than phenotypic AST at the time of the GenID LPA study. Therefore, based on this recent analysis we decided in the revised version of the manuscript, that reference standard for rifampicin resistance should be Xpert MTB /RIF, not phenotypic DST. Table 2 and the results section were revised accordingly.
We added a sentence clarifying the desired reference standards for LPA performance evaluation: LPA diagnostic performance for Mtb detection is mostly compared with Mtb culture.(3) Antimicrobial susceptibility testing (AST) performance is compared increasingly to molecular (i.e. next generation genome sequencing and culture based methods as reference standards.(4) Comment 12: References are partially outdated. For example, when relating to WHO guidance, the introduction should summarize the rapid molecular WHO-recommended tests as outlined in the WHO consolidated guidelines/operational handbooks on tuberculosis. Module 3: Diagnosis -Rapid diagnostics for tuberculosis detection 2021 update. These documents should be cited as references. Thank you for the important comment. The updated guidelines for TB diagnostics from 2021 have been included in the introduction and references, specifying the current role of LPA in the TB diagnostic pathway. We did not discuss the PZA-LPA, as PZA is not test in the GenID assay. Currently WHO recommends the use of LPAs in sputum-smear positive patients and from culture positive specimen as initial test for detection of resistance to RIF and INH instead of culture based phenotypic antimicrobial susceptibility testing (AST). In addition, LPAs are recommended in patients with multidrug-resistant (MDR)/ RIF-resistant (RR) TB as initial test to detect resistance to FQ and SLI.(2) Comment 13: Thank you for the important comment. We discussed the point during manuscript preparation. We combined the reference standard Xpert MTB/ Rif and culture for Mtb detection, as the samples size of the study is already low, and using only one reference standard would have reduced it further. Unfortunately the Namibia Institute of Pathology laboratory -the only culture lab in the country had stock outs of reagents for culture which forced us in this suboptimal situation, as we could not use culture consistently as reference standard. This is the bitter reality of building research and research capacity in such settings. We aimed for culture as reference standard in all cases. We calculated again, how many patients we would loose by using only culture as reference standard, not combined with Xpert for MTB detection: 16/79 in cohort 1 have no culture result, 6/55 in cohort 2 had no culture result. Based on this calculations and the small sample size we still agree with the reviewer comments, but feel the combined reference standard is an acceptable alternative, considering the circumstances. In addition we might argue, that recent diagnostic performance studies, like the evaluation of the Xpert MTB/XDR assay (https://doi.org/10.1016/s1473-3099(21)00452-7) define Mtb detection by use of Xpert MTB/Rif or Xpert Ultra.
Comment 14: For any comparison between the LPA and other rapid molecular tests, the authors should indicate the sample volumes that were used as an input and the range of volumes recommended by the manufacturer. In similar studies, the available leftover sample volumes used for the index test tend to result in input volumes that are rather on the lower end of the recommended range, whereas the volume used for the standard-ofcare assay (e.g. Xpert) is typically rather on the higher end of the range. This may introduce systematic bias. We agree with your comment. The manufacturer of GenID does not recommend a specific sample volume. In this study, a spot sample and early morning sample was used for Xpert MTB/RIF, smear microscopy and culture at the NIP laboratory. If sufficient sputum remained, DNA extraction was performed for LPA. If not, an additional sample was collected. This might explain partially the need for repeat testing, as depicted in table 4. We clarified this by editing the methods section with the sentence: GenID ® LPA testing was performed according to manufacturer instructions. DNA was extracted from decontaminated sputum samples, after aliquots were used for smear microscopy, Xpert ® MTB/RIF and culture first, using the hot sodium hydroxide and tris (HotSHOT) protocol and heat inactivation at 95°C for 5 min.(15) Comment 15: Line 358/359: To make it easier for the reader, the authors could briefly explain the criteria used in the WHO TPP. Thank you for the comment. We added the TTP requirements. The sentence reads now: The assay misses the target product profiles for drug-resistance testing, suggesting a minimal sensitivity > 95% for RIF and > 90% for INH, FQ and SLI and specificity > 95% compared to culture-based AST, recommended by WHO. (19) Reviewer #2 (Minor Comments for Author (Required)):

Comment 17: Lines 284-289: Redundant, could be shortened
Thank you for the comment. We completely changed the paragraph, based on the change of the reference standard for RIF -resistance from culture/DST to Xpert MTB/RIF, avoiding redundant results. The authors describe the results of an evaluation of the performance of the GenID LPA for the detection of Mtb and the detection of drug resistance. Although the scientific approach and experimental design are appropriate, the description of the results is a bit confusing. Comment 1: For example, this reviewer found the information in Table 2, Figure 1, and lines 253-255 confusing and contradictory. Table 2 states a sensitivity of 90.5% (19/21) and a specificity of 82.8% (24/29) for detection of RIF resistance whereas the data in Figure 1 suggest a sensitivity of 84% (32 of 38 RIF-resistant MTB+ samples in cohort 2 with LPA results) and a specificity of 100% (21/21 RIF-susceptible MTB+ samples in cohort 1 with LPA results). A 2-by-2 table comparing the GenID result for RIF resistance with the Xpert result for RIF resistance would be useful to the reader. Thank you for your very important comment. Data in figure 1 were based on resistance detection to RIF based in Xpert MTB/RIF, while in table 2 data were based on phenotypic DST. This explains the difference. We now changed in table 2 the reference standard for RIF resistance detection to Xpert MTB/RIF. At the time of the study the critical concentration for RIF resistance used based on WHO recommendations was 1.0 μg/dl. It is known from later work, that this missed a number of RIF -resistant cases, which were false negative in phenotypic DST, compared to Xpert MTB/RIF. The sensitivity for RIF -resistance testing is now only 84.2%, compared to 90.5%, if phenotypic DST for RIF is the reference standard. Table S2 gives similar results, if cohort 1 and cohort 2 are analysed separately. We added a sentence in the methods section: Results were compared to Xpert ® MTB/RIF for RIF resistance, to Mtb culture (BACTEC ® MGIT 960; Becton Dickinson, Sparks, USA) and phenotypic AST results for all other drugs. And we highlighted also the different reference standards in the discussion: For GenID RIF/INH we used Xpert MTB/RIF resistance detection as reference standard for RIF, and AST for INH. This decision was taken, as it is known from later, yet unpublished work, that Xpert MTB/RIF predicted RIF resistance more accurately than AST with a critical concentration of 1.0 μg/ml for RIF, used at the time of the study.
Comment 2: Lines 243-249. The authors should clearly justify the rational for combining the data from cohort 1 and cohort 2 with respect to assessing the ability to detect Mtb. Combining the two cohorts generates a patient population in which 61% of the patients have bacteriologically confirmed TB -a proportion that is unlikely to represent the situation in a clinical setting. Thank you for this important comment. As highlighted earlier in our comments, the authors discussed this approach at length at the analysis stage. We opted for the combination of the 2 cohorts to allow for meaningful stratification of results by smear status and HIV, as this is first study assessing the GenID assay in a real world, HIV-high incidence setting. The primary reason in real world setting is the use of the LPA for resistance detection, not MTB detection. As MTB detection and resistance detection are linked closely, using the GenID LPA, and considering the intentions to stratify, we opted to combine the cohorts. In cohort 2 TB was confirmed by Xpert MTB/RIF in all but 2 cases. We opted to also use this cohort for performance assessment of the GenID RIF/INH strip for TB detection, knowing the limitation of a total of 61% patients with bacteriologically confirmed TB. Table S1 now shows the performance of GenID ® LPA RIF/INH, comparing cohort 1, cohort 2 and both cohorts together.
Comment 3: On the other hand, combining the cohort 1 and cohort 2 data to assess the performance of the assay for detecting drug-resistance in bacteriologically confirmed patients is appropriate. Thanks you for the comment. As the main clinical use of the GenID assay is resistance testing, not Mtb detection, we kept table 2 in the main manuscript and added table S1 in the supplement.
Comment 4: It would be very useful for the reader if the authors described separately the performance of the assay for detecting Mtb in persons with presumed TB and the performance of the assay for detecting drug-resistance in persons with bacteriologically confirmed TB (Xpert or culture positive). We did so in the revised manuscript and added table S1. We also refered to it in the results section in the sections about Mtb detection were we added the sentence: Similar results were observed in a separate analysis of cohort 1 and 2 (table S 1). In the section about resistance prediction we added the sentence: Table 2 also shows the diagnostic sensitivities in HIV-positive vs. HIV-negative patients, while table S1 shows diagnostics sensitivities and specificities for RIF and INH testing separate for cohort 1 and 2.
Reviewer #3 (Minor Comments for Author (Required)): Comment 5: Abstract: include 95% confidence intervals in for the performance parameters. This has been done, thank you.
Comment 6: Line 156-159. In cohort 1, were the patients sufficiently ill to warrant admission to the hospital or were the patients just evaluated at in the respiratory ward? Symptoms led to presention at the internal medicine admissions department of the hospital. If the assessment was, that work up for TB is required, the workup was performed on the respiratory admissions ward.