False-negative results of initial RT-PCR assays for COVID-19: A systematic review

Background A false-negative case of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection is defined as a person with suspected infection and an initial negative result by reverse transcription-polymerase chain reaction (RT-PCR) test, with a positive result on a subsequent test. False-negative cases have important implications for isolation and risk of transmission of infected people and for the management of coronavirus disease 2019 (COVID-19). We aimed to review and critically appraise evidence about the rate of RT-PCR false-negatives at initial testing for COVID-19. Methods We searched MEDLINE, EMBASE, LILACS, as well as COVID-19 repositories, including the EPPI-Centre living systematic map of evidence about COVID-19 and the Coronavirus Open Access Project living evidence database. Two authors independently screened and selected studies according to the eligibility criteria and collected data from the included studies. The risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. We calculated the proportion of false-negative test results using a multilevel mixed-effect logistic regression model. The certainty of the evidence about false-negative cases was rated using the GRADE approach for tests and strategies. All information in this article is current up to July 17, 2020. Results We included 34 studies enrolling 12,057 COVID-19 confirmed cases. All studies were affected by several risks of bias and applicability concerns. The pooled estimate of false-negative proportion was highly affected by unexplained heterogeneity (tau-squared = 1.39; 90% prediction interval from 0.02 to 0.54). The certainty of the evidence was judged as very low due to the risk of bias, indirectness, and inconsistency issues. Conclusions There is substantial and largely unexplained heterogeneity in the proportion of false-negative RT-PCR results. The collected evidence has several limitations, including risk of bias issues, high heterogeneity, and concerns about its applicability. Nonetheless, our findings reinforce the need for repeated testing in patients with suspicion of SARS-Cov-2 infection given that up to 54% of COVID-19 patients may have an initial false-negative RT-PCR (very low certainty of evidence). Systematic review registration Protocol available on the OSF website: https://tinyurl.com/vvbgqya.

repeated testing in patients with suspicion of SARS-Cov-2 infection given that up to 29% of patients 62 could have an initial RT-PCR false-negative result. 63 Systematic review registration: Protocol available on OSF website: https://osf.io/gp38w/ 64 65 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https: //doi.org/10.1101//doi.org/10. /2020 BACKGROUND 66 On December 31, 2019, the World Health Organization (WHO) was alerted about a cluster of 67 pneumonia patients in the city of Wuhan, in China's Hubei province [1]. Chinese authorities 68 confirmed a week later the outbreak of a novel coronavirus currently called Severe Acute 69 Respiratory Coronavirus 2 (SARS-CoV-2) [2]. This new virus is the underlying cause of Coronavirus 70 Disease 2019 , which has become a worldwide public health emergency and reached 71 pandemic status [3]. By the time of this article's writing, the virus has spread to 212 countries and 72 territories and has caused over 85,837 deaths worldwide [4]. 73 Patients with COVID-19 exhibit respiratory symptoms such as fever, cough, and shortness of breath 74 as primary manifestations [5,6]. Although most of the cases present mild symptoms, some cases 75 have developed pneumonia, severe respiratory diseases, kidney failure and even death [7][8][9]. SARS-76 CoV-2 mainly spreads through person-to-person contact via respiratory droplets from coughing and 77 sneezing, and through surfaces that have been contaminated with these droplets. [10] Recent 78 studies have suggested the presence of asymptomatic cases in cluster families, possibly transmitting 79 the virus before a virus-carrying person displays any symptom [11]. 80 Because the signs of infection mentioned above are non-specific, confirmation of cases is currently 81 based on the detection of a viral sequence by reverse transcription-polymerase chain reaction . Different RT-PCR schemes have been proposed; all of them include the N gene that codes for 83 the viral nucleocapsid. Other alternative targets are the E gene, for the viral envelope, or the S gene 84 for the spike, and the Hel gene for the RNA polymerase gene (RdRp/Helicase) [12,13]. Molecular 85 criteria for in vitro diagnosis of COVID-19 disease are heterogeneous, and usually require the 86 detection of two or more genes of SARS-CoV-2 [14]. 87 RT-PCR repeated testing might be required to confirm a clinical diagnosis, especially in the presence 88 of symptoms close related to COVID-19 disease [15]. Cases with negative RT-PCR results at initial 89 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https: //doi.org/10.1101//doi.org/10. /2020 testing and later found to be positive in a subsequent test are commonly considered cases with an 90 initial false-negative result. Some researchers have suggested that these failures in SARS-CoV-2 91 detection are related to multiple pre-analytical and analytical factors, such as lack of standardisation 92 to collect specimens, the time and conservation of samples until to be received in the laboratory, 93 the use of non-adequately validated assays, contamination during the procedure, insufficient viral 94 specimens and load, the incubation period of the disease, and the risk of active recombination and 95 mutation [14,16]. 96 The availability of accurate laboratory tools for COVID-19 is essential for case identification, contact 97 tracing, and optimization of infection control measures, as it was shown by previous epidemics 98 caused by SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV) [17][18][19]. Due 99 to the COVID-19 pandemic causing an important burden on health systems around the globe, and 100 considering that a missing COVID 19 case might have severe consequences at several levels, we 101 aimed to estimate through a systematic review of the literature the proportion of false-negatives 102 related to the detection of SARS-CoV-2 using RT-PCR assays at the initial laboratory test. 103 104

METHODS 105
We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) for 106 diagnostic test accuracy (DTA) to perform this report [20]. For the development of this systematic 107 review of literature, we used selected methods for the development of rapid reviews, such as a high 108 involvement of stakeholders in the review process (including the definition of the review question), 109 a non-independent verification of data selection and extraction, and parallelisation of tasks (that is, 110 to perform selected activities simultaneously instead of consecutively). Other review shortcuts and 111 omission of review tasks were not applied. A protocol of this review was published in the Open 112 Science Framework repository for public consultation (https://osf.io/gp38w/). 113 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101 Criteria for considering studies for this review 114 We included observational studies (including accuracy studies, cohorts, and case series) reporting 115 the initial use of RT-PCR to the detection of SARS-CoV-2 RNA in patients under suspicion of infection 116 by clinical or epidemiological criteria. Specially, we prioritised studies enrolling consecutive patients 117 who were receiving RT-PCR as initial testing with further confirmation of SARS-CoV-2 infection 118 and/or COVID-19 diagnosis (positive/negative). We did not impose limits by age, gender, or study 119 location. 120 We aimed to include all types of RT-PCR kits, regardless of the brand/manufacturer, the RNA 121 extraction method used, the number of target gene assays assessed and cycle threshold value for 122 positivity. Studies comparing the accuracy of two or more tests for COVID-19 diagnosis were also 123 considered if we could abstract the fraction of negative test results as defined by an initial RT-PCR 124 assay. 125 We excluded studies without clear information about false-negative cases, the number of final 126 confirmed cases, or an unclear verification of negative cases. Case reports, studies based on 127 laboratory samples, and literature reviews were also excluded. 128

Search methods for identification of studies 129
We carried out a comprehensive and sensitive search strategy based on the proposal for the living 130 systematic review developed by the University of Bern's Institute of Social and Preventive Medicine-131 ISPM in the following databases: 132  MEDLINE (Ovid SP, 1946 to April 6 th, 2020) 133  Embase (Ovid SP, 1982 to April 6 th, 2020) 134  LILACS (iAH English) (BIREME, 1982 to April 6 th, 2020) 135 We did not apply any language restrictions to electronic searches (S1 Appendix). As additional 136 sources of potential studies, we searched in repositories of preprint articles (such as Medrxiv), 137 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint

Data collection and analysis 151
For the selection of potential studies, one reviewer screened the search results based on the title 152 and abstract, with additional verification by a second reviewer (no-independent verification). We 153 retrieved the full-text copy of each study assessed as potentially eligible, and pairs of reviewers 154 confirmed eligibility according to the selection criteria (non-independent verification). In case of 155 disagreements we reached consensus by discussion. For data extraction one reviewer extracted 156 qualitative and quantitative data from eligible studies. An additional reviewer checked all the 157 extracted information for accuracy (non-independent verification of data extraction). 158

Assessment of methodological quality 159
We assessed the methodological quality of accuracy studies using the Quality Assessment of 160 Diagnostic Accuracy Studies (QUADAS-2) tool [21]. Due to the lack of tools to assess the risk of bias 161 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101 associated with case series, we decided to apply the QUADAS-2 tool in case of inclusion of this type 162 of report. 163

Statistical analysis and data synthesis 164
For all included studies, we extracted data about the number of cases initially considered as negative 165 (i.e. false-negative cases) as well as the total of confirmed cases in further investigations. We 166 presented the results of estimated proportions (with 95% CIs) in a forest plot, in order to assess the 167 between-study variability. We aimed to calculate the false-negative rate with the corresponding 168 95% CI using a multilevel mixed-effect logistic regression model implemented in Stata 16®'s 169 metaprop_one command. This allowed us to estimate the between-study heterogeneity from the 170 variance of study-specific random intercepts. We assessed the heterogeneity between the results 171 of the primary studies using the Tau-square statistic. A probability value less than 0.1 (p<0.1) was 172 considered to suggest statistically significant heterogeneity and preclude a pooled result of 173 numerical data. 174 We planned to investigate the potential sources of heterogeneity using a descriptive approach and 175 performing a random-effects meta-regression analysis. Anticipated sources of heterogeneity 176 included the type of specimen collected, the presence or not of clinical findings, the number of RNA 177 targets genes under assessment, and the time of symptom evolution. 178

Summary of findings and certainty of the evidence 179
We rated the certainty of the evidence about false-negative cases following the GRADE approach 180 for tests and strategies [22,23]. We assessed the quality of evidence as high, moderate, low or very 181 low, depending on several factors including risk of bias, imprecision, inconsistency, indirectness, and 182 publication bias. We illustrate the consequences of the numerical findings in a population of 100 183 tested, according to three different prevalence estimates of the disease provided by the 184 stakeholders involved in this review. 185 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101 Patient and public involvement 186 We involved several stakeholders in the design, conduct, and reporting of our research, including 187 general and family physicians, specialists on infectious disease and microbiologists currently 188 attending patients under suspicion of COVID 19 disease. The study protocol and preliminary results 189 are publicly available on https://osf.io/gp38w/. 2). The age of participants ranged from 44 to 51 years (information derived from three studies) [24, 204 25, 27]. There were 577 men versus 213 women included (Table 1). Three studies included patients 205 under suspicion of COVID-19 due to clinical findings and/or epidemiological criteria [24,26,27]. 206 Confirmation of infection was performed after isolation of SARS-CoV-2 in any real-time RT-PCR assay 207 for 2019-nCoV, including repeated RT-PCR after negative results (two or more). Three studies 208 provided information about the proportion of confirmed cases with positive chest CT findings, 209 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101 ranging from 74 to 98%. One study provided information about the time from the symptom onset 210 to CT scan as a proxy for the duration of disease [25], and a second one reported duration of fever 211 [27]. 212 Regarding RT-PCR testing, the RT-PCR brand/manufacturer was reported by two studies [24,25], No 213 studies reported criteria for positivity. Most of the studies based their assessment on throat 214 samples, such as pharyngeal, nasal and oropharyngeal swabs. Four studies provided information 215 about the time since the initial RT-PCR to repeated testing (Table 1). 216 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101

Quality of included studies 219
We applied the QUADAS-II tool to all included studies to reflect critical limitations in the validity of 220 the findings (Figure 3 and S3 Appendix). The reference standard domain was the most affected by 221 the potential risk of bias due to the lack of independence between the index test and the 222 confirmation of cases (repeated RT-PCR testing). Details about the criteria for positivity were not 223 provided by all included studies, and this domain was judged as under unclear risk and unclear 224 applicability concerns. In addition, the applicability of patient selection was judged as with great 225 concerns due to most of the studies selected patients who underwent both RT-PCR and Chest CT, 226 excluding patients who can be candidates to receive the index test in the current clinical practice. The pooled estimation of false-negative proportion was 0.085 (95% CI 0.034 to 0.196) estimated by 236 a mixed-effects logistic regression model. However, pooled data is affected by a considerable 237 between-study heterogeneity (tau-squared = 1.08; 95% CI= 0.27 to 8.28; p<0.001), since we are not 238 able to warrant that the average estimation provided by the meta-analysis is a valid and 239 representative estimation of the true value of the false-negative proportion in the current practice, 240 we instead used the range of proportions in the analysis of the certainty of the evidence using the 241 GRADE approach. A full exploration of heterogeneity was not possible given that: a) most of the 242 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https: //doi.org/10.1101//doi.org/10. /2020 studies collected upper or lower respiratory specimens; b) all studies included patients with clinical 243 findings suggestive of COVID 19 disease; c) subgroup information by the time of evolution of 244 symptom was only provided by one study; and d) key information about the characteristics of the 245 index test, such as positivity criteria, were not reported. The high variability of pooled estimation 246 was not reduced with the separate estimation of false-negative proportion by type of study 247 (accuracy versus case series). 248 249

Certainty of the evidence 250
We use the range of false-negative proportions to develop a summary of findings following the 251 GRADE approach. The quality of the evidence was judged to be very low due to issues related to the 252 risk of bias, indirectness, and inconsistency ( Figure 5). We illustrate the consequences of the range 253 of false-negative proportions in a population of 100 tested, according to three different prevalence 254 seen in the current clinical practice for participant stakeholders (30%, 50%, and 80%) ( Figure 5). 255 Using a prevalence of 50%, we found that 1 to 14 cases would be misdiagnosed and then they could 256 no receive adequate clinical management, and they could require repeated testing at some point of 257 their hospitalization or even they could require other investigations for competitive diagnoses. This 258 numerical approach should be interpreted with caution due to the multiple limitations of the 259 evidence described above ( Figure 5). . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101 findings for RT-PCR and chest CT) from several provinces of China and collected between January to 267 February 2020. We considered all studies to be affected by several sources of bias, especially related 268 to the independence between the index test and the reference standard and the unclear report of 269 key RT-PCR characteristics. A meta-analysis of the proportions using Stata® showed a considerable 270 heterogeneity not explained by the collected data, and this variability is a limitation for the full 271 interpretation of averaged proportion. As an alternative, we preferred to provide an analysis of the 272 range of false-negative proportions derived from included studies in a cohort of 100 patients tested 273 and using three different prevalence of the disease derived from the current clinical practice of our 274 participant stakeholders. Using a prevalence of 80%, we found that 2 to 23 cases would be 275 misdiagnosed and then they could no receive adequate clinical management. However, we 276 emphasized that this numerical approach should be interpreted with caution due to the multiple 277 limitations of the evidence described above (Quality of evidence: Very low). 278 Although we did not impose restrictions on population characteristics such as age, setting or 279 publication status, we noticed that our findings are limited due to all the studies were performed in 280 one country (China), and they reported data only for the beginning of the pandemic (January 2020), 281 in addition to the lack of reporting about the index test previously mentioned. RT-PCR kits in use for 282 included studies were likely the first kits developed by detection of SARS-CoV-2, and then the tests 283 currently in use might have a great technological evolvement and different characteristics to those 284 of the initial tools. 285 286 Despite the scarcity of information to answer the review question, our study carried out a 287 comprehensive literature search to identify all relevant studies, including several sources of 288 unpublished literature such as pre-print repositories. Our assessment also includes a rigorous 289 assessment of potential sources of bias, a formal statistical analysis of results and a final assessment 290 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https: //doi.org/10.1101//doi.org/10. /2020 of the certainty of the evidence under a well-known system (GRADE). We applied selected methods 291 associated with rapid reviews to streamline the review process, such as the involvement of 292 stakeholders in the development of the review, a non-independent verification of data selection 293 and extraction, and parallelisation of tasks (that is, to conduct selected activities simultaneously 294 instead of consecutively) [29]. We avoided the use of methods that potentially might affect the 295 quality of the review process, such as those related to limiting the search strategies, the omission 296 of quality assessment of the collected evidence and the narrative synthesis of results [29,30]. 297 Due to the permanent involvement of clinicians managing COVID 19 patients in the development of 298 this review, we were able to define a review question that responds to a clinical inquiry relevant to 299 current clinical practice [31][32][33]. In fact, the number of cases misdiagnosed as not having the target 300 condition is a critical figure due to the severe consequences of not treatment of missing patients. 301 This estimation also can help in the estimation of additional resources in the current clinical 302 practices to confirm a suspicious case. 303 304

Implications for practice 305
Our findings reinforce the need for repeated testing in patients with suspicion of being infected, due 306 to either clinical or epidemiological reasons, given that up to 29% of patients may have an initial 307 negative RT-PCR (certainty of evidence: very low). The collected evidence has several limitations in 308 terms of risk of bias and applicability; in addition, lack of reporting of several key factors remains a 309 significant constraint for analysis of collected data. A false negative result during the recovering 310 phase could have important implications for isolation and risk of transmission, although this risk is 311 reduced by the documentation of at least two negative samples before the discharge. A consequent 312 positive result could also be erroneously considered as reinfection. An update of this review when 313 new studies would be available is warranted. 314 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101

Implications for research 315
Due to the multiple difficulties associated with the lack of reporting of included studies, and due to 316 the high probability of new studies being published in the short-term, we provided some 317 recommendations for future studies candidates to be included in an update of this review: 318  Inclusion of a series of consecutive patients instead of selected groups, to avoid spectrum bias. 319  Inclusion of a series of consecutive patients instead of selected groups, to avoid spectrum bias 320 CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101  CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101  CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101  CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101  CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101  CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101