Measuring the impact of screening automation on meta-analyses of diagnostic test accuracy

Main Authors: Norman, Christopher R., Leeflang, Mariska M.G., Porcher, Raphaël, Névéol, Aurélie
Format: Article Journal
Terbitan: , 2019
Subjects:
Online Access: https://zenodo.org/record/3678082
ctrlnum 3678082
fullrecord <?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><creator>Norman, Christopher R.</creator><creator>Leeflang, Mariska M.G.</creator><creator>Porcher, Rapha&#xEB;l</creator><creator>N&#xE9;v&#xE9;ol, Aur&#xE9;lie</creator><date>2019-10-01</date><description>Background: The large and increasing number of new studies published each year is making literature identification in systematic reviews ever more time-consuming and costly. Technological assistance has been suggested as an alternative to the conventional, manual study identification to mitigate the cost, but previous literature has mainly evaluated methods in terms of recall (search sensitivity) and workload reduction. There is a need to also evaluate whether screening prioritization methods leads to the same results and conclusions as exhaustive manual screening. In this study, we examined the impact of one screening prioritization method based on active learning on sensitivity and specificity estimates in systematic reviews of diagnostic test accuracy. Methods: We simulated the screening process in 48 Cochrane reviews of diagnostic test accuracy and re-run 400 meta-analyses based on a least 3 studies. We compared screening prioritization (with technological assistance) and screening in randomized order (standard practice without technology assistance). We examined if the screening could have been stopped before identifying all relevant studies while still producing reliable summary estimates. For all meta-analyses, we also examined the relationship between the number of relevant studies and the reliability of the final estimates. Results: The main meta-analysis in each systematic review could have been performed after screening an average of 30% of the candidate articles (range 0.07 to 100%). No systematic review would have required screening more than 2308 studies, whereas manual screening would have required screening up to 43,363 studies. Despite an average 70% recall, the estimation error would have been 1.3% on average, compared to an average 2% estimation error expected when replicating summary estimate calculations. Conclusion: Screening prioritization coupled with stopping criteria in diagnostic test accuracy reviews can reliably detect when the screening process has identified a sufficient number of studies to perform the main meta-analysis with an accuracy within pre-specified tolerance limits. However, many of the systematic reviews did not identify a sufficient number of studies that the meta-analyses were accurate within a 2% limit even with exhaustive manual screening, i.e., using current practice .</description><identifier>https://zenodo.org/record/3678082</identifier><identifier>10.1186/s13643-019-1162-x</identifier><identifier>oai:zenodo.org:3678082</identifier><relation>info:eu-repo/grantAgreement/EC/H2020/676207/</relation><relation>doi:10.5281/zenodo.1303258</relation><relation>url:https://zenodo.org/communities/miror</relation><rights>info:eu-repo/semantics/openAccess</rights><rights>https://creativecommons.org/licenses/by/4.0/legalcode</rights><source>Systematic Reviews 8(243)</source><subject>Evidence based medicine</subject><subject>*Machine learning</subject><subject>Natural language processing/*methods</subject><subject>*Systematic review as topic</subject><title>Measuring the impact of screening automation on meta-analyses of diagnostic test accuracy</title><type>Journal:Article</type><type>Journal:Article</type><recordID>3678082</recordID></dc>
format Journal:Article
Journal
Journal:Journal
author Norman, Christopher R.
Leeflang, Mariska M.G.
Porcher, Raphaël
Névéol, Aurélie
title Measuring the impact of screening automation on meta-analyses of diagnostic test accuracy
publishDate 2019
isbn 136430191162x
topic Evidence based medicine
*Machine learning
Natural language processing
*methods
*Systematic review as topic
url https://zenodo.org/record/3678082
contents Background: The large and increasing number of new studies published each year is making literature identification in systematic reviews ever more time-consuming and costly. Technological assistance has been suggested as an alternative to the conventional, manual study identification to mitigate the cost, but previous literature has mainly evaluated methods in terms of recall (search sensitivity) and workload reduction. There is a need to also evaluate whether screening prioritization methods leads to the same results and conclusions as exhaustive manual screening. In this study, we examined the impact of one screening prioritization method based on active learning on sensitivity and specificity estimates in systematic reviews of diagnostic test accuracy. Methods: We simulated the screening process in 48 Cochrane reviews of diagnostic test accuracy and re-run 400 meta-analyses based on a least 3 studies. We compared screening prioritization (with technological assistance) and screening in randomized order (standard practice without technology assistance). We examined if the screening could have been stopped before identifying all relevant studies while still producing reliable summary estimates. For all meta-analyses, we also examined the relationship between the number of relevant studies and the reliability of the final estimates. Results: The main meta-analysis in each systematic review could have been performed after screening an average of 30% of the candidate articles (range 0.07 to 100%). No systematic review would have required screening more than 2308 studies, whereas manual screening would have required screening up to 43,363 studies. Despite an average 70% recall, the estimation error would have been 1.3% on average, compared to an average 2% estimation error expected when replicating summary estimate calculations. Conclusion: Screening prioritization coupled with stopping criteria in diagnostic test accuracy reviews can reliably detect when the screening process has identified a sufficient number of studies to perform the main meta-analysis with an accuracy within pre-specified tolerance limits. However, many of the systematic reviews did not identify a sufficient number of studies that the meta-analyses were accurate within a 2% limit even with exhaustive manual screening, i.e., using current practice .
id IOS16997.3678082
institution ZAIN Publications
institution_id 7213
institution_type library:special
library
library Cognizance Journal of Multidisciplinary Studies
library_id 5267
collection Cognizance Journal of Multidisciplinary Studies
repository_id 16997
subject_area Multidisciplinary
city Stockholm
province INTERNASIONAL
shared_to_ipusnas_str 1
repoId IOS16997
first_indexed 2022-06-06T05:11:05Z
last_indexed 2022-06-06T05:11:05Z
recordtype dc
_version_ 1734904478183718913
score 17.538404