Evaluation of an Artificial Intelligence System for Retinopathy of Prematurity Screening in Nepal and Mongolia

Objective
The purpose of this study is to evaluate the performance of a deep learning algorithm for retinopathy of prematurity (ROP) screening in Nepal and Mongolia.

Design
This was a retrospective analysis of prospectively collected clinical data.

Subjects
Clinical information and fundus images were obtained from infants in two ROP screening programs in Nepal and Mongolia.

Methods
Fundus images were obtained using the Forus 3nethra neo in Nepal and RetCam® Portable in Mongolia. The overall severity of ROP was determined from the medical record using the International Classification of ROP (ICROP). The presence of plus disease was independently determined in each image using a reference standard diagnosis. The Imaging and Informatics for ROP (i-ROP) deep learning (DL) algorithm, which was trained on images from the RetCam® was used to classify plus disease, as well as assign a vascular severity score (VSS) from 1-9.

Main outcome measures
The main outcome measures were area under the receiver operating characteristic (AUC-ROC) and area under the precision recall curve (AUC-PR) for the presence of plus disease or type 1 ROP, and association between VSS and ICROP disease category.

Results
The prevalence of type 1 ROP was found to be higher in Mongolia (14.0%) than in Nepal (2.2%, p < 0.001) in these data sets. In Mongolia (Retcam images), the AUC-ROC for exam-level plus disease detection was 0.968 and AUC-PR was 0.823. In Nepal (Forus images), the AUC-ROC for exam-level plus disease detection was 0.999 and AUC-PR was 0.993. The ROP vascular severity score was associated with ICROP classification in both datasets (p < 0.001). At the population level, the median [interquartile range] VSS was found to be higher in Mongolia (2.7 [1.3–5.4]) as compared to Nepal (1.9 [1.2–3.4], p < 0.001). Conclusions These data provide preliminary evidence of the effectiveness of the i-ROP DL algorithm for ROP screening in neonatal populations in Nepal and Mongolia, using multiple camera systems, and provide useful data for consideration in future clinical implementation of AI-based ROP screening in low- and middle-income countries.