1. Deep learning (DL) for diabetic retinopathy (DR) screening in middle-income countries, such as Thailand, may result in societal cost-savings and similar health outcomes compared to trained human graders (HG).
2. DL has higher sensitivity than HG, which leads to a higher treatment cost from a healthcare provider perspective but causes less bilateral blindness and more cost-saving from a societal perspective.
3. This study provides an economic rationale for decision makers to expand DL-based DR screening in MICs with similar prevalence of diabetes and low compliance to referrals for treatment. Policy makers should be aware of the budget impact of treating more patients with sight-threatening DR with clinical deployment of DL.
The article titled "Cost-Utility Analysis of Deep Learning and Trained Human Graders for Diabetic Retinopathy Screening in a Nationwide Program" presents a health economic evaluation of deep learning (DL) compared to trained human graders (HG) for diabetic retinopathy (DR) screening in Thailand. The study aims to provide decision-makers with an economic rationale for expanding DL-based DR screening in middle-income countries (MICs) with similar prevalence of diabetes and limited availability of skilled human resources.
The article provides a detailed description of the model used for cost-utility analysis, including the target population, model structure, screening strategies, input parameters, and costs. The results show that from a societal perspective, screening with DL was associated with a reduction in costs and similar quality-adjusted life-years compared to HG. However, from a healthcare provider perspective, DL had a higher incremental cost-effectiveness ratio due to higher sensitivity leading to higher treatment costs.
The article provides valuable insights into the potential benefits of DL-based DR screening in MICs. However, there are some limitations and biases that need to be considered. Firstly, the study only considers the cost-utility analysis from two perspectives: societal and healthcare provider. Other stakeholders such as patients or payers may have different perspectives on the value of DL-based DR screening.
Secondly, the study assumes that DL has higher sensitivity than HG without considering other factors such as specificity or accuracy. This assumption may not hold true in all settings and could lead to biased results.
Thirdly, the study does not consider potential risks associated with DL-based DR screening such as privacy concerns or algorithmic bias. These risks should be carefully evaluated before implementing DL-based DR screening programs.
Finally, the study does not explore counterarguments against DL-based DR screening such as concerns about overdiagnosis or overtreatment. These issues should also be considered when making decisions about implementing DL-based DR screening programs.
In conclusion, while this article provides valuable insights into the potential benefits of DL-based DR screening in MICs, decision-makers should carefully evaluate all potential risks and biases before implementing such programs. Further research is needed to address these limitations and provide more robust evidence on the value of DL-based DR screening in different settings.