Publicly available datasets of breast histopathology H&E whole-slide images: A scoping review

Advancements in digital pathology and computing resources have made a significant impact in the field of computational pathology for breast cancer diagnosis and treatment. However, access to high-quality labeled histopathological images of breast cancer is a big challenge that limits the development...

Full description

Saved in:

Bibliographic Details
Main Authors:	Masoud Tafavvoghi (Author), Lars Ailo Bongo (Author), Nikita Shvetsov (Author), Lill-Tove Rasmussen Busund (Author), Kajsa Møllersen (Author)
Format:	Book
Published:	Elsevier, 2024-12-01T00:00:00Z.
Subjects:	article
Online Access:	Connect to this object online.
Tags:	Add Tag No Tags, Be the first to tag this record!

MARC


LEADER	00000 am a22000003u 4500
001	doaj_1b42556ffece42f8b0e1ca0cb0bafd24
042			\|a dc
100	1	0	\|a Masoud Tafavvoghi \|e author
700	1	0	\|a Lars Ailo Bongo \|e author
700	1	0	\|a Nikita Shvetsov \|e author
700	1	0	\|a Lill-Tove Rasmussen Busund \|e author
700	1	0	\|a Kajsa Møllersen \|e author
245	0	0	\|a Publicly available datasets of breast histopathology H&E whole-slide images: A scoping review
260			\|b Elsevier, \|c 2024-12-01T00:00:00Z.
500			\|a 2153-3539
500			\|a 10.1016/j.jpi.2024.100363
520			\|a Advancements in digital pathology and computing resources have made a significant impact in the field of computational pathology for breast cancer diagnosis and treatment. However, access to high-quality labeled histopathological images of breast cancer is a big challenge that limits the development of accurate and robust deep learning models. In this scoping review, we identified the publicly available datasets of breast H&E-stained whole-slide images (WSIs) that can be used to develop deep learning algorithms. We systematically searched 9 scientific literature databases and 9 research data repositories and found 17 publicly available datasets containing 10 385 H&E WSIs of breast cancer. Moreover, we reported image metadata and characteristics for each dataset to assist researchers in selecting proper datasets for specific tasks in breast cancer computational pathology. In addition, we compiled 2 lists of breast H&E patches and private datasets as supplementary resources for researchers. Notably, only 28% of the included articles utilized multiple datasets, and only 14% used an external validation set, suggesting that the performance of other developed models may be susceptible to overestimation. The TCGA-BRCA was used in 52% of the selected studies. This dataset has a considerable selection bias that can impact the robustness and generalizability of the trained algorithms. There is also a lack of consistent metadata reporting of breast WSI datasets that can be an issue in developing accurate deep learning models, indicating the necessity of establishing explicit guidelines for documenting breast WSI dataset characteristics and metadata.
546			\|a EN
690			\|a Breast cancer
690			\|a Computational pathology
690			\|a Deep learning
690			\|a Whole-slide images
690			\|a Publicly available datasets
690			\|a Computer applications to medicine. Medical informatics
690			\|a R858-859.7
690			\|a Pathology
690			\|a RB1-214
655	7		\|a article \|2 local
786	0		\|n Journal of Pathology Informatics, Vol 15, Iss , Pp 100363- (2024)
787	0		\|n http://www.sciencedirect.com/science/article/pii/S2153353924000026
787	0		\|n https://doaj.org/toc/2153-3539
856	4	1	\|u https://doaj.org/article/1b42556ffece42f8b0e1ca0cb0bafd24 \|z Connect to this object online.

Publicly available datasets of breast histopathology H&E whole-slide images: A scoping review

MARC

Similar Items