Skin of Color Underrepresented in Datasets Used by AI to Identify Skin Cancer

An analysis of open-access skin image datasets available to train machine learning algorithms to identify skin cancer has revealed that darker skin types are markedly underrepresented in the databases, researchers in the United Kingdom report.

Out of 106,950 skin lesions documented in 21 open-access databases and 17 open-access atlases identified by David Wen, BMBCh, from the University of Oxford, United Kingdom, and colleagues, 2436 images contained information on Fitzpatrick skin type. Of these, “only ten images were from individuals with Fitzpatrick skin type V, and only a single image was from an individual with Fitzpatrick skin type VI,” the researchers said. “The ethnicity of these individuals was either Brazilian or unknown.”

In two datasets containing 1585 images with ethnicity data, “no images were from individuals with an African, Afro-Caribbean, or South Asian background,” Wen and colleagues noted. “Coupled with the geographical origins of datasets, there was massive under-representation of skin lesion images from darker skinned populations.”

The results of their systematic review were presented at the National Cancer Research Institute (NCRI) Festival and published on November 9 in The Lancet Digital Health. To the best of their knowledge,” they write, this is “the first systematic review of publicly available skin lesion images comprising predominantly dermoscopic and macroscopic images available through open access datasets and atlases.”

Overall, 11 of 14 datasets (79%) were from North America, Europe, or Oceania among datasets with information on country of origin, the researchers said. Either dermoscopic images or macroscopic photographs were the only types of images available in 19 of 21 (91%) datasets. There was some variation in the clinical information available, with 81,662 images (76.4%) containing information on age, 82,848 images (77.5%) having information on gender, and 79,561 images having information about body site (74.4%).

The researchers explained that these datasets might be of limited use in a real-world setting where the images aren’t representative of the population. Artificial intelligence (AI) programs that train using images of patients with one skin type, for example, can potentially misdiagnose patients of another skin type, they said.

“AI programs hold a lot of potential for diagnosing skin cancer because it can look at pictures and quickly and cost-effectively evaluate any worrying spots on the skin,” Wen said in a press release from the NCRI Festival. “However, it’s important to know about the images and patients used to develop programs, as these influence which groups of people the programs will be most effective for in real-life settings. Research has shown that programs trained on images taken from people with lighter skin types only might not be as accurate for people with darker skin, and vice versa.”

There was also “limited information on who, how and why the images were taken,” Wen said in the release. “This has implications for the programs developed from these images, due to uncertainty around how they may perform in different groups of people, especially in those who aren’t well represented in datasets, such as those with darker skin. This can potentially lead to the exclusion or even harm of these groups from AI technologies.”

While there are no current guidelines for developing skin image datasets, quality standards are needed, according to the researchers.

“Ensuring equitable digital health includes building unbiased, representative datasets to ensure that the algorithms that are created benefit people of all backgrounds and skin types,” they conclude in the study.

Neil Steven, MBBS, MA, PhD, FRCP, an NCRI Skin Group member who was not involved with the research, stated in the press release that the results from the study by Wen and colleagues “raise concerns about the ability of AI to assist in skin cancer diagnosis, especially in a global context.”

“I hope this work will continue and help ensure that the progress we make in using AI in medicine will benefit all patients, recognising that human skin colour is highly diverse,” said Steven, Honorary Consultant in Medical Oncology at University Hospitals Birmingham NHS Foundation Trust, United Kingdom.

Lancet Digit Health. Published online November 9, 2021. Full text

This study was funded by NHSX and the Health Foundation. Three authors reported being paid employees of Databiology at the time of the study. The other authors reported no relevant financial relationships.

Source: Read Full Article