Weighing the benefits and risks of collecting race and ethnicity data in clinical settings for medical artificial intelligence

Updated

This Viewpoint weighs the risks of collecting race and ethnicity data in clinical settings against the risks of not collecting those data.

Many countries around the world do not collect race and ethnicity data in clinical settings. Without such identified data, it is difficult to identify biases in the training data or output of a given artificial intelligence (AI) algorithm, and to work towards medical AI tools that do not exclude or further harm marginalised groups. However, the collection of these data also poses specific risks to racially minoritised populations and other marginalised groups. 

AI algorithms can exhibit substantial racial bias, affecting health outcomes for marginalized communities, and the social and statistical biases in AI are interconnected, often leading to exacerbated health disparities. ​

Benefits and Risks of Race Ethnicity Data

  • Benefits:
    • Collecting comprehensive demographic data can improve healthcare access and outcomes for historically marginalized groups. ​
    • Better data can help identify disparities and track discrimination in healthcare delivery. ​
    • AI algorithms can detect demographic features, aiding in bias identification and correction. ​
  • Risks:
    • Race is a socially constructed category, leading to potential misuse and misinterpretation of data. ​
    • Historical misuse of race data in medicine raises concerns about privacy and trust in data collection. ​
    • Self-reported data can introduce complexity and variability in categorization, complicating future comparisons. ​
【MORE】