Researchers Minimize Bias in AI Models Without Sacrificing Accuracy

Janani R December 14, 2024 | 10:50 AM Technology

Machine learning models can struggle to make accurate predictions for individuals who are underrepresented in the datasets they were trained on.

For example, a model designed to recommend the best treatment for a chronic disease might be trained on a dataset primarily composed of male patients. When applied in a hospital, the model may generate inaccurate predictions for female patients.

Figure 1. Bias in AI Models

To enhance performance, engineers may attempt to balance the training dataset by removing data points until each subgroup is equally represented. However, this approach can be challenging, as it often requires discarding a significant amount of data, which can negatively impact the model's overall performance. Figure 1 shows Bias in AI Models.

MIT researchers have developed a novel technique that targets and removes specific data points in a training dataset that most contribute to a model’s failures on minority subgroups. By eliminating far fewer data points than traditional methods, this technique preserves the overall accuracy of the model while enhancing its performance for underrepresented groups.

Moreover, the technique can uncover hidden biases in a training dataset, even when it lacks labels. Since unlabeled data is more common than labeled data in many applications, this approach is especially valuable.

This method could also be integrated with other strategies to enhance the fairness of machine learning models, particularly in high-stakes environments. For instance, it could help prevent misdiagnoses of underrepresented patients caused by biased AI models.

"Many other algorithms that aim to address this issue assume that each data point has the same level of importance. In this paper, we demonstrate that this assumption is false. There are specific data points in our dataset that contribute to the bias, and by identifying and removing these points, we can achieve better performance," says Kimia Hamidieh, an electrical engineering and computer science (EECS) graduate student at MIT and co-lead author of the paper on this technique.

She co-authored the paper with fellow co-lead authors Saachi Jain PhD ’24 and Kristian Georgiev, an EECS graduate student; Andrew Ilyas MEng ’18, PhD ’23, a Stein Fellow at Stanford University; and senior authors Marzyeh Ghassemi, an associate professor in EECS and a member of the Institute of Medical Engineering Sciences and the Laboratory for Information and Decision Systems, and Aleksander Madry, the Cadence Design Systems Professor at MIT. The research will be presented at the Conference on Neural Information Processing Systems.

Eliminating Problematic Examples

Machine learning models are often trained on large datasets collected from various sources across the internet. These datasets are typically too vast to be manually curated, which means they may include problematic examples that negatively affect model performance.

Researchers also recognize that some data points have a greater influence on a model's performance for certain tasks than others.

Building on these insights, MIT researchers developed an approach that identifies and removes these problematic data points. Their goal is to address a challenge known as worst-group error, which occurs when a model performs poorly on minority subgroups within a training dataset.

This new technique is based on previous work in which the researchers introduced TRAK, a method that identifies the most critical training examples for a specific model output.

For their current technique, the researchers analyze the incorrect predictions made by the model for minority subgroups and use TRAK to pinpoint which training examples contributed most to those errors.

"By aggregating this information from poor test predictions in the right way, we can identify the specific parts of the training data that are causing the worst-group accuracy to decrease overall," Ilyas explains.

The team then removes those specific samples and retrains the model using the remaining data.

Since larger datasets typically lead to better overall performance, removing only the samples responsible for worst-group failures allows the model to maintain its overall accuracy while improving its performance on minority subgroups.

A Simpler Approach

Across three machine-learning datasets, their method outperformed several other techniques. In one case, it improved worst-group accuracy while removing about 20,000 fewer training samples than a traditional data balancing approach. The technique also achieved higher accuracy than methods that require altering a model's internal structure.

Since the MIT method focuses on modifying the dataset rather than the model itself, it is easier for practitioners to implement and can be applied to a wide range of models.

It can also be used when bias is not known, especially in cases where subgroups in the training dataset are unlabeled. By identifying the data points that most influence a feature the model is learning, practitioners can gain insight into the variables the model uses for its predictions.

"This is a tool anyone can use when training a machine-learning model. They can examine those data points to see if they align with the objective they want the model to achieve," says Hamidieh.

To detect unknown subgroup bias, practitioners would need some intuition about which groups to focus on. The researchers hope to validate and expand upon this approach through future studies involving human participants.

They also aim to enhance the technique's performance and reliability, ensuring it remains accessible and user-friendly for practitioners who might eventually deploy it in real-world applications.

"When you have tools that help you critically assess the data and identify which data points might lead to bias or other undesirable outcomes, it gives you a crucial first step toward building fairer and more reliable models," says Ilyas.

This work is partially funded by the National Science Foundation and the U.S. Defense Advanced Research Projects Agency.

Source:MIT News

Cite this article:

Janani R (2024), Researchers Minimize Bias in AI Models Without Sacrificing Accuracy, AnaTechmaz, pp.1052