DebiasML: A Causal Discovery tool for eliminating social bias in Machine Learning

Albrin Richard

Back

DebiasML: A Causal Discovery tool for eliminating social bias in Machine Learning

Thesis

Open access

DebiasML: A Causal Discovery tool for eliminating social bias in Machine Learning

Albrin Richard

California State University, Sacramento

Master of Science (MS), California State University, Sacramento

08/12/2024

Handle:

https://hdl.handle.net/20.500.12741/rep:12212

Abstract

Unbiased model training

Bias-free dataset

Causal relations

Data visualization

Social bias elimination

Machine Learning

Most machine learning algorithms are trained by learning patterns from datasets that may contain social biases embedded in the collected data, leading to concerns about fairness and accountability in the decision-making process. The use of such a model in decision-making processes across various fields, including law enforcement, admissions, health care, recruitment, finance decision, etc., where decisions must remain unaffected by social biases, raises concerns about the accountability and trust in machine learning algorithms. The aim is to develop a graphical tool for detecting and correcting social biases, with the goal of generating an unbiased dataset for training ML models. Users can identify biases against certain groups, such as females or minority races, by detecting unfair causal relations within the causal network. Simultaneously, these identified biases must be removed while minimizing distortion to the original dataset. The tool utilizes Directed Acyclic Graphs (DAGs) for graphical representation of dataset columns as nodes and PC (Peter & Clark) algorithm as causal discovery algorithms to better capture the relationship between features and reduce bias gaps. We use linear Structural Equation Models (SEM) to quantify the strength of the edge between nodes based on target variable type. If the target variable is numerical, we use linear regression or if categorical multinomial logistic regression is used for SEM. Then we allow the user to inject domain knowledge by removing or altering edges that contain social bias, leading to the creation of a refined causal model without social bias. It is also capable of generating a new debiased dataset by debiasing and rescaling the original dataset. The tool also contains chart-based evaluation metrics to evaluate the removal of bias from the dataset. The user can select the sensitive variable, target label variable, and the ML model to evaluate the performance of bias reduction in debiased dataset. The user can visualize the Bias metrics, ML metrics, Dataset metrics as bar graphs. The data distortion percentage as a gauge and the distribution of sensitive and target variable as fourfold charts.

Files and links (1)

pdf

RichardAlbrin_Spring2024_508CompliantCopy3.20 MBDownload View

TextProject Open Access

Metrics

7 File views/ downloads

103 Record Views

Details

Title: DebiasML: A Causal Discovery tool for eliminating social bias in Machine Learning
Creators: Albrin Richard
Contributors: Anna Baynes (Advisor)
Haiquan Chen (Committee Member)
Academic Unit: Computer Science Department
Theses and Dissertations: Master of Science (MS); Computer Science; California State University, Sacramento; 05/01/2024; 2024
Publisher: California State University, Sacramento
Publication Details: 08/12/2024
Identifiers: 99258157062901671; https://hdl.handle.net/20.500.12741/rep:12212
Resource Type: Masters Project
Language: English
Number of pages: 86
Accessibility Statement: This document has been made accessible/508 compliant by Sacramento State University Library. For questions, please contact lib-accessibility@csus.edu.