FairTrace : An automated unsupervised fairness machine learning model - The case of huanglongbing citrus disease

Neha Yelshetty

Back

FairTrace : An automated unsupervised fairness machine learning model - The case of huanglongbing citrus disease

Thesis

Open access

FairTrace : An automated unsupervised fairness machine learning model - The case of huanglongbing citrus disease

Neha Yelshetty

California State University, Sacramento

Master of Science (MS), California State University, Sacramento

07/09/2024

Handle:

https://hdl.handle.net/20.500.12741/rep:12103

Abstract

Huanglongbing

Greening disease

Fairness

Unsupervised machine learning

Automation

Machine Learning

As data science becomes more integrated into our daily lives, concerns about the fairness of automated decision-making are gaining attention. When humans supervise machine learning algorithms like time series data analysis, it raises concern about whether decisions made by machine learning models are fair. Despite research on algorithmic fairness across different fields, data scientists continue to encounter challenges in achieving fairness from machine learning models in practical settings. There is no proper end-to-end system without human in the loop for ensuring fairness in machine learning algorithms, which highlights the need for a practical solution to ensure fair decision-making. To ensure fairness in the decision-making of time series data generated by the machine learning model we have developed a fairness model named FairTrace which is built on top of FairRover[1] involving human to mitigate bias and retraining of machine learning model. The FairTrace model is an iterative three-step approach aimed at removing bias in machine learning models outcomes by removing the need for human supervision. The steps of FairTrace are: (1) Audit: the results from a machine learning model is validated against an agent-based model (ABM), (2) Explanation: provide transparent documentation about the fairness of the machine learning predictions, (3) Bias Mitigation: automatically re-train the model if the machine learning output does not satisfy fairness constraints. These three procedures are repeatedly carried out until the machine learning model generates predictions that satisfy fairness constraint requirements. To showcase our approach, the FairTrace model is integrated into a website designed for citrus farmers named the HLB grower web tool. The HLB grower web tool offers detailed insights about the potential spread of huanglongbing (HLB) across a representative California citrus grove, accompanied by crucial information such as profit and yield data over five years. HLB spread is predicted using the machine learning model LightGBM which is trained using data generated from ABM simulations. The LightGBM machine learning fairness outcomes are transparently documented on the website, providing users with transparent information about the reliability of the machine learning predictions. Additionally, if the validation process reveals the need for retraining, this happens in the background with the help of ABM-generated values, ensuring continuous improvement and accuracy of the machine learning model while the relevant information is displayed on the website. Utilizing the ABM eliminates the need for human input in validating machine learning predictions, automating the process to increase both efficiency and reliability. FairTrace also presents challenges in balancing multiple fairness constraints simultaneously, which could impact model performance metrics. Thus, achieving a balance between fairness and performance remains a key consideration in deploying FairTrace.

Files and links (1)

pdf

YelshettyNeha_Spring2024_508CompliantCopy11.00 MBDownload View

TextThesis Open Access

Metrics

144 File views/ downloads

151 Record Views

Details

Title: FairTrace : An automated unsupervised fairness machine learning model - The case of huanglongbing citrus disease
Creators: Neha Yelshetty
Contributors: Ahmed Salem (Advisor) - California State University, Sacramento, Computer Science Department
Anna Baynes (Committee Member) - California State University, Sacramento, Computer Science Department
Bang S Tran (Committee Member) - California State University, Sacramento, Computer Science Department
Jonathan D Kaplan (Committee Member) - California State University, Sacramento, Economics Department
Ajay Singh (Committee Member) - California State University, Sacramento, Environmental Studies Department
Academic Unit: Computer Science Department
Theses and Dissertations: Master of Science (MS); Computer Science; California State University, Sacramento; 05/02/2024; 2024
Publisher: California State University, Sacramento
Publication Details: 07/09/2024
Identifiers: 99258154563701671; https://hdl.handle.net/20.500.12741/rep:12103
Resource Type: Masters Thesis
Language: English
Number of pages: 73
Accessibility Statement: This document has been made accessible/508 compliant by Sacramento State University Library. For questions, please contact lib-accessibility@csus.edu.