Abstract
As data science becomes more integrated into our daily lives, concerns about the fairness of automated decision-making are gaining attention. When humans supervise machine learning algorithms like time series data analysis, it raises concern about whether decisions made by machine learning models are fair. Despite research on algorithmic fairness across different fields, data scientists continue to encounter challenges in achieving fairness from machine learning models in practical settings. There is no proper end-to-end system without human in the loop for ensuring fairness in machine learning algorithms, which highlights the need for a practical solution to ensure fair decision-making.
To ensure fairness in the decision-making of time series data generated by the machine learning model we have developed a fairness model named FairTrace which is built on top of FairRover[1] involving human to mitigate bias and retraining of machine learning model. The FairTrace model is an iterative three-step approach aimed at removing bias in machine learning models outcomes by removing the need for human supervision. The steps of FairTrace are: (1) Audit: the results from a machine learning model is validated against an agent-based model (ABM), (2) Explanation: provide transparent documentation about the fairness of the machine learning predictions, (3) Bias Mitigation: automatically re-train the model if the machine learning output does not satisfy fairness constraints. These three procedures are repeatedly carried out until the machine learning model generates predictions that satisfy fairness constraint requirements.
To showcase our approach, the FairTrace model is integrated into a website designed for citrus farmers named the HLB grower web tool. The HLB grower web tool offers detailed insights about the potential spread of huanglongbing (HLB) across a representative California citrus grove, accompanied by crucial information such as profit and yield data over five years. HLB spread is predicted using the machine learning model LightGBM which is trained using data generated from ABM simulations.
The LightGBM machine learning fairness outcomes are transparently documented on the website, providing users with transparent information about the reliability of the machine learning predictions. Additionally, if the validation process reveals the need for retraining, this happens in the background with the help of ABM-generated values, ensuring continuous improvement and accuracy of the machine learning model while the relevant information is displayed on the website.
Utilizing the ABM eliminates the need for human input in validating machine learning predictions, automating the process to increase both efficiency and reliability. FairTrace also presents challenges in balancing multiple fairness constraints simultaneously, which could impact model performance metrics. Thus, achieving a balance between fairness and performance remains a key consideration in deploying FairTrace.