Exploring advances in multi-label text classification for  enhanced healthcare management

Srujay Reddy Vangoor

Back

Exploring advances in multi-label text classification for enhanced healthcare management

Thesis

Open access

Exploring advances in multi-label text classification for enhanced healthcare management

Srujay Reddy Vangoor

California State University, Sacramento

Master of Science (MS), California State University, Sacramento

08/11/2025

Handle:

https://hdl.handle.net/20.500.12741/rep:13314

Abstract

Transformers

Deep learning

Machine Learning

This project proposes a new approach to digital healthcare by improving the interpretation of medical record databases through advances in Multi-Label Text Classification (MLTC). MLTC involves assigning multiple labels to a given text, which is a challenging but essential task in healthcare because of the complexity of clinical information. The heterogeneous nature of health data requires effective management strategies to improve patient outcomes and healthcare operations. This project aims to investigate state-of-the-art approaches, including classical machine learning, deep learning, and natural language processing (NLP) models, to improve the accuracy of medical document classification. Central to this project is developing an MLTC framework that not only achieves correct categorization of medical texts but also extracts valuable insights, transforming latent data into actionable knowledge. This project uses two distinct datasets: the Toxic Comment Classification Challenge dataset (cjadams et al., 2017), which is widely used for text classification tasks, and MIMIC-IV (Johnson et al., 2020), an extensive, freely available database of de-identified health data. Using these datasets, we focus on developing a model capable of handling overlapping and diverse medical labels, easing better retrieval, decision support, and patient care. This project builds upon the baseline work presented in 'Multi-Label Text Classification using Attention-based Graph Neural Network' by Ankit Pal, Muru Selvakumar, and Malaikannan Sankarasubbu (Pal et al., 2020). Their study introduced a graph attention network-based model proposed to capture the attentive dependency structure among labels, using a feature matrix and a correlation matrix to explore dependencies and generate classifiers for the task. In contrast, this project develops a novel model that combines graph neural networks with transformer-based architectures, specifically BERT, to achieve enhanced classification performance. The user interface allows stakeholders to explore the model through Hugging Face Spaces, providing an interactive platform to evaluate the model's capabilities.

Files and links (1)

pdf

VangoorSrujayReddy_Fall20241.51 MBDownload View

TextProject Open Access

Metrics

1 Record Views

Details

Title: Exploring advances in multi-label text classification for enhanced healthcare management
Creators: Srujay Reddy Vangoor
Contributors: Kin Chung Kwan (Advisor)
Anna Baynes (Committee Member)
Haiquan Chen (Committee Member)
Academic Unit: Computer Science Department
Theses and Dissertations: Master of Science (MS); Computer Science; California State University, Sacramento; 12/05/2024; 2024
Publisher: California State University, Sacramento
Publication Details: 08/11/2025
Identifiers: 99258242565901671; https://hdl.handle.net/20.500.12741/rep:13314
Resource Type: Masters Project
Language: English
Number of pages: 101