Generating automatic captions for scientific charts

Harikiran Thallada

Back

Generating automatic captions for scientific charts

Thesis

Open access

Generating automatic captions for scientific charts

Harikiran Thallada

California State University, Sacramento

Master of Science (MS), California State University, Sacramento

03/26/2024

Handle:

https://hdl.handle.net/20.500.12741/rep:11901

Abstract

Data visualization

Visual impairment

Accessibility

Machine Learning

With the rapid development of industries and corporate sectors, the amount of data that is generated is increasing exponentially. Data collection has reached almost every corner of the technologies we are using. Many businesses are using this data to make better decisions to improve profits and customer satisfaction; governments are taking optimal steps to improve public services by using collected data. Scientists need data to conduct research and extract new discoveries; even a typical individual uses data to personalize their experiences. Data has become a primary source of progress and innovation in today’s world. These data can be in many forms: text, numbers, images, video, audio, etc. Not all the data can be understood by just looking at it. Thus, data requires visualization, which involves making meaningful charts that extract underlying insights from the data. These visualizations minimize users’ effort to understand the data and its underlying trends and patterns and can improve user comprehension. However, for some people who have full or partial visual disabilities, it can be challenging to understand these visualizations. The LINECAP, a novel figure captioning dataset that is used in this project, has a collection of line charts. Each line chart has a human-generated summary or caption and a number that indicates the number of lines in the chart attached to it. This dataset has been used to develop machine learning models that predict the count of lines and summary for these line charts. This captioning dataset contains a total of 3528 line chart images. Instead of trying to look and understand the chart, these models can be used to summarize the line chart and predict the count of lines in the line charts. This project focuses on building machine learning models that process these line chart images and generate results that help people with visual impairment comprehend these charts. The Line count prediction model uses the DenseNet architecture for the image feature extraction. The second model, the Line caption generation model, comprises the transformer architecture as the language modeling. This project also aims to cover the web application that works alongside these models and use them for generating summaries for the line charts.

Files and links (1)

pdf

ThalladaHarikiran_Fall2023_508CompliantCopy5.69 MBDownload View

TextProject Open Access

Metrics

36 File views/ downloads

172 Record Views

Details

Title: Generating automatic captions for scientific charts
Creators: Harikiran Thallada
Contributors: Anna Baynes (Advisor)
Haiquan Chen (Committee Member)
Academic Unit: Computer Science Department
Theses and Dissertations: Master of Science (MS); Computer Science; California State University, Sacramento; 12/01/2023; 2023
Publisher: California State University, Sacramento
Publication Details: 03/26/2024
Identifiers: 99258116062201671; https://hdl.handle.net/20.500.12741/rep:11901
Resource Type: Masters Project
Language: English
Number of pages: 69
Accessibility Statement: This document has been made accessible/508 compliant by Sacramento State University Library. For questions, please contact lib-accessibility@csus.edu.