Semi-supervised fake review detection using pre-trained BERT embeddings

Thomas Hoang

Back

Semi-supervised fake review detection using pre-trained BERT embeddings

Thesis

Open access

Semi-supervised fake review detection using pre-trained BERT embeddings

Thomas Hoang

California State University, Sacramento

Master of Science (MS), California State University, Sacramento

08/28/2023

Handle:

https://hdl.handle.net/20.500.12741/rep:11208

Abstract

Information science

Deception

FakeGAN

GANBERT

Opinion spam

SpamGAN

Transformers

Among the current state-of-the-art tools for solving NLP tasks, BERT, which stands for Bidirectional Encoder Representations from Transformers, is one of the leading pre-trained models of today. BERT is pre-trained using a large corpus of unlabeled data extracted from Wikipedia and BooksCorpus. BERT is able to capture context effectively through its contextual embeddings. Given that reviews are a means by which customers base their purchasing decision on a particular product or service, there is value in determining the authenticity of a review. There exist several works that have contributed to solving this problem. Given the large dataset of reviews that are a majority unlabeled, semi-supervised learning using generative adversarial networks has been adopted to take advantage of the feature analysis of unlabeled data. In this project, we propose a GAN-BERT-based solution for spatial and non-spatial fake review detection. We used a dataset collected from TripAdvisor to evaluate its performance. The extensive experiments verified the superiority of our approach over the state-of-the-art fake review detection approaches.

Files and links (1)

pdf

HoangThomas_Spring2023659.23 kBDownload View

TextProject Open Access

Metrics

15 File views/ downloads

84 Record Views

Details

Title: Semi-supervised fake review detection using pre-trained BERT embeddings
Creators: Thomas Hoang
Contributors: Haiquan Chen (Advisor) - California State University, Sacramento, Computer Science Department
Ying Jin (Committee Member) - California State University, Sacramento, Computer Science Department
Academic Unit: Computer Science Department
Theses and Dissertations: Master of Science (MS); Computer Science; California State University, Sacramento; 05/06/2023; 2023
Publisher: California State University, Sacramento
Publication Details: 08/28/2023
Identifiers: 99258062061901671; https://hdl.handle.net/20.500.12741/rep:11208
Resource Type: Masters Project
Language: English
Number of pages: 40
Comment: The accessibility of this document has been verified by Sacramento State University Library. For questions, please contact lib-508Accessibility@csus.edu.