Semi-supervised spam detection with geo-entities using pretrained language models

Jason Phillips

Back

Semi-supervised spam detection with geo-entities using pretrained language models

Thesis

Open access

Semi-supervised spam detection with geo-entities using pretrained language models

Jason Phillips

California State University, Sacramento

Master of Science (MS), California State University, Sacramento

04/23/2025

Handle:

https://hdl.handle.net/20.500.12741/rep:12982

Abstract

Artificial intelligence

Fake review detection

Spatial data

Spam detection in online reviews is a challenging task to overcome, especially when labeled data are limited. To address this, semi-supervised learning has become an effective approach that leverages both labeled and unlabeled data to improve the accuracy of classification. This study explores the usage of semi-supervised learning techniques to detect fake reviews by leveraging SpaBERT embeddings along with traditional BERT embeddings. The BERT embeddings capture the semantic context of each review, while the SpaBERT embeddings, an extension of BERT, capture the spatial context of each geo-entity within the reviews. This research demonstrates that the addition of this spatial context leads to improved spam detection accuracy.The comparative experiments using the real-world review dataset show that by combining the semantic and spatial contexts of reviews, our model outperformed the baseline GAN-BERT model in terms of effectiveness metrics, making this approach a promising direction for spam detection where labeled data is hard to obtain.

Files and links (1)

pdf

PhillipsJason_Fall2024480.32 kBDownload View

TextProject Open Access

Metrics

1 Record Views

Details

Title: Semi-supervised spam detection with geo-entities using pretrained language models
Creators: Jason Phillips
Contributors: Haiquan Chen (Advisor)
Ying Jin (Committee Member)
Academic Unit: Computer Science Department
Theses and Dissertations: Master of Science (MS); Computer Science; California State University, Sacramento; 12/04/2024; 2024
Publisher: California State University, Sacramento
Publication Details: 04/23/2025
Identifiers: 99258206817901671; https://hdl.handle.net/20.500.12741/rep:12982
Resource Type: Masters Project
Language: English
Number of pages: 38
Comment: The accessibility of this document has been verified by Sacramento State University Library. For questions, please contact lib-508Accessibility@csus.edu.