Abstract
Among the current state-of-the-art tools for solving NLP tasks, BERT, which stands for Bidirectional Encoder Representations from Transformers, is one of the leading pre-trained models of today. BERT is pre-trained using a large corpus of unlabeled data extracted from Wikipedia and BooksCorpus. BERT is able to capture context effectively through its contextual embeddings. Given that reviews are a means by which customers base their purchasing decision on a particular product or service, there is value in determining the authenticity of a review. There exist several works that have contributed to solving this problem. Given the large dataset of reviews that are a majority unlabeled, semi-supervised learning using generative adversarial networks has been adopted to take advantage of the feature analysis of unlabeled data. In this project, we propose a GAN-BERT-based solution for spatial and non-spatial fake review detection. We used a dataset collected from TripAdvisor to evaluate its performance. The extensive experiments verified the superiority of our approach over the state-of-the-art fake review detection approaches.