Abstract
In recent years, online businesses and websites have become the main target of fake online reviews, where fake reviews are intentionally written to manipulate the business ratings positively or negatively. Most of the existing work to detect fake reviews focus mainly on supervised methods which use lexical and syntactic patterns of the reviews. In this paper, we propose a GAN-based semi-supervised framework, TopicGAN, for online fake review detection using topic modeling. Specifically, we first extract spatial named entities from the reviews and employ fuzzy string matching to obtain their embeddings. Second, the words and spatial named entities that appear in reviews are represented using their corresponding topic distributions by training an embedded topic model. TopicGAN builds on two discriminators, with one discriminator differentiating between real and fake reviews and the other discriminator differentiating between the fake reviews from the dataset and the fake reviews from the generator. In this way, the Generator competes with the Discriminators like a min-max game until convergence. This architecture coupled with Topic modeling and other novel features has allowed TopicGAN to compete with the state-of-the-art semi-supervised methods in terms of all performance metrics for detecting real reviews and fake reviews, respectively.