Abstract
Online opinion spamming has become a potential threat widespread in this digital era as most decisions from the purchase of a simple product to consulting certain doctor are taken based on the online user opinions. Taking this as an advantage, businesses in various fields have either committing online spamming or being affected by the same for several reasons like market competition and profit gains. Despite of the significant research carried out in identifying spam reviews, there is a huge gap left unbridged in detecting the spamming activity on the business as a whole (we measure this as honesty of the businesses). Identifying a single review to be spam or benign cannot clearly justify the business to be dishonest or trustworthy. With the advancements in the camouflage strategies followed by malicious users (spammers) in writing fake reviews, it has become difficult to categorize a review as a spam/no-spam. One such important strategy is singleton review technique – the technique where reviewers create multiple accounts and write only one review under each account. A large number of such Singleton Reviews (SRs) constitute to a biased review of the overall business. Recent research reveals that singleton reviews are a significant source of spam reviews and largely affects the ratings of online businesses. For example, about 68% of the amazon review data are singleton reviews. In this research project, we focus on detecting the businesses that are affected by opinion spamming over time. We take advantage of the Yelp review data containing reviews from 5,044 business by 260,277 reviewers. We leverage the recent techniques in deep learning such as transfer learning, semantic embeddings, auto encoding and LSTMs to classify the business as honest or dishonest based on semantic analysis of their reviews over time. Extensive experiments showed that the proposed models outperformed the baseline models in terms of precision, recall, and F1 score metrics in identifying both honest and dishonest businesses.