Abstract
Review websites, such as Yelp, Google Places and Zagat, play a major role in determining a user’s choice of business. The overall rating provided by these review applications is critical because many times, users do not have enough time to read all the reviews before making a decision. Existing rating systems determine the overall rating for a business based on average of ratings provided by users. This project aims at improving the reliability and accuracy of rating systems. Machine learning techniques are utilized to normalize and predict user’s rating based on the content of their own reviews as well as other users’ reviews. More weighting is given to latest reviews as quality of a business might change over time. We investigated various approaches to detect fake reviewers using average pairwise cosine similarity, rating deviation and burst review ratio, and to detect fake reviews using linear discriminant analysis. The proposed rating system is decoupled into three modules. Data fetch module populates data from the review websites into a database. We fetch the data using a combination of web services provided by those review websites and scraping their web pages directly. Mining module is the core application that comprises of various machine-learning techniques for sentiment analysis. We also used various machine-learning algorithms for detecting fake reviewers/reviews. Web application module provides a graphical representation of various analytics calculated from the rating module. This real-time rating system that conducts intelligent analysis of reviews will help users make an informed decision about a business.