Project Overview

Websites which captures reviews about buisnesses have been an integral part of business owners planning for their future growth. The free text review system, numerical star ratings along with upvotes and downvotes have given a new form of the traditional notion of “word of mouth”. These sites have helped consumers to make factual decisions, get new recommendations and a way to express their opinions and influence other users. For the business owners, such websites have helped to monitor their growth, inspired them to make bold decisions, and quickly address the concerns of consumers.

Motivation

The success of a new business depends on many factors, the location of business being the primary one. Various other aspects of business are also present on the review websites. Going through all these influencing factors present on these websites manually is a tedious task. Moreover, taking a decision based on these factors which are present in unstructured format can be misleading. This along with the usefulness of review websites for the different tasks of decision support serves as our motivation. Currently, Yelp dataset supports existing business and users and there is a vast amount of research being done since Yelp open-sourced its dataset for the Yelp challenge. This review based website’s dataset is being used for different types of research (Rafay, Suleman, and Alim 2020; Yun, Wu, and Wang 2014).

Project objectives

Use human-computer interaction (HCI) interface for creating Decision support system which uses machine learning to aid investors and potential business owners in their new ventures.

We answer multiple questions based on our project objective:

  • What insights can we draw from the given dataset which can help us understand consumer and business behaviour and habits?
  • In what ways we can model the reviews of the users for various classification tasks?
  • What kind of statistics are important for a potential business owner which will allow him/her to make a sound business decision?
  • How can we increase the engagement of a potential investor with selected statistics using HCI techniques?

Dataset details

We use the dataset provided by Yelp, a company focused on providing business recommendations and local search based on peer ratings, free text reviews and business facilities. We focus on “Restaurants” category of business. The original files are in JSON format and we use the following for our work:

  1. Business.json - contains the list of businesses assigned with a unique business_ID. Along with the category of the business, its location details including the longitude and latitude, and opening time is mentioned.

  2. Review.json - contains the reviews by the users against the business establishment having free text review column accompanied with the star rating. Further it also contains the upvote and downvote received against each review.

Project repository

All the code for GUI, NLP and project notebook can be accessed from this repository here

Tutorial video(Screencast)

Our project workflow and working of GUI can be seen here

 

 

Team members

  1. Abhinav Srivastava (223683, )
  2. Neel Rajkumar Mishra (224143, )
  3. Sneha Videkar (221283, )
  4. Yash Shah (223740, )

Under the guidance of: M.Sc. Uli Niemann