Data Analysis and Rating Prediction on Google Play Store Using Data Mining Techniques

Authors

  • Kayalvily Tabianan Faculty of Information Technology, INTI International University, 71800 Nilai, Negeri Sembilan, Malaysia
  • Denis Arputharaj Faculty of Information Technology, INTI International University, 71800 Nilai, Negeri Sembilan, Malaysia
  • Mohd Norshahriel Abd Rani Faculty of Information Technology, INTI International University, 71800 Nilai, Negeri Sembilan, Malaysia
  • Sarasvathi Nagalingham Faculty of Information Technology, INTI International University, 71800 Nilai, Negeri Sembilan, Malaysia

Keywords:

Google Play Store, Decision Tree, Analysis, Machine Learning Algorithm

Abstract

Google Play Store was formerly known as Android Market. This biggest Android Application
(App) provides a wide variety of details on requirements such as reviews, quality, number of
installs, and explanations for device functionality. This study aims to predict the ratings of
Google Play Store apps using decision trees for classification in machine learning algorithms.
The goal of using a Decision Tree is to create a training model that can use to predict the class
or value of the target variable by learning simple decision rules inferred from prior data. This
method classifies a population into branch-like segments that construct an inverted tree with a
root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently
deal with large, complicated datasets without imposing a complicated parametric
structure. This enables us to draw a comprehensive picture of the current situation on the
process of analyzing Google Play Store by Number of Downloading Rate and Rating in current
market trend. This will help the developers understand customers' great desires, attitudes, and
trends in demand. To understand more in-depth, the similarity between the functionality of the
device and to construct clusters of related applications. Then, analyze their characteristics
following features of interest. The datasets that the author used are collected from Google Play
Store (2019). In this research, the expected results have a more strong correlation between price
and number of downloads and similarity between price and participation.

Published

2022-01-08