Uncovering Relationship between Sleep Disorder and Lifestyle using Predictive Analytics

Authors

  • Sham Fang Ying Faculty of Information Technology, INTI International University, 71800 Nilai, Malaysia
  • Harprith Kaur Faculty of Information Technology, INTI International University, 71800 Nilai, Malaysia
  • Deshinta Arrova Dewi Faculty of Information Technology, INTI International University, 71800 Nilai, Malaysia
  • Chong Fong Kim Faculty of Information Technology, INTI International University, 71800 Nilai, Malaysia

Keywords:

Sleep Disorder, Data Mining, Predictive Analytics, Machine Learning

Abstract

Sleep disorder refers to the conditions that affect the ability of someone to sleep well regularly
whether they are caused by health problems or other outside influences. Occasionally most
people experience a sleeping problem due to various reasons. However, when this issue keeps
occurring and interferes with daily life, this may indicate a sleeping disorder. In some cases, a
sleep disorder may be a symptom of another medical or mental health condition and eventually
gone once treatment is obtained for the underlying cause. The treatment normally involves a
combination of medical treatments and lifestyle changes. Previous research reported that
someone’s lifestyle may affect the sleep length and its quality. For example, food choice affects
sleep quality and caffeine consumption affects sleep length. This paper aims to uncover the
relationship between sleep disorder and lifestyle by performing data investigation using
predictive analytics. This study employs Cross Industry Standard Process for Data Mining
(CRISP-DM) as methodology. Starting with collection of raw datasets, which were acquired
from SleepFoundation.org, one of the leading sources of evidence-based pertaining sleep health
information. From there, 1000 data records with 77 attributes are selected and categorized into
five class labels i.e. Personal, Diet, Technology, Disease, and Environment. The 77 attributes
including depression, anxiety disorder, felt sad, overall health, etc. are then measured using
Cramer’s value and visualize using Mosaic plots. The Correlation Coefficient and P-value
methods are employed to define the relationship among those attributes with a sleep disorder.
As for the predictive analytics, we exploit three data mining methods i.e. Support Vector
Machine (SVM), Conditional Inference Tree (CTree) and Recursive Partitioning (Rpart).
Results show that SVM lead the accuracy level up to 80.288% outperformed Rpart (71.428%)
and Ctree (66.499%).

Published

2022-03-15