Uncovering Relationship between Sleep Disorder and Lifestyle using Predictive Analytics
Keywords:
Sleep Disorder, Data Mining, Predictive Analytics, Machine LearningAbstract
Sleep disorder refers to the conditions that affect the ability of someone to sleep well regularly
whether they are caused by health problems or other outside influences. Occasionally most
people experience a sleeping problem due to various reasons. However, when this issue keeps
occurring and interferes with daily life, this may indicate a sleeping disorder. In some cases, a
sleep disorder may be a symptom of another medical or mental health condition and eventually
gone once treatment is obtained for the underlying cause. The treatment normally involves a
combination of medical treatments and lifestyle changes. Previous research reported that
someone’s lifestyle may affect the sleep length and its quality. For example, food choice affects
sleep quality and caffeine consumption affects sleep length. This paper aims to uncover the
relationship between sleep disorder and lifestyle by performing data investigation using
predictive analytics. This study employs Cross Industry Standard Process for Data Mining
(CRISP-DM) as methodology. Starting with collection of raw datasets, which were acquired
from SleepFoundation.org, one of the leading sources of evidence-based pertaining sleep health
information. From there, 1000 data records with 77 attributes are selected and categorized into
five class labels i.e. Personal, Diet, Technology, Disease, and Environment. The 77 attributes
including depression, anxiety disorder, felt sad, overall health, etc. are then measured using
Cramer’s value and visualize using Mosaic plots. The Correlation Coefficient and P-value
methods are employed to define the relationship among those attributes with a sleep disorder.
As for the predictive analytics, we exploit three data mining methods i.e. Support Vector
Machine (SVM), Conditional Inference Tree (CTree) and Recursive Partitioning (Rpart).
Results show that SVM lead the accuracy level up to 80.288% outperformed Rpart (71.428%)
and Ctree (66.499%).
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Journal of Data Science
This work is licensed under a Creative Commons Attribution 4.0 International License.