A Framework for Formulation of Student Dataset Using Existing and Novel Features for Analysis

Authors

  • Esther Samuel Alu Department of Computer Science, Nasarawa State University Keffi, Nigeria
  • Rashida Funke Olanrewaju Department of Computer Science, Ahmadu Bello University Zaria, Nigeria
  • Afolyan A. Obiniyi Department of Computer Science, Nasarawa State University Keffi, Nigeria
  • Muhammad Dahiru Liman Department of Computer Science Federal University of Lafia, Nigeria

Keywords:

Student, Dataset, Feature Importance, Random Forest

Abstract

One major problem identified with most schools in Nigeria is that they lack structured educational datasets that is composed of several attributes related to each student, such as term-based grades, courses taken, student-specific details, and absences which could be easily analysed. This paper formulates a dataset with some novel features for analysing and predicting student performance. Apart from the current features like age, grade, number of failures etc. some novel features which consists of environmental factors were proposed. Students’ records were collected from schools and surveys on schools’ infrastructure were collected using a questionnaire. The data were analysed using NumPy and Pandas in python. Random forest was used as classifier for making prediction and detecting important features. The following features were found to influence the model decision in making decision; Average, Number of failures, students score in all the subjects, school type, portable drinking water, availability of electricity, textbook to student ratio, and availability of laboratory reagents. Four of the proposed features were among the most important features. In addition, the model was excellent in making prediction. Results of the analysis shows that there are more male than females in the dataset, this means that government, non-governmental organization and the society needs to promote and encourage girl child education.

Published

2023-08-09

Issue

Section

Articles