Comparison of Recursive Feature Elimination and Boruta as Feature Selection in Greenhouse Gas Emission Data Classification

Authors

  • Riko Febrian Department of Statistics, Faculty of Mathematics and Natural Sciences, Riau University
  • Anne Mudya Yolanda Department of Statistics, Faculty of Mathematics and Natural Sciences, Riau University

Keywords:

Classification, greenhouse gas emissions, feature selection, recursive feature elimination, boruta

Abstract

Classification analysis is a supervised learning method that can be utilized to categorize levels of greenhouse gas emissions. Regular monitoring of greenhouse gas emissions is essential for relevant agencies to devise prevention and mitigation programs that address climate change. In classification analysis, enhancing model performance is correlated with the number of features or variables utilized, thus necessitating feature selection in its application. This study compares feature selection methods for classifying greenhouse gas emission levels, specifically wrapper feature selection, recursive feature elimination, and boruta. The Support Vector Machine (SVM) algorithm is employed to evaluate classification performance, focusing on binary classification into "high" and "low" categories in this study. The results indicate that classification performance improves with feature selection and recursive feature elimination compared to scenarios without feature selection or with Boruta feature selection. By employing three out of the thirty-nine features, accuracy, sensitivity, and specificity of 98.95%, 99%, and 97% were achieved, respectively.

Published

2024-06-04