Title: A study for method-level code smells detection using machine learning algorithms
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Academic Press
Abstract
Motivation: Code smells reflect poor design decisions that degrade software quality and maintainability. Although several machine learning algorithms have been proposed to detect code smells, the impact of feature selection and cross-validation on certain method-level smells, specifically Long Parameter List and Switch Statements, has not been adequately explored in prior research. Methodology: This study employs a rigorous methodology to investigate the detection of four method-level code smells—Long Parameter List (LPL), Switch Statement (SS), Feature Envy (FE), and Long Method (LM) using twenty machine learning algorithms. We apply the Information Gain feature selection algorithm and the Equal Width Discretization (EWD) class balancing method. Performance is evaluated using 10-fold cross-validation across multiple metrics: accuracy, precision, recall, F-measure, MCC, ROC-area, and PRC-area. Key Findings: The proposed framework achieved a remarkable 99.77% accuracy for the Long Method dataset using the Filtered Classifier with feature selection and class balancing. Importantly, this study is the first to demonstrate the effect of feature selection and cross-validation on the LPL and SS datasets, where significant performance improvements are also observed. Contributions: A comprehensive comparative analysis of 20 machine learning algorithms on four method-level code smell datasets. © 2025 The Author(s)
