Predicting Employee Attrition: A Machine Learning Comparison
Employees are essential resources for a company. However, several issues can arise when employees decide to leave the company, such as additional costs and reputation loss. Companies want to avoid these issues. Therefore, there is a growing interest in using machine learning to predict which employee is likely to leave. Whereas existing literature focused on traditional machine learning methods, this thesis introduces a new method for predicting employee attrition, namely neural networks. Therefore, this thesis aims to examine whether and to what extent neural networks outperform traditional machine learning methods in predicting employee attrition. The traditional machine learning methods used in this thesis are K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest (RF). This thesis uses a dataset published by IBM Watson Analytics, which contains 1470 instances and 35 features. With that dataset two experiments are carried out. First, the performance of artificial neural networks (ANNs) is compared to the performance of the traditional machine learning methods on the original imbalanced dataset. Second, a balanced dataset is created by using Synthetic Minority Oversampling Technique (SMOTE). Hereafter, all four methods are trained and compared on the balanced dataset. The first experiment showed that the ANNs outperformed the KNN and RF. However, the ANN did not outperform the SVM. The second experiment showed that the ANN did not outperform any of the traditional methods. Therefore, in this research we can conclude that ANNs do not outperform traditional machine learning methods in predicting employee attrition.