Comparative Analysis of Performance and Influence of PCA On Machine Learning Models Leveraging The NSL-KDD Dataset
DOI:
https://doi.org/10.5281/zenodo.8014233Abstract
Cyber-attacks have become prevalent in the digital sphere with varied forms and shades of attacks orchestrating significant damage in information systems. Intrusion detection systems which detect attacks on a network are being developed rapidly. Machine learning algorithms are also being utilized in developing such systems with their performance being evaluated by various relevant metrices as well as techniques that could improve their performance. This paper is aimed at performing a comparative analysis of the performance of some machine learning models with respect to the NSL-KDD dataset. The impact of Principal Component Analysis on the models is also investigated. Random Forest, Logistic Regression, Support Vector Machine (SVM), Artificial Neural Network (ANN) and K-Nearest Neighbour (KNN) were considered with and without feature selection. Performance metrices such as Accuracy, F1-score, Precision and Recall were used as basis for comparing the models. Results show that Random Forest gives the best accuracy compared to the other models