Options
Scalable intrusion detection in IoT networks: Evaluating PySpark pipelines and design trade-offs
Date Issued
2025
Publisher
IEEE
ISBN
9798331543723
9798331543730
ISSN
2325-2944
2325-2936
Citation
Georgiades, M., Hussain, F., Christodooulou, L., Ho, K. H., Hou, Y., & Gregoriades, A. (2025). Scalable intrusion detection in IoT networks: Evaluating PySpark pipelines and design trade-offs. In IEEE (Ed.). 2025 21st International conference on distributed computing in smart systems and the internet of things (DCOSS-IoT). 2025 21st International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT), Lucca, Italy (pp. 1-8). IEEE.
Type
Conference Paper
Abstract
The rapid growth of Internet of Things (IoT) networks has introduced challenges in securing large-scale, real-time environments against evolving cyber threats. This study evaluates scalable machine learning workflows implemented in PySpark for intrusion detection using the RT-IoT2022 dataset. We compare manual feature engineering with automated pipeline-based approaches across classifiers including Logistic Regression, Naïve Bayes, Decision Tree, and Random Forest. Leveraging PySpark's distributed processing and modular components—such as Pipeline, StringIndexer, VectorAssembler, and MinMaxS-caler—we assess how workflow design affects performance metrics (Accuracy, Precision, Recall, and F1 Score), execution time, and model interpretability. Our findings reveal trade-offs between modularity, transparency, and latency, highlighting the need to align workflow architecture with deployment goals. The results provide practical insights for designing explainable, scalable, and resource-aware intrusion detection systems for real-time IoT security.
Subjects
Loading...
Availability at HKSYU Library

