Abstract
The world has seen rapid growth in the use of IoT devices, especially for smart home automation. Over recent years, their use has significantly increased and become part of daily lives. However, there have been growing concerns about the privacy and security of data collected through these IoT devices. The network traffic flowing across these IoT devices may contain anomalies that alter the behavior of these IoT devices. In such situations, the IoT device under attack may deviate from its normal behavior and take actions that may potentially have serious consequences. Hence, it is necessary to monitor, identify and block any anomalous network traffic to protect the IoT devices connected to a network.
Hardware and software systems have been developed to detect network intrusions. Signature-based intrusion detection systems and anomaly-based intrusion detection systems are two broad categories of network intrusion detection systems. Signature-based systems detect malicious activity by using a database containing attack signatures. A major limitation of this approach is that they rely on previously known attack signatures to detect intrusions. Hence, the system needs to be constantly updated to deal with new types of attacks.
To overcome the limitations of signature-based network intrusion detection systems, anomaly-based network intrusion detection systems have been developed. These systems do not rely on previously known attacks but detect anomalies by distinguishing between normal behavior and abnormal behavior.
A subset of anomaly-based network intrusion detection methods uses deep network models to classify attacks and benign traffic. We can use deep learning to solve the problem of detecting malicious traffic in a network of IoT devices. A traditional approach is to train the models using centrally stored data collected from all the devices in the network. This framework raises concerns around data privacy and security. Attacks on the central server can compromise the data and expose sensitive information.
To address the issues of data privacy and security, we propose a newer approach called federated learning. This approach helps to preserve data privacy by decentralizing the machine learning model training. It does not require data to be sent to a central server so, it allows on-the-device training of deep learning models using the local dataset. Moreover, it improves a global model on a federated server by aggregating models from individual devices connected to the network. The aggregation allows the global model to learn about the network behavior without having the need to be trained on the original dataset. The global model can then be used to detect attacks in the IoT network.
In the first part of the project, we explore and implement federated learning techniques to detect attack traffic in the IoT network. We use a Deep Neural Network on the labeled dataset and Autoencoder on the unlabeled dataset in a federated framework. We implement model aggregation algorithms such as FedSGD, FedAvg, and FedProx to compare their performance. In the second part of the project, we implement these models in a centralized framework using labeled and unlabeled datasets.
By comparing the accuracy of the models, we learn that the performance of deep neural networks and autoencoders trained in a federated framework is similar to that of a centralized framework. We also determine the aggregation algorithm for the global model that yields the best performance for detecting attack traffic in the IoT network. Based on our results we conclude that federated learning is a secure and data privacy preserving approach for detecting attacks in IoT networks.