Hybrid Intrusion Detection System Based On Combination of
Hybrid Intrusion Detection System Based On Combination of
Article
Hybrid Intrusion Detection System Based on Combination of
Random Forest and Autoencoder
Chao Wang 1,2 , Yunxiao Sun 1,2 , Wenting Wang 3 , Hongri Liu 1,4 and Bailing Wang 1,2, *
1 School of Computer Science and Technology, Harbin Institute of Technology, Weihai 264209, China
2 School of Cyber Science and Technology, Harbin Institute of Technology, Harbin 150001, China
3 State Grid Shandong Electric Power Company, Electric Power Research Institute, Jinan 250003, China
4 Weihai Cyberguard Technologies Co., Ltd., Weihai 264209, China
* Correspondence: [email protected]
Abstract: To cope with the rising threats posed by network attacks, machine learning-based intrusion
detection systems (IDSs) have been intensively researched. However, there are several issues that
need to be addressed. It is difficult to deal with unknown attacks that do not appear in the training
set, and as a result, poor detection rates are produced for these unknown attacks. Furthermore, IDSs
suffer from high false positive rate. As different models learn data characteristics from different
perspectives, in this work we propose a hybrid IDS which leverages both random forest (RF) and
autoencoder (AE). The hybrid model operates in two steps. In particular, in the first step, we utilize
the probability output of the RF classifier to determine whether a sample belongs to attack. The
unknown attacks can be identified with the assistance of the probability output. In the second step,
an additional AE is coupled to reduce the false positive rate. To simulate an unknown attack in
experiments, we explicitly remove some samples belonging to one attack class from the training
set. Compared with various baselines, our suggested technique demonstrates a high detection rate.
Furthermore, the additional AE detection module decreases the false positive rate.
Keywords: intrusion detection; random forest; autoencoder; hybrid model; unknown attack
networks [12]. In datasets with both normal and attack samples, a classifier can find a
decision boundary between the normal and attack samples.
As network attack methods are becoming more complicated, however, it is challenging
to obtain samples of all attack types. When encountering unknown attacks during the
detection phase, the supervised classifier may not generalize well [13], which may result in
misclassification for these samples and decrease the detection rate. To tackle this challenge,
researchers have attempted to create an IDS model that trains on the normal data alone,
such as an autoencoder (AE) [14]. AEs employ the reconstruction error as the anomaly
score, where one sample with a higher score could be detected as a attack. However, due
to the lack of supervision by both normal and attack samples, it may not obtain the same
high performance as supervised algorithms, which may learn complex decision boundaries
between the normal and attack samples.
In this paper, we focus on the case where samples belonging to attack classes that
are used for training are limited because it is difficult to collect samples of all attack types.
For example, there are some types of attacks that have been used for training supervised
models. As attack variants or new types of attacks continue to emerge, the trained detection
model may not detect a novel attack. As a result, developing a robust IDS model with a
higher detection rate and a lower false positive rate becomes critical. With the consideration
that different techniques can learn the characteristics of data from different perspectives,
in this work, we propose a hybrid IDS that combines an RF and an AE. In particular,
the hybrid IDS comprises two steps in the detection phase. The first step is the application
of RF with probabilistic methods to detect attacks. Then, considering how to reduce the
false alarm rate further, we employ the AE module in the second step. The contributions of
this study can be listed as follows:
1. In the first step, we employ RF to identify attacks. Unlike commonly used strategies,
we employ the predicted probability to distinguish the samples. With a predefined
threshold, probabilities higher than the threshold can be identified as attacks. In this
manner, the RF can identify some unknown attacks.
2. In the second step, we combine another detector utilizing a different detection prin-
ciple. In detail, we apply an AE to recheck samples that have been predicted as
attacks by the RF classifier. The reconstruction error of samples with a lower value
can be reclassified as normal. This additional step decreases the false positive rate
even further.
3. To demonstrate the effectiveness of the proposed methods, we conduct experiments
on two incursion datasets. In the experiments, we explicitly set some attacks as
unknown. The combined approach provides a greater detection rate and a lower false
positive rate compared with other baseline methods.
The remainder of this research is organized as follows: we describe the relevant
work concerning the IDS in Section 2. The whole detection framework and corresponding
methodology are presented in Section 3. Section 4 demonstrates the performance of the
suggested approach via comprehensive tests. Finally, we draw relevant conclusions and
highlight avenues for further work in Section 5.
2. Related Work
The purpose of IDSs is the discovery of anomalous operations within the monitoring
environment. There are two ways to classify IDSs: based on the data source utilized or the
detection methods. According to the data source utilized in the detection engine, IDSs can
be divided into two categories: host-based IDSs (HIDS) and network-based IDSs (NIDS) [4].
The former utilizes data generated on one host, while the latter checks network traffic
packets transmitted within the network. In this study, we focus on machine learning-based
NIDS.
The overall process of machine learning can be summarized as two parts: training
and testing [1]. During the training phase, models are trained on the collected dataset and
learn the characteristics of the input features. After training, the model is deployed in the
Symmetry 2023, 15, 568 3 of 16
testing phase to examine the anomalous samples. There are a lot of classic machine learning
methods have been applied to construct IDSs [15,16]. As an ensemble learning approach,
the RF classifier yields considerable detection performance [17]. It constructs numerous
DTs to obtain a higher detection rate than a single DT.
As there may exist some issues such as high-dimensional data [18,19] and data imbal-
ances [20], researchers have proposed more and more enhanced classifiers. To reduce the
impact of irrelevant features and enhance the detection rate, the authors of [21] selected
useful features first based on the correlation between the features and classified samples
using a combination of several distinct classifiers.
RF can be employed directly as a feature selection method. The authors of [22] applied
an RF to discover the optimal features for classification based on feature importance first.
After that, the selected features are utilized to train a support vector machine. There are
also some other hybrid models involving two parts. For example, the authors of [23] used
both AE and DNN to classify attacks. However, this method has difficulty with some
unknown attacks that do not appear in the training set.
In some situations, it is difficult to collect or simulate the attack samples [24]. It is
reasonable to employ some one-class classifiers to learn about the characteristics of network
traffic. One-class learning aims to build a profile of normal traffic. For example, the one-
class support vector machine (OCSVM) [25] attempts to distinguish between normal and
anomalous data by learning the hyperplane that has the maximum distance between the
normal samples and the origin [26]. In addition, the isolation forest (IF) algorithm can be
used to detect anomalies [27]. Furthermore, there are various works employing AEs [14].
AEs are generally used for feature extraction [28,29], however they can also be utilized for
anomaly detection [14].
In this work, we seek to develop an IDS with the objective of a high detection rate and
a low false alarm rate. In a real deployment, there are some unknown attacks. Additionally,
this work finds that some supervised classifiers may incorrectly classify unknown attacks
as normal. To solve this problem, we employ a probabilistic RF and an AE. There is one
work [30] utilized in fraud detection similar to ours that also utilizes RF and AE. However,
they employed the AE as a dimension reduction approach to retrieve the representative
characteristics. Furthermore, they applied the RF in a probabilistic approach to overcome
the problem of data imbalance.
3. Proposed Methods
In this section, we describe the suggested model in detail. We introduce the employed
techniques first, i.e., RF and AE. Then, we merge these two methods to introduce the full
detection framework.
Training data
Majority voting
Prediction
As illustrated in Figure 1, given the training set, there are M distinct DT classifiers.
To obtain final prediction results, majority voting is employed to aggregate the predictions
from each DT. In the following, we present the detailed training procedure for RF. Consid-
ering the labeled dataset with samples { x1 , x2 , · · · , x N } and labels {y1 , y2 , · · · , y N }, where
N is the number of samples and every sample includes j features, to train the RF, we aim to
train M distinct DTs. The general step can be summarized as follows:
(1) Sampling from the training set with N samples using the bootstrap with replacement.
(2) Construction of a DT classifier using the selected samples.
To construct one DT, we first select k features from j features. The value of k is set as
sqrt( j). After that, we pick the best split feature from the chosen k features and divide the
node into two children nodes. The Gini impurity is utilized as a split criterion for the split
node. Repeat these procedures and grow the tree as deep as possible.
In this study, we focus on binary classification. To predict an input sample, the RF
utilizes the votes of the trees in the forest weighted by their probability estimates [32]. Let
p denote the probability of being predicted as the attack. Then, the probability of being
predicted as normal is q and q = 1 − p. The predicted class probabilities of an input sample
are calculated as the mean predicted class probabilities of the trees in the forest. The class
probability of a single tree is the fraction of samples of the same class in a leaf node [32].
We utilize the “predict_proba” function in scikit-learn [32] to output the probability
that an object belongs to a certain class. Usually, the samples can be classified into one
class with the highest probability. However, in this approach, some samples belonging to
unknown attacks may be wrongly classified as normal. In this part, we define a threshold
T to help the process of deciding the samples. If the probability p belonging to the attack is
greater than the threshold T, samples could be classified as attack. In this manner, with the
sample xi and corresponding probability pi , we define a decision function f (·). The results
are indicated by ±1, where +1 is the anomalous samples. The calculation is shown below:
(
−1, if pi ≤ T
f ( xi ) = (1)
+1, if pi > T
3.2. Autoencoder
Deep learning has shown to be quite effective in a variety of research fields [33]. It
learns data representations with multiple neural network layers. The other part of our
suggested model is a special unsupervised neural network [34], an AE. As it can rebuild the
input, the reconstruction error can serve as the anomaly score for identifying abnormalities.
Symmetry 2023, 15, 568 5 of 16
The framework for anomaly detection using AE is depicted in Figure 2. We introduce the
general process later.
𝑋 𝑋"
er De
c od
cod er
En
Reconstruction Error
>Threshold ≤Threshold
Attack Normal
ei = k xi − x̂i k2 (2)
The training process aims to minimize the reconstruction loss. After training, with a
well-trained AE we can identify the anomalous samples using the MSE. Similar to the
handling process for the probability of the RF with a predefined threshold T, we use a
function f (·) to make the decision, as illustrated below:
(
−1, if ei ≤ T
f ( xi ) = (3)
+1, if ei > T
Testing
Labeled Training Set Samples
Normal Attack
Samples Samples
Random Forest Normal
Classifier Samples
Train
Train
Autoencoder Normal
Detector Samples
Autoencoder
Detector
Attack
Samples
(a) (b)
Figure 3. The overview of our proposed method. (a) Training process; (b) testing process.
Because the AE trains on normal data only, the normal samples would have a lower
MSE than the anomalous ones. From this point of view, we can define a lower thresh-
old, and samples lower than this threshold have a higher degree of confidence that they
belong to the normal class. Under this assumption, we integrate these two decision pro-
cesses. After obtaining the trained model, during the testing phase, we apply a two-step
detection strategy.
We list the detection procedure in Algorithm 1. There are two hyperparameters
considered for the decision. The first one is T1 , utilized for RF probability, and the second
one is T2 , used for MSE. In the beginning, a sample xi is classified using the RF classifier.
When the probability of a sample is larger than the T1 , it would be classified as an attack.
After that, we utilize the AE to examine the attack sample predicted by RF again. In this part,
when the reconstruction error is smaller than the T2 , the sample is reclassified as normal.
In this two-step approach, the samples in the testing phase can be correctly classified,
especially the misclassified attack samples.
Symmetry 2023, 15, 568 7 of 16
4. Experimental Results
In this section, we present the experimental results for the proposed approach. First,
we describe the dataset and preprocessing procedures applied in the experiments. The eval-
uation metric and comparison methods are then introduced. After that, the specific experi-
ment settings are listed. The results of the experiments are thoroughly analyzed in the final
part.
4.1. Dataset
To conduct the experiments, we use two intrusion detection datasets [35], namely,
NF-CSE-CIC-IDS2018-v2 and NF-BoT-IoT-v2. Both datasets are created using the NetFlow
v9 features from the original datasets CSE-CIC-IDS2018 [36] and BoT-IoT [37]. In this study,
we refer to them as IDS2018 and BoT-IoT, respectively. Considering that there is a large
number of samples in the dataset, we randomly sample different categories of data, and
the detailed distribution of the different categories is shown in Table 1.
Table 1. The sample distribution of different attack types for both datasets.
IDS2018 BoT-IoT
Number of Number
No. Class No. Class
Samples of Samples
1 Normal 120,000 1 Normal 65,150
2 DDoS 68,000 2 DoS 42,332
3 DoS 48,000 3 DDoS 21,133
4 Bot 14,000 4 Reconnaissance 13,000
5 Bruteforce 12,000 5 Theft 2133
6 Infiltration 11,000 - - -
7 Web 3502 - - -
In the IDS2018 dataset, there are six attacks (not including normal data): DDoS, DoS,
Bot, Bruteforce, Infiltration, and Web attacks. As for the BoT-IoT dataset, there are four
attacks. There are 43 features for every record. To preprocess the dataset, we remove some
irrelevant columns, for example, source IP. After that, all numeric features are handled
Symmetry 2023, 15, 568 8 of 16
via the log function to decrease the effect of a larger number. Furthermore, some category
features are encoded by the one-hot encoder method. We use min–max normalization to
scale the feature into the range of 0 and 1. After handling the dataset, the dimension of
IDS2018 dataset is about 300, and the dimension of BoT-IoT dataset is about 200.
Attack Normal
Predicted label
Figure 4. Confusion matrix of detection results.
Accounting for these four numbers of the classification results, we calculate some
performance metrics. We list four metrics commonly used in the field of classification
including accuracy, precision, recall, and F1. Their calculations are defined as follows:
TP + TN
Accuracy = (4)
TP + TN + FP + FN
Symmetry 2023, 15, 568 9 of 16
TP
Precision = (5)
TP + FP
TP
Recall = (6)
TP + FN
2 × Precision × Recall
F1 = (7)
Precision + Recall
The F1 is the harmonic mean of precision and recall. Furthermore, there is an additional
metric named false positive rate (FPR) as below:
FP
FPR = (8)
FP + TN
The comparative method AE uses the same network settings as ours. In our method,
the T2 for AE is used to reduce the FPR. Its value is set at a lower value to ensure that the
samples can be classified as normal with higher confidence. When using AE only to detect
attacks, we use another threshold. We use the validation dataset to select the best threshold,
which results in the best F1 score.
2.5×104
Normal
2.0×104 Known Attack
Unknown Attack
1.5×104
Count
1.0×104
5.0×103
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Probability
Figure 5. The probability of being predicted as attacks for testing samples of IDS2018 dataset when
DoS is unknown attack.
In the figure, the samples’ distribution of probability is plotted, where the x-axis is
the probability, and the y-axis is the number of samples. From the figure, we can see that
the probability of normal and known attack samples being distributed on the two sides is
generally 0 or 1. However, the unknown attack samples are located in the middle position.
From this point, if we use the default “predict” method, the normal and known attacks can
be classified correctly, but the unknown attacks would be misclassified as normal.
In our method, we can use the probability to classify the samples. In this instance, we
can set the threshold to 0.2 and obtain the correct prediction. Next, we display the MSE
distribution, which is plotted in Figure 6. As there is no attack sample in the training set, it
is hard for the AE to rebuild the attack samples well. Whether known or unknown, most
attack samples have a higher MSE than normal samples.
Symmetry 2023, 15, 568 11 of 16
2.0×104 Normal
Known Attack
1.5×104 Unknown Attack
Count 1.0×104
5.0×103
0.0
0.00 2.50×10 5 5.00×10 5 7.50×10 5 1.00×10 4 1.25×10 4 1.50×10 4
MSE
Figure 6. The MSE of testing samples for IDS2018 dataset.
To further compare the combination methods, we plot the confusion matrix of the four
classifiers in Figure 7.
Normal
Normal
Normal
Normal
96.7 3.3 89.8 10.2 92.4 7.6 93.5 6.5
True label
True label
True label
True label
Attack
Attack
Attack
Attack
64.8 35.2 1.8 98.2 2.0 98.0 2.0 98.0
Normal Attack Normal Attack Normal Attack Normal Attack
Predicted label Predicted label Predicted label Predicted label
(a) RF (b) RF(Pro) (c) AE (d) Ours
Figure 7. The detection confusion matrix (in %) of different classifiers on IDS2018 dataset when
the unknown attack is DoS. (a) Classification confusion matrix of RF. (b) Classification confusion
matrix of RF(Pro). (c) Classification confusion matrix of AE. (d) Classification confusion matrix of
combination methods.
The first RF shows the lowest detection rate, as 64.8% of attacks have been wrongly
categorized as normal. When it uses the highest probability to classify (as we can see from
Figure 5), it wrongly classifies some unknown attack samples as normal. From Figure 7b, it
can be seen that only 1.8% of attack samples were misclassified as normal for the RF(Pro)
classifier. It has a big improvement over the RF classifier. Furthermore, the AE detector has
a higher TP rate. In the final analysis, our proposed combination methods achieved the
lowest FP fraction when compared with the other two. As stated before, we apply the AE
to the RF results and relabel some attack samples that have a lower MSE as normal. In this
manner, the combination method has a lower FP.
To further demonstrate the performance of our method, we average all of the results
when different attacks are set as unknown. The detailed results are shown in Tables 2 and 3
for both datasets. We report the mean value and standard deviation in the table.
Symmetry 2023, 15, 568 12 of 16
Table 2. The detection performance (in %) of different classifiers for the BoT-IoT dataset. The table
reports the five metrics mentioned above. Both the mean value and standard deviation are reported.
In the first place, based on the results of accuracy, and F1 for both datasets, our method
outperforms others. In detail, our method has the highest F1 of 99.72% for the BoT-IoT
dataset and 95.90% for the IDS2018 dataset. Although the recall of OCSVM in the IDS2018
dataset is higher than ours, their other metrics perform worse than ours.
As the experimental results present similar conclusions for both datasets, we analyze
them both. The first four supervised methods present higher precision and a lower FPR than
other methods. Because some samples belong to the unknown attacks in the experiments,
the supervised methods classified them directly into the normal class.
The detection performance of IF is not satisfying. It has a F1 of only 65% on the BoT-IoT
dataset. The other two detection methods, OCSVM and AE, present higher performance
than supervised methods. As these three methods only require normal data during training,
they are capable of dealing with known or unknown attacks. The F1 of OCSVM and AE
achieves about 93%, which is higher than the four supervised methods.
The single detector “RF(Pro)” which is part of our method, has a similar performance
to our method. The RF classifier is significantly enhanced after using probability to make
the decision. From the point of recall, the value of “RF(Pro)” is higher than RF with about
8% on BoT-IoT dataset and about 20% on IDS2018 dataset. As we stated before, we aim to
reduce the false positive rate by combining the AE and RF, and the FPR of both datasets
in the tables is lower than a single one. For example, in the IDS2018 dataset, our hybrid
method has the lowest FPR of 1.81% compared to the “RF(PRO)” or AE. In addition, the F1
of our method is higher than these two basic methods.
After analyzing the average detection performance on different attacks, it is reasonable
to investigate the performance of various classifiers when dealing with different unknown
attacks. For simplification, we report the methods related to RF and AE for the IDS2018
dataset. We plot the F1 and FPR only in Figures 8 and 9, with the consideration that F1 is the
harmonic mean of recall and precision. Additionally, the FPR can prove the improvement
of our method.
Symmetry 2023, 15, 568 13 of 16
RF RF(Pro) AE Ours
1.00
0.95
0.90
F1
0.85
0.80
0.75
DoS Bot Bruteforce DDoS Web Infiltration
Attack
Figure 8. F1 (in %) of different detection methods against various unknown attacks.
RF RF(Pro) AE Ours
0.15
0.10
FPR
0.05
0.00
DoS Bot Bruteforce DDoS Web Infiltration
Attack
Figure 9. FPR (in %) of different detection methods against various unknown attacks.
In Figure 8, there are six different attacks. The probability-based RF performs better
than the classical RF method in most cases. Furthermore, our method presents the highest
F1. The results shown in Figure 9, our combination method reduces the FPR significantly.
In our model, there are two hyperparameters that have significant effects: T1 and T2 .
Because there are no unknown attacks in the training or validation set, we need to set them
manually. To validate its effectiveness, we vary these two hyperparameters with some
representative values. In detail, the T1 is selected from {80, 85, 90, 95, 99} and T2 is from
{65, 70, 75, 80, 85}. We plot the comparisons in Figure 10. As before, we report the F1 and
FPR for the IDS2018 dataset only.
100 6
95.6 95.7 95.8 95.9 95.8 5.8 5.0 4.3 3.6 2.9
99 95 90 85 80
99 95 90 85 80
95
95.9 96.0 96.1 96.1 96.0 4.1 3.5 3.0 2.5 2.0 4
90
95.8 95.9 95.9 95.9 95.7 2.5 2.2 1.8 1.5 1.2
T1
T1
85 2
89.9 89.9 89.9 89.8 89.6 1.2 1.1 0.9 0.7 0.6
80
81.3 81.3 81.3 81.2 81.0 0.3 0.2 0.2 0.2 0.2
75 0
65 70 75 80 85 65 70 75 80 85
T2 T2
(a) F1 (b) FPR
Figure 10. Performance (in %) comparisons of various threshold value on IDS2018 dataset methods
combining with five threshold method construct 25 results. We use the red circle to highlight our
results. (a) F1, higher is better. (b) FPR, lower is better.
Symmetry 2023, 15, 568 14 of 16
In this section, we restate the effects of two parameters. T1 is the cut position for
probability output by RF to decide whether a sample is attack. Furthermore, T2 is a value
used for the MSE to determine whether a sample may be misclassified as an attack. It is
hard to decide the optimal value of these two hyperparameters, because there no existing
unknown attack samples in the training set or validation set.
At first, we find that the FPR in Figure 10b decreases with increases in T1 and T2 .
However, F1 presents a different pattern with the change in threshold. When the T1
achieves the highest value, i.e., the 99th percentile, the F1 presents the lowest value. This
is because with higher T1 , more and more attack samples are missed, although the FPR is
lower. The F1 does not change too much, as with the varied T2 ; however, FPR is lower.
In our experiment, we set T1 as the 90th percentile of the probabilities of normal data
in the validation dataset, and T2 is the 75th percentile of the MSEs of normal data in the
validation. We use the red circle to highlight the results of our selected value. We can see
that the highest F1 is achieved at T1 = 85th and T2 = 75th, which is only higher than ours by
0.2%. The results prove that the selected values present satisfactory results.
5. Conclusions
With the increasing risks of network attacks, network environments need more power-
ful IDSs to protect them. As more and more attacks appear every day, it is important to
handle the issues presented by unknown attacks. In this study, we develop a hybrid IDS to
boost the detection rate when dealing with unknown attacks.
In detail, the proposed method combines RF and AE. Because the unknown attacks
may be misclassified, we use the probability output of the RF classifier to check the sam-
ples first. Then, the AE is utilized to recheck the attack predicted by RF and reduce the
FPR. We conducted experiments on two intrusion detection datasets while setting some
attack samples explicitly as unknown. The experimental results prove that the combina-
tion method boosts the detection rate and reduces the FPR in comparison to the single
detection methods.
Some directions are worth further investigation. Only one type of attack was set as
the unknown during the experiments; it is important to set more than one type of attack as
the unknown to test the model. In this study, we focus on binary classification. We plan to
expand the method into a multi-class approach to provide more diagnostic information for
security operators in the future.
Author Contributions: Conceptualization, C.W. and Y.S.; data curation, C.W.; formal analysis, W.W.
and H.L.; funding acquisition, B.W.; investigation, C.W. and H.L.; methodology, C.W.; project
administration, B.W.; software, C.W. and Y.S.; supervision, H.L. and B.W.; validation, Y.S. and W.W.;
visualization, C.W.; writing—original draft, C.W.; writing—review and editing, B.W. All authors have
read and agreed to the published version of the manuscript.
Funding: This research is funded by the National Key Research and Development Program of China
(No.2021YFB2012400).
Data Availability Statement: In this study, we use the intrusion detection dataset reported in [35].
Readers can refer to the corresponding paper for detail information.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Ahmad, Z.; Shahid Khan, A.; Wai Shiang, C.; Abdullah, J.; Ahmad, F. Network intrusion detection system: A systematic study of
machine learning and deep learning approaches. Trans. Emerg. Telecommun. Technol. 2021, 32, 1–29. [CrossRef]
2. Anderson, J.P. Computer Security Threat Monitoring and Surveillance; Technical Report; James P. Anderson Company: Philadelphia,
PA, USA, 1980.
3. Vanin, P.; Newe, T.; Dhirani, L.L.; O’Connell, E.; O’Shea, D.; Lee, B.; Rao, M. A Study of Network Intrusion Detection Systems
Using Artificial Intelligence/Machine Learning. Appl. Sci. 2022, 12, 11752. [CrossRef]
4. Liu, H.; Lang, B. Machine learning and deep learning methods for intrusion detection systems: A survey. Appl. Sci. 2019, 9, 4396.
[CrossRef]
Symmetry 2023, 15, 568 15 of 16
5. Adnan, A.; Muhammed, A.; Abd Ghani, A.A.; Abdullah, A.; Hakim, F. An Intrusion Detection System for the Internet of Things
Based on Machine Learning: Review and Challenges. Symmetry 2021, 13, 1011. [CrossRef]
6. Aldallal, A.; Alisa, F. Effective Intrusion Detection System to Secure Data in Cloud Using Machine Learning. Symmetry 2021, 13,
2306. [CrossRef]
7. Aldallal, A. Toward Efficient Intrusion Detection System Using Hybrid Deep Learning Approach. Symmetry 2022, 14, 1916.
[CrossRef]
8. Ingre, B.; Yadav, A.; Soni, A.K. Decision Tree Based Intrusion Detection System for NSL-KDD Dataset. In Proceedings of the
Information and Communication Technology for Intelligent Systems (ICTIS 2017); Satapathy, S.C., Joshi, A., Eds.; Springer International
Publishing: Cham, Switzerland, 2018; Volume 2, pp. 207–218.
9. Balyan, A.K.; Ahuja, S.; Lilhore, U.K.; Sharma, S.K.; Manoharan, P.; Algarni, A.D.; Elmannai, H.; Raahemifar, K. A Hybrid
Intrusion Detection Model Using EGA-PSO and Improved Random Forest Method. Sensors 2022, 22, 5986. [CrossRef]
10. Yang, Z.; Wang, B. A Feature Extraction Method for P2P Botnet Detection Using Graphic Symmetry Concept. Symmetry 2019, 11,
326. [CrossRef]
11. Vinayakumar, R.; Alazab, M.; Soman, K.P.; Poornachandran, P.; Al-Nemrat, A.; Venkatraman, S. Deep Learning Approach for
Intelligent Intrusion Detection System. IEEE Access 2019, 7, 41525–41550. [CrossRef]
12. Li, Z.; Qin, Z.; Huang, K.; Yang, X.; Ye, S. Intrusion Detection Using Convolutional Neural Networks for Representation Learning.
In Proceedings of the Neural Information Processing; Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S.M., Eds.; Springer International
Publishing: Cham, Switzerland, 2017; pp. 858–866.
13. Rudd, E.M.; Rozsa, A.; Günther, M.; Boult, T.E. A Survey of Stealth Malware Attacks, Mitigation Measures, and Steps Toward
Autonomous Open World Solutions. IEEE Commun. Surv. Tutor. 2017, 19, 1145–1172. . 2636078. [CrossRef]
14. Song, Y.; Hyun, S.; Cheong, Y.G. Analysis of autoencoders for network intrusion detection†. Sensors 2021, 21, 4294. [CrossRef]
[PubMed]
15. Magán-Carrión, R.; Urda, D.; Díaz-Cano, I.; Dorronsoro, B. Towards a reliable comparison and evaluation of network intrusion
detection systems based on machine learning approaches. Appl. Sci. 2020, 10, 1775. [CrossRef]
16. Maseer, Z.K.; Yusof, R.; Bahaman, N.; Mostafa, S.A.; Foozy, C.F.M. Benchmarking of Machine Learning for Anomaly Based
Intrusion Detection Systems in the CICIDS2017 Dataset. IEEE Access 2021, 9, 22351–22370. . 3056614. [CrossRef]
17. Resende, P.A.A.; Drummond, A.C. A survey of random forest based methods for intrusion detection systems. ACM Comput.
Surv. 2018, 51, 1–36. [CrossRef]
18. Di Mauro, M.; Galatro, G.; Fortino, G.; Liotta, A. Supervised feature selection techniques in network intrusion detection: A critical
review. Eng. Appl. Artif. Intell. 2021, 101, 104216. [CrossRef]
19. Abdulhammed, R.; Musafer, H.; Alessa, A.; Faezipour, M.; Abuzneid, A. Features dimensionality reduction approaches for
machine learning based network intrusion detection. Electronics 2019, 8, 322. [CrossRef]
20. Seo, J.H.; Kim, Y.H. Machine-Learning Approach to Optimize SMOTE Ratio in Class Imbalance Dataset for Intrusion Detection.
Comput. Intell. Neurosci. 2018, 2018, 9704672. [CrossRef]
21. Zhou, Y.; Cheng, G.; Jiang, S.; Dai, M. Building an efficient intrusion detection system based on feature selection and ensemble
classifier. Comput. Netw. 2020, 174, 107247. [CrossRef]
22. Chang, Y.; Li, W.; Yang, Z. Network intrusion detection based on random forest and support vector machine. In Proceedings of
the 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on
Embedded and Ubiquitous Computing (EUC), Guangzhou, China, 21–24 July 2017; Volume 1, pp. 635–638. [CrossRef]
23. Narayana Rao, K.; Venkata Rao, K.; P.V.G.D., P.R. A hybrid Intrusion Detection System based on Sparse autoencoder and Deep
Neural Network. Comput. Commun. 2021, 180, 77–88. [CrossRef]
24. Cao, V.L.; Nicolau, M.; McDermott, J. Learning Neural Representations for Network Anomaly Detection. IEEE Trans. Cybern.
2019, 49, 3074–3087. [CrossRef]
25. Schölkopf, B.; Platt, J.C.; Shawe-Taylor, J.; Smola, A.J.; Williamson, R.C. Estimating the support of a high-dimensional distribution.
Neural Comput. 2001, 13, 1443–1471. [CrossRef]
26. Mahfouz, A.M.; Abuhussein, A.; Venugopal, D.; Shiva, S.G. Network Intrusion Detection Model Using One-Class Support Vector
Machine. In Proceedings of the Advances in Machine Learning and Computational Intelligence; Patnaik, S., Yang, X.S., Sethi, I.K., Eds.;
Springer: Singapore, 2021; pp. 79–86.
27. Javed, M.A.; Khan, M.Z.; Zafar, U.; Siddiqui, M.F.; Badar, R.; Lee, B.M.; Ahmad, F. ODPV: An Efficient Protocol to Mitigate Data
Integrity Attacks in Intelligent Transport Systems. IEEE Access 2020, 8, 114733–114740. . 3004444. [CrossRef]
28. Al-Qatf, M.; Lasheng, Y.; Al-Habib, M.; Al-Sabahi, K. Deep Learning Approach Combining Sparse Autoencoder with SVM for
Network Intrusion Detection. IEEE Access 2018, 6, 52843–52856. [CrossRef]
29. Kunang, Y.N.; Nurmaini, S.; Stiawan, D.; Zarkasi, A.; Jasmir, F. Automatic Features Extraction Using Autoencoder in Intrusion
Detection System. In Proceedings of the 2018 International Conference on Electrical Engineering and Computer Science (ICECOS),
Pangkal, Indonesia, 2–4 October 2018; Volume 17, pp. 219–224. [CrossRef]
30. Lin, T.H.; Jiang, J.R. Credit card fraud detection with autoencoder and probabilistic random forest. Mathematics 2021, 9, 2683.
[CrossRef]
31. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [CrossRef]
Symmetry 2023, 15, 568 16 of 16
32. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.;
et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830.
33. Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [CrossRef] [PubMed]
34. Géron, A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2022.
35. Sarhan, M.; Layeghy, S.; Portmann, M. Towards a Standard Feature Set for Network Intrusion Detection System Datasets. Mob.
Networks Appl. 2021, 27, 357–370. [CrossRef]
36. Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic
characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy, Funchal,
Portugal, 22–24 January 2018 pp. 108–116. [CrossRef]
37. Koroniotis, N.; Moustafa, N.; Sitnikova, E.; Turnbull, B. Towards the development of realistic botnet dataset in the Internet of
Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst. 2019, 100, 779–796. [CrossRef]
38. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch:
An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Curran
Associates, Inc.: Red Hook, NY, USA, 2019; pp. 8024–8035.
39. He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In
Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 11–18 December 2015; pp. 1026–1034.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.