Abstract:
With the increase in the dependence on Internet for transactions and communications, ensuring
security has become a necessity. Intrusion Detection System (IDS) is one of the important
software or hardware devices in security architecture that is used to ensure a safe communication
between organizations. Its capability to monitor the complete packet (promiscuously) makes it a
good complement tool to other security tools like firewall, anti-virus etc.
Data stream mining is an active research area that has recently emerged to discover knowledge
from large amounts of continuously generated data. In this dissertation, we have focused on data
streams and data mining techniques to be able to detect attacks on the fly, to learn new attacks
for better accuracy in prediction and to generate alarms for the same.
The dissertation has proposed a hybrid approach to efficiently detect intrusions in the network in
real time with high accuracy. To improve the accuracy, both the intrusion detection approaches
viz. Anomaly and Misuse detection has been used (in sequence). To improve the detection rate,
we have used Decision Tree for creating the signatures for the attack data and LERAD’s
ensemble for anomaly detection in the streaming data. LERAD is a rule-based algorithm for
intrusion detection and falls under the category of Anomaly Detection approach of Intrusion
Detection. We have implemented the algorithm using Python and its libraries and have tested our
results on NSL-KDD dataset, benchmark in intrusion detection field by simulating the datasets as
streaming data. All the major challenges with Streaming Data viz. Infinite Length, Concept Drift
and Concept Evolution have been addressed. The report has shown incremental improvements in
the accuracy and compared the proposed framework with the techniques used.