NETWORK SECURITY THROUGH INTRUSION DETECTION SYSTEMS USING
ARTIFICIAL INTELLIGENCE
ABSTRACT
Intrusion detection systems have previously been built by hand. These systems have difficulty successfully classifying intruders, and require a significant amount of computational overhead making it difficult to create a robust real-time IDS system. Artificial intelligence techniques can reduce the human effort required to build these systems and can improve their performance. Learning and induction are used to improve the performance of search problems, while clustering has been used for data analysis and reduction. AI has recently been used in intrusion detection (ID) for anomaly detection, data reduction and induction, or discovery, of rules explaining audit data. We survey uses of artificial intelligence methods in ID and feature selection to improve the classification of network connections. The network connection classification problem is related to ID since intruders can create “private” communication services undetectable by normal means. We also explore some areas where AI techniques may further improve IDSs.
Introduction to network security:
A network is two or more machines interconnected for communications. When business is conducted, sensitive data is stored and transferred, and sensitive communications occur. Some opportunistic people might attempt to disrupt that business, steal or destroy the data, or exploit the communications.
The goals of security must be confidentiality, integrity and availability.
.Who is attacking? An attacker can be either a hacker, or a cracker or a novice.
The security mechanisms of a system are designed so as to prevent unauthorized access to system resources and data. Completely preventing breaches of security appear unrealistic at present. We can, however, try to detect these intrusion attempts so that the action may be taken to repair the damage later. This field of research is called intrusion detection.
Introduction to intrusion detection
Intrusion detection is the process of monitoring and evaluating computer events and network traffic for signs of intrusions. It is a hardware device with software that is used to detect unauthorized activity on your network. They are burglar alarms of the computer security systems. IDS implementation can log and alert you to unauthorized activity on your network. IDS software can be implemented on individual hosts, servers, at the network perimeter or throughout the entire network.
The aim is to defend a system by using a combination of an alarm that sounds whenever the site’s security has been compromised and an entity- most often a site security officer (SSO) - that can respond to the alarm and take the appropriate action.
Need for IDS (INTRUSION DETECTION SYSTEMS):
1. In practice, it is not possible to build a completely secure system.
2. The vast installed base of systems world wide guarantees that any transition to a secure system will be long in coming.
3. Cryptographic methods have their own problems. Passwords can be cracked, lost and entire crypto-systems can be broken.
4. A truly secure system is vulnerable to abuse by insiders who abuse their privileges.
5. It has been that the relationship between the level of access control and user efficiency is an inverse one, which means that the stricter the mechanisms, the lower the efficiency becomes.
An IDS does not usually take preventive measures when an attack is detected; it is a reactive rather than pro-active agent.
The most popular way to detect intrusions has been by using the audit data generated by Operating System. An audit trail is a record of activities on a system that are logged to file in chronologically sorted order. Audit trails are particularly useful because they can be used to establish guilt of attackers and they are often the only way to detect unauthorized but subversive user activity. This is a good substitute for manual analysis.
Intruders can be classified as internal and external. External intruders are unauthorized users of machines they attack. Internal intruders masquerade as another user, those with legitimate access to sensitive data, or the clandestine intruders who have the power to turn off audit control for themselves.
Problems in intrusion detection:
Issues in intrusion detection include data collection, data reduction, behaviour classification, reporting and response. Data reduction reduces processing time, communications overhead and storage requirements. Classification is the process of identifying attackers and intruders. Artificial intelligence techniques have been used in many Intrusion Detection Systems to perform these important tasks in a more efficient manner.
Classification of IDS
Techniques of intrusion detection are of two main types-anomaly detection and misuse detection.
Anomaly detection:
These techniques assume that all intrusive activities are necessarily anomalous. There are two problems here.
1) anomalous activities that are not intrusive are flagged as intrusive.
2) Intrusive activities that are not anomalous result in false negatives.
So, the main issues in the anomaly detection system become the selection of threshold levels so that neither of the two problems is unreasonably magnified and the selection of features to monitor. These systems are also computationally expensive
because of the overhead of keeping track of and updating several system profile metrics.
Misuse detection:
The concept here is that there are ways to represent attacks in the form of a pattern or signature so that even variations of same attack can be detected i.e. they can detect many or all known attack patterns but are of little use for as yet unknown attack methods.
Anomaly detection systems
Statistical approaches:
Behaviour profiles for subjects are generated. As system runs, anomaly detector constantly generates the variance of present profile from the original one. The main advantage to statistical systems is that they adaptively learn the behaviour of users; they are thus potentially more sensitive than human experts. However, there are a few problems. They can be gradually trained by intruders so that eventually intrusive events are considered normal, false positives and false negatives are generated depending on whether the threshold is set too low or too high and relationships between events are missed because of statistical measures to order of events.
An open issue here is the selection of measures to monitor. A static and dynamic determination of the set of measures should be done. Some problems associated with this technique have been remedied by other methods, including the method of involving ‘Predictive Pattern Generation’ which takes part events into account while analysing the data.
2.3.3 Attacks captured by software
IGMP KOD
Predictive pattern generation:
This method tries to predict future events based on the events that have already occurred. Therefore, there can be a rule
E1 - E2 - -> (E3 =80%, E4 =15%, E5 =5%)
This would mean that given that events E1 and E2 have occurred, with E2 occurring after E1 there is an 80% probability that event E3 will follow, a 15% chance that event E4 will follow and a 5% probability that event E5 will follow. The problem with this is that some intrusion scenarios that are not described by the rules will not be flagged intrusive Thus, if an event sequence A –B –C exists that is intrusive, but not listed in the rule base, it will be classified as unrecognised. This problem can be partially solved by flagging any unknown events as intrusions (increasing the probability of false positives), or by flagging them as non-intrusive (thus increasing the probability of false negatives). In the normal case, however, an event is flagged intrusive if the left hand side of a rule is matched, but the right hand side is statistically very deviant from the prediction.
There are several advantages here. Rule based sequential patterns can detect anomalous activities that were difficult with traditional methods. Systems built using this model are highly adaptive to changes. It is easier to detect users who try to train system during its learning period. Anomalous activities can be detected and reported within seconds of receiving audit events.
Neural networks:
The idea used here is to train the neural network to predict a user’s next action or command given the window of n previous actions or commands. The network is trained on a set of representative user commands. Some advantages of using neural networks are they cope well with noisy data, their success does not depend on any statistical assumption about the nature of the underlying data and they are easier to modify for new user communities. There are also some problems. A small window results in false positives while a large window results in irrelevant data as well as increase of false negatives. The net topology is only determined after considerable trial and error. The intruder can train the net during its learning phase.
Misuse detection systems:
Expert systems are modelled in such a way as to separate the rule matching phase from the action phase. The matching is done according to audit trail events. Next Generation Intrusion Detection Expert Systems (NIDES) follows a hybrid intrusion detection technique consisting of misuse detection component as well as an anomaly detection component, the anomaly detectors based on statistical approach. The misuse detection component encodes known intrusion scenarios and attack patterns. One advantage is it has a statistical component as well as an expert system component. So, the chances of one system catching intrusions missed by the other increase. Another advantage is problem’s control reasoning is clearly separated from the formulation of the solution.
Some drawbacks for the expert systems are the expert system has to be formulated by a security professional and thus the system is only as strong as the security personnel who programs it. So, there is a chance that expert systems can fail to flag intrusions.
The NIDES system runs on a machine different from the machine(s) to be monitored which could be unreasonable overhead. Additions and deletions of rules from rule-base must take into account the inter-dependencies between different rules in the rule-base and there is no recognition of sequential ordering of data, because the various conditions that makeup a rule are not recognised to be ordered.
Keystroke monitoring:
It is a technique that monitors keystrokes for attacks. Here, there are several defects—features of shells like bash, ksh, and tcsh in which user definable aliases are present defeat the technique unless alias expansion and semantic analysis of the commands is taken up. This method does not analyse running of a program, only keystrokes. So, malicious programs cannot be flagged for intrusive activities. Operating System does not offer much support for keystroke capturing, so keystroke monitor should have a hook that analyses keystrokes before sending them to intended receiver. Also, system calls should be monitored by application programs as well so that an analysis of the program’s execution is possible.
Model based intrusion detection:
It states that certain scenarios are inferred by certain other observable activities. If these activities are monitored, it is possible to find intrusion attempts by looking at activities that infer a certain intrusion scenario.
The model based scheme consists of three important modules. The anticipator uses active modules and scenario modules (knowledge base of intrusion scenario specification) try to predict next step in the scenario that is expected to occur. Planner then translates this hypothesis into a format that shows behaviour as it would occur in audit trail. It uses predicted information to plan what to search for next. The interpreter then searches for this data in the audit trail. The system proceeds this way accumulating more and more evidence for an intrusion attempt until a threshold is crossed, at this point signals an intrusion attempt.
Large amounts of noise present in audit data can be filtered. System can predict attacker’s next move based on intrusion model. These predictions can be used to verify an intrusion hypothesis to take preventive measures or determine what data to look for next.
Some critical issues are: patterns for intrusion scenarios must be easily recognised. Patterns must always occur in the behaviour being looked for. The patterns must be distinguishing; they must not be associated with any other normal behaviour.
State transition analysis:
In this technique, the monitored system is represented as state transition diagram. As data is analysed, the system makes transitions from one state to another. A transition takes place on some Boolean condition being true. The approach is to have state transitions from safe to unsafe states based on known attack patterns.
Advantages are it can detect co-operation attacks, it can detect attacks that span across multiple user sessions, and it can foresee impending compromise situations based on present system state and take pre-emptive measures.
Problems are attack patterns can specify only a sequence of events rather than more complex forms. There are no general purpose methods to prune the search except through assertion primitives. They cannot detect denial of service attacks, failed logins, variations from normal usage and passive listening because these items are either not recorded by the audit trail mechanism or they cannot be represented by state transition diagrams. It should be used with anomaly detector so that more intrusion attempts may be detected by their combination.
Pattern matching:
This model encodes known intrusion signatures as patterns that are then matched against audit data. It makes transitions on certain events, called labels and Boolean variables called guards can be placed at each transition.
Important advantages are declarative specification, portability, excellent real time capabilities, detects some attack signatures that state transition model cannot, multiple event streams can be used together to match against patterns for each stream without the need to combine streams.
Problems are it can only detect attacks based on vulnerabilities. It is not useful for ill-defined patterns. It cannot detect wire-tapping intrusions nor can it detect spoofing attacks where a machine pretends to be another machine by using its IP address.
ARTIFICIAL INTELLIGENCE AND INTRUSION DETECTION:
Artificial intelligence is concerned with improving algorithms by employing problem solving techniques used by human beings.
Data reduction for intrusion detection:
Due to the massive amount of audit data available, classification by hand is impossible. Also, complex relationships exist between the features that are difficult for humans to discover. So, the amount of data to be processed should be reduced. Data that is not considered useful can be filtered. Data can be grouped or clustered to reveal hidden patterns; by storing the characteristics of the clusters instead of data, overhead can be reduced. Feature selection can also be used to eliminate some data sources.
Data filtering:
The purpose is to reduce the amount of data directly handled by Intrusion Detection Systems. This decreases storage requirements and reduces processing time. Data filtering is done using heuristic or ad hoc methods, which can be viewed as expert rules for filtering.
Feature selection:
Some data hinders classification process. Features may contain false correlations which hinder the process of detecting intrusions. Some of the features may be redundant since their information is contained in others. Feature selection improves classification by searching for the subset of features which best classifies the training data. It is used to find features most indicative of misuse, or can be used to distinguish between types of misuse.
Data clustering:
This is used to find hidden patterns in data and significant features for use in detection and can also be used as a reduction technique by storing the characteristics of clusters. There is a close relationship between learning and clustering. Hence this is used by AI.
Behaviour classification in intrusion detection:
Classification has drawbacks like false positives and false negatives. AI techniques can be used to improve this.
Expert systems:
In this system, a set of rules encoding knowledge of an expert are used to make conclusions about matter gathered by IDS. The expert determines the most appropriate rule to select. This can be implemented with a neural network which reports anomalies to the expert system and also employs data not used by the net.
Using feature selection in network based intrusion detection:
Computer systems are increasingly network dependent. Hence, it is imperative to protect both local and regional networks. An intruder can hide network connections by strategically placing the servers that receive the connections on different ports. The mapping of ports to services is internal to a single machine; an intruder could also change the port map. Thus, identification of type of connection made without referring to port numbers is mandatory.
We can improve classification of n/w connections by minimizing classification error rate and by reducing the number of features required to classify connections by using feature selection algorithms.
Search algorithms
1. Backward sequential search begins with a full set of features. At each stage of search, each feature in the remaining set is removed. The best feature to be eliminated is determined by comparing the error rates of the classifiers created using the resulting feature sets.
2. Beam search is a type of best first search which uses a bounded queue and the best state is placed in the front of the queue. The algorithm operates by taking the first state in the queue and extending the search as in backward sequential search. Each new state visited is placed in the queue in the order of goodness of its state.
3. In random generation plus sequential selection several sequential selections from different places in the search space are performed. To do so, we generate a random feature set, then perform backward and forward sequential selection of the state. This is the best search algorithm among the three.
Future uses of AI in intrusion detection:
Many IDS’s employ AI methods in their systems for improvement. Some of the AI techniques are:
• Feature selection in intrusion detection systems
• Reconfiguration and customization of IDS’s
• Clustering in intrusion detection
IDS’s make extensive use of AI techniques to improve their ability to detect attacks on computer systems.
CONCLUSION:
Intrusion detection is still a fledging field of research. However, it is beginning to assume enormous importance in today’s computing environment. The combination of facts such as the unbridled growth of the internet, the vast financial possibilities opening up in electronic trade, and the lack of truly secure systems make it an important and pertinent field of research. Future research trends seem to be converging towards a model that is a hybrid of the anomaly and misuse detection models; it is slowly acknowledged that neither of the models can detect all intrusion attempts on their own.
We have proved the need for IDS and discussed its classification. The major classification is into anomaly and misuse detections, we have gone briefly into the different techniques used in anomaly detection systems. Some of them are statistical approaches, predictive pattern generation and usage of neural networks. We have also discussed about the misuse detection systems like keystroke monitoring, model based intrusion detection, state transition analysis and pattern matching.
we have provided a brief survey of AI methods used in a variety of IDSs. We dealt with the need of data reduction for intrusion detection and the methods of data filtering, types of data filtering like feature selection and data clustering. We described the behaviour classification in intrusion detection using expert systems and rule based induction. We have also shown how one technique, feature selection, can be used to reduce overhead and improve classification of network connections.
No comments:
Post a Comment