A blog previous known as "Intrusion Detection on the Road": March 2007

Friday, March 30, 2007

Mohit's security blog: IPS algorithms...

Mohit's security blog: IPS algorithms...

See what I wrote in previous blog entry.

Background reading: Polygraph - Automatically Generating Signatures for Polymorphic Worms

The next paper from RAID 2006 I will comment is about manipulating Polygraph. Thus it seemed natural that I looked at the original publication Polygraph: Automatic Signature Generation for Polymorphic Worms (2005).

Polygraph is a program that automatically generates signatures for Polymorphic worms; that are worms that change (obfuscate) their appearance from time to time between attacks. Existing worm blocking solutions (before 2005) assumes that worms have the same content from time to time. Thus it is easy to automatically generate signatures (simple single strings of bytes) that filter out worms. However, this assumption does not apply for polymorphic worms.

Since however, the polymorphic worms are targeting specific vulnerabilities some of the payload must be same between all worms, so Polygraph collects suspicious and innocuous payloads, classified using a simple flow classifier, and then extract content signatures from them. Instead of just extracting one single string of bytes, as in previous algorithms, Polygraph extracts sets of byte sequences.

The extracted byte sequences are used in three different ways for detecting worms :

All byte sequences must be present in payload to indicate an worm
All byte sequences must be present in correct order to indicate an worm
All byte sequences are weighed together using a Naïve Bayes Classifier:

A byte sequence has probability being in a worm or not: P(seq | worm) and P(seq | ~worm)
A score is computed for a payload being a worm were {seq} means all sequences in a payload: score = P({seq} | worm) / P({seq} | ~worm)
Then the score is compared to a threshold and if true, the payload is believed to be an worm: score > tau

To handle the case that there are more than one type of polymorphic worms among the suspicious payloads, Polygraph uses hierarchical clustering of the byte sequences of the payloads. The sequences are merged into clusters by minimizing the false positives tested of the innocuous payloads.

Comment: First of all I think this an interesting paper, since I have a background in machine learning and Bayesian learning. However, the learning algorithms could probably be improved, for instance, by applying a more fully Bayesian approach than the used Naïve Bayes Classifier.

In addition I found an interesting comment at Mohit's security blog: IPS algorithms... that is as follows:

Most signatures in good products are vulnerability based so even if you change the attack it still gets stopped.

Thus, Polygraph might not be needed! Or what should we believe?

Wednesday, March 14, 2007

IPS without signatures or log analysis

ForeScout is a company that claims to have an Intelligent IPS that uses

an entirely unique approach to preventing network attacks from "zero-day" threats such as self-propagating malware and hackers/espionage without using signatures, anomaly detection or any form of pattern matching technology. ForeScout's solution has proven its accuracy by detecting in real-time every self-propagating threat to date and has gained the trust of 100% of our customers who use the appliances in automatic blocking mode.

In summary: Malwares are detected when probing the network for vulnerabilities. Any request to a non-existing IP address is assumed to be a certain indication of a malware, thus it should be stopped. The IPS answers each malware request with some marked information, and when the malware sends a new request with the marked information, it can be stopped before it can make an real intrusion attemp.

Comment: This seems to be a neat solution. Though, if it is true: why is research in this area still needed?

Tuesday, March 13, 2007

"Big Business"

Yesterday the Swedish newspaper Svenska Dagbladet had a set of articles about intrusion into Swedish companies. It really seems to be "Big Business". These articles I hope triggers more Swedish research in intrusion detection.

Sunday, March 11, 2007

Tao Security

A good blog for learning more on intrusion detection and things around it is at blogspot fellow Richard Bejtlich's blog Tao Security. Richard's posts are full of interesting remarks about the current standard of network security and intrusion detection. I wounder if it is possible to automate some stuff of what he calls Network Security Monitoring (NSM) and thus filtering out more irrelevant alarms?

Friday, March 9, 2007

Paper 4: Allergy Attack Against Automatic Signature Generation

This paper practically shows how to do what Can machine learning be secure? describes. In the paper, they show how to attack systems that uses Automatic Signature Generation (ASG). A typical ASG first detects an intrusion or attack, thereafter automatically generates a signature from the attack data and then filter out all future traffic matching the signature.

By using the fact that many ASG system does not use the same method to detect the attack and then create the signature they are able to fool the system into creating signatures for non-malicious traffic. Also, by not using the full context of an attack, such as the steps leading to the attack, ATG systems are easier fooled.

An ATG system seems to be a kind of unsupervised learning system, using anomaly detection to detect suspicious traffic. Then a signature is created from the traffic based on comparison between many suspicious traffic instances. The signature is often computed from the longest common byte sequence.

IDS and child pornography

According to a Swedish newspaper has the volume of child pornography seized by the police at single crimes increased from averaging from 10.000 - 20.000 pictures two years ago till being up to millions of pictures and movies. The cheer volume blocks the police from investigating the crimes (summary in Swedish below).

Barnporrfall blir liggande

- De stora volymerna blockerar våra resurser. Ett stort beslag för två år sedan kunde bestå av 10.000-20.000 bilder. Det tyckte vi var mycket då. I dag kan det finnas enskilda beslag där den misstänkta har lagrat flera miljoner filmer och bilder, säger Stefan Kronqvist, chef för Riskriminalens IT-brottssektion.

Maybe intrusion detection/prevension technology could be used to stop kiddie porn from being sent through a network? Though pedophiles probably encode their communication using some form of cryptography or maybe they use darknets. However, according to the following story, it might be a reasonable approach: Recent child porn busts are one result of stepped-up Internet monitoring.

Thursday, March 8, 2007

OSSEC is gaining momentum

OSSEC HIDS (see software link) is a project I am keeping an eye on. It seems to be gaining in popularity according to this blog:

http://www.appliedwatch.com/blog/?p=6

Tuesday, March 6, 2007

Follow up on: Can Machine Learning Be Secure?

In the previous post I mentioned I did not understand the relative distance metric they used for analyzing the security simple learning problem. However, in the paper they refer to a Master thesis that explains the metric in more detail.

The keys to understanding are the following:

To move the decision boundary as much as possible, each data point equals the previous mean added with the radius: Xt-1 + R
This is done alpha times at each iteration: (Xt-1 + R) * at
The next mean will be Xt = [Xt-1 * (n + a1+ ... +at-1) + (Xt-1 + R) * at ]/(n + a1+ ... +at-1 +at) where n is the number of previous data points
If we simplify the expression we will get: Xt = Xt-1 + R * at /(n + a1+ ... +at)
Assuming that the attacker has complete control of the learning process then n=0 thus: Xt = Xt-1 + R * at /(a1+ ... +at)
Using Mt = (a1+ ... +at) as the effort of the attacker and by noticing that the recursion leads to: Xt = X0 + R * [1 + a2/M2+ ... +at /Mt]
Thus: (Xt - X0)/R = [1 + a2/M2+ ... +at /Mt]
Noticing that ai/Mi = (Mi - Mi-1)/Mi = 1 - Mi-1/Mi
Thus we have the relative displacement (Xt - X0)/R = 1 + a2/M2+ ... +at /Mt = 1 + 1 - M1/M2 + ... +1 - Mt-1/Mt = t - [M1/M2 + ... + Mt-1/Mt]
The relative distance (or displacement) is then for t = T as in the paper:

D({Mi}) = T − [M1/M2 + ... + MT-1/MT]

Monday, March 5, 2007

Background reading: Can Machine Learning Be Secure?

The next paper from the RAID 2006 proceedings cites a paper called Can Machine Learning Be Secure? as it's source of inspiration. This is a theoretical paper while the RAID paper complements it by being experimental. Thus it seemed reasonable to read it before reading the next RAID paper.

Can Machine Learning Be Secure? That seems to be a good question. This paper analyzes how secure a learning system can be.

A learning system adjusts it's model given new data, these are some of the questions asked:

Can it be trained by an attacker to allow malicious calls?
Can it be degenerated such that it becomes useless and must be shut down?
Are there any defenses against these attacks?

The paper tries to create a taxonomy of attacks on a learning system but I don't think it is that successful. The taxonomy has three axes:

Influence: the part of the learning system that is manipulated, causative (alter the training data) or exploratory (trying to discover information about the system)
Specificity: a continuous spectrum, from achieving a specific goal, for instance to manipulate the learning system to accept a specfic malicious call, to acheiving a broader goal, for instance to manipulate the learner to reveal the existence of any possible malicious call.
Security violation: what security goal is violated, integrity (false negative) or availability (many classification errors making the system useless).

Comment: I don't think the paper gives enough reasons for this taxonomy. It is not that clear to me that these axes and scales are completely orthagonal or at least describes the space of attacks in a good way. Although I cannot, at the movement, come up with something better, I think it should be possible to think this through again and come up with something better. Maybe it is the vocabulary that is problematic; maybe by using different words, the taxonomy will be more readable.

Then the paper lists defenses against the different attacks, such as adding prior distributions (robustness) that makes the system less sensitive to altered data, detecting attacks with intrusion detection mechanism that analyzes the training data, confusing the attacker using disinformation that hinders the attacker from learning decision boundaries and, what seems to be a special case of the former, randomization of the decision boundaries.

Comment: Bayesian learning methods seems to a be natural choice since prior distributions are in the essence of the Bayesian concept.

Last in the paper, they analyze a simple learning example for outlier detection on the bounds of the effort an attacker has to use to manipulate the learning system into wrongly classify a malicious call.

Comment: I cannot write much about this analysis since I could not understand the definition of the relative distance they use. I don't understand why they use it and what it means. Thus I do not understand the result. Is there anybody out there that can help me with this?

See follow up post on this issue.

A blog previous known as "Intrusion Detection on the Road"