A customer recently reported some False Positives on a signature and it made for an interesting case to track down.
The report mentioned that 2016950 was trigging on benign HTTP traffic and was actually one of “Top 5” signatures generating alerts within the environment.
This reported prompted a review of the rule.
TLDR; Do you have FPs that are causing you problems? Instead of just disabling that rule, report it and Emerging Threats and can tune the rule for you! FPs can be publicly submitted here on the Discourse site or privately via https://feedback.emergingthreats.net.
Seek First to Understand, Then to Make the Tune.
In reviewing a rule there are several items considered, though the primary considered is often if the rule is still producing true positives alerts. Other considerations are specific to a malware family still being active within the malware ecosystem, the age of the rule, rate of false positives, etc.
2016950, as indicated by the “created_at” metadata, was created just over a decade ago and is currently on it’s 7th revision. It was last updated a bit over a year ago and internal tooling indicates this rule was created by none-other than ET’s OG Will Metcalf.
sid:2016950; rev:7; metadata:created_at 2013_05_31, updated_at 2022_06_27;
This context is very helpful to keep in mind as tunes are considered. Here is a 10 year old rule having undergone several revisions already. This leads us to believe that the signature has been prone to FPs for some time.
Looking at the actual detection logic it can be determined that the rule is detecting output HTTP requests to any URI ending in /ip.txt
where the HTTP User-Agent does not contain “Mozilla”, with a couple negations.
http.uri; content:"/ip.txt"; nocase; endswith; fast_pattern;
http.header; content:!"%E5%A4%A7%E4%BC%97%E7%82%B9%E8%AF%84";
http.user_agent; content:!"Mozilla";
http.host; dotprefix; content:!".malwaredomainlist.com";
Finding True Positives
In order to tune any signature, a good sample of true positives traffic is helpful. This establishes the baseline of what is possible with any tunes. Creating false negatives (FNs) is a very real outcome of tuning rules. Having true positive traffic allows any changes to the rule to be tested and ensure that FNs are not introduced.
In this case, due to the age of the rule and the high number of False Positives (even within Emerging Threat’s own dataset) finding true positive traffic proved to be difficult.
Pivot from the reference sample
Within the rule, there is a single md5 reference. When a rule includes a MD5 reference, a link to VT, or Malware Bazaar, etc it’s generally just the sample which was used to create the rule. To explain that further, the sample would have been executed in a sandbox and then the network traffic from that sandbox execution was analyzed and the rule created based on it. It is important to realize that the md5 is a “reference sample” and not considered the definitive reference. It is the intent to cover multiple unique samples with a single rule. Even though there is a md5 reference in the rule, it’s possible, and even desirable, that other samples will generate true positive traffic and be covered by the same signature.
reference:md5,4d23395fcbab1dabef9afe6af81df558;
In this case, even after attempting to sandbox the sample, 4d23395fcbab1dabef9afe6af81df558
did not produce any network traffic nor was there any traffic stored within Emerging Threats internal dataset for this sample.
#####VirusTotal
In looking at the sample within VirusTotal, there is a unique artifact which allowed for pivoting. This sample opened, wrote to and then deleted a file called DELME.BAT
A quick pivot within VT allowed for 79 other samples which produced the same behavior, several of which were first uploaded to VT within the last year.
behaviour_files:"C:\\WINDOWS\\DELME.BAT"
After sandboxing all of the samples, several produced traffic which matched the existing rule!
md5sum | first seen on VT | compile date |
---|---|---|
077b4fa1df1e7daa8a92215461add816 | 2023-09-15 09:27:21 | 2008-08-03 09:11:21 |
a256c7a6c4f0f46219aabfbce1976980 | 2023-09-15 09:17:10 | 2008-08-03 09:11:21 |
ec7735fea5c1fd350dcdc28673084914 | 2023-09-15 09:17:10 | 2008-08-03 09:11:21 |
1aa9dd3138670d24948dd046a1dfe540 | 2023-09-15 09:15:51 | 2008-08-03 09:11:21 |
30bd44741c26ffc4b752112affc8849a | 2019-05-01 05:35:05 | 1992-06-19 22:22:17 |
4a20b3500310b8b442c2387480f4e280 | 2014-04-06 09:25:21 | 1992-06-19 22:22:17 |
16436fac049aac6cf8d176b5add4de76 | 2013-06-12 23:10:37 | 2007-11-11 07:14:33 |
249af293e2820a66977dacbbbeb0b5d2 | 2012-01-21 11:10:52 | 2008-09-12 05:14:23 |
52d45e6e6b3ab80a9912549e5b0fdfd1 | 2012-01-20 15:58:06 | 2008-05-16 15:56:06 |
1bada85493cd78d525886978d2e795c7 | 2008-05-06 15:24:25 | 2008-05-06 13:13:42 |
Several other samples produced traffic which matched a different signature for detecting Hupigon.
The true positive samples, which matched the signature in question, produced traffic that only varied within the HTTP Host header.
Emerging Threats Internal Data
VirusTotal proved useful, though within the mass of false positives within the Emerging Threats data, there are true positives as well. Infact, the second sample reviewed ended up being a true positive of high value. 93f8587d7f977a64142b0979c534e8fd
(first submitted to VT way back on 2007-03-30 09:35:13) produced the following network traffic:
This network traffic is different from the previous true positives samples at least four ways:
- the URI includes a base directory for which
/ip.txt
is in - the HTTP Version is different (
HTTP/1.0
vsHTTP/1.1
- The HTTP User-Agent is different (
RAV1.23
vsSERVER2_03
) - The
Cache-Control
header has been replaced with aPragma
header.
As explained by the Mozilla Developer Docs the Cache-Control and Pragma header is an effect of the different HTTP versions.
As the current version of the rule alerts on both types of network traffic produced by of the Hupigon samples, these differences need to be considered when making any rule modifications.
Finding False Positives
Collecting false positives is an important step to help ensure we are “tightening” the signature in a way to avoid incorrectly alerting on them. Many false positives were located using Emerging Threats internal data. This process was generally as simple as looking at all the HTTP traffic produced and seeing if it as consistent with true positive samples.
One example of an False Positives is 09428675e4d527fbff579570522b2e53
which produced the following request for /ip.txt
While this traffic might be malicious, it certainly isn’t Hupigon. This sample actually triggered another rule, 2805265 - ETPRO ADWARE_PUP W32/Chistudi Checkin
.
The Tuning Process
Taking a careful look at the traffic produced by true positives samples and comparing it to the existing rule there is at least one option to reduce false positives. The ordering and number of the HTTP headers is consistent across all samples.
- User-Agent
- Host
- Either Cache-Control or Pragma with a value of
no-cache
using the a combination of http.header
and http.header_names
buffer we can enforce this logic:
http.header; content:"|3a 20|no-cache|0d 0a|"; endswith; http.header_names; content:"|0d 0a|User-Agent|0d 0a|Host|0d 0a|"; content:"|0d 0a 0d 0a|"; within:17; pcre:"/(?:Cache-Control|Pragma)\x0d\x0a\x0d\x0a$/":
Tuning Results
After testing this modification on true positives samples, the signature continued to fire, while the discovered false positives samples did not alert. This rule will now be under “Revision 8” as it continues providing good true positive alerts on recently uploaded samples.