Hey Folks,
I know you just saw me put down a pretty meaty blog post a few minutes ago, but this is another blogpost that is somewhat related. On top of that, I’m making it a personal goal this year to share a blog post on the community pages at least once per month, so I’m aiming for a double header to cover March and April .
Like the last entry I made, this blog will feature some rules I wrote from the PolarEdge botnet research that sekoia.io wrote up. This time around, I’m going to very briefly cover some of Proofpoint/Emerging Threats QA standards for the rules we produce, cover an introduction to Suricata (and Snort) rule performance metrics, then we’ll talk about writing flexible rules.
Rule Performance: Emerging Threats Rule Criteria
Emerging Threats has certain criteria when it comes to rule writing:
- Is the rule Syntactically correct?
- If the rule doesn’t pass basic validation by Snort/Suricata or causes startup errors, obviously it has to be fixed before it can ship
- Can it be universally deployed?
- If possible, we want versions of this rule that support all supported IDS syntax versions – Snort 2.9.x, Suricata 5, and Suricata 7.0.3 syntax
- Can it be done with a default configuration?
- If Making a rule or using certain keywords requires modifying the default config file (snort.conf or suricata.yaml), then we can’t do it, and have to find other ways to devise detection. A lot of the time this requires us to use unmodified content matches for Snort rules because the port or ports the bad guys use for HTTP communication are random, or are not in the default HTTP_PORTS variable. Thankfully, Suricata brought us dynamic protocol detection.
In the way back when (and well, also currently since snort 2.9.x is still actively supported), Snort relied on users configuring arrays of port numbers in which they expected specific network traffic. To inspect HTTP traffic and use http modifiers on non-default ports for example, the target port has to be added to the HTTP_PORTS variable, stream5, and HTTP preprocessors. Compare that to Suricata 5 and 7, and dynamic protocol detection.
Rule Performance Metrics
On top of those general criteria, we have performance requirements. This mostly requires us to test new rules against a battery of pcaps that are meant to be a sort-of realistic depiction of real-world traffic in different environments. I say somewhat realistic because, some of these pcaps are getting a little bit long in the tooth (If you have an interest in helping is continue to provide a free and relevant ruleset for Snort and Suricata, reach out to us. We could always use new pcaps for baselining).
If the rules we produce have excessive alerts (false positives in most cases) or the performance metrics (more on this in a minute) are just not good, we either modify them, ship them to let users know the performance cost is high (via a metadata tag), or if there is no alternative, ship it in the ruleset disabled, and let users make up their mind on whether or not they want to enable it.
So, what are rule performance metrics? For Snort and Suricata, rule performance is measured in checks, matches, and ticks. Without getting too deep into it, checks are how often a rule’s fast_pattern
matches against traffic during Suricata or Snort session, Matches are how often the rest of the criteria of a rule matches once a fast_pattern
for a rule has matched, and ticks are a measurement of how many CPU cycles a rule has eaten over the course of Snort or Suricata running. Ticks can be measured by total ticks, average ticks, average ticks that don’t result in a match, average ticks that do result in a match, etc.
Be aware that if you’re interested in doing performance measurement at home, Snort and Suricata both have compile options for doing performance analysis. In general, its not recommend to compile these options in for Sensors in production environments, as monitoring performance introduces a performance hit of its own. Which is why I recommend using Dalton for measuring rule performance, as each instance comes with rule profiling support compiled in, as well as options to produce rule profiling statistics when analyzing pcaps.
In conclusion, the lower the checks, matches, and ticks are while actually detecting the threat they’re meant to catch, the better.
Rigid Rules vs. Flexible Rules
With an understanding on rule performance metrics, I’m going to dump some of the rules I made for PolarEdge here, based on the info from the blog:
alert http1 any any -> $HOME_NET any (msg:"ET MALWARE PolarEdge Webshell Installation attempt"; flow:established,to_server; http.uri; content:"/cgi-bin/"; startswith; pcre:"/^(?:config|config_mirror)\x2eexp\x3f/R"; http.header; content:"M|3a 20|H4s"; fast_pattern; content:"CMD|3a 20|"; reference:url,blog.sekoia.io/polaredge-unveiling-an-uncovered-iot-botnet/; classtype:trojan-activity; sid:2060440; rev:1; metadata:affected_product IoT, attack_target Networking_Equipment, tls_state plaintext, created_at 2025_02_28, deployment Perimeter, deployment Internal, malware_family PolarEdge, performance_impact Low, confidence High, signature_severity Major, updated_at 2025_02_28, mitre_tactic_id TA0001, mitre_tactic_name Initial_Access, mitre_technique_id T1190, mitre_technique_name Exploit_Public_Facing_Application; target:dest_ip;)
alert http1 any any -> $HOME_NET any (msg:"ET MALWARE PolarEdge Webshell Activity"; flow:established,to_server; http.uri; content:"/cgi-bin/userLogin.cgi"; http.header; content:"PASSHASH|3a 20|"; fast_pattern; content:"XCMD|3a 20|"; reference:url,blog.sekoia.io/polaredge-unveiling-an-uncovered-iot-botnet/; classtype:trojan-activity; sid:2060441; rev:1; metadata:affected_product IoT, attack_target Networking_Equipment, tls_state plaintext, created_at 2025_02_28, deployment Perimeter, deployment Internal, malware_family PolarEdge, performance_impact Low, confidence High, signature_severity Major, updated_at 2025_02_28, mitre_tactic_id TA0008, mitre_tactic_name Lateral_Movement, mitre_technique_id T1210, mitre_technique_name Exploitation_Of_Remote_Services; target:dest_ip;)
alert http1 any any -> $HOME_NET any (msg:"ET MALWARE PolarEdge TLS Backdoor Installation Attempt"; flow:established,to_server; http.uri; content:"/cgi-bin/"; startswith; pcre:"/^(?:config|config_mirror)\x2eexp\x3f/R"; content:"/tmp|3b|busybox"; distance:0; fast_pattern; content:"ftpget"; distance:0; http.header; content:"Referer|3a 20|https|3a 2f 2f|"; content:".htm|0d 0a|"; distance:0; reference:url,blog.sekoia.io/polaredge-unveiling-an-uncovered-iot-botnet/; classtype:attempted-admin; sid:2060442; rev:1; metadata:affected_product IoT, attack_target Networking_Equipment, tls_state plaintext, created_at 2025_02_28, deployment Perimeter, deployment Internal, malware_family PolarEdge, performance_impact Low, confidence High, signature_severity Major, updated_at 2025_02_28, mitre_tactic_id TA0001, mitre_tactic_name Initial_Access, mitre_technique_id T1190, mitre_technique_name Exploit_Public_Facing_Application; target:dest_ip;)
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"ET MALWARE PolarEdge CnC Checkin"; flow:established,to_server; http.uri; content:"ip|3d|"; content:"version|3d|"; content:"module|3d|"; nocase; content:"cmd|3d|putdata"; fast_pattern; content:"data|3d|BRAND"; content:"PORT|3d|"; content:"PWD|3d|"; content:"MAC|3d|"; reference:url,blog.sekoia.io/polaredge-unveiling-an-uncovered-iot-botnet/; classtype:trojan-activity; sid:2060443; rev:1; metadata:affected_product IoT, attack_target Networking_Equipment, tls_state TLSDecrypt, created_at 2025_02_28, deployment Perimeter, deployment Internal, malware_family PolarEdge, performance_impact Low, confidence High, signature_severity Major, updated_at 2025_02_28, mitre_tactic_id TA0011, mitre_tactic_name Command_And_Control, mitre_technique_id T1573, mitre_technique_name Encrypted_Channel; target:src_ip;)
I know these rules are a lot of text to read, so I’m going to focus on one rule out of the ones above that I think embodies a flexible rule:
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"ET MALWARE PolarEdge CnC Checkin"; flow:established,to_server; http.uri; content:"ip|3d|"; content:"version|3d|"; content:"module|3d|"; nocase; content:"cmd|3d|putdata"; fast_pattern; content:"data|3d|BRAND"; content:"PORT|3d|"; content:"PWD|3d|"; content:"MAC|3d|"; reference:url,blog.sekoia.io/polaredge-unveiling-an-uncovered-iot-botnet/; classtype:trojan-activity; sid:2060443; rev:1; metadata:affected_product IoT, attack_target Networking_Equipment, tls_state TLSDecrypt, created_at 2025_02_28, deployment Perimeter, deployment Internal, malware_family PolarEdge, performance_impact Low, confidence High, signature_severity Major, updated_at 2025_02_28, mitre_tactic_id TA0011, mitre_tactic_name Command_And_Control, mitre_technique_id T1573, mitre_technique_name Encrypted_Channel; target:src_ip;)
To sum up this rule, we’re looking for a bunch of URI parameters in TLS Decrypted HTTP traffic. We know this because right before the stream of content matches in this rule, I dropped http.uri
to ensure all of the next content related keywords are in the URI of an HTTP request:
http.uri; content:"ip|3d|"; content:"version|3d|"; content:"module|3d|"; nocase; content:"cmd|3d|putdata"; fast_pattern; content:"data|3d|BRAND"; content:"PORT|3d|"; content:"PWD|3d|"; content:"MAC|3d|";
Let’s compare that to the backdoor mentioned in the report, and its URI structure:
hxxps://195.123.212[.]54:58425/cCq65x?ip=[DEVICE PUBLIC IP]&version=1.6&module=CISCO_3&cmd=putdata&data=BRAND=Cisco,MODULE=CISCO_3,PORT=[BINDING PORT],PWD=[BINARY EXECUTION PATH],MAC=[DEVICE MAC ADDRESS]
Flexibility by omission of data
You might notice that in my rule, I didn’t include the /cCq65x?
at the start of the URI. Its a unique content match on its own, and in another life, I might have used it as the fast_pattern
for this rule. But there’s no guarantee that they’re not using a different URI endpoint for other operations, or for the other IoT devices the article says the actors were compromising (ASUS, QNAP, Synology), so I want to omit it, and look for an alternative fast_pattern
match. I wanted to increase the chance that maybe other versions of the C2 Check-in URI might be caught by using this rule.
You might also notice that for the for some of the content matches, I ommitted some data (e.g., content:"data|3d|BRAND"
instead of content:"data|3d|BRAND|3d|Cisco";
and content:"module|3d|"; nocase;
instead of content:"MODULE|3d|CISCO_3";
). Again, this is to make a flexible rule that, (provided you’re doing SSL decryption), might allow Snort/Suricata users to potentially catch variants of this communication to the C2 server, or maybe new versions.
“If this rule is meant to be flexible, why are the content matches in the same order as the content from the blog?”
Good question, but allow me to answer with a question of my own: Did you know that Suricata and Snort’s content matches are NOT position-dependent unless you use positional keywords like startswith, endswith, offset, depth, distance, or within
? What that means is that I could place the content matches above in ANY order, and Suricata will still find a match. Don’t believe me? Let’s let Dalton and Flowsynth do some heavy lifting for me.
First, I’m going to make a pcap that is a duplicate of the data from the blog post, using Flowsynth:
Flowsynth is magic by which you can turn strings or byte patterns in research reports, blogs, social media screencaps, sandbox runs, etc. into mock pcaps for testing.
Now, let’s do a Suricata run with the original rule above, and a rule in which I scramble the order of the content matches:
I’m submitting two rules with identical content matches, but in different orders for the content matches.
We can see that both the rules fire. Pretty neat, huh?
So we can see that the rules both trigger correctly, buuuut check out the difference in performance:
While both of these rules work, notice how Suricata uses a few more ticks to process out of order content matches on the secondary rule?
Using distance and within to control where a content match should be
Let’s look at two other alternative rules:
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"PolarEdge CnC Checkin with distance zero content matches"; flow:established,to_server; http.uri; content:"ip|3d|"; content:"version|3d|"; distance:0; content:"module|3d|"; nocase; distance:0; content:"cmd|3d|putdata"; fast_pattern; distance:0; content:"data|3d|BRAND"; distance:0; content:"PORT|3d|"; distance:0; content:"PWD|3d|"; distance:0; content:"MAC|3d|"; distance:0; sid:668; rev:1;)
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"PolarEdge CnC Checkin using within content matches"; flow:established,to_server; http.uri; content:"ip|3d|"; content:"version|3d|"; within:60; content:"module|3d|"; within:20; content:"cmd|3d|putdata"; fast_pattern; within:40; content:"data|3d|BRAND"; within:20; content:"MODULE|3d|"; within:30; content:"PORT|3d|"; within:20; content:"PWD|3d|"; within:20; content:"MAC|3d|"; within:50; sid:669; rev:1;)
Let’s say you wanted to write a Suricata rule and you wanted to make sure the content matches were sequential, in the order that you wrote the content matches in a rule body. You can use the distance
content modifier, or you can use within
. Distances normally tells a rule to start searching in a buffer or payload X number of bytes AFTER the previous content match. What’s special about distance:0;
is that it tells suricata to start looking immediately after the previous content match with no limit to the number of bytes between the previous content match, and the match with the distance:0;
modifier.
There are cases where you have no idea how many bytes there will be between content matches, but you know that certain content matches are sequential. In those cases, we can use distance:0;
to let Suricata know that modified content match should appear sometime after the prior content match, either in the same sticky buffer, or same overall payload (if no modifiers/sticky buffers are being used).
In some cases, this can save you some CPU ticks, because you don’t have to have Suricata repeatedly scan the same buffer/payload to all the content matches in that buffer/payload. In other cases… If the payload/buffer is quite large, or vary in size, and the content matches you are using are not particularly unique, this results in false positives in ways you didn’t expect. Suddenly customers are sending you screencaps that show content match number 1 happens, then content match two happens 400 bytes later. repeatedly. Other times, if distance:0;
is used excessively, it can introduce performance problems where it costs overall more CPU ticks to analyze. What’s the alternative? within
.
If you have any inkling that you can calculate the distance in bytes between an initial content match and the next one, use the within
keyword instead. Generally, the performance is much better, and it establishes boundaries as to how far Suricata is allowed to look for content match with the within
modifier, which can help with false positives. To do this however, you have to make a lot of educated guesses using the data you have. Let’s take a look at a content match from the rule above, where we use within
:
content:“ip|3d|”; content:“version|3d|”; within:60;
Notice how the second content match has within:60;
as its modifier? How did I come to that conclusion? The original article states that the “ip=” URI Parameter contains the public IP address of the device reported to the C2 Server.
-Is that in IPv4 only? or is IPv6 included? If IPv6 addresses are supported, we need to account for at least 39 bytes, if they’re writing the IP address in plaintext/ascii (16 bytes, 32 characters, plus 7 :
characters). Then we have to account for the ampersand seperator between values, plus the number of bytes in version=
, 8 bytes. We add all that up and we come to 39 + 1 +8 = 48. I added another 12 bytes for safety, so within:60;
.
Let’s do some performance measuring. Here is a performance comparison of the original rule I put into the ET ruleset, vs. the distance:0
rule:
While this is a case where distance:0 reduced the number of ticks by a fair amount, just bear in mind that this is a test pcap with a singular HTTP stream. Overuse of distance:0 in real-world traffic can cause worse performance.
Next, let’s compare the original rule to the rule with the within modifier:
The performance improvement here is also fairly nice, but not quite as good as the rule with the distance modifier. However, the chance of false positive is much lower with this rule, with more exact locations to expect the content matches in.
Now, with all of these options available, and both the distance:0;
and within
rules being better performers, why did I choose to make use of neither modifier? Its because while the natty perf gainz are nice, it was the most flexible option available, to try and capture unknown variants, while still have a pretty decently performing rule.
I didn’t match on the start of the URI string in case bad guys change it in a campaign. Notice content;"moduled|3d|"; nocase;
? There’s two “module” strings in the rule – one is fully uppercase, and the other is lowercase. I’m choosing to match if either one or the other is present. And finally, I’m not using distance or within modifiers to catch potential cases where the URI parameters for the backdoor have their location swapped. That would immediately make both the distance:0;
and within
rules worthless for catching new variants.
Hope you learned something today.
Happy Hunting,
-Tony