Http.dottedquadhost and you

Hey hey folks,

This post should be much shorter than the posts I typically deliver here. I just wanna make folks aware of some changes we made to some pretty old, but very valuable rules in the ET ruleset that tend to be very valuable when hunting malware.

http.dottedquadhost and you

In the ET ruleset, there are a cluster of rules that set a flowbit called http.dottedquadhost:

 2021067: Dotted Quad Host M1 (noalert)

2021068: Dotted Quad Host M2 (noalert)

2021069: Dotted Quad Host M3 (noalert)

2021070: Dotted Quad Host M4 (noalert)

2021071: Dotted Quad Host M5 (noalert)

2021072: Dotted Quad Host M6 (noalert)

2021073: Dotted Quad Host M7 (noalert)

2021074: Dotted Quad Host M8 (noalert)

2021075: Dotted Quad Host M9 (noalert) 

The structure of these rules is all very similar. For example, the M1 version of the rule looks like this (suricata 7.0.3):

alert http1 $HOME_NET any -> $EXTERNAL_NET any (msg:"ET INFO Dotted Quad Host M1 (noalert)"; flow:established,to_server; flowbits:set,http.dottedquadhost; flowbits:noalert; http.header.raw; content:"host|3a 20|1"; nocase; fast_pattern; pcre:"/^\d{0,2}\.(?:\d{1,3}\.){2}\d{1,3}(?:\x3a\d{1,5})?/R"; classtype:bad-unknown; sid:2021067; rev:8; metadata:created_at 2015_05_08, confidence High, signature_severity Informational, updated_at 2025_06_13;)

Regex Structure

All the rules are looking for an HTTP host header that indicates the client is attempting to communicate via HTTP to an external web server, with a HTTP host header value that contains an IPv4 IP address. All of the rules, M1 through M9 all set the same flowbit http.dottedquadhost. The reason why there are so many of them, is that the rule needs a fast_pattern in order to not perform excessively bad. So we use the host header, plus the first number of whatever IP address is encountered in the host header as a fast_pattern for the rule, then if the number is 1 or 2, we created an IP address regex like so:

^\d{0,2}\.(?:\d{1,3}\.){2}\d{1,3}

Which could be translated as “give me four numbers, three with literal periods between them. aside from the number 1 or 2 (that we used a content match for), we want any number of digits between zero and two before first period.” This gives us the ability to match on 1, 10, or 100, or 2, 20, 200 and any number in between. Then, for the remaining octets of the IP address, it can be any number of digits between the periods, from one digit to three digits.

an IP address that starts with the numbers 3 through 9 have a regex that looks like this:

^\d{0,1}\.(?:\d{1,3}\.){2}\d{1,3}

The big difference being the first portion of the regex, that will match between zero and one digit before that first period, for us to get numbers like 3,30,4,40,5,50,6,60,etc.

Another thing that was added that wasn’t present in the original version of these rules is a port number specifier. According to the MDN (Mozilla Developer Network) entry for the http host header, it can be specified with or without a port number. If no port number is used, the default port of 80 for http is used, and 443 for https.

That’s the reason for this portion of the regex:

(?:\x3a\d{1,5})?

After the IP address, we want to match a literal colon : and any number of digits between 1 and 5. The ? allows for us to say “match this pattern zero or more times” so that it works on requests where the port number is not present as well.

http.host vs. http.header.raw

Another change we made to the Suricata rules was using http.header.raw instead of http.host for content matches and PCRE. Why?

The Suricata documentation lays it out here specifically, check this out:

* When there are duplicate HTTP headers (referring to the header name only, not the value), the normalized buffer (`http_header`) will concatenate the values in the order seen (from top to bottom), with a comma and space (", ") between each of them. If this hinders detection, use the `http_raw_header` buffer instead.Example request:

GET /test.html HTTP/1.1 Content-Length: 44 Accept: */* Content-Length: 55

The Content-Length header line becomes this in the `http_header` buffer:

Content-Length: 44, 55

So, let’s say you have network traffic with multiple host headers like this:

GET /ayylmao HTTP/1.1
Host: 100.100.100.100
Host: seemsleg.it
Host: totesleg.it
User-Agent: ayylmaobot

suricata will just smash all the data from the host headers together, separated by a comma and a space (\x2c\x20) like this for the http.host buffer:

http.host = 100.100.100.100, seemsleg.it, totesleg.it

While its not a common occurrence, malware developers often don’t have the strongest coding practices, so we wanted to account for duplicate host headers, and according to the docs and our testing, using http.header.raw was the best way forward. Note for Snort users that snort doesn’t seem to have these problems. Which is nice, one problem that snort 2.9.x doesn’t have for once.

New and Existing rules that use http.dottedquadhost

There are a plethora of rules in ET INFO as well as ET HUNTING that utilize the http.dottedquadhost flowbit for tracking anomalous activities. Some of the existing rules track requests for various file types:

PDF File Request: 2027265
Zip File Request: 2027262
PS1 (powershell) Request: 2027259
MZ (possible Windows executable) Response

and so on. Some new rules that were written recently include:

Base64-Encoded Powershell Payload: 2063007
Base64-Encoded PHP payload: 2063008
Suspected IoT Botnet Loader Shell Script: 2063020
Workzueg HTTP Server String Response: 2063017

We’re hoping that by improving the original rules, and expanding the ruleset that relies on the http.dottedquadhost flowbit, that our users can utilize these rules to discover more unusual activity in their traffic before it becomes a bigger problem.

Call for Action

As always, as we think of more and better ways to take advantage of http.dottedquadhost and the improvements made, we’ll create rules to cover those anomalous responses.

Here’s where you come in. If you have ideas for malicious file extensions, http server strings you see used in attacks/IP address hostnames, file responses from servers, etc. Contact us, and we’ll see about the rule making it into the ETOPEN ruleset for everyone to benefit.

Happy Hunting,

-Tony Robinson

Edit 6/19: Big shout-out to James for his help in re-writing these rules.

2 Likes

malicious file extensions, http server strings you see used in attacks/IP address hostnames, file responses from servers

Is this Zeek rule from Sigma something you’d consider adding into the ET ruleset?

great sigma rule. I’ll have to see what I can do with it. I know we have a lot of rules for detection requests to different TLDs, and I know we have a number of rules to catch suspicious extensions, it might be possible to make rules that look for TLDs in the host header, set a flowbit, and react when it sees the file extension or response data.

Thanks for the great suggestion

1 Like