Investigating and Interpreting TLS SNI and DNS query rules

Hey folks, after a recent question in which a user was confused about where to start investigating a malware threat related to a DNS alert that had triggered. I’m also going to talk a little bit about TLS SNI rules as well, as some of the investigation questions sort of apply to those rules as well.

Why Do DNS/TLS SNI based rules exist?

As a part of our daily duties its a requirement to keep up on the latest threat research. It doesn’t matter to us whether the threat research is generated by a competing cybersecurity company, government organization, or independent researchers. That being said, we can only create IDS rules based on the data we have available to us.

Sometimes, reports will have partial packet captures, examples of network traffic, a detailed guide on the struct of C2 packets for the malware in question, etc. Other times, we utilize the file hashes, and hunt through public and private data sources for samples, then attempt to run the samples we can find in sandboxes to generate the pcaps ourselves.

However, even in spite of our best efforts, we may not be able to acquire packet captures, or sometimes even if we do, there may not be a repeatable pattern to create a rule on. It is then that we have to rely on IDS rules based on DNS and/or TLS SNI (Sever Name Indicator) IOCs in order to provide detection.

Many will cite Threat Intelligence models to tell you that low-hanging fruit like IP addresses, Domain names (and by extension TLS SNI), etc. are of low value because threat actors will change them frequently.

Sometimes this isn’t the case. Maybe the Domain names they use follow a discernible pattern (e.g. a domain generation algorithm that can be expressed by a rule, a simple string/pattern observed in multiple registered domains, etc.). Maybe they re-use the same domain name more than once, or aren’t particularly sophisticated and have a static domain their malware calls back to.

If these rules trigger, even if you think the domain name or website is dead, they require investigation. What triggered the DNS lookup? What triggered an SSL client_hello with the TLS SNI present? Root cause for the alert triggering still needs to be determined.

Something else to consider is that threat actors can’t take away malware samples that have been forensically retrieved during an investigation. Nor is it possible for them to (easily) take away any DNS/TLS traffic they’ve generated that is logged by full or partial packet capture. Suricata rules can be ran against sandbox network traffic, or retained full packet capture logs to see if an alert triggered some time during the log retention period.

Finally, if nothing else, relevant reference metadata in a triggered DNS/TLS SNI rule may result in other network or host-based indicators that can be used with other data sources to hunt for a given threat actor in your network.

In Conclusion, we create DNS and TLS SNI based rules when we don’t have any other choice for detecting a threat. These rules may be created with “low value” indicators of compromise, but can still sometimes detect active campaigns, and with the right data sources available, can be used to retroactively check to see if any of your systems were compromised by the malware/threat actor in question.

Introduction to DNS: Recursion

Investigating DNS-based alerts requires a little bit of understanding into how DNS works, and where your IDS/IPS sensor is located in your network in relation to the client that made the query and/or the the local DNS servers(s) on your network.

In my years of experience with supporting NSM implementations, most organizations prefer to deploy IDS/IPS sensors inside of the network perimeter behind the firewall. This has some ramifications when it comes to DNS-based IDS signatures.

Now, I’m vastly oversimplifying this, but in a nutshell, a core part of how DNS works involves the use of recursive DNS queries. A client in your network is configured to talk to a DNS resolver or forwarder on the local network. When that client makes a request to the local DNS resolver/forwarder, and it doesn’t know the answer to the DNS query, that DNS resolver/forwarder then makes a DNS request on behalf of the client, to the DNS servers it is configured to talk to, and so on, and so forth until either an answer is found (e.g. the requested DNS record is returned or an NXDOMAIN response is provided) or the request times out. Every time a DNS request recurses, the source IP address that made the DNS request changes. That means, if the only place you have an IDS/IPS sensor deployed in a network is at the perimeter, the source IP address for all DNS queries that sensor is going to see is going to be your DNS server. This will make investigating the source of the DNS query considerably difficult if you don’t have access to other data sources to try and determine where the DNS query actually originated from.

Logging DNS Queries to Investigate DNS Alerts

One of the foremost improvements to consider to aid in this problem is to log the DNS queries handled by your DNS resolvers/forwarders. Many DNS server software packages have options for doing this. Additionally, most SIEM solutions out there also have software that can capture and log DNS queries to your forwarders/resolvers and log them for consumption or storage in a SIEM solution of some sort. Here are some products/configuration settings to consider:

Another option to consider is to attempt to resolve the malicious domain in question, see what IP address it resolves to, then consider looking for any allowed or attempted connections to that IP address around the same timeframe as the DNS query. You may also consider using passivedns threat intelligence services (such as Cisco Umbrella DNS service, mnemonic.no’s passivedns database, Virustotal DNS lookup, etc.) to look up what IP address(es) the domain resolved to historically and looking for that IP address in firewall and/or netflow logs, if available.

What about TLS SNI Logs Sources and Alerts?

Investigating TLS SNI rules is very similar in nature, the only difference is that you don’t have to worry about finding the original client due to DNS recursion, but if you use an SSL/TLS capable proxy in your environment, and your IDS/IPS deployment only sees HTTP/HTTPS requests after requests are forwarded by the proxy, you’ll need to correlate connections back to the original client through the use of the proxy’s logs.

From there, there are a couple of different data sources that may help with investigating TLS connections to known malicious hosts:

Zeek’s ssl.log
Suricata tls-log and/or TLS EVE JSON logs

Having these logs available allows us to look at other aspects of the SSL certificate if there were any observed over the wire. That information can be used for further threat hunting and security research to find other servers on the internet that may share similar SSL/TLS certification option using online system databases like censys.io.

Bonus Round: flow, netflow, and firewall logs

So with dns logs on your local network, passive dns databases online, as well as SSL/TLS logs, it may be confirmed whether or not a system on your network connected to a known bad domain, but how do we get an idea of what happened next? That’s where flow logs and/or firewall logs come into play. Firewall logs are pretty straightforward. While formatting varies depending on the vendor generally speaking, these logs will let you know who attempted to connect to a given IP address, at what time, and whether or not the connection was allowed or not.

Flow logs on the other hand are a generic name given to logs that track a variety of connection data – source IP, destination IP, source port, destination port, network protocol used (sometimes, application layer protcol used), number of packets sent, number of packets received, bytes sent, bytes received, etc. Sometimes, this information is referred to as connection metadata – as in, we know information about the connection, but not the contents of the connection itself. There are a wide variety of software products out there that provide flow/netflow-style logs. Here are just a few to think about:

If you have access to these logs, query for the external IP addresses identified by either your DNS or TLS SNI pivoting. If you get any results, they will help to identify the source IP address in your network that attempted contacting the external host. Through the flow logs, investigators can get an idea as to how much data was transferred between the local host in your network, and the external host, as well as the duration of the connections. This information may not tell you what data was transferred over the connection, but generally speaking, connections with a long duration, or connections in which a large number of bytes were transferred mean that investigation should likely be continued, focusing on host-based forensics and indicators of compromise to determine if the host on your internal network was indeed infected. That is a little bit beyond the scope of this post, but maybe we can cover that another time.

I hope this post gives you some ideas on how to investigate DNS-based IDS alerts and/or TLS SNI based alerts as well! Good luck, and happy hunting!

4 Likes

+1 for DNS/TLS SNI alerts. “What triggered the DNS lookup?” is a great question, domain names are human readable, easy to corroborate with other tools, and their detection is difficult to wave away as a false positive due to packet loss. In my experience, it’s easier to find information on the web about malicious domains than other IoCs, too.

That said, I’d like to be able hand a rule signature to someone pushing back on one of my detections, and have the signature be self-evident enough of host compromise to end any debate. I want to be enabling DNS and TLS SNI signatures that prove malicious code is running on an endpoint, rather than detecting a request for malicious code during an intermediate stage of attack (e.g. merely visiting a compromised website). Compromised websites get cleaned up, but if the only known use of a domain is for exfiltrating data, I want to know when something resolves it.

My reading of the Rules Severities article is that domain-based signatures indicating malware running on a host would all be marked “Critical”, is that right?

If that holds, then a look at all ET Open signatures matching on the DNS and TLS protocols, of signature_severity “Critical”, shows they’re broken out into a few different classtypes:

bad-unknown
command-and-control
domain-c2
policy-violation
social-engineering
targeted-activity
trojan-activity

Of these classtypes, the descriptions of “trojan-activity”, “command-and-control”, and “domain-c2” in classification.config sound most like “malicious code running on an endpoint must have made this DNS/request or TLS connection.” Is that right?

Also, in the context of observing a malicious domain name, “domain-c2” and “command-and-control” sound like the same thing. Do you draw a distinction between them when matching on DNS and TLS? Maybe “requests to this domain are part of c2 traffic” versus “this DNS traffic is part of a stealth C2 protocol”?

Thanks!

2 Likes