SIG: EarthWorm Reverse SOCKS Handshake and Tunnel Sequence Detection

EarthWorm Research Lineage, Protocol Grounding, and Sample PCAP Set

Hi all, I wanted to share some background on how this research thread came together, the protocol grounding behind the detections, and the sample PCAP set used to build and test the current rules.

How this research path started

This work began with reporting on the exploitation of a Palo Alto Networks zero-day in a campaign showing hallmarks of APT linked activity. One detail that stood out in that reporting was the use of EarthWorm as a tunneling utility during post-compromise operations.

That article led directly into a deeper look at EarthWorm as a detection target.

Why EarthWorm stood out

EarthWorm is interesting not only because of its tunneling function, but also because it has been associated in public reporting with multiple APT linked intrusion sets.

Groups I found associated with EarthWorm in this research path include:

  • Volt Typhoon
  • APT41

Background on the tool

EarthWorm was previously offered as an open source tunneling tool and is publicly described as a reverse SOCKS style utility.

Useful references:

At this point, the original distribution path appears to have effectively disappeared, likely due in part to the tool’s repeated use by APT groups. I do not have a definitive public statement confirming the exact reason, so I would treat that as a reasonable inference rather than a confirmed attribution.

How that turned into the lab work

Because EarthWorm had clear operational relevance but was no longer conveniently available through its original distribution path, I located a mirror copy of the Linux binary and used it in a controlled local lab to generate repeatable packet captures for detection development.

To be clear, these PCAPs were lab generated by running EarthWorm itself and capturing the resulting traffic across several scenarios. They are not victim environment captures or third party passive collections.

That step made it possible to validate the packet structure and directionality directly on the wire rather than relying only on reporting or screenshots.

Public protocol grounding used for the detections

The current rule logic is grounded in the UK National Cyber Security Centre (NCSC), Malware Analysis Report: Pygmy Goat, especially pages 18-20, which map the EarthWorm handshake and tunnel sequence as:

  • 01 01 = client request
  • 01 02 = server response
  • 01 03 = assign pool number
  • 01 04 = tunnel request
  • 01 05 = tunnel response

Based on the clean lab PCAPs, the observed directionality is:

  • 01 01 = client → server
  • 01 02 / 01 03 = server → client
  • 01 04 = client → server
  • 01 05 / 05 02 00 01 = server → client

Note on pool number handling and rule construction

The Pygmy Goat report documents an EarthWorm related implementation in which a pool number is assigned during the handshake sequence. In that protocol mapping, the handshake and tunnel records are still six bytes long, but the later bytes can carry a pool-specific value.

My lab-generated test set did not model that pool-assignment behavior because it was built using a more standard EarthWorm binary rather than the exact implementation described in the Pygmy Goat report.

Even so, the packet structure was still useful for detection development because both the public writeup and the lab PCAPs showed the same six-byte record size for these control messages.

Because of that, I built the rules to focus on the leading bytes that remain consistent across both cases, while also requiring dsize: 6 to anchor the match to the expected control-record format and reduce noise.

In the current rule set, that means matching on the first two bytes of the control record with depth: 2, while requiring dsize: 6 so the match stays tied to the expected six-byte handshake or tunnel record. For the follow-on SOCKS pattern, the rule uses depth: 4 with dsize: 4 on 05 02 00 01.

In practice, that means the detection logic is intentionally looking for the stable handshake and tunnel byte patterns at the start of the record, while using staged state tracking and payload-size constraints to avoid broad or noisy matching.

Detection design notes

The current Suricata approach intentionally separates the traffic into two logical groups:

  1. Setup-stage control sequence
  2. Post-setup request-stage SOCKS sequence

These are designed as stateful staged detections, not as flat one-packet signatures.

Important implementation note

For anyone reviewing or adapting the rules, the early-stage marker rules are intentionally silent state setting rules. Their purpose is to carry state forward to a later, higher confidence alert condition.

In practical terms:

  • the setup-stage group uses flowbits
  • the request-stage group also uses flowbits
  • the marker stages use flowbits:noalert intentionally
  • that silent behavior is part of the design and should be preserved

The goal is for the analyst facing alert to fire on the stronger confirmation point, rather than on the earliest staging byte pattern alone.

Setup-stage logic

The setup-stage sequence is tracked with flowbits because it occurs on a single long-lived control connection.

The 01 03 continuation is treated as optional corroboration, not as a required alert condition.

Request-stage logic

The post setup request stage sequence is tracked separately to catch the tunnel request / response progression and the follow-on SOCKS negotiation pattern.

This second group is also built with flowbits, using a staged progression from 01 04 to 01 05 before alerting on the SOCKS negotiation bytes.

Using 2- and 3-stage flowbits plus the byte restrictions, I tried to make the detections as quiet as possible while still surfacing only the more compelling sequence-based alerts.

Sample PCAP set

The PCAPs used for this work are here:

Current curated samples:

  • ew-test-01 — captures the initial setup sequence and one successful tunnel test
  • ew-test-02 — captures the same setup sequence, then multiple uses of the tunnel
  • ew-test-03 — captures activity after setup is already complete
  • ew-test-04 — captures what a reconnect might look like with just a handshake and no transmission
  • ew-test-05 — captures delayed transmission after connect behavior
  • ew-test-06 — captures whether follow-on streams behave consistently after setup is complete

Why the PCAP set matters

The goal of the sample set was to avoid overfitting to a single capture. Instead, the PCAPs were arranged to test several conditions:

  • clean first-time setup
  • repeated tunnel usage after setup
  • captures that begin after setup is already complete
  • reconnect behavior with minimal traffic
  • delayed follow-on activity
  • consistency across later request streams

That structure made it possible to test whether the staged detection logic held up across more than one traffic shape.

Closing note

I hope this context is useful for anyone reviewing the rules or the writeup. The goal here was to stay close to observed traffic, anchor the detection logic in public protocol documentation, and keep the alerting behavior aligned with the actual communication stages by alerting on confirmed second or third step rule matches.

Group 1: Setup Stage Control Sequence

Logic

  • 01 01 sets the first setup stage bit
  • 01 02 checks that bit and alerts
  • 01 03 is optional continuation/corroboration
alert tcp any any -> any any (msg:"EARTHWORM setup stage marker 01 01"; flow:established,to_server; content:"|01 01|"; depth:2; dsize:6; flowbits:set,ew.setup.stage1; flowbits:noalert; classtype:trojan-activity; sid:9906001; rev:4;)

alert tcp any any -> any any (msg:"EARTHWORM Setup Stage Control Sequence"; flow:established,to_client; flowbits:isset,ew.setup.stage1; content:"|01 02|"; depth:2; dsize:6; flowbits:set,ew.setup.confirmed; classtype:trojan-activity; metadata:confidence high, deployment Perimeter, affected_product Any; sid:9906002; rev:4;)

alert tcp any any -> any any (msg:"EARTHWORM setup stage continuation 01 03"; flow:established,to_client; flowbits:isset,ew.setup.confirmed; content:"|01 03|"; depth:2; dsize:6; flowbits:noalert; classtype:trojan-activity; sid:9906003; rev:4;)

Group 2: Post-Setup Request Stage SOCKS Sequence

Logic

  • 01 04 sets the first request stage bit
  • 01 05 checks that bit and promotes the request-stage state
  • 05 02 00 01 checks the promoted state and alerts

alert tcp any any -> any any (msg:"EARTHWORM request stage marker 01 04"; flow:established,to_server; content:"|01 04|"; depth:2; dsize:6; flowbits:set,ew.request.stage1; flowbits:noalert; classtype:trojan-activity; sid:9906004; rev:3;)

alert tcp any any -> any any (msg:"EARTHWORM request stage marker 01 05"; flow:established,to_client; flowbits:isset,ew.request.stage1; content:"|01 05|"; depth:2; dsize:6; flowbits:set,ew.request.stage2; flowbits:noalert; classtype:trojan-activity; sid:9906005; rev:3;)

alert tcp any any -> any any (msg:"EARTHWORM Post Setup Request Stage SOCKS Sequence"; flow:established,to_client; flowbits:isset,ew.request.stage2; content:"|05 02 00 01|"; depth:4; dsize:4; classtype:trojan-activity; metadata:confidence medium, deployment Perimeter, affected_product Any; sid:9906006; rev:3;)

References

4 Likes

Hey man, this is awesome research. Just the type of thing I love to see! I’ll work on getting these added to the ruleset as soon as I can. Until you have my thanks for your very thorough breakdown.

3 Likes

Hey @Pb-22 , just wanted to follow up on this. I’ve made some changes to your rules.

A lot of this is just formatting and arbitrary rules we have for adding things to the ET ruleset.

Here are some slight detection logic changes I’ve made specifically for Suricata rules:

changed protocol from tcp to tcp-pkt. I noticed a slight performance bump in some cases where I tested all six pcaps together, that the average ticks for tcp-pkt were overall lower than using tcp over an extended period, so I opted to make that change.

I replaced all instances where depth was used to startswith

I explicitly set the fast_pattern for the rule, because in some cases, I’ve noticed that while suricata will automatically set the fast_pattern for the rule if there is only a single content match, doing so explicitly does free up some CPU ticks in some cases.

Anyway, this is what I submitted:

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Initial Setup Request"; flow:established,to_server; flowbits:set,ET.earthworm.setup.stage1; flowbits:noalert; dsize:6; content:"|01 01|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:1; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Server Response"; flow:established,to_client; flowbits:isset,ET.earthworm.setup.stage1; flowbits:set,ET.earthworm.setup.confirmed; dsize:6; content:"|01 02|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:2; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Assign Pool Number Request"; flow:established,to_client; flowbits:isset,ET.earthworm.setup.confirmed; dsize:6; content:"|01 03|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:3; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Tunnel Request"; flow:established,to_server; flowbits:set,ET.earthworm.request.stage1; flowbits:noalert; dsize:6; content:"|01 04|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:4; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Proxy Tunnel Response"; flow:established,to_client; flowbits:isset,ET.earthworm.request.stage1; flowbits:set,ET.earthworm.request.stage2; flowbits:noalert; dsize:6; content:"|01 05|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:5; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Proxy Tunnel Post Setup Request"; flow:established,to_client; flowbits:isset,ET.earthworm.request.stage2; dsize:4; content:"|05 02 00 01|"; fast_pattern; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:6; rev:1;)

Additionally, these rules are going to be shipped in the ruleset as disabled. Because in testing against our QA pcaps they didn’t perform very well at all. Kind of understandable, as two-byte content matches as a rule’s fast_pattern isn’t terribly much. so this resulted in a lot of “checks”, but not very many false positives. So again, we are shipping the rules, but because of the poor performance, they are shipped as disabled and will require users to manually re-enable them using their rule management tools.

Once again, I would like to thank you for research, and the time taken to produce both the rules AND pcaps for the traffic. It is greatly appreciated and we hope to see more of your work in the future.

Thanks,

-Tony

1 Like

Hi @trobinson667,

Thanks very much, I really appreciate the time you put into the review and submission.

I’m glad to see the signatures made it into the ET ruleset, and your notes were genuinely helpful. The changes to tcp-pkt, startswith, and the explicit fast_pattern all make sense, and I appreciate you walking through the reasoning behind them.

I was definitely hoping the rules might ship enabled, but I understand the performance concern with such short content matches. The short matches were intentional on my side because I was trying to cover both the more regular EarthWorm traffic and the Pygmy Goat style without overfitting too tightly to one variant.

Even though these did not quite meet the bar for enabled deployment, I’m happy they were included in a disabled state so defenders can still test them and enable them where they make sense.

One question: would it be an even bigger performance hit to keep the full 6-byte structure by using a second content match or a byte_test for the Pygmy Goat style? I could also split that logic and make a separate rule with trailing zeroes for the more regular EarthWorm style.

Best,
Pb-22

If you show me the rules you want to make, I can performance test them to see if they pass muster. Sound good?

Hi @trobinson667,

Thanks for the offer to test new rules.

After your last note, I split the work into two paths:

  1. a tighter vanilla EarthWorm set using the full 6 byte records
  2. a separate Pygmy Goat request-stage path for the pool-aware traffic. This work ended up leading to needing to use lua scripts to get the job done. I’m not sure if that works for submission purposes, but I thought I might as well share the work anyway.

For the vanilla EarthWorm side, the main change was tightening the rules back to the exact 6 byte control records that were actually showing up in the traffic.

Vanilla EarthWorm set

# EarthWorm candidate ET follow-up set: vanilla-focused
# Based on the ET-reviewed version, tightened by restoring the
# full 6-byte zero-tail records for the exact vanilla EarthWorm path.
#
# Current intent:
# - improve specificity/performance on the plain EarthWorm records
# - preserve the optional 01 03 setup-stage pool assignment indicator
# - keep request-stage detection aligned to the observed six-byte control records

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Initial Setup Request Vanilla"; flow:established,to_server; flowbits:set,ET.earthworm.setup.stage1; flowbits:noalert; dsize:6; content:"|01 01 00 00 00 00|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:9000011; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Server Response Vanilla"; flow:established,to_client; flowbits:isset,ET.earthworm.setup.stage1; flowbits:set,ET.earthworm.setup.confirmed; dsize:6; content:"|01 02 00 00 00 00|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:9000012; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Assign Pool Number Request"; flow:established,to_client; flowbits:isset,ET.earthworm.setup.confirmed; dsize:6; content:"|01 03|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:9000013; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Tunnel Request Vanilla"; flow:established,to_server; flowbits:set,ET.earthworm.request.stage1; flowbits:noalert; dsize:6; content:"|01 04 00 00 00 00|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:9000014; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Proxy Tunnel Response Vanilla"; flow:established,to_client; flowbits:isset,ET.earthworm.request.stage1; flowbits:set,ET.earthworm.request.stage2; flowbits:noalert; dsize:6; content:"|01 05 00 00 00 00|"; fast_pattern; startswith; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:9000015; rev:1;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Proxy Tunnel Post Setup Request Vanilla"; flow:established,to_client; flowbits:isset,ET.earthworm.request.stage2; dsize:4; content:"|05 02 00 01|"; fast_pattern; reference:url,community.emergingthreats.net/t/sig-earthworm-reverse-socks-handshake-and-tunnel-sequence-detection/3314; classtype:trojan-activity; sid:9000016; rev:1;)

That set keeps the setup stage intact and tightens the request stage back to the exact observed 6 byte records.

For the Pygmy Goat side, I ended up building a separate set of request stage PCAPs to test the pool behavior directly.

PCAPs are here:

https://github.com/Pb-22/Network_Detections/tree/main/Pygmy-Goat-Earthworm/PCAPs

Those samples are:

  • ew-test-07-pool-disabled-zero.pcap
    zero-pool baseline, where 01 03, 01 04, and 01 05 all carry 00 00 00 00

  • ew-test-08-pool-enabled-04d2.pcap
    matched pool sample using 0x000004d2 across 01 03, 01 04, and 01 05

  • ew-test-09-pool-enabled-1337.pcap
    matched pool sample using 0x00001337 across 01 03, 01 04, and 01 05

  • ew-test-10-pool-mismatch-04d2-1337.pcap
    negative-control mismatch sample where 01 03 and 01 04 use 0x000004d2, while 01 05 uses 0x00001337

That testing was useful because it made the failure case obvious. If ew-test-10 alerted, the logic was still too broad. Also, 07 should not fire because that is already covered by the other set.

Pygmy Goat testing

For the Pygmy Goat path, the setup stage did not need to change. The difference was in the request stage, where the pool value has to match between 01 04 and 01 05.

I tried to keep this in pure Suricata rule logic first, but I could not get the behavior I wanted on the mismatch sample. The request stage shape was easy enough to match, but the pool equality piece wasn’t working. So I decided to use lua helpers.

The pool aware request stage rules I ended up with were:

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Tunnel Request Pool Store"; flow:established,to_server; dsize:6; content:"|01 04|"; startswith; byte_test:4,>,0,2; byte_extract:4,2,ew_req_pool; lua:lua/earthworm_pool_store.lua; xbits:set,ET.earthworm.request.pool.stage1,track ip_pair,expire 300; noalert; classtype:trojan-activity; sid:9000031; rev:3;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Reverse Proxy Tunnel Response Pool Match"; flow:established,to_client; xbits:isset,ET.earthworm.request.pool.stage1,track ip_pair; dsize:6; content:"|01 05|"; startswith; byte_test:4,>,0,2; byte_extract:4,2,ew_resp_pool; lua:lua/earthworm_pool_compare.lua; xbits:set,ET.earthworm.request.pool.stage2,track ip_pair,expire 300; noalert; classtype:trojan-activity; sid:9000032; rev:3;)

alert tcp-pkt any any -> any any (msg:"ET MALWARE EARTHWORM SOCKS Proxy Tunnel Post Setup Request Pool Match"; flow:established,to_client; xbits:isset,ET.earthworm.request.pool.stage2,track ip_pair; dsize:4; content:"|05 02 00 01|"; fast_pattern; classtype:trojan-activity; sid:9000033; rev:1;)

And the Lua helpers were:

earthworm_pool_store.lua

local bytevarlib = require("suricata.bytevar")
local flowintlib = require("suricata.flowint")

function init(sig)
 bytevarlib.map(sig, "ew_req_pool")
 flowintlib.register("ew_pool_req04")
 local needs = {}
 needs["payload"] = tostring(true)
 return needs
end

function thread_init()
 ew_req_pool = bytevarlib.get("ew_req_pool")
 ew_pool_req04 = flowintlib.get("ew_pool_req04")
end

function match(args)
 local req_pool = ew_req_pool:value()
 if req_pool ~= nil then
 ew_pool_req04:set(req_pool)
 return 1
 end
 return 0
end

earthworm_pool_compare.lua

local bytevarlib = require("suricata.bytevar")
local flowintlib = require("suricata.flowint")

function init(sig)
 bytevarlib.map(sig, "ew_resp_pool")
 flowintlib.register("ew_pool_req04")
 local needs = {}
 needs["payload"] = tostring(true)
 return needs
end

function thread_init()
 ew_resp_pool = bytevarlib.get("ew_resp_pool")
 ew_pool_req04 = flowintlib.get("ew_pool_req04")
end

function match(args)
 local resp_pool = ew_resp_pool:value()
 if resp_pool ~= nil then
 local stored = ew_pool_req04:value()
 if stored ~= nil and stored == resp_pool then
 return 1
 end
 end
 return 0
end

That solved the mismatch problem cleanly in testing.

The expected results for that set were:

  • ew-test-07 = no alert
  • ew-test-08 = alert
  • ew-test-09 = alert
  • ew-test-10 = no alert

Best,
Pb-22

Going to let this marinate and move on to the next one