EarthWorm Research Lineage, Protocol Grounding, and Sample PCAP Set
Hi all, I wanted to share some background on how this research thread came together, the protocol grounding behind the detections, and the sample PCAP set used to build and test the current rules.
How this research path started
This work began with reporting on the exploitation of a Palo Alto Networks zero-day in a campaign showing hallmarks of APT linked activity. One detail that stood out in that reporting was the use of EarthWorm as a tunneling utility during post-compromise operations.
That article led directly into a deeper look at EarthWorm as a detection target.
Why EarthWorm stood out
EarthWorm is interesting not only because of its tunneling function, but also because it has been associated in public reporting with multiple APT linked intrusion sets.
Groups I found associated with EarthWorm in this research path include:
- Volt Typhoon
- APT41
Background on the tool
EarthWorm was previously offered as an open source tunneling tool and is publicly described as a reverse SOCKS style utility.
Useful references:
- UK NCSC: Malware Analysis Report: Pygmy Goat (PDF)
- EarthWorm project page: rootkiter EarthWorm
At this point, the original distribution path appears to have effectively disappeared, likely due in part to the tool’s repeated use by APT groups. I do not have a definitive public statement confirming the exact reason, so I would treat that as a reasonable inference rather than a confirmed attribution.
How that turned into the lab work
Because EarthWorm had clear operational relevance but was no longer conveniently available through its original distribution path, I located a mirror copy of the Linux binary and used it in a controlled local lab to generate repeatable packet captures for detection development.
To be clear, these PCAPs were lab generated by running EarthWorm itself and capturing the resulting traffic across several scenarios. They are not victim environment captures or third party passive collections.
That step made it possible to validate the packet structure and directionality directly on the wire rather than relying only on reporting or screenshots.
Public protocol grounding used for the detections
The current rule logic is grounded in the UK National Cyber Security Centre (NCSC), Malware Analysis Report: Pygmy Goat, especially pages 18-20, which map the EarthWorm handshake and tunnel sequence as:
01 01= client request01 02= server response01 03= assign pool number01 04= tunnel request01 05= tunnel response
Based on the clean lab PCAPs, the observed directionality is:
01 01= client → server01 02/01 03= server → client01 04= client → server01 05/05 02 00 01= server → client
Note on pool number handling and rule construction
The Pygmy Goat report documents an EarthWorm related implementation in which a pool number is assigned during the handshake sequence. In that protocol mapping, the handshake and tunnel records are still six bytes long, but the later bytes can carry a pool-specific value.
My lab-generated test set did not model that pool-assignment behavior because it was built using a more standard EarthWorm binary rather than the exact implementation described in the Pygmy Goat report.
Even so, the packet structure was still useful for detection development because both the public writeup and the lab PCAPs showed the same six-byte record size for these control messages.
Because of that, I built the rules to focus on the leading bytes that remain consistent across both cases, while also requiring dsize: 6 to anchor the match to the expected control-record format and reduce noise.
In the current rule set, that means matching on the first two bytes of the control record with depth: 2, while requiring dsize: 6 so the match stays tied to the expected six-byte handshake or tunnel record. For the follow-on SOCKS pattern, the rule uses depth: 4 with dsize: 4 on 05 02 00 01.
In practice, that means the detection logic is intentionally looking for the stable handshake and tunnel byte patterns at the start of the record, while using staged state tracking and payload-size constraints to avoid broad or noisy matching.
Detection design notes
The current Suricata approach intentionally separates the traffic into two logical groups:
- Setup-stage control sequence
- Post-setup request-stage SOCKS sequence
These are designed as stateful staged detections, not as flat one-packet signatures.
Important implementation note
For anyone reviewing or adapting the rules, the early-stage marker rules are intentionally silent state setting rules. Their purpose is to carry state forward to a later, higher confidence alert condition.
In practical terms:
- the setup-stage group uses flowbits
- the request-stage group also uses flowbits
- the marker stages use
flowbits:noalertintentionally - that silent behavior is part of the design and should be preserved
The goal is for the analyst facing alert to fire on the stronger confirmation point, rather than on the earliest staging byte pattern alone.
Setup-stage logic
The setup-stage sequence is tracked with flowbits because it occurs on a single long-lived control connection.
The 01 03 continuation is treated as optional corroboration, not as a required alert condition.
Request-stage logic
The post setup request stage sequence is tracked separately to catch the tunnel request / response progression and the follow-on SOCKS negotiation pattern.
This second group is also built with flowbits, using a staged progression from 01 04 to 01 05 before alerting on the SOCKS negotiation bytes.
Using 2- and 3-stage flowbits plus the byte restrictions, I tried to make the detections as quiet as possible while still surfacing only the more compelling sequence-based alerts.
Sample PCAP set
The PCAPs used for this work are here:
Current curated samples:
- ew-test-01 — captures the initial setup sequence and one successful tunnel test
- ew-test-02 — captures the same setup sequence, then multiple uses of the tunnel
- ew-test-03 — captures activity after setup is already complete
- ew-test-04 — captures what a reconnect might look like with just a handshake and no transmission
- ew-test-05 — captures delayed transmission after connect behavior
- ew-test-06 — captures whether follow-on streams behave consistently after setup is complete
Why the PCAP set matters
The goal of the sample set was to avoid overfitting to a single capture. Instead, the PCAPs were arranged to test several conditions:
- clean first-time setup
- repeated tunnel usage after setup
- captures that begin after setup is already complete
- reconnect behavior with minimal traffic
- delayed follow-on activity
- consistency across later request streams
That structure made it possible to test whether the staged detection logic held up across more than one traffic shape.
Closing note
I hope this context is useful for anyone reviewing the rules or the writeup. The goal here was to stay close to observed traffic, anchor the detection logic in public protocol documentation, and keep the alerting behavior aligned with the actual communication stages by alerting on confirmed second or third step rule matches.
Group 1: Setup Stage Control Sequence
Logic
01 01sets the first setup stage bit01 02checks that bit and alerts01 03is optional continuation/corroboration
alert tcp any any -> any any (msg:"EARTHWORM setup stage marker 01 01"; flow:established,to_server; content:"|01 01|"; depth:2; dsize:6; flowbits:set,ew.setup.stage1; flowbits:noalert; classtype:trojan-activity; sid:9906001; rev:4;)
alert tcp any any -> any any (msg:"EARTHWORM Setup Stage Control Sequence"; flow:established,to_client; flowbits:isset,ew.setup.stage1; content:"|01 02|"; depth:2; dsize:6; flowbits:set,ew.setup.confirmed; classtype:trojan-activity; metadata:confidence high, deployment Perimeter, affected_product Any; sid:9906002; rev:4;)
alert tcp any any -> any any (msg:"EARTHWORM setup stage continuation 01 03"; flow:established,to_client; flowbits:isset,ew.setup.confirmed; content:"|01 03|"; depth:2; dsize:6; flowbits:noalert; classtype:trojan-activity; sid:9906003; rev:4;)
Group 2: Post-Setup Request Stage SOCKS Sequence
Logic
01 04sets the first request stage bit01 05checks that bit and promotes the request-stage state05 0200 01 checks the promoted state and alerts
alert tcp any any -> any any (msg:"EARTHWORM request stage marker 01 04"; flow:established,to_server; content:"|01 04|"; depth:2; dsize:6; flowbits:set,ew.request.stage1; flowbits:noalert; classtype:trojan-activity; sid:9906004; rev:3;)
alert tcp any any -> any any (msg:"EARTHWORM request stage marker 01 05"; flow:established,to_client; flowbits:isset,ew.request.stage1; content:"|01 05|"; depth:2; dsize:6; flowbits:set,ew.request.stage2; flowbits:noalert; classtype:trojan-activity; sid:9906005; rev:3;)
alert tcp any any -> any any (msg:"EARTHWORM Post Setup Request Stage SOCKS Sequence"; flow:established,to_client; flowbits:isset,ew.request.stage2; content:"|05 02 00 01|"; depth:4; dsize:4; classtype:trojan-activity; metadata:confidence medium, deployment Perimeter, affected_product Any; sid:9906006; rev:3;)
References
- SecurityWeek: Palo Alto Zero-Day Exploited in Campaign Bearing Hallmarks of Chinese State Hacking
- UK NCSC: Malware Analysis Report: Pygmy Goat (PDF)
- EarthWorm project page: rootkiter EarthWorm
- Test PCAPs [Pb-22 Git] (Network_Detections/Earthworm/PCAPs at main · Pb-22/Network_Detections · GitHub)