ESET researchers have uncovered another tool in the already extensive arsenal of the SparklingGoblin APT group: a Linux variant of the SideWalk backdoor
ESET researchers have discovered a Linux variant of the SideWalk backdoor, one of the multiple custom implants used by the SparklingGoblin APT group. This variant was deployed against a Hong Kong university in February 2021, the same university that had already been targeted by SparklingGoblin during the student protests in May 2020. We originally named this backdoor StageClient, but now refer to it simply as SideWalk Linux. We also discovered that a previously known Linux backdoor – the Specter RAT, first documented by 360 Netlab – is also actually a SideWalk Linux variant, having multiple commonalities with the samples we identified.
SparklingGoblin is an APT group whose tactics, techniques, and procedures (TTPs) partially overlap with APT41 and BARIUM. It makes use of Motnug and ChaCha20-based loaders, the CROSSWALK and SideWalk backdoors, along with Korplug (aka PlugX) and Cobalt Strike. While the group targets mostly East and Southeast Asia, we have also seen SparklingGoblin targeting a broad range of organizations and verticals around the world, with a particular focus on the academic sector. SparklingGoblin is one of the groups with access to the ShadowPad backdoor.
This blogpost documents SideWalk Linux, its victimology, and its numerous similarities with the originally discovered SideWalk backdoor.
Attribution
The SideWalk backdoor is exclusive to SparklingGoblin. In addition to the multiple code similarities between the Linux variants of SideWalk and various SparklingGoblin tools, one of the SideWalk Linux samples uses a C&C address (66.42.103[.]222) that was previously used by SparklingGoblin.
Considering all of these factors, we attribute with high confidence SideWalk Linux to the SparklingGoblin APT group.
Victimology
Even though there are various SideWalk Linux samples, as we now know them, on VirusTotal, in our telemetry we have found only one victim compromised with this SideWalk variant: a Hong Kong university that, amidst student protests, had previously been targeted by both SparklingGoblin (using the Motnug loader and the CROSSWALK backdoor) and Fishmonger (using the ShadowPad and Spyder backdoors). Note that at that time we put those two different clusters of activity under the broader Winnti Group denomination.
SparklingGoblin first compromised this particular university in May 2020, and we first detected the Linux variant of SideWalk in that university’s network in February 2021. The group continuously targeted this organization over a long period of time, successfully compromising multiple key servers, including a print server, an email server, and a server used to manage student schedules and course registrations.
The road to Sidewalk Linux
SideWalk, which we first described in its Windows form in our blogpost on August 24th, 2021, is a multipurpose backdoor that can load additional modules sent from the C&C server. It makes use of Google Docs as a dead-drop resolver, and Cloudflare workers as its C&C server. It can properly handle communication behind a proxy.
The compromise chain is currently unknown, but we think that the initial attack vector could have been exploitation. This hypothesis is based on the 360 Netlab article describing the Specter botnet targeting IP cameras, and NVR and DVR devices, and the fact that the Hong Kong victim used a vulnerable WordPress server, since there were many attempts to install various webshells.
We first documented the Linux variant of SideWalk as StageClient on July 2nd, 2021, without making the connection at that time to SparklingGoblin and its custom SideWalk backdoor. The original name was used because of the repeated appearances of the string StageClient in the code.
While researching StageClient further, we found a blogpost about the Specter botnet described by 360 Netlab. That blogpost describes a modular Linux backdoor with flexible configuration that uses a ChaCha20 encryption variant – basically a subset of StageClient’s functionality. Further inspection confirmed this hypothesis; we additionally found a huge overlap in functionality, infrastructure, and symbols present in all the binaries.
We compared the StageClient sample E5E6E100876E652189E7D25FFCF06DE959093433 with Specter samples 7DF0BE2774B17F672B96860D013A933E97862E6C and found numerous similarities, some of which we list below.
First, there is an overlap in C&C commands. Next, the samples have the same structure of configuration and encryption method (see Figure 1 and Figure 2).
Additionally, the samples’ modules are managed in almost the same way, and the majority of the interfaces are identical; modules of StageClient only need to implement one additional handler, which is for closing the module. Three out of the five known modules are almost identical.
Lastly, we could see striking overlaps in the network protocols of the compared samples. A variant of ChaCha20 is used twice for encryption with LZ4 compression in the very same way. Both StageClient and Specter create a number of threads (see Figure 3 and Figure 4) to manage sending and receiving asynchronous messages along with heartbeats.
Despite all these striking similarities, there are several changes. The most notable ones are the following:
- The authors switched from the C language to C++. The reason is unknown, but it should be easier to implement such modular architecture in C++ due to its polymorphism support.
- An option to exchange messages over HTTP was added (see Figure 5 and Figure 6).
- Downloadable plugins were replaced with precompiled modules that fulfill the same purpose; a number of new commands and two new modules were added.
- Added the module TaskSchedulerMod, which operates as a built-in cron utility. Its cron table is stored in memory; the jobs are received over the network and executed as shell commands.
- Added the module SysInfoMgr, which provides information about the underlying system such as the list of installed packages and hardware details.
These similarities convince us that Specter and StageClient are from the same malware family. However, considering the numerous code overlaps between the StageClient variant used against the Hong Kong university in February 2021 and SideWalk for Windows, as described in the next section, we now believe that Specter and StageClient are both Linux variants of SideWalk, so we have decided to refer to them as SideWalk Linux.
Similarities with the Windows variant
SideWalk Windows and SideWalk Linux share too many similarities to describe within the confines of this blogpost, so here we only cover the most striking ones.
ChaCha20
An obvious similarity is noticeable in the implementations of ChaCha20 encryption: both variants use a counter with an initial value of 0x0B, which was previously mentioned in our blogpost as a specificity of SideWalk’s ChaCha20 implementation.
Software architecture
One SideWalk particularity is the use of multiple threads to execute one specific task. We noticed that in both variants there are exactly five threads executed simultaneously, each of them having a specific task. The following list describes the function of each; the thread names are from the code:
- StageClient::ThreadNetworkReverse
If a connection to the C&C server is not already established, this thread periodically attempts to retrieve the local proxy configuration and the C&C server location from the dead-drop resolver. If the previous step was successful, it attempts to initiate a connection to the C&C server. - StageClient::ThreadHeartDetect
If the backdoor did not receive a command in the specified amount of time, this thread can terminate the connection with the C&C server or switch to a “nap” mode that introduces minor changes to the behavior. - StageClient::ThreadPollingDriven
If there is no other queued data to send, this thread periodically sends a heartbeat command to the C&C server that can additionally contain the current time. - StageClient::ThreadBizMsgSend
This thread periodically checks whether there is data to be sent in the message queues used by all the other threads and, if so, processes it. - StageClient::ThreadBizMsgHandler
This thread periodically checks whether there are any pending messages received from the C&C server and, if so, handles them.
Configuration
As in SideWalk Windows, the configuration is decrypted using ChaCha20.
Checksum
First, before decrypting, there is a data integrity check. This check is similar in both implementations of SideWalk (see Figure 7 and Figure 8): an MD5 hash is computed on the ChaCha20 nonce concatenated to the encrypted configuration data. This hash is then checked against a predefined value, and if not equal, SideWalk exits.
Layout
Figure 9 presents excerpts of decrypted configurations from the samples that we analyzed.
The SideWalk Linux config contains less information than the SideWalk Windows one. This makes sense because the majority of the configuration artifacts in SideWalk Windows are used as cryptography and network parameters, whereas most of these are internal in SideWalk Linux.
Decryption using ChaCha20
As previously mentioned, SideWalk uses a main global structure to store its configuration. This configuration is first decrypted using the modified implementation of ChaCha20, as seen in Figure 10.
Figure 10. ChaCha20 decryption call in SideWalk Windows (left) and in SideWalk Linux (right)
Note that the ChaCha20 key is exactly the same in both variants, strengthening the connection between the two.
Dead-drop resolver
The dead-drop resolver payload is identical in both samples. As a reminder from our blogpost on SideWalk, Figure 11 depicts the format of the payload that is fetched from the dead-drop resolver.
For the first delimiter, we notice that the PublicKey: part of the string is ignored; the string AE68[…]3EFF is directly searched, as shown in Figure 12.
Figure 12. SideWalk Linux’s first delimiter routine (left), end delimiter and middle delimiter routines (right)
The delimiters are identical, as well as the whole decoding algorithm.
Victim fingerprinting
In order to fingerprint the victim, different artifacts are gathered on the victim’s machine. We noticed that the fetched information is exactly the same, to the extent of it even being fetched in the same order.
As the boot time in either case is a Windows-compliant time format, we can hypothesize that the operators’ controller runs under Windows, and that the controller is the same for both Linux and Windows victims. Another argument supporting this hypothesis is that the ChaCha20 keys used in both implementations of SideWalk are the same.
Communication protocol
Data serialization
The communication protocol between the infected machine and the C&C is HTTP or HTTPS, depending on the configuration, but in both cases, the data is serialized in the same manner. Not only is the implementation very similar, but the identical encryption key is used in both implementations, which, again, accentuates the similarity between the two variants.
POST requests
In the POST requests used by SideWalk to fetch commands and payloads from the C&C server, one noticeable point is the use of the two parameters gtsid and gtuvid, as seen in Figure 13. Identical parameters are used in the Linux variant.
POST /M26RcKtVr5WniDVZ/5CDpKo5zmAYbTmFl HTTP/1.1 Cache-Control: no-cache Connection: close Pragma: no-cache User-Agent: Mozilla/5.0 Chrome/72.0.3626.109 Safari/537.36 gtsid: zn3isN2C6bWsqYvO gtuvid: 7651E459979F931D39EDC12D68384C21249A8DE265F3A925F6E289A2467BC47D Content-Length: 120 Host: update.facebookint.workers[.]dev |
Figure 13. Example of a POST request used by SideWalk Windows
Another interesting point is that the Windows variant runs as fully position-independent shellcode, whereas the Linux variant is a shared library. However, we think the malware’s authors could have just taken an extra step, using a tool such as sRDI to convert a compiled SideWalk PE to shellcode instead of manually writing the shellcode.
Commands
Only four commands are not implemented or implemented differently in the Linux variant, as listed in Table 1. All the other commands are present – even with the same IDs.
Table 1. Commands with different or missing implementation in the Linux version of SideWalk
Command ID (from C&C) | Windows variants | Linux variants |
---|---|---|
0x7C | Load a plugin sent by the C&C server. | Not implemented in SideWalk Linux. |
0x82 | Collect domain information about running processes, and owners (owner SID, account name, process name, domain information). | Do nothing. |
0x8C | Data serialization function. | Commands that are not handled, but fall in the default case, which is broadcasting a message to all the loaded modules. |
0x8E | Write the received data to the file located at %AllUsersProfile%UTXPnat<filename>, where <filename> is a hash of the value returned by VirtualAlloc at each execution of the malware. |
Versioning
In the Linux variant, we observed a specificity that was not found in the Windows variant: a version number is computed (see Figure 14).
The hardcoded date could be the beginning or end of development of this version of SideWalk Linux. The final computation is made out of the year, day, and month, from the value Oct 26 2020. In this case, the result is 1171798691840.
Plugins
In SideWalk Linux variants, modules are built in; they cannot be fetched from the C&C server. That is a notable difference from the Windows variant. Some of those built-in functionalities, like gathering system information (SysInfoMgr, for example) such as network configuration, are done directly by dedicated functions in the Windows variant. In the Windows variant, some plugins can be added through C&C communication.
Defense evasion
The Windows variant of SideWalk goes to great lengths to conceal the objectives of its code. It trimmed out all data and code that was unnecessary for its execution and encrypted the rest. On the other hand, the Linux variants contain symbols and leave some unique authentication keys and other artifacts unencrypted, which makes the detection and analysis significantly easier.
Additionally, the much higher number of inlined functions in the Windows variant suggests that its code was compiled with a higher level of compiler optimizations.
Conclusion
The backdoor that was used to attack a Hong Kong university in February 2021 is the same malware family as the SideWalk backdoor, and actually is a Linux variant of the backdoor. This Linux version exhibits several similarities with its Windows counterpart along with various novelties.
ESET Research now also offers private APT intelligence reports and data feeds. For any inquiries about this service, visit the ESET Threat Intelligence page.
IoCs
A comprehensive list of Indicators of Compromise and samples can be found in our GitHub repository.
SHA-1 | Filename | ESET detection name | Description |
---|---|---|---|
FA6A40D3FC5CD4D975A01E298179A0B36AA02D4E | ssh_tunnel1_0 | Linux/SideWalk.L | SideWalk Linux (StageClient variant) |
7DF0BE2774B17F672B96860D013A933E97862E6C | hw_ex_watchdog.exe | Linux/SideWalk.B | SideWalk Linux (Specter variant) |
Network
Domain | IP | First seen | Notes |
---|---|---|---|
rec.micosoft[.]ga | 172.67.8[.]59 | SideWalk C&C server (StageClient variant) | |
66.42.103[.]222 | SideWalk C&C server (Specter variant from 360 Netlab’s blogpost) |
MITRE ATT&CK techniques
This table was built using version 11 of the MITRE ATT&CK framework.
Tactic | ID | Name | Description |
---|---|---|---|
Resource Development | T1587.001 | Develop Capabilities: Malware | SparklingGoblin uses its own malware arsenal. |
Discovery | T1016 | System Network Configuration Discovery | SideWalk Linux has the ability to find the network configuration of the compromised machine, including the proxy configuration. |
Command and Control | T1071.001 | Application Layer Protocol: Web Protocols | SideWalk Linux communicates via HTTPS with the C&C server. |
T1573.001 | Encrypted Channel: Symmetric Cryptography | SideWalk Linux uses ChaCha20 to encrypt communication data. |