Replicating Cobalt Strike's Port Scanner BOF for Open-Source C2 | Fast OPSEC-Aware Ping & TCP Connect Scanning in C

January 24, 2025·

Tldr

Was annoyed open source C2 tools had inferior port scanners compared to Cobalt Strike so I decided to replicate the functionality. Also wrote a ping scanner to help with host discovery. OPSEC may still require some improvements as described throughout the blog but the functionality is there.
The tool is available here: https://github.com/fyxme/portscanbof

On my quest to reviewing random C2’s for pure self enjoyment, I’ve come to realise that very few implement port-scanning for whatever reason.

While I understand that it’s possible to use a socks proxy and perform nmap scans over the proxy using a TCP connect scan (-sT flag), it feels less convenient, generates a lot of unwanted network traffic, requires the beacon to communicate constantly (sleep 0) and may have DNS issues, which made me want to re-create the Cobalt Strike Port Scan functionality.

It’s already possible to run binaries (ie. exe and dll) via other BOFs like noconsolation, using dotnet CLR to run csharp binaries inline or using a PowerShell port scanner via the C2’s powershell functionality, however, a lot of these generate more IoCs than BOFs and their output formats don’t integrate as well with C2’s since they were not directly made for those.

Through reviewing a number of BOFs on Github, looking at custom C2 agents and more, I didn’t find many implementing portscanning up to the level of Cobalt Strike’s Port Scanner. Although, I will give the following honourable mentions:

rvrsh3ll’s bofportscan BOF which works well but only supports a single host and a single port
Mythic’s thanatos agent which seemed to be the most complete, supports IP and IP subnets as well as ports and port ranges. While the agent itself is really good and the blog post about it deserves a read, this port scan functionality is only available to users of this specific agent, isn’t portable like a BOF, and still doesn’t support as many arguments or output as Cobalt Strike’s port scanner.

Hence, the goals were set:

write a portscanner BOF (COFF) that replicates Cobalt Strikes Port Scanning functionality including the wide variety of input parameters and provide the same amount of information as its output
write a pingscanner to complement the portscanner and use as a first pass scan
have some OPSEC considerations or at least describe the OPSEC limitations of the tool.

The initial release will be used to test and improve the BOF in training environments and/or test labs, with future improvement adding functionality and improving OPSEC. We’ll write the BOF in C as it’s the most common language to use for BOFs.

Enough intro, let’s begin.

Parsing Arguments

Cobalt Strike’s port scanner can be using the following way:

portscan [pid] [arch] [targets] [ports] [arp|icmp|none] [max connections]

The [pid] and [arch] options are used to “inject into the specified process to run a port scan against the specified hosts”. We’ll ignore completely as our BOF will just execute within the context of the running agent. Nevertheless, this is a good functionality if you want to improve OPSEC as you may want to choose a process which is likely to perform arbitrary requests to other hosts (eg. a web browser for HTTP/HTTPS).
The [targets] option is a comma separated list of hosts to scan. You may also specify IPv4 address ranges (e.g., 192.168.1.128-192.168.2.240, 192.168.1.0/24)
The [ports] option is a comma separated list or ports to scan. You may specify port ranges as well (e.g., 1-65535)
The [arp|icmp|none] target discovery options dictate how the port scanning tool will determine if a host is alive. none assumes that all hosts are alive. Instead of doing this in the port scan BOF itself, I’ve decided to implement the icmp functionality in a different BOF (ie. pingscan) which will be described later on. The arp scan feature will be implemented as a feature update in a future release. For our portscan, we’ll just assume that all hosts are alive (ie. none) and use timeouts instead to reduce the time it takes to perform a portscan.
Lastly, the [max connections] option limits how many connections the port scan tool will attempt at any one time. While this is not hard to code, performing synchronous scans is actually already really fast if you set low enough timeouts (which should be fine considering you’d expect low RTT in most internal environments). Nevertheless, this is a good feature that will be added in a future release.

We’ll split our BOF between portscan (ie. TCP connect scan) and pingscan (ie. ICMP echo/reply) with each being used the following way:

portscan [targets] [ports]
pingscan [targets]

Parsing ports

Parsing ports is relatively easy, there’s only three (3) options:

a single port (eg. 22,445,80, etc)
a port range (eg. 1-100, 5000-6000, etc)
an arbitrary combination of the two above options separated by a comma (eg. 22,5000-9000,445,7)

To improve user experience, we’ll also allow decreasing port ranges (eg. 900-100) for bragging rights on Cobalt Strike’s inferior port scanning argument parser.

The pseudo-code for this is simple:

split input on ","
for each part:
	if it contains a dash ("-") assume its a range: 
		get both parts of the range using sscanf
		convert each part to an int
		check that each part is between 1 and 65535
	else assume its a port:
		convert str to int 
		check that the int is between 1 and 65535

Assuming the parts are valid and passed our rudimentary checks, we need to store these ports. A design decision was made to use a linked list to store these as it reduces memory usage when large ranges are provided and easily allows arbitrary combinations to be made. To make things easier, we’ll assume that a single port is the same as the range from that port to itself (ie. "80" == "80-80"), which allows us to use the following struct:

Port input struct

Parsing targets

Parsing targets is a lot more involved since there are many available options, however we can split valid input options between the following input types:

IP Range: token contains a dash (-). Need to validate that each part of the dash is a valid IP otherwise you might end up matching hosts (eg. asdf-asdf.com).
CIDR: that one is easy, just match on /, parse the IP and the mask and do calculations to find the IP range, which brings us back to type 1
Host: you can use getaddrinfo to do the work for you here as it will find IPs for hostnames for you.
IP: similarly to hosts, we can use getaddrinfo and validate the IP which allows us to treat single IPs, the same way as single Hosts.
A combination of any of the above input types, split by a comma (,)

In the end, you can categorise each target into one of two categories:

IP Range: IP ranges, and CIDR Subnet (ie. an IP range in disguise)
IP/Host: Everything else that was parsed correctly by getaddrinfo

Similarly to port arguments, we’ll use a linked list to store the information as it provides the most flexibility. This results in the following struct definitions:

Host Input Structs

Important

OPSEC: The getaddrinfo function returns results for the NS_DNS namespace. The getaddrinfo function aggregates all responses if more than one namespace provider returns information. For use with the IPv6 and IPv4 protocol, name resolution can be by the Domain Name System (DNS), a local hosts file, or by other naming mechanisms for the NS_DNS namespace. (ref)

ie. getaddrinfo will resolve host names to their IPs. This potentially results in network traffic being generating while parsing input. To limit network traffic, we store a pointer to the addrinfo struct populated after running getaddrinfo and we’ll reuse it during scanning operations.

Correct parsing results in the following (with reversible IP ranges to assert dominance on Cobalt Strike’s inferior parser):

Parser Testing

ICMP (ping scan)

Windows API provides a very easy to use function to send ICMP echo requests, namely IcmpSendEcho. We can use this function to send ICMP echoes to our targets easily:

// The **IcmpSendEcho** function sends an IPv4 ICMP echo request and returns any echo response replies. The call returns when the time-out has expired or the reply buffer is filled.
IPHLPAPI_DLL_LINKAGE DWORD IcmpSendEcho(
  [in]           HANDLE                 IcmpHandle,
  [in]           IPAddr                 DestinationAddress,
  [in]           LPVOID                 RequestData,
  [in]           WORD                   RequestSize,
  [in, optional] PIP_OPTION_INFORMATION RequestOptions,
  [out]          LPVOID                 ReplyBuffer,
  [in]           DWORD                  ReplySize,
  [in]           DWORD                  Timeout
);

The function takes in a HANDLE IcmpHandle as its first parameter which denotes a handle returned by the IcmpCreateFile function (ie. a “function which opens a handle on which IPv4 ICMP echo requests can be issued.”). From testing, it appears we can reuse this file allowing us to create the initial handle and reuse it on subsequent calls to IcmpSendEcho resulting in improved scanning speed.

OPSEC

I wanted to see the network traffic difference between the IcmpSendEcho default configuration and the traffic generated from the ping.exe Windows utility. As it turns out, there is quite a bit of a difference:

Wireshark Ping Comparison

We notice the following from the screenshot above:

ping.exe: the ICMP requests and replies from the first four packets (those generated by the ping utility) are all 74 bytes long. The packet’s time to live (ie. ttl) is set to 128ms.
IcmpSendEcho: the request packets are 46 bytes long and the reply packets are 60 bytes long. The packet’s TTL defaults to 255ms.

By comparing the requests packets, we notice that the ping utility sends data as part of the request, namely the following string abcdefghijklmnopqrstuvwabdcefghi.

Wireshark Ping Packet Data

We can match this in our function call by passing in RequestData, and adjust the packet’s TTL to 128ms. This results in the following traffic being generated which is now identical between both ping.exe and our ping scanner:

Wireshark ICMP Packet Size 74

Note

The ping reply size difference was simply due to the difference in data being sent (since the reply contains the data our packet has sent), hence no further configuration was needed.

With configurations and optimisations via timeout, this results in the following code:

Ping IP C code

Danger

OPSEC: Other than the improvements described above, the tool does not currently add any delay between ping requests which can result in extremely fast consecutive requests being made which could lead to Denial of Service (DoS) on fragile hosts/environments if you’re not careful. Further customisation options will added in a future release to help improve OPSEC, as well as the ability to configure a custom timeout. The defaults will be adjusted to ping.exe’s timeouts.

We can combine this with the parser to have a working pingscanner that supports many input types:

icmp.exe Ping cflare subnet

Speed

Surprisingly if you don’t add a delay between the ping requests, it’s actually extremely fast… Who knew…

Without multi-threading or any other optimisations, running it against a /24 subnet (ie. 255 IPs) with low response times, you can scan the whole subnet in about 4 seconds: ICMP scan 24 subnet 4s

Just watch out… If you ping scan too quickly, you get angry neighbours… From ~20ms to over 10 times that for each ping:

icmp.exe Ping cflare subnet icmp.exe Ping cflare subnet error

From testing, it appears there are two events which can slow down the ping scan:

A ping request timeout which results in error 11010 aka IP_REQ_TIMED_OUT. This is possible to optimise by setting the request timeout in IcmpSendEcho, although there is a trade-off where you may end up with false negatives (ie. the host exists but took longer than timeout time to respond).

Ping Error 11010

Supplying non existant domains will result in delay from getaddrinfo. Unfortunately, it doesn’t have a timeout so you’d need to run it in a thread and time it out yourself if you wanted to optimise this. Or implement DNS requests yourself, but I’ll leave that as an exercise for the reader.

nslookup nsaindisguise.com Furthermore, getaddrinfo is also extremely slow when provided with invalid input… A stunning 2.4s to say that the host doesn’t exist: icmp.exe digits slow

The simplest fix to the above would be to check whether the input is a valid IP or hostname based on a regex, although this is harder than expected because in an internal environment, pretty much anything could be a valid hostname….

OPSEC wise performing a scan this fast really isn’t great and will start flooding the network with too many packets, potentially generating alerts or breaking stuff. But damn it’s fun to see the packets go by at MACH 2 speed!

I’ll add IP/Host validation as an optimisation in the backlog, move on to another project and never implement it. But at least you know its there… So yeah, don’t pass stupid data to it and you should be sweet!

TCP (port scan)

Before attempting to write a port scanner, I looked at the video linked on Cobalt Strike’s website about the Port Scanning functionality. From this, I gathered a few things I wanted to include in this port scanner:

The port scanner shows all open ports (that’s an obvious one)
For each port/service it finds, if the service supplies a banner (ie. SSH), then it will receive the banner and display it
The scanner provides additional information for Windows hosts with SMB open (port 445)

This is the functionality I wanted to replicate with this BOF.

Port Scan Function

Note

This is describing a TCP Connect Scan. If you are interested in other types of scans, checkout nmap’s book on scan methods.

Port scanning in itself is very simple, you have an IP and a port, you connect to an address, if the socket connects successfully, that port is open, if the socket errors that port is closed. Translate to code, this looks something like this (assuming WSA is initialised outside of this function):

isPortOpen source code

Not much going on here but this would also make a terrible port scanner because it would be terribly slow when you try to scan a port/service that doesn’t exist (ie. you’d get timed out but it would take a while). The screenshot below demonstrates this. It’s pretty much instant when the service exists (Elapsed time: 0.00 seconds) but takes forever when the services do not respond (Elapsed time: 42.00 seconds):

port scanning slow example

To improve the speed we can set a timeout when the socket attempts to connect to the remote port/service. On Windows, this results to setting the socket to non-blocking mode and using the select function to wait for the socket to successfully connect or timeout. This allows us to set a socket timeout and greatly increase the speed of the program:

updated isPortOpen with timeout

The code above results in an increase of 41 seconds over the old code:

portscanning with timeout for speed

Lastly, since we want to receive the headers from sockets we successfully connect to, we should set a timeout on the socket receive too. We can do this using the setsocketopt function prior to connecting to the socket as such:

setsockopt timeout example

This is pretty much it in terms of basic setup and allows us to identify open ports at a reasonable speed. We can now move on to more fun stuff like receive service headers and querying Windows Host information.

Service Headers and Information Discovery

In the video posted on the Cobalt Strike website, they show banner information retrieved when port scanning:

cobalt strike video screenshot scanner output

Now, the simple headers like SSH are simply sent upon connecting to the service. You just need to receive after connecting to the socket. You don’t even need to send any data:

netcat port 22 ssh banner

This is trivial to add to our scanner using the following snippet:

c socket receive data Note: In the above snippet, we are setting the socket back to blocking mode, however if we wanted to keep it asynchronous we could use a similar approach as what we did for connect, check the WSA Error and use select to wait for the recv function to end or timeout.

I knew you could get computer information from SMB such as hostname and domain but I’ve never had to implement it in code so I wasn’t sure where to get it from. The computer information seemed harder to get from SMB and I thought surely you wouldn’t get that from simply connecting to the socket. Spoiler alert, you don’t…

So I went down the rabbit hole and found some documentation on how to get server information on MS Documentation. I started by using NetServerGetInfo to try and retrieve the same information as in the cobalt strike clip… Turns out it was not that…. I didn’t get much information from it but was on the right track:

server-info.exe NetServerGetInfo

Did more digging and found another more promising function, namely NetWkstaGetInfo. And as it turns out, it looks like this is what they are using under the hood. I was able to get the exact same information as Cobalt Strike’s port scanner from that one function call:

server-info.exe NetWkstaGetInfo

Note

You can request different levels of information from both NetWkstaGetInfo and NetServerGetInfo.

Example for NetWkstaGetInfo:

Anonymous access is always permitted for level 100.
Authenticated users can view additional information at level 101
Members of the Administrators, and the Server, System and Print Operator local groups can view information at levels 102.

We’re using level 100 here since we want to do this unauthenticated.

All that remained was adding the following code to our scanner:

C get_server_info function

Final Scanner

After implementing all of the above, we have a scanner that can identify open ports, prints Windows host information and displays service banners:

Havoc Final Scanner output

However, it’s not a BOF yet…

OPSEC

At the moment the tool scans all selected ports on one (1) IP before moving on to the next, however it might be better to do the opposite and scan one (1) port on all IPs and move on to the next port.

An even better solution might be to randomly choose 1 port and 1 IP and scan that, although if you see random ports and IPs popping up in logs than it also might stand out as weird… More research required.

Interestingly, looking at the output from Cobalt Strike’s video demonstrating their port scanner, I noticed two things:

They are scanning IPs and ports from highest to lowest (probably due to how they are parsing the arguments). You can see the first service to come back is port 5357 on 10.10.10.2222:

Cobalt Strike IP scan start

And after the video cut, you can see the IPs are decreasing and ports too:

Cobalt Strike tool output IP decreasing order

They perform the NetWkstaGetInfo on all hosts with port 445 open at the end of the scan, which is why you see all of the port 445 hosts popup at once:

Cobalt Strike tool output NetWkstaGetInfo

No idea if their design is better or not. Need to test it inside a lab environment where monitoring tools are in place to gain a better idea. Good follow up research!

Converting to a BOF

A follow up blog will be released highlighting how the scanners were converted to BOFs and will cover the following topics:

Dynamic Resolution of Win32 APIs and generating function declaration mappings automatically
Compiling larger BOFs with multiple c files
Improved BOF output through batching prints

In the meantime, you can enjoy using the tool by downloading it from github: https://github.com/fyxme/portscanbof

Initial release

After converting the application to a BOF and cleaning up the output, we get the following:

Initial Release Example Screenshot

The code and usage guide has been released publicly and can be found here: https://github.com/fyxme/portscanbof

The tool may have been update since this blog was written. See the corresponding GitHub repository for up to date information.

Future improvements

~~Add async/multi-threading support for TCP scan~~ - This has been added since this blog was written. Still need to do more testing but good for now
ARP scan
Add the ability to check if the host is alive before running a TCP scan (similar to how cobaltstrike does it)
Additional arguments including timeout, number of threads, only scan 1 ip per host, etc
Add sleep delay option between ICMP requests to prevent DOS and improve OPSEC (approx 1 second delay from ping.exe)
Fix code and Makefile for compiling exe’s
code refactor, and ensure all failure checks are validated (ie. memory allocation failures, etc)
UDP scan
IPV6 Support
Linux support

References

OffSec Experienced Penetration Tester (OSEP) Exam Review Creating a Havoc Module to run commands on all agents at once and map out an environment passively