Firewall Misconfig
Firewall Misconfig
Abstract—Public IP addresses can expose devices and services connections1 (with a few exceptions if the end device is a
to risks such as port scanning and subsequent cyberattacks. server). They may also restrict outbound traffic to a few
Therefore, firewalls are extensively deployed and play a critical essential protocols, e.g., HTTP and DNS. Such a firewall is
role in enforcing security policies and preventing unauthorized supposed to minimize the attack surfaces of the end devices.
access. However, vulnerabilities can allow firewalls to be by- However, this belief is not always true, as misconfigured
passed, effectively nullifying the protection. firewalls can fail silently in response to simple yet carefully
In this paper, we present the first comprehensive study of a crafted network traffic, exposing protected hosts and ser-
previously understudied attack surface: firewall misconfigura- vices to external attackers. For example, when configuring a
tions that inadvertently expose protected services to the public firewall to allow accessing HTTP websites, the administrator
Internet. Specifically, we demonstrate flawed firewall rules that may inadvertently allow any inbound traffic from TCP port
allow inbound connections from special source ports to bypass 80, creating a loophole that permits any inbound TCP con-
the firewall, and explore the prevalence and security implica- nection from port 80. Similarly, allowing DNS traffic might
tions thereof. To this end, we scan the IPv4 space for 15 com- unintentionally open up the firewall to any inbound UDP
monly high-risk TCP and UDP services from two special source datagrams from source port 53. Similar misconfigurations
ports. Our measurement reveals the widespread existence of have previously occurred in Windows 2000/XP/2003 [12]
such misconfigurations and identified over 2,000,000 otherwise and Mac OS X Tiger [13], which allowed connections from
unreachable services spread over 15,837 autonomous systems, certain ports to bypass their built-in firewalls. Although this
expanding the “observable Internet” for various protocols by vulnerability is documented in the Nmap manual [14], it has
up to 12.60%. More importantly, the affected services generally
been long forgotten and overlooked, and no comprehensive
study has been conducted. The prevalence, security implica-
exhibit higher security risks than the publicly accessible ones,
tions, and real-world exploitation of such misconfigurations
like outdated software versions and weak configurations. De-
in today’s Internet remain unknown.
spite the severity of this vulnerability, our honeypot experiment
Our Study. We conduct the first comprehensive study
provides little evidence of active exploitation in the wild.
of firewall misconfigurations of this type, which mistakenly
Our findings offer insights for better security posture and
allow undesired connections to bypass the firewalls. We aim
network administration, helping researchers and organizations
to quantify the prevalence and security implications of this
anticipate and mitigate potential cyber threats emanating from
understudied attack surface. To this end, we identify the
the Internet.
affected services in the IPv4 address space and analyze their
security risks.
1. Introduction Specifically, we scan the IPv4 space targeting 15 com-
monly high-risk services (e.g., SSH, HTTP, and MongoDB)
The public Internet is constantly being scanned [1], [2], and identify those that are only accessible from specific
[3], [4], [5], [6], often by malicious actors as a prelimi- unusual source ports. The result confirms the widespread
nary step in cyberattacks [1], [7]. Devices with public IP existence of firewall misconfigurations of this kind. By
addresses are subject to such probes, and attackers actively initiating connections from TCP port 80 and UDP port 53,
try to compromise reachable hosts by exploiting vulnera- we uncover over two million services that would otherwise
bilities like software defects, configuration flaws, and weak remain unreachable, including over 800 thousand manage-
passwords [8], [9]. Unsecured hosts may be compromised ment services (SSH, Telnet, RDP, SNMP, and IPMI). The
within minutes [10], [11]. As a result, firewalls are widely
deployed to protect devices and their services by blocking 1. For the sake of convenience, we refer to the sessions of connectionless
unwanted access. Typically, these firewalls block all inbound protocols like UDP also as connections.
affected hosts are distributed across 15,837 autonomous topology and connectivity across different regions [18], [19].
systems (ASes) and 221 countries and regions. With this By probing the accessible hosts and services, it maps the
firewall circumvention technique, we are able to reach up Internet and collects large-scale empirical data for analysis.
to 12.60% more services, with the ratio varying by protocol, Due to the importance of Internet-scale measurements,
expanding the observable Internet to a great extent. numerous techniques have been developed to facilitate In-
Notably, we find that the affected generally exhibit ternet scanning. ZMap [20] democratized fast Internet-wide
higher security risks than public services, such as outdated surveys. Masscan [21] is capable of scanning the IPv4 space
software versions and weak ciphers. Furthermore, we iden- in minutes with 10-gigabit Internet connections. While the
tified hundreds of unprotected SMB shares and MongoDB immense space of IPv6 precludes full scans, there have been
databases. This indicates that these misconfigured firewalls works utilizing heuristics and/or machine learning [22], [23],
provide a false sense of security. We also observe over ten [24] to pinpoint scannable subspaces.
thousand vulnerable home routers that may be compromised There are also platforms that spare researchers the bur-
from the public Internet and one public cloud that provisions den of managing their own scanners. Search engines like
flawed default firewall rules to virtual machines. Shodan [25], Censys [26], and FOFA [27] periodically probe
To investigate whether such misconfigurations are ac- the Internet, and users can search for hosts and services of
tively exploited in the wild, we carry out a honeypot exper- interest. RIPE Atlas [28] is a global network measurement
iment over four months. While our study shows no evidence platform where users can distribute measurement tasks to
of large-scale exploitation of this attack surface in the wild, tens of thousands of crowdsourced probes around the world.
its widespread presence should trigger an alarm for network Unfortunately, besides research purposes, Internet scan-
administrators and security researchers. We hope that the ning is largely abused by attackers for malicious activities.
insights in this paper can help improve security posture and In general, attackers scan ports where vulnerable services
benefit various sensitive applications. reside [7]. After identifying the hosts running the target
Disclosure. We perform responsible disclosure to the services, attackers try to exploit them by different means.
15,837 ASes with affected hosts and set up a website to They may try to compromise the hosts by exploiting soft-
explain this vulnerability and provide fix suggestions. In ware vulnerabilities or brute-forcing [8], [9] and use these
total, the website has been viewed by recipients from 1,436 machines to mine cryptocurrencies or expand botnets [29],
ASes, and we have received inquiries, updates, and letters [30]. Attackers may also exploit configuration flaws to
of thanks from 173 ASes. In the follow-up measurements, launch attacks, e.g., DNS and NTP reflection amplification
we see a noticeable decrease in affected hosts in the most attacks [31], [32], without fully compromising the hosts.
affected ASes. We also disclose this issue to involved parties
whose products are found to contain misconfigurations in 2.2. The Observable Internet
their host-level firewalls.
Contributions. We make the following contributions: The more services we can reach, the more empirical
• We perform the first comprehensive study of an under- data we can collect and analyze. It is also true for attackers
studied attack surface, firewall misconfigurations that in- – they need to reach a service before they can exploit it. In
advertently allow undesired connections, at the Internet this paper, we name the set of reachable Internet services
scale. the observable Internet. Only the services in the observable
Internet can be interacted with by actors on the Internet,
• We confirm the widespread existence of this issue and including performing measurements or launching attacks.
identify over two million affected services in the IPv4 However, despite the huge number of connected devices,
space, with a low false positive rate ensured by our hosts and services are sparse even for the relatively small
multi-pass workflow. IPv4 space. Bano et al. [33] pointed out that only ∼10%
• We analyze the security implications of the affected ser- of IPv4 addresses respond to ICMP pings. Klick et al. [34]
vices and find that they generally exhibit higher security found that only two-thirds of IPv4 addresses are announced
risks than public services. and that they can collect 90-99% of the desired responses by
• While we did not observe any large-scale exploitation scanning at most 75% of the announced IP addresses. The
in the wild in our honeypot experiment, we conduct majority of online devices and their services are concealed
responsible disclosure to the affected parties out of an behind firewalls or NAT gateways.
abundance of caution and receive positive feedback. Researchers have been working on expanding the ob-
servable Internet in different ways. Izhikevich et al. [35]
investigated services on non-standard ports and found that
2. Background and Related Work only 3% of HTTP and 6% of TLS services ran on ports 80
and 443, respectively. They further developed a framework
2.1. Internet Scanning to predict the ports of these misplaced services [36]. Song et
al. [37] also proposed a machine learning method to improve
Internet-wide scanning is crucial for various cybersecu- the hit rate and intrusiveness of uncovering such services.
rity applications, like vulnerability assessment and threat Wan et al. [38] demonstrated the importance of scanning
tracking [4], [15], [16], [17]. It also helps understand the from different geographic and topological locations.
TCP 80 TCP 42000 TCP 42666 TCP 3389 TCP 80 TCP 3389
(a) Normal traffic is allowed. (b) Attacks from high ports are blocked. (c) Attacks from port 80 are overlooked.
Figure 1: Example of Bypassing Misconfigured Firewalls
There are also techniques leveraging vulnerabilities to packets that neither belong to any known flow nor create a
reach normally unreachable hosts and services. Rytilahti et new flow are discarded. The firewall deletes a flow record
al. [39] found that application-layer middlebox protocols when termination signals like FIN and RST segments are
might be used to bypass NAT gateways and reach internal found. For connectionless protocols like UDP and ICMP,
hosts. Feng et al. [40] utilized the shared-IPID side channel the record expires after a period of inactivity [47]. Most
to penetrate NAT devices. The SYN cookie implementation stateful firewalls, like iptables, also support stateless rules
in old Linux versions allowed attackers to bypass firewall to enhance performance and simplify configuration. With
rules by guessing cookies [41]. Some firewalls dynamically only stateless rules, a stateful firewall operates in the same
open ports for multichannel protocols like FTP and SIP, and way as stateless firewalls.
this feature can be exploited for tunneling any ports [42], Firewall Misconfiguration. As critical guardians of net-
[43]. Two vintage firewall flaws in Windows [12] and Mac works, firewalls mainly rely on manual configurations and
OS X [13] allowed any inbound traffic from certain remote a misconfigured one can be a single point of failure. Hence,
ports, effectively nullifying the system firewalls. firewall misconfiguration has been studied from various
In conclusion, expanding the observable Internet is criti- perspectives, such as automatically modeling and examining
cal to network research and can help anticipate and mitigate firewall rules [48], [49], [50], [51]. Bringhenti et al. [52]
potential cyber threats. Our work contributes a new perspec- proposed an approach to automate firewall configuration.
tive by demonstrating the prevalence and significant impacts The misconfiguration which unintentionally allows inbound
of firewall misconfigurations. connections from specific source ports has been mentioned
in the Nmap manual [53] and some articles available on-
2.3. Firewall line [13], [54]. However, there has been no comprehensive
study on this topic and the security implications thereof in
Firewalls inspect inbound and outbound traffic based on today’s Internet remain unknown. Our work demonstrates
predefined rules, establishing a barrier between the trusted that such misconfigurations are still widespread and have
network and the untrusted network (usually the Internet). alarming prevalence and far-reaching impacts.
Firewalls come in different forms and implementations. In
the context of this paper, the key lies in whether the firewall 3. A Firewall’s Achilles Heel
is able to, or configured to, track network flows, such as
TCP connections and UDP sessions. Based on whether they A single flawed rule can compromise the effectiveness
have this capability, we can categorize firewalls into stateless of the entire firewall. In this section, we demonstrate the
firewalls and stateful firewalls [44]. mechanism, threat model, and behavior patterns regarding
Stateless Firewalls. Stateless firewalls inspect network such flawed rules.
traffic on a per-packet basis, unaware of network flows.
Stateless firewalls only support stateless rules, which permit
or discard packets with specified properties, such as specific 3.1. Mechanism
source/destination addresses and ports. A stateless firewall
matches each packet against the rules and makes the de- Configuring firewalls may seem intuitive. However, sub-
cision solely based on that packet. Consequently, stateless tle intricacies can render it prone to errors.
firewalls are generally incapable of distinguishing between Using stateless rules, it takes two rules to permit access
inbound and outbound connections. While many advanced to external HTTP services: one for user packets to go out
models can filter TCP segments by their flags [45], enabling and another for server packets to come back. In a simplified
them to block inbound TCP connections, it does not extend scenario where only access to HTTP services is allowed, a
to UDP and is absent in the access control lists (ACLs) of network administrator may set up the following rules:
some switches [46]. Stateless firewalls are widely deployed 1) Permit TCP segments from internal hosts to port 80 of
in environments with heavy network traffic, as they offer external hosts.
superior performance and desirable simplicity. 2) Permit TCP segments from port 80 of external hosts to
Stateful Firewalls. Stateful firewalls are capable of internal hosts.
tracking network flows and filtering the packets based on 3) Discard all other IP packets.
their corresponding flows. Stateful firewalls support stateful These rules usually work well as operating systems typically
rules, which permit or forbid network flows with specific use random high ports as source ports by default [55], and
properties. The permitted flows are recorded, and orphan connections to internal hosts are supposed to be rejected by
Rule 3, as shown in Figure 1b. However, suppose an attacker It is worth noting that theoretically, port X may be a high
initiates TCP connections to the protected hosts from port port itself, as the flawed rule may allow any port depending
80. In that case, the initial SYN segment will be permitted by on the exact configuration. There may also be multiple ports
Rule 2, as the firewall is unable to distinguish the connection from which the inbound traffic can bypass the firewall, as
direction. Subsequently, the malicious traffic consistently there may be multiple flawed rules.
adheres to the first two rules, evading the intention of
blocking inbound connections, as illustrated in Figure 1c. 4. Measurement Framework
Similar misconfigurations can happen when allowing UDP
protocols like DNS and NTP, causing a loophole that allows While examining a single service is straightforward, it
any inbound UDP datagrams from source ports 53 or 123. becomes challenging when scaling the scope to the whole
It is unlikely for stateful firewalls with stateful rules IPv4 space. In this section, we discuss how we address the
to undergo the same misconfigurations, as they operate on challenges relating to the Internet-scale measurement.
network flows and do not require a complementary rule Goals. We aim to study (i) the prevalence and (ii) the
to permit returning packets. Still, human errors may lead security implications of the aforementioned firewall miscon-
to flawed rules being added. For example, when allowing figurations at the Internet scale. To this end, we need to:
inbound connections to TCP port 80 of a web server, the 1) Identify the affected services in the IPv4 space.
network administrator might misspell the destination port as 2) Extract the characteristics of the affected services, e.g.,
the source port, allowing any inbound connection from TCP software versions and supported cipher suites.
port 80. In such cases, the examples in Figure 1 still hold. 3) Analyze the security implications of exposure.
Challenges. Networks are dynamic and volatile, and the
3.2. Threat Model Internet exemplifies these characteristics. Packet loss and
region-based filtering rules are commonplace. A host can
We assume an attacker who actively scans the Internet to run any service, and the flawed rule may allow any source
find and exploit vulnerable services. The attacker is remote port to bypass the firewall inadvertently. To address these
and does not have special network capabilities. complexities, we need a robust workflow complemented by
Leveraging flawed firewall rules, the attacker can bypass diverse vantage points, carefully selected scope, and proper
the misconfigured firewalls and reach protected services by tools to ensure both accuracy and efficiency.
simply manipulating the source port of their connections.
The attacker can specify a common port like TCP 80 or 4.1. Workflow
UDP 53 as the source port for all their scanning activities.
The services reached in this manner should be a superset of The core of the workflow is an iterative method to ensure
the result of regular scanning, as scanning from a designated both broad coverage and a low false positive rate (below
port should not obscure the services reachable from random 1%). We denote the chosen source port as the designated
ports. The attacker can also scan from multiple special ports port and the port of the target service as the target port.
to circumvent even more misconfigured firewalls.
Consequently, the attacker can expand their observable Scan IPv4 space Candidate list Response list
desig. port → target port
Internet and obtain more potential targets for their attacks, ►host list
Assume that the affected service is on port P . Normally, Yes Yes Yes
the firewall blocks all inbound connections; however, it erro- Candidate list Response list Validated Responses
neously allows inbound traffic from port X . When initiating
connections from a random high port R and from port X , (a) Identification (b) Probe (c) Validation
we should notice the following differences in responses:
Figure 2: Three-Phase Workflow
• Port P is irresponsive or responds with errors, e.g.,
RST segments or ICMP unreachable messages when the Phase 1: Identify the Affected Hosts. In this phase,
connection is initiated from source port R. we unearth the hosts whose target ports are responsive only
• Port P becomes responsive, establishing the connection when we connect from the designated port. The steps are:
and responding at the application layer, when the con- 1) Scan the IPv4 space targeting the target port from
nection is initiated from source port X . the designated port. The responsive hosts constitute
the initial host list, including both affected hosts and The process is shown in Figure 2c. The iteration often
irrelevant hosts. finishes in three to five rounds. The validated responses have
2) Scan the host list targeting the target port from random the desired false positive rate below 1%.
high ports. The responsive hosts are the irrelevant hosts After these phases, we parse the validated responses to
because they are reachable from high ports2 . extract the characteristics of the services, such as software
3) Remove the irrelevant hosts from the current host list. versions and supported cryptographic algorithms. We further
4) Repeat Steps 2 & 3 until the response rate is below analyze the characteristics to conclude security risks.
1%, ensuring a low false positive rate. The remaining
hosts form the candidate list. 4.2. Scope Selection
The process is shown in Figure 2a. We adopt the iterative While the IPv4 address space has a relatively manage-
design to circumvent packet loss and other dynamic be- able size, there are numerous (i) possible services and (ii)
haviors. Step 1 incurs most of the cost, which requires a unexpectedly allowed source ports. Given the scale, we have
full scan of the IPv4 space. The following steps converge to restrict our scope to a small number of target services and
rapidly, typically after three to five rounds of scanning a designated ports due to feasibility considerations.
small number of addresses. We believe that the candidate Target Service. To illustrate the prevalence and security
list is a reliable set of affected hosts. implications of firewall misconfigurations of this kind, we
We note that an open port does not necessarily serve the choose 15 common services that are often vulnerable. We
target service, and hosts may go offline at Step 2, causing list the services in Table 1. For services supporting both TCP
false positives. We handle these cases in the next phase. and UDP, we choose the more commonly used protocol, like
Phase 2: Probe the Affected Services. In this phase, UDP for DNS. We divide the services into four categories.
we send application-layer probes and collect the responses Management services are for device management; file shar-
for characteristic extraction. The steps are: ing and database services store user and application data;
1) Send probes (e.g., HTTP or DNS requests) to the target general services include other often exploitable services.
port of the candidate hosts from the designated port. Compromise of these services may lead to device takeover,
2) Record the responses and exclude the responding hosts information leakage, etc. These services have various attack
from further probing. surfaces, such as weak passwords in SSH, Telnet, RDP, etc.,
3) Repeat Steps 1 & 2 until the response rate is below and lack of authentication in FTP, SMB, MongoDB, etc. We
1%, ensuring high coverage. The aggregated responses introduce the attack surfaces of the services in detail in the
constitute the response list. corresponding sections in §6.
The process is shown in Figure 2b. The iteration usually Category Service Port
finishes in three rounds. The responses come from a subset
SSH TCP 22
of the candidates which truly serve the target service and Telnet TCP 23
have not gone offline during the measurement. RDP TCP 3389
We recognize that, despite ensuring a low false positive Management IPMI UDP 623
SNMPv1
rate in Phase 1, a significant number of affected hosts may SNMPv2c UDP 161
either fail to respond to our application-layer probes or SNMPv3
respond with an unexpected protocol due to various factors.
FTP TCP 21
Consequently, the subset of the hosts responding with valid File sharing
SMB TCP 445
responses in Phase 2 may be substantially smaller than
MySQL TCP 3306
the candidate list generated in Phase 1, necessitating an Database
MongoDB TCP 27017
additional verification of the false positive rate. We handle
HTTP TCP 80
this in the next phase. HTTPS TCP 443
Phase 3: Validate the False Positive Rate. In this phase, General
DNS UDP 53
we confirm the desired false positive rate by scanning the NTP UDP 123
affected services from high ports again. The steps are:
TABLE 1: Target Services for Measurement
1) Scan the services in the response list from random high
ports. The responsive hosts are new irrelevant hosts as Designated Ports. Technically, a firewall flaw may mis-
their services turn out to be reachable from high ports. takenly allow traffic from any source port, depending on
2) Remove the responses previously collected from the the specific erroneous rule. However, it is impractical to
irrelevant hosts. enumerate and scan from all 65,535 ports, and we hope to
3) Repeat Steps 1 & 2 until the response rate is below 1%, maintain a balance between coverage and footprint as dis-
ensuring a desired false positive rate. The remaining cussed in §11. We select port 80 for TCP-based services and
responses are the validated responses. port 53 for UDP-based services because their corresponding
protocols (HTTP and DNS) are among the most popular
2. Here we ignore the negligible possibility that the random high port is services on the Internet. According to recent studies on Inter-
the very port that can bypass the firewall. net traffic [56], [57], the HTTP family of protocols (HTTP,
HTTPS, and QUIC) account for the largest share. HTTP is We also develop our own tools in Python to analyze the
the fundamental member of this family. DNS accounts for output of ZMap and ZGrab. We store the data in MongoDB
the majority of UDP traffic besides QUIC and is necessary and ElasticSearch for indexing and querying.
for domain name resolution. Therefore, we presume that
TCP port 80 and UDP port 53 are the ports most likely to 4.4. Threats to Validity
be allowed by the firewalls, and the results should establish
a reliable prevalence. This presumption is also confirmed in We put in our best effort to get reliable and accurate
small-scale preflight measurements with other popular ports results in our Internet-wide measurement. We employ the
(TCP port 22/443 and UDP port 123/443). iterative and multi-pass approach in §4.1 to increase cover-
age and constrain false positives. However, the results may
4.3. Experiment Setup still be affected by the following factors:
Network Volatility. Despite measuring from five vantage
Vantage Points. We employ multiple geographical re- points iteratively, we cannot fully eliminate the impact of
gions to increase the coverage and mitigate packet loss and packet loss. An affected host may be offline for part or all
region-based filtering. This is important because it has been of our measurement period, which is out of our control.
shown that an individual vantage point may miss up to Firewall Strategies. Our probes may suffer from region-
18.2% of the target service [38]. Specifically, we deploy five based filtering rules despite multiple geographical locations.
vantage points in the United States, Germany, Singapore, Limited Scope. Most services do not run on their default
India, and Brazil, respectively. Each vantage point executes ports [35], and firewalls may be misconfigured to allow
every step of the workflow in §4.1, and we merge the results, packets from ports other than the designated ports we select
e.g., responsive hosts, into a union set after each step. The as discussed in §4.2.
vantage points run Debian 12 with kernel 6.1 and are capable Special Networks. Certain networks may route packets
of scanning the entire IPv4 space in about two hours. via multiple paths (e.g., ECMP routing), which may affect
Control Group from Public Services. We hypothesize the observation of middleboxes [60], [61]. Port forwarding
that the affected services face higher security risks because may also introduce a bias on the number of observed hosts.
administrators may be misled by the false sense of security It is hard to quantify the exact impact of these factors as
provided by firewalls, leading to less stringent maintenance their effects are intertwined. Nevertheless, we try our best to
practices. To verify this, we set up control groups for the ser- keep the observed false positive rate below 1% as discussed
vices for which we can safely measure specific weaknesses in §4.1, and give a reliable set of affected services.
(e.g., software versions and weak cipher suites); we do not
set up control groups for services that only have general 5. Overview of Measurement Results
weaknesses (e.g., weak passwords) that require aggressive
detection due to ethical considerations. The control groups We performed our primary Internet-wide measurement
consist of public services (unprotected by firewalls), whose in August 2024. In this section, we showcase the prevalence
hosts are randomly picked from the irrelevant hosts in Step and distributions of the aforementioned firewall misconfig-
2 of Phase 1 in §4.1. Their sizes are comparable to or larger urations. We also highlight the security implications of the
than the sizes of the affected services. We probe these public affected services.
services following Phase 2 of the workflow, except that we
Service Port Count Public Services %
initiate connections from random high ports.3 Finally, we
parse the responses and compare the security risks of the SSH TCP 22 234,984 25,307,484 0.93%
Telnet TCP 23 50,820 2,504,330 2.03%
affected services and public services. RDP TCP 3389 7,931 3,504,675 0.23%
Tools. We leverage two open-source tools for scanning IPMI UDP 623 4,242 59,565 7.12%
the Internet and sending probes to the target services. SNMPv1 42,894
SNMPv2c UDP 161 36,753 19,360,968 2.51%5
• ZMap [20], [58]: ZMap is a fast network scanner for SNMPv3 465,033
Internet-wide surveys. It is capable of specifying the
FTP TCP 21 32,172 7,713,688 0.42%
source port4 . We use it to detect the status of TCP SMB TCP 445 19,419 1,174,742 1.65%
ports and send probes to UDP services.
MySQL TCP 3306 19,456 3,853,048 0.50%
• ZGrab 2.0 [59]: ZGrab is a fast application-layer net-
MongoDB TCP 27017 338 198,470 0.17%
work scanner. It supports interacting with various pro-
tocols, and we enhance it to allow specifying the source HTTP TCP 80 222,539 163,503,752 0.14%
HTTPS TCP 443 193,630 158,488,284 0.12%
port and support additional protocols like RDP. We use DNS UDP 53 334,358 5,287,835 6.32%
it to send probes to TCP services. NTP UDP 123 824,389 6,545,301 12.60%
3. Not all picked hosts respond, possibly due to factors like network TABLE 2: Number of Affected Services
volatility. Hence, the final control group sizes vary (and thus, may appear
random).
4. When no source port is provided, ZMap uses random source ports 5. Calculated based on unique IP addresses as Shodan has no SNMP
from 32768 to 61000, the default ephemeral port range of Linux. version filter.
Country Count ASN Country Type Count any authentication. We will describe the details relating
Italy 303,446 1267 Italy ISP 231,316 to each service in §6. Note that we did not perform any
USA 290,848 8447 Austria ISP 125,994 aggressive measurements (e.g., guessing weak passwords)
China 216,647 5483 Hungary ISP 121,853 due to ethical considerations.
Austria 130,331 4837 China ISP 79,811
Hungary 123,294 29256 Syria ISP 75,124 Service Security Risks
Poland 108,134 16509 USA Cloud 75,050
Japan 104,464 12741 Poland ISP 54,586 SSH 70,739 (30.10%) with weak cryptographic algorithms;
Brazil 85,869 1680 Israel ISP 47,127 2,695 known to be vulnerable to CVE-2024-6387 [62].
Syria 74,144 4134 China ISP 39,303 Telnet 31 directly present shells without any authentication.
Israel 69,440 15802 UAE ISP 34,514 RDP 2,324 (29.31%) run end-of-life Windows versions.
IPMI 1,264 (29.8%) can lead to full control of servers; 2,772
TABLE 3: Top Countries TABLE 4: Top ASes (65.35%) support insecure authentication.
SNMP 47,117 devices may leak configurations, including
servers, routers, switches, printers, VoIP devices, etc.;
465,033 devices may be fingerprinted or brute-forced.
Prevalence. We found a total of 2,488,958 services on
2,147,229 unique IP addresses affected by the previously FTP 1,833 known to run outdated software; two known to
allow anonymous login.
discussed firewall misconfigurations. We display the detailed SMB 202 unprotected file shares; 642 (63.44% of Windows
numbers relating to each service in Table 2. In addition, we hosts) run end-of-life Windows versions; 10,890 Linux
queried Shodan [25] for the numbers of the corresponding hosts are vulnerable routers.
publicly accessible services6 . We list the figures and the MySQL 10,522 (54.08%) run end-of-life versions.
ratios of the affected services to the public services. It turns MongoDB 790 unprotected databases with sizes up to 1 terabyte.
out that by initiating connections from only two designated HTTP(S) 20,681 (53.54% of those on mainstream web servers)
ports, we can reach up to 12.60% more services. run end-of-life software; 172,622 (89.15%) HTTPS
According to the statistics, the ratios of affected services services serve internal websites; 189,334 (97.78%)
HTTPS services use insecure certificates.
to public services are higher for UDP services than for TCP DNS 212,886 (63.67%) allow ANY queries, possibly ex-
services, especially for the more popular protocols like DNS ploited for reflection amplification attacks.
and NTP. The reason for this phenomenon may be due to NTP 22,244 (2.70%) allow monlist, possibly exploited for
the prevalence of insecure firewall rules for UDP services reflection amplification attacks.
on the Internet, which we discuss in §10.
TABLE 5: Security Risks of Affected Services
Distributions. We identified the locations and ASes of
the affected hosts by their IP WHOIS information. The
affected hosts are located in 221 countries and regions and
belong to 15,837 different ASes. We also list the ten most 6. Analyzing Security Risks by Service
affected countries in Table 3 and the ten most affected ASes
in Table 4. The results suggest that ISPs are most prone to In this section, we present the detailed results from our
such firewall misconfigurations. The complete geographic primary measurements, focusing on the security implications
distribution is shown in Appendix A.1. of firewall misconfigurations on various services.
Besides ISPs and clouds, we also find that many well-
known enterprises have thousands of affected hosts in their 6.1. SSH Services
ASes, such as Apple, Google, Starlink, Alibaba, and Yandex.
This further demonstrates the prevalence of firewall miscon- SSH provides secure remote access over insecure net-
figurations of this type. We list these noteworthy ASes in works. We uncovered 234,984 SSH services behind mis-
Appendix A.2. configured firewalls and picked 753,769 publicly accessible
We also notice a significant long tail in the distribution SSH services as the control group.
of the affected hosts in the ASes. Among the 15,837 ASes, Measurement. We record the banners and handshakes
only 202 (1.3%) have more than 1,000 hosts, accounting for of the SSH services. We determine their software versions
83.07% of affected hosts; only 960 (6.1%) have more than by parsing banners. We also extract the supported crypto-
100 hosts, accounting for 94.27% of affected hosts. graphic algorithms in the handshakes, as weak cryptographic
Security implications. After we extract the characteris- algorithms can compromise the confidentiality and integrity
tics as described in §4.1, such as software versions and sup- of SSH channels as found in the Terrapin Attack [63].
ported cryptographic algorithms, we analyze the potential Analysis. We list the most common vendors of the af-
security risks. We find the affected services are susceptible fected services in Table 6. The result shows that OpenSSH is
to various kinds of attack vectors. We highlight the key secu- predominant (94.8%) while some indicate embedded or net-
rity risks in Table 5. For example, we find that about 30% work devices like dropbear, ROSSSH, Cisco, HUAWEI,
of the affected SSH services employ weak cryptographic and DOPRA. OpenSSH accounts for 76.0% of the control
algorithms and that 31 Telnet services present shells without group services.
Given its dominant usage share, we compare the distri-
6. We specified the port number and filters like -hash:0 to exclude butions of OpenSSH versions between the affected services
invalid results. and public services as in Table 7. Overall, the affected
Software Count % Version Affected Public considerations, these affected services may be brute-forced
OpenSSH 222,878 94.85% 2.x 0.11‰ 0.02‰ and taken over if the firewalls are circumvented.
AWS_SFTP 4,955 2.11% 3.x 0.16% 0.08%
dropbear 1,545 0.66% 4.x 0.34% 0.52%
ROSSSH 1,264 0.54% 5.x 2.81% 3.20%
6.2. Telnet Services
Cisco 1,095 0.47% 6.x 3.77% 2.58%
HUAWEI 564 0.24% 7.x 54.07% 34.84% Telnet allows users to access remote systems. We uncov-
DOPRA 232 0.10% 8.x 32.25% 45.31% ered 50,820 Telnet services behind misconfigured firewalls
WeOnlyDo 92 0.04% 9.x 6.55% 13.28%
Others 2,451 1.04% Others 0.03% 0.20%
and fetched their banners.
Measurement. We record the banners of Telnet services.
TABLE 6: SSH Software TABLE 7: OpenSSH Versions Telnet usually prompts login when connected, and insecure
systems may directly prompt shells without authentication.
Analysis. Upon connection, 98.06% (50,192) of affected
services tend to run older versions, suggesting a higher services prompt login. While most of the other affected
chance of older operating systems and other outdated and services return error messages or unidentifiable data, 31
vulnerable software. directly present shells without any authentication.
We then check the patch progress of CVE-2024-6387, Telnet does not support encryption and transmits all data
which is a recently discovered remote code execution (RCE) in plaintext. However, it is still widely used for manag-
vulnerability in OpenSSH. We select the services within ing network devices and IoT devices [15], [67]. Conse-
the affected version range and examine whether they are quently, attackers actively exploit Telnet to spread malware
patched based on the patch level in the version. We list by leveraging brute-forcing or software vulnerabilities [68],
the results in Table 8, where unidentified means that the [69]. While we did not verify these attack vectors due to
patch level is missing or there are no release notes. We ethical considerations, the compromise of network devices
note that a smaller portion of the affected services are like routers and switches can constitute severe threats to
known to be patched, and 47.92% are unidentified. Since infrastructures.
OpenSSH in many distros like Debian [64], Ubuntu [65],
and FreeBSD [66] provides patch levels and has explicit 6.3. RDP Services
release notes, unidentified services might be customized
versions and less secure. RDP allows users to manage Windows hosts via a
Vulnerable Patched Unidentified graphical interface. We uncovered 7,931 RDP services be-
hind misconfigured firewalls and picked 222,200 publicly
Affected 4.71% (2,695) 47.36% (27,085) 47.92% (27,405)
Public 16.34% (28,919) 62.23% (110,131) 21.43% (37,918)
accessible RDP services as the control group.
Measurement. We record the negotiation process of
TABLE 8: Patching Progress of CVE-2024-6387 RDP services. We identify the OS version by the build num-
ber in the NTLMSSP [70] response during negotiation [53].
We inspect the most vulnerable cryptographic algorithms If NTLMSSP is unavailable, often due to old versions, we
and list the results in Table 9. Compared with public SSH match the magic numbers in the responses [71].
services, a larger portion of affected services support weak Analysis. We list the OS distribution in Table 10. In
encryption or message authentication code algorithms. The total, 29.31% (2,324) of affected hosts run end-of-life (EOL)
usage share of weak key exchange algorithms is similar. Windows versions, while the percentage for the publicly
accessible hosts is only 9.70%.
Affected Public
OS Affected Public EOL
Encryption Algorithm
arcfour{,128,256} 6.82% (16,022) 4.15% (31,266) Windows 2000/XP 0.52% (41) 0.15% (344) ✓
aes{128,192,256}-cbc 28.25% (66,383) 24.73% (186,392) Windows 2003 9.51% (754) 3.17% (7,048) ✓
3des-cbc 26.94% (63,304) 21.04% (158,599) Windows 7/2008 19.28% (1,529) 6.38% (14,174) ✓
blowfish-cbc 22.16% (52,071) 17.19% (129,541) Windows 2012 10.68% (847) 30.13% (66,949)
cast128-cbc 21.62% (50,801) 16.02% (120,768) Windows 10/2016/2019 39.76% (3,153) 36.76% (81,688)
Windows 11/2022 15.77% (1,251) 21.19% (47,083)
Message Authentication Code Algorithm
Unidentified 4.49% (356) 2.21% (4,914)
hmac-md5 9.27% (21,780) 7.41% (55,870)
hmac-md5-96 8.28% (19,452) 4.31% (32,509)
TABLE 10: OS Distribution of RDP Services
Key Exchange Algorithm
dh-group1-sha1 25.20% (59,224) 25.32% (190,818)
dh-group-exchange-sha1 26.51% (62,289) 25.23% (190,162)
RDP has had a number of critical vulnerabilities [72],
[73], [74], [75], mainly in the old Windows versions. Out-
TABLE 9: Weak Cryptographic Algorithms in SSH dated Windows versions also have various vulnerabilities
beyond the RDP service and can be compromised in minutes
Besides these attack vectors we have measured, SSH is when connected to the Internet [10].
also vulnerable to weak passwords [9]. Although we did Furthermore, RDP is also prone to weak passwords, and
not verify the prevalence of weak passwords due to ethical compromise can lead to takeover, which we did not measure
due to ethical concerns. Still, exposure of RDP services can Analysis. We classify the affected devices based on their
constitute a threat to these devices. system descriptions and list the results in Table 12. “Net-
work device” includes switches, routers, firewalls, modems,
6.4. IPMI Services etc.; “Appliance” contains other connected devices including
printers, IP cameras, and VoIP gateways; “Empty Field”
IPMI is the industry standard protocol for out-of-band means that the system description field is empty; “Uniden-
remote management, mostly for servers. It enables users to tified” means that we cannot identify the exact type of the
control the power, view the screen, and use the keyboard and device. The result shows that the firewall misconfigurations
mouse over the network. We uncovered 4,242 IPMI services can affect a wide variety of devices.
behind misconfigured firewalls and picked 43,031 publicly
Type Count %
accessible IPMI services as the control group.
Measurement. We probe IPMI services by sending a Server 12,096 25.67%
“Get Channel Authentication Capabilities” command, which Network Device 20,044 42.54%
Appliance 2,175 4.62%
reveals the authentication configuration. Empty Field 11,260 23.90%
Analysis. We examine and list the prevalence of three Unidentified 1,542 3.27%
types of weak configurations [76], [77], [78] in Table 11. A
large portion of affected services allow NONE authentica- TABLE 12: SNMPv1 & SNMPv2 Device Type
tion, which enables anyone to gain full control of the servers
without any authentication. A relatively small but noticeable It is worth noting that SNMPv1 and SNMPv2 respond
percentage of affected services allow the anonymous user, only when the community string is accepted, and hence,
which is a low-privilege account without a password. Null anyone can use the community string public to read all
usernames may also lead to anonymous login. the variables of these affected devices. The variables may
contain sensitive information like the device passwords and
Type Affected Public logs [81], varying by specific model and implementation.
Due to ethical considerations, we did not use community
NONE Authentication 29.8% (1,264) 2.8% (1,185)
Anonymous Login 1.5% (64) 1.1% (469)
strings other than public.
Null Usernames 8.7% (369) 23.5% (10,091)
6.5.2. SNMPv3. We uncovered 465,033 SNMPv3 services
TABLE 11: IPMI Weak Configuration Prevalence behind misconfigured firewalls.
Measurement. SNMPv3 enhances security by support-
IPMI is not intended for public access. Many IPMI ing cryptographic authentication. We probe SNMPv3 ser-
implementations are flawed or shipped with fixed default vices by sending an empty “get request” without a pass-
passwords [79]. While we did not verify the vulnerabilities word. The response includes the device’s engine ID, which
due to ethical considerations, the exposure of IPMI services indicates the device manufacturer and contains additional
is a serious attack surface per se. data, often the media access control (MAC) address.
Analysis. The engine ID can be used to fingerprint the
6.5. SNMP Services device [82] and even brute force the password [83]. We
examine the responses and identify the device manufacturers
SNMP is widely used for telemetry and remote manage- by the enterprise ID in the engine ID. We list the results
ment, mainly for servers, routers, and switches. It exposes in Table 13. Furthermore, 163,827 devices give away their
the device status and configuration in the form of variables. MAC addresses without any authentication, enabling further
SNMP has three main versions: SNMPv1, SNMPv2c, fingerprinting and attacks.
and SNMPv3. The first two are similar and insecure, while
Manufacturer Category Count %
SNMPv3 has major improvements.
Cisco Network Device 199,304 42.86%
6.5.1. SNMPv1 and SNMPv2c. We uncovered 42,894 SN- Net-SNMP Server 99,623 21.42%
Juniper Network Device 51,346 11.04%
MPv1 services and 36,753 SNMPv2c services on 47,117 Nokia Network Device 45,330 9.75%
unique IP addresses behind misconfigured firewalls. We MikroTik Network Device 17,218 3.70%
discuss the two services together and regard each IP address Kyle Fox Unknown 14,303 3.08%
as a unique device. SNMP Research Server 10,691 2.30%
Others 27,218 5.85%
Measurement. SNMPv1 and SNMPv2c have a weak
security model, where the “community string” serves as TABLE 13: SNMPv3 Manufacturers
the password. Most vendors use public as the default
community string for read access, and private as the
default community string for read-write access [80]. We 6.6. FTP Services
probe SNMPv1 and SNMPv2c services by requesting the
system description field with the community string public, FTP is an ancient protocol for file transfer. We uncovered
which can reveal the device type and model. 32,172 FTP services behind misconfigured firewalls.
Measurement. We record the banners of FTP services, OS Affected Public EOL
which may provide the software versions and the capability Windows 2000/XP 1.19% (12) 0.02% (1) ✓
of anonymous login. Windows 2003 4.74% (48) 0.06% (4) ✓
Analysis. The banners of FTP servers are obscured and Windows 7/2008 44.37% (449) 32.08% (1,992) ✓
Windows 8/8.1 13.14% (133) 12.24% (760) ✓
only 2,588 (8.04%) contain the software version. We list the Windows 2012 0 0.05% (3)
identifiable servers in Table 14. In total, 70.83% of these Windows 10/2016/2019 26.79% (271) 41.78% (2,594)
servers run software that is end-of-life or outdated7 , which Windows 11/2022 9.78% (99) 13.77% (855)
poses a high security risk. We also find that two affected
services explicitly state in their banners that anonymous TABLE 15: OS Distribution of SMB Services on Windows
login is allowed. Furthermore, the affected services may be
brute-forced if the firewalls are bypassed.
figure. After investigation, we find 10,890 services with
Software Supported EOL or Obsolete hostnames starting with LINKSYS, which appear to be
FileZilla 0 13.49% (349) Linksys routers. Further analysis reveals that these routers
MikroTik 17.39% (450) 10.36% (268) are customized and have an RCE vulnerability which can
ProFTPD 0 12.67% (328) be exploited from the Internet. We discuss this case in §8.1.
Serv-U FTP 0 1.31% (34)
vsFTPd 11.79% (305) 33.00% (854)
B.1. Summary