As devices are joined to the Internet and other networks, the potential for network vulnerabilities increases. The recent growth of the Internet of Things (IoT) is driving an expansion in both scope and quantity of network-connected devices. These devices are also now seen in increasingly sensitive settings, such as in operating rooms and automobiles. Consumers are also transmitting increasingly sensitive information over the Internet, including financial and personal-health data. The necessity for rigorous security standards is especially important given these elements of the current landscape.
Modern security best practice is to use Public-Key Cryptography to securely transmit data to a remote source – any party can encrypt data with the public portion of the remote source’s key, which can then only be decrypted by the remote source’s private key. The RSA algorithm has long served as one of the most popular encryption techniques for this encryption measure. The security of RSA relies on the inability of another party to determine two randomly-chosen prime numbers from which the RSA public key is derived. If these prime factors are discovered, the RSA private key can be re-derived, and an attacker can impersonate the remote source or decrypt stored communications that rely on the confidentiality of the private key.
In an attack that has already received a significant amount of attention, researchers are able to efficiently compute these prime factors for certain RSA public keys that have been generated with poor randomness, as shown in , , and . Large numbers of RSA public keys can be acquired from many sources, and these can be mined for common factors between the keys. Each key that is found to have a common factor with another key in the dataset is compromised.
We reexamine this attack and its implications in the current landscape of ubiquitous and sensitive communications between users and devices. In 2019, with the large number of devices on the Internet and in other data sets like Certificate Transparency (CT) logs, this attack presents a serious threat if proper precautions are not in place. As the number of keys grows, it is more likely that weakly generated factors in RSA public keys will be discovered. Coupled with the availability of cheap computing resources and sensitivity of communications, the attack is as potent as ever.
As part of an ongoing effort to continuously monitor and improve Internet security, we outline an efficient, highly scalable approach to testing the vulnerability of public RSA keys in use. We use the SSL/TLS certificate discovery capability of the Keyfactor platform to build a database of 60 million RSA keys used on the Internet. We then augment this dataset with 100 million certificates available through Certificate Transparency logs. This data is analyzed on a single virtual machine in the Microsoft Azure cloud, using our scalable GCD algorithm for shared factors adapted from . The analysis reveals that at least 435,000 weak certificates – 1 in 172 of the certificates we found on the Internet – are vulnerable to this attack.
435,000 WEAK CERTIFICATES
1 IN 172 OF THE CERTIFICATES WE FOUND ON THE INTERNET
RSA is used in in the process of encrypting data to send across a network. The server transmits its RSA public key to the client as a part of an SSL or TLS handshake. Part of the RSA public key contains the modulus n = p * q, where p and q are two randomly chosen primes of similar size. Primes p and q are kept secret, as knowing these values allows the private key to be calculated.
Ensuring that p and q are selected with sufficient randomness is a crucial component of keeping the public key secure. Factoring a large modulus n to obtain p and q is not feasible under normal circumstances. However, if keys are generated with poor randomness, then it becomes a concern that two public keys will share a factor once enough keys are generated. If two RSA moduli n1 and n2 share precisely one prime factor p, then computing the Greatest Common Divisor (GCD) of n1 and n2 will reveal the value of p. The GCD computation is significantly easier than straightforward factoring, and can easily be performed in practice. The other factors of n1 and n2 can then trivially be found by the simple calculations n1 ⁄ p and n2 ⁄ p, respectively, compromising both keys. This GCD computation can be scaled to analyze all pairs of keys in sub-quadratic time in the number of keys .
Selecting the prime factors of appropriate size with uniform randomness should prevent two moduli from ever sharing a factor in practice. However, if there is a flaw in the random number generation when choosing primes, a collision is likely in a sufficiently large dataset. Attackers can use this knowledge to collect a large number of RSA public keys and then look for GCDs between their moduli to search for factors shared by any pair.
Previous research teams have performed large scans on the Internet to demonstrate and investigate the implications of this attack. The attack received significant attention in 2012 and researchers were able to break tens of thousands of keys. There was a follow up on the attack in 2016 on a significantly larger data set, with a focus on trends in occurrences of weak keys from various vendors. We focused on three previous publications , , and  in developing our analysis.
The authors of  collected approximately 6.2 million digital certificates across the Internet and found that approximately 4.3% of these certificates fully share their RSA modulus with another. The authors clustered the certificates together by modulus to further examine this duplication. Notably, single clusters of sizes 16489, 8366, 6351, and 5055 were found, and 58913 of the clusters have precisely two certificates. Additionally, they found that 12934 of the distinct RSA moduli were able to be factored. The authors found 1200 factors that are shared by more than one modulus. The most widely used individual factors were shared by up to 4627 certificates.
In , researchers collected 5.8 million unique TLS certificates and 6.2 million unique SSH host keys. Between these two collections, they found 11 million distinct RSA moduli and were able to factor 16,717 distinct public keys. This led to breaking 23,576 (.4%) of their TLS certificates and 1,013 (.02%) of the RSA SSH host keys.
The authors investigated the causes of moduli sharing factors by examining random number generation implementations. A major issue in the flawed random number generation was traced back to a low amount of entropy in the random number pool when the keys were generated. This issue most commonly arises on lightweight devices that do not have much input from which to gain entropy.
This analysis found that only two moduli on publicly trusted certificates could be factored, and both certificates had expired. The large majority of the broken keys came from network devices such as routers, modems, or firewalls.
The 2012 results of  were reexamined in 2016 in  with an emphasis on examining how vendors and end-users responded to the vulnerabilities. The authors examine 81 million distinct RSA keys and were able to factor 313,000 keys (.37%). They also examine trends in the number of weak keys from various companies. Since 2012, a significant number of new devices from companies including Huawei, D-Link, and ADTRAN were found to have vulnerable RSA keys.
Additionally, it was exceedingly rare for end-users to patch their devices to fix vulnerabilities found during the 2012 study. The authors note the “distressing implications” this has for the IoT era, as the number of devices on a network is rapidly increasing.
Rise of the IoT and Cloud Computing
Despite the large number of keys broken by this attack previously, it is unlikely that a key that has been properly generated with a sufficient amount of entropy could be broken with this technique. However, the requirement of sufficient entropy may not always be met when generating keys. In particular, some IoT devices may not have enough entropy available to generate keys without an external source. Major websites can prove vulnerable to this flaw as well, as the public and private keys used with their SSL/TLS certificates are generated by the website owner or administrator, and not by the CA that signs the certificate to make it publicly trusted. This means that the process the site operators used to generate their keys is opaque to the CA, and in general they cannot analyze the submitted public key for poor generation practices.
As growth of the Internet and the IoT has continued, network sizes have grown significantly and Internet-connected devices are appearing in new places. Security is a major concern in this landscape because of the increasing amount of data handled by these devices and the introduction of connected devices in sensitive settings such as operating rooms and vehicles. Compromising any device on these critical networks could results in catastrophic failure if proper precautions are not taken. Similarly, consumers are increasingly relying on publicly available web services to handle sensitive data and perform high-impact operations, such as financial transactions, transmission and storage of valuable intellectual property, and applications for credit or other services. The potential for an attacker to obtain this sensitive information or perform fraudulent operations can significantly impact consumers.
The rise of IoT technologies and cloud-computing resources since the initial publication of this technique makes it more dangerous in today’s landscape for several reasons:
THE IoT IS COMPROMISED MOSTLY OF LIGHTWEIGHT DEVICES.
There is nothing inherently insecure about how such a device would generate an RSA key, but  found that lightweight devices are primarily at risk of this attack due to their low entropy states. Entropy in a device is required to prevent the random number generation from being predictable. Researchers were able to find deterministic “random” output when removing entropy. Lightweight IoT devices are particularly prone to being in low entropy states due to the lack of input data they might receive, as well as the challenge of incorporating hardware-based random number generation economically. Keys generated by lightweight IoT devices are therefore at risk of not being sufficiently random, increasing the chance that two keys share a factor and allow the key to be broken. The authors of  found that most of the keys that broken were from “low-resource” devices. Only two keys on two certificates were publicly trusted, and both of these certificates had expired.
THE ATTACK BECOMES MORE SUCCESSFUL WHEN MORE PRIME FACTORS ARE ANALYZED.
As the number of discovered certificates grows, the number of certificate pairs to analyze for shared factors increases quadratically. The IoT landscape has drastically increased the number of devices on networks and this trend will only continue. Estimates and projections vary considerably, but Gartner predicts 20 billion devices will be deployed on the Internet within the next year .
DEVICES ARE IN NEW AND MORE CRITICAL ENVIRONMENTS.
Compromising an RSA key has much more potential to be catastrophic in 2019. An RSA key being compromised now means more than personal or enterprise data being compromised. Critical real-time environments such as operating rooms, automobiles, industrial control devices, and home security systems now operate using RSA keys. Physical property and lives are therefore now at risk with RSA keys being compromised.
IoT DEVICES ARE MORE DIFFICULT TO PATCH.
Patching many IoT devices across a network can be tedious for a user. There can be many independent devices for a user to consider and manage, often without centralized automation across the devices. In some cases – where original device design did not support patching, or where the device is inaccessible – patching may be impossible. Additionally, due to the long life span of some IoT devices, it is possible that vulnerabilities can be found on live devices that are no longer being actively supported by the manufacturer. This means that a vulnerability to key compromise can be exploitable for much longer, and more difficult to remediate.
COMPUTING AND SCANNING RESOURCES ARE MORE READILY ACCESSIBLE.
Cloud services are readily available for rent to allow an adversary to both acquire and analyze the data efficiently. The only obstacle that an adversary truly faces in this attack is the implementation details. The attack itself has been well studied and understood, and the resources needed to execute it are now readily available at modest cost as well. To illustrate, development and computation resources spent for this study, excluding data acquisition, totaled less than $3000. Furthermore, pre-collected datasets of certificates and keys are available for purchase or even for free download, which can reduce the time and cost for an attacker to perform this computation.
For the above reasons, we feel it is necessary to reevaluate the implications of this attack in a modern landscape. As in previous works, the goal of this study is to provide an awareness of this attack and improve Internet security overall. Given the nature of the research, it is necessary to use real-world data for our analysis. However, as the results produce compromised keys used to secure real-world communications, we do not provide details on broken keys beyond high-level statistical summaries and we do not retain the broken keys past their use for this project.
When analyzing the 45 million moduli from our 2015-2017 Internet scans, along with the first million low primes, we were able to break 192,709 keys. These 192,709 broken keys corresponded to 344,055 distinct certificates in the original dataset, or 0.56%. Twelve certificates failed signature validation when correlating the broken key with the original certificate. These appear to represent minor variations of legitimate certificates that occur in our dataset but were not compromised. The variants all produced an invalid public key, in that it is not a product of two primes, which causes the signature check to fail. These certificates were all CA certificates obtained by multiple ICI scans, so we attribute this to file corruption in the certificate chain on the hosting server.
When incorporating the results of the 2019 ICI scan and the 100 million certificates from CT logs, we were able to compromise 249,553 distinct keys corresponding to 435,694 certificates. Of these results, only five certificates are publicly-trusted certificates found in CT logs, including one domain with two compromised certificates, issued by different CAs. As of this writing, none of these certificates remain in use online. The remainder of the certificates are self-signed, privately-rooted, or device certificates. Excluding the CT logs, these numbers indicate that around 0.58% of certificates online during the scanning period are vulnerable to this attack. Statistical summaries are given in Table 1.
Table 1. Statistics on Broken Keys
It is interesting to note that some of the broken moduli are used by considerably more than one certificate, with the average number of certificates per key being 1.75. The most commonly occurring modulus value was used by 1737 certificates, and 19 moduli were used by more than 300 certificates. The usage statistics for individual factors are even more concentrated, and Table 2 gives the top ten most used factors. These ten factors together, while less than .01% of the factors across all keys, account for more than 6% of the broken certificates. To reduce the odds of causing harm by helping others to re-derive real private keys, we do not provide any association between these factors and keys or certificates that used them. These factors have also been redacted to prevent abuse while giving manufacturers the opportunity to recognize if they’ve generated a common weak factor; the first and last 8 bytes are provided to facilitate this.
Table 2. Most Commonly Found Factors
Of these top factors, eight appear to represent factors of an appropriate length for a 1024-bit RSA key and two appear to represent factors of a 512-bit RSA key. Across all reused factors in a typical range for each common key size, the distribution is given in Table 3. In addition, one 2204-bit factor was shared by two keys, along with 16 factors of 13 bits or less found by including the low primes.
Table 3. Key Length Distribution
As with previous results in this field, we are able to determine some information about the usage of the broken certificate by looking at its subject. Of these certificates, for example, 217,988 – almost exactly 50% of compromised certificates – contain the name of a large network equipment manufacturer previously notified as a result of  and . This includes 66,939 certificates discovered in the 2019 ICI scan, indicating that the manufacturer and/or consumers of their products continue to have issues with security in the field. In fact, many of these devices used the same key. Unsurprisingly for reasons already discussed, at least 12 of the 20 most common certificate subjects in our dataset appear to represent consumer IoT devices. In some cases, we were able to determine the exact model of device represented. Other common subjects include very simple IP addresses like “CN=192.168.1.1” (25,075 certificates) and subjects that appear to have been taken from examples without modification, such as “E=HOST@ANYCOMPANY.COM, CN=NOHOSTNAME.NODOMAINNAME, O=ANY COMPANY, L=ANY LOCALITY, ST=ANY STATE, C=US” (242 certificates). As we have not been able to identify or notify all manufacturers, we do not provide further information about specific subjects.
The fact that a quarter million keys from the past five years are vulnerable to this attack, even for an attacker with limited resources, is concerning. A party with a re-derived private key for an SSL/TLS server certificate can impersonate that entity, and network clients attempting to connect to that endpoint cannot distinguish the attacker from the legitimate holder of the certificate. This leads to a risk that an unsuspecting client will transmit sensitive information, such as login credentials or financial information, to the attacker, who can then use this information to cause a variety of adverse effects. In today’s landscape where these clients may represent automobiles, medical implants, or other critical devices connecting to a backend system, the impersonated service may cause the device to malfunction and cause physical harm. Furthermore, in some cases, the private key can be used to decrypt stored communications to a server long after those communications have taken place. This risk can be eliminated by encrypting a session with a cryptographic standard that employs forward secrecy, a property that ensures prior communications are not disclosed simply by key compromise. Although the value these communications may provide to the attacker tends to decrease with age of the communications, a large amount of information transmitted without forward secrecy may be exploited by a party with the resources to obtain and store the encrypted messages. One Internet security company that monitors the cryptographic features offered by servers finds that only about 50% of servers during the time period of the ICI scans offered forward secrecy, leaving the other half vulnerable to compromised keys . While the situation has improved since – in particular, TLS 1.3 requires forward secrecy – this older traffic is all vulnerable to disclosure. It is worth noting that even RSA keys that were well-generated and not currently compromised will likely be broken in the future with the advent of practical quantum computers, for which integer factoring is an easy problem. Other cryptographic systems such as Eliptic-Curve Cryptography (ECC) are also vulnerable to compromise by quantum computers. It is also worth noting that, while the particular attack described here is specific to RSA keys, ECC does still rely on secure random number generation and other attacks could still potentially exploit a lack of entropy in key generation to compromise an ECC key.
With modest resources, we were able to obtain hundreds of millions of RSA keys used to protect real-world traffic on the Internet. Using a single cloud-hosted virtual machine and a well-studied algorithm, over 1 in 200 certificates using these keys can be compromised in a matter of days. These weak keys expose users to a wide variety of harm, from disclosure of confidential information to failures of safety-critical systems. This vulnerability is only one of many potential threats that can cause a key that appears secure – or that was once secure – to be unexpectedly compromised. These concerning findings highlight the need for device manufacturers, website and network administrators, and the public at large to consider security, and especially secure random number generation, as a paramount requirement of any connected system. Systems should attempt to use security best practices from inception, and in any case must have the ability to securely update both software and cryptography to protect against risks like this.
 N. Heninger, Z. Durumeric, E. Wustrow, and J. A. Halderman, “Mining your Ps and Qs: detection of widespread weak keys in network device,” Proceedings of the 21st USENIX Security Symposium, August 2012.
 A. K. Lenstra, J. P. Hughes, M. Augier, J. W. Bos, T. Kleinjung, and C. Wachter, “Ron was wrong, Whit is
right,” 32nd International Cryptology Conference, CRYPTO '12, August 2012.
 M. Hastings, J. Fried, and N. Heninger, “Weak keys remain widespread in network devices,” Proceedings of the 2016 Internet Measurement Conference (IMC '16), ACM, New York, NY, USA, 49-63.
 Gartner, “Gartner says 8.4 billion connected “Things” will be in use in 2017, up 31 percent from 2016”. https://www.gartner.com/en/newsroom/press-releases/2017-02-07-gartner-says-8-billion-connected-things-will-be-in-use-in-2017-up-31-percent-from-2016.
 Certificate Transparency, https://www.certificate-transparency.org/.
 RSA Laboratories, “RSA Numbers”, http://www.ontko.com/pub/rayo/primes/rsa_fact.html.
 B. Cloostermans, “Quasi-linear GCD computation and factoring RSA moduli”, August 2012. https://pdfs.semanticscholar.org/4124/edde77caeecbbee378ccdd12ffbaa2e9f322.pdf.
 Qualys, “SSL Pulse,” https://www.ssllabs.com/ssl-pulse/.
By 2021, organizations with crypto-agility will suffer 60% fewer cryptographically related security breaches and application failures than those without a plan.
The Need for Crypto-Agility
Every certificate expires, every algorithm evolves, and with recent advances in quantum computing, the risks grow even greater. To prevent the risk of breach or loss of trust, organizations must be able to respond quickly to find and replace affected keys and certificates — often at massive scale.
Cryptographic compromises are often sudden and unpredictable. Even with limited resources, attackers can exploit these weaknesses, threatening to disrupt the security or functionality of business and life-critical devices. Just knowing how many keys and certificates you have can be a challenge in itself, compounded by the vast number of devices and applications that rely on cryptography to authenticate connections, encrypt data, or verify the integrity of code and firmware updates.
Crypto-agility means knowing everywhere cryptography is used across your organization (i.e. certificates, algorithms, protocols, and libraries), and being able to quickly identify and remediate vulnerabilities, without disruption. From enterprises securing their business-critical data and infrastructure, to manufacturers building the next-generation of connected devices, every organization must prepare for crypto-agility, or risk falling behind the crypto-curve.
COMPROMISE OR BREACH OF ROOT
When a Root of Trust (RoT) is compromised, all trust is lost. In the case of a Certificate Authority (CA) issuing certificates, a breach renders all of your public and private keypairs moot, or even dangerous, as they can be issued and used maliciously. The immediate replacement of that RoT is required, along with the updating of all certificates and keys used by devices.
CRYPTO LIBRARY BUG
Discovery of a bug in crypto libraries may result in the need to generate new keys and re-issue certificates according to the technology used in patching or replacing it. A TPM flaw, known as ROCA, left millions of devices vulnerable, requiring end users to install firmware updates and replace of weak encryption keys.
Discovery of weaknesses and flaws in cryptographic algorithms – like SHA-1 and MD5 – routinely challenge security teams’ ability to adapt and respond effectively. Any keys using the affected algorithm are rendered insecure or untrusted. Similar to a compromised RoT, a complete replacement is required.
NIST predicts that large-scale quantum computing will break public key cryptography in use today. This leap forward will disrupt key components of cryptography, as we know it, forcing organizations to pivot to new strategies, cryptographic standards and technologies.
Crypto-Agility for IoT
Building IoT devices that are secure by design is critical not only for today’s threat landscape, but for product and service lifecycle challenges down the road. With many IoT products expected to last more than 10+ years, and warranty periods extending well into that, manufacturers must be able to securely update both the software and cryptography on their devices to maintain trust.
It’s not enough to have the ability to swap out certificates and keys. It is imperative to be non-disruptive to customers, attainable within targeted timeframes, and achievable within ecosystems consisting of hundreds of millions (or more) distributed, and often resource constrained, IoT devices.
SECURE IoT DEVICE DESIGN
Keyfactor Control makes it easy and affordable to embed high-assurance secure identity into every step of the IoT device lifecycle — from device authentication to continuous secure firmware updates.
Empower your development teams to build it right.
Design identity and security into your devices from the start.
Secure identity must be rooted in design. Keyfactor Control enables you to integrate robust cryptography that will ensure your devices do only what you intend them to do throughout their lifecycle.
Manufacture and deploy anywhere with confidence.
Build devices reliably and securely at massive scale.
The options are limitless. Be free to consider the best and most cost-effective ways to secure your devices, knowing you can ensure authenticity and security during the build and deployment of your products.
Keep devices secure through the entire lifecycle.
Assure customers that your products are built to stay secure.
Stay confident that your products are ready to go the distance – safely. Over-the-air firmware updates, identity refreshes, access controls, and other essential tasks ensure everything stays up to date.
From design through deployment, Keyfactor works with industry-leading device manufacturers to develop, realize, and sustain the most secure devices on the market. Whether you’re designing life-critical medical devices or manufacturing a new fleet of connected vehicles, our team of experts works with you to meet your timelines and ensure easy implementation.
Crypto-Agility for PKI
Public key infrastructure (PKI) is critical to securing your enterprise, but it relies on trusted cryptographic algorithms, industry standards, and certificate authorities to work effectively. If the integrity of any of these components is broken, organizations must act quickly to restore trust, or risk disruption to their business, or worse, a security breach.
Manual processes, spreadsheets and monitoring tools may have worked for small certificate counts, but most organizations today have come to recognize that there are far more certificates deployed than they can track or even know about. The result is an increase in efforts and costs to stay on top of them, with the risk of security degradation from error and omission.
CERTIFICATE LIFECYCLE AUTOMATION
Keyfactor Command is a complete and scalable platform designed to empower enterprises to discover, manage and automate the lifecycle of every key and digital certificate across their environment.
Gain visibility across your entire network.
Discover and inventory every certificate, without compromise.
Get end-to-end visibility from virtually any public and private CAs to distributed web and app servers, load balancers, firewalls, containers, multi-cloud infrastructure, mobile and IoT devices.
Secure all of your digital identities with ease.
Continuously monitor status and identify vulnerabilities.
Stay ahead of cryptographic vulnerabilities. Group certificates, monitor status, set notifications, and enforce consistent policies to ensure that all keys and certificates are kept compliant and up to date.
Achieve true business agility through automation.
Automate the lifecycle of keys and certificates at scale.
Eliminate manual, time-intensive tasks and respond quickly to change – whether it’s replacing all SHA-1 SSL certificates in one action, or updating trusted root stores for all network firewalls and load balancers in one shot.