- Internet security relies on public key cryptography often derived from prime factorization
- Researchers set new record computations of integer factorization for 240- and 250-digit semiprime numbers
- These benchmarks indicate how many digits a key must contain to remain secure
Public key cryptography is a mainstay of the internet, allowing people to send secure, sensitive data over what is essentially an insecure network. It uses a pair of keys to encrypt and decrypt data to protect it against unauthorized access or use.
Network users have a pair of keys: one private and one public. If other users want to send encrypted data to someone, they get the intended recipient’s public key from a public directory. This key is used to encrypt the message and to send it to the recipient. When the message arrives, the recipient decrypts it using their private key, to which no one else has access.
For public key cryptography to work, it is essential that deriving the private key from the public key is computationally infeasible. This allows public keys to be freely shared, giving users an easy and convenient method for encrypting content and verifying digital signatures. Private keys, meanwhile, are kept secret, ensuring only their owners can decrypt content meant for them.
Nowadays, many of the main algorithms for public key cryptography are based on the RSA cryptosystem, which leverages the difficulty of factorization into prime numbers. This means working out the two prime numbers (the private key) that can be multiplied together to reach a specific, very large “semi-prime” number (the public key). The other main group of algorithms used for public key cryptography are based on the discrete logarithm problem (DLP), which is also mathematically difficult to solve.
Paul Zimmermann is a computational mathematician who has contributed to a number of the record computations for integer factorization and discrete logarithm. Recently, he and five colleagues tasked themselves with setting new records for each of these problems with numbers of the same size. Zimmermann was looking to disprove the previously held belief that the discrete logarithm problem is harder to solve than integer factorization.
“We wanted to show that these problems are actually about the same in difficulty,” says Zimmermann. “To prove this, we used the same hardware, the same software, around the same amount of CPU hours and the same algorithm – the number field sieve – to set new equally sized records.”
Zimmermann and his colleagues used their own implementation of the number field sieve algorithm in a piece of publicly available software they developed called CADO-NFS. In December 2019, the group announced two new records: RSA-240 and DLP-240. In the case of the RSA number, the 240 indicates that the group were able to calculate the two prime factors of a given 240-digit semi-prime number.
These records were set using a 32 million core-hour allocation at the Juelich Supercomputing Centre in Germany. After having set these records, however, the group had some time remaining within their allocation, and thus were also able to set another new record, RSA250, which was announced in February 2020.
When we set these records, we know that if a government state were to try and break such a cryptography scheme that they would have computing power in the order of ten times more powerful than we have at our disposal. We can therefore easily calculate the length of keys that would be impossible for such hypothetical government state attackers to break. ~ Paul Zimmermann
During the peak of the project, which ran for a year starting from April 1, 2019, the researchers had access to 128 nodes, each with about 100 cores, meaning that they had around 12,000 homogeneous cores working on the problem at the same time.
“The hardware at hand made our work much easier,” says Zimmermann. “Our job was also made easier by the excellent support from the PRACE engineers, although we had very few problems when running CADO-NFS on the PRACE machine in Germany.”
Algorithmic development within the CADO-NFS software was largely responsible for the achievement of these new records, the details of which are outlined in a paper that has been accepted for publication at the Crypto 2020 conference.
“The variants of the algorithm allowed us to speed up our calculations by a factor of between two and three,” explains Zimmermann. “For example, if we compare the discrete logarithm problem record to the previous one, mathematically the new record should be about three times more difficult. But, if you compare the running times, we used about the same number of CPU hours.”
The setting of these records represents an important benchmark for the design of modern cryptography schemes. When creating new schemes, it is important to know that they will be secure not only now but also for the next ten or twenty years. Government agencies use these records to provide recommendations for key sizes to ensure that they are secure.