CHAPTER 9
Asymmetric cryptography relies on keys made of two parts: a private (secret) part and a public (non-secret) part. A common misconception is that asymmetric cryptography is the same thing as asymmetric encryption. In fact, asymmetric cryptography has several distinct applications:
Encryption is for ensuring confidentiality. Signatures are for ensuring authenticity, integrity, and non-repudiation. Some asymmetric schemes offer signature generation only, while others can do both signatures and encryption. .NET 4.5 provides the following asymmetric implementations, all of which are FIPS-compliant:
Table 7: Asymmetric primitives available in .NET
Asymmetric class | API | Signature | Encryption | Key agreement | Hash |
|---|---|---|---|---|---|
CAPI | yes | yes | yes (via encrypt) | enc: SHA-1 | |
CNG | yes | yes | yes (via encrypt) | enc: SHA-2 | |
CAPI | yes | no | no | SHA-1 only | |
CNG | yes | no | no | SHA-1, SHA-2 | |
CNG | not directly | yes (via key) | yes | SHA-1, SHA-2 |
This quick comparison is by no means comprehensive—there are many other factors and features that could be considered when assessing suitability for a particular scenario. Our pragmatic engineering perspective, however, compels us to focus on a single asymmetric scheme that can be used to address most of the common asymmetric-crypto scenarios, and describe how to use it correctly. One such Swiss Army knife asymmetric scheme is RSA. RSA is easy to use, widely available, and can do signatures, encryption, and key agreement.
The default RSA-CSP constructor creates 1024-bit keys, which no longer offer sufficient security, and are in fact disallowed by NIST SP800-131A. NIST allows ≥2048-bit keys, and there is little security to be gained from going higher. NIST SP800-57 Part1-Rev3 recommends 2048-bit RSA keys for ~112 bits of security, and 3072-bit RSA keys for ~128 bits of security. However, in August of 2015, the NSA made a poorly explained change to their Suite B recommendations and requirements. The NSA not only inexplicably added RSA to Suite B, but also required a minimum 3072-bit key size. Since the cryptographic community is not buying the NSA explanations for the change, it is probably prudent to follow the higher minimum (just in case).
We therefore recommend a 3072-bit RSA key size. Another RSA-CSP constructor explicitly takes the RSA key size as argument, which is the constructor you should use. RSA-CSP constructors do not trigger the expensive RSA key generation, and thus are fast. RSA key generation is triggered only during key-requiring operations (encryption, signing, key export) if no key was provided or imported prior.
RSA key-pair can be exported with ExportParameters/ImportParameters methods to get the RSAParameters structure. RSAParameters contains eight public byte[] fields, two of which describe the public key (Exponent and Modulus). Microsoft’s RSA-CSP implementation uses a constant public exponent,65,537, which translates into the Exponent field always being three bytes {01, 00, 01} (big-endian storage). This effectively means that the RSA-CSP public key can be represented by the RSAParameters.Modulus byte array alone, which has the same length as the key length in bytes. For a 3072-bit RSA key-pair, the modulus is 384 bytes long.
Another way to export the key-pair is with ExportCspBlob/ImportCspBlob methods to get the serialized blob as byte[]. ExportCspBlob(includePrivateParameters:false) is a tempting alternative to using the Modulus field directly after ExportParameters(includePrivateParameters:false). The public-key blob produced by ExportCspBlob(false) will be ~20 bytes longer than the Modulus byte array. Another reason for our slight preference toward public-key extraction from Modulus is that it is easier to make the mistake of accidentally blob-exporting the entire key instead of just its public components. Using Modulus directly makes it harder to make the mistake of using the entire key when only the public key is intended to be used.
ExportCspBlob(true) exports the entire key with a close-to-optimal blob size. You are unlikely to see any major savings from a custom serialization of all fields in RSAParameters structure versus ExportCspBlob(true) output.
The default RSACng constructor creates 2048-bit keys (an improvement over RSA-CSP default). However, we reiterate our recommendation to use a minimum 3072-bit key size instead:
Code Listing 15
byte[] privKey; CngKey cngPrivKey; byte[] pubKey; CngKey cngPubKey; byte[] signature; byte[] data = new byte[] { 1, 2, 3 }; // some data // generate RSA keys (private and public) using (var rsa = new RSACng(keySize: 3072)) { privKey = rsa.Key.Export(CngKeyBlobFormat.GenericPrivateBlob); pubKey = rsa.Key.Export(CngKeyBlobFormat.GenericPublicBlob); } cngPrivKey = CngKey.Import(privKey, CngKeyBlobFormat.GenericPrivateBlob); cngPubKey = CngKey.Import(pubKey, CngKeyBlobFormat.GenericPublicBlob); // generate RSA signature with private key using (var rsa = new RSACng(cngPrivKey)) { signature = rsa.SignData( data: data, hashAlgorithm: HashAlgorithmName.SHA384, padding: RSASignaturePadding.Pss); } // verify RSA signature with public key using (var rsa = new RSACng(cngPubKey)) { rsa.VerifyData( data: data, signature: signature, hashAlgorithm: HashAlgorithmName.SHA384, padding: RSASignaturePadding.Pss).Dump(); } |
RSA-CSP provides SignData/VerifyData and SignHash/VerifyHash method pairs for signing. The *Data methods will internally hash the data prior to signing. The *Hash methods accept an externally calculated hash and an OID string matching the hash algorithm to be used.
Code Listing 16
string hashOID = CryptoConfig.MapNameToOID("SHA384"); using (var rsa = new RSACryptoServiceProvider(3072)) using (var hashAlg = new SHA384Cng()) { var data = new byte[] { 1, 2, 3, 4 }; var hash = hashAlg.ComputeHash(data); var signature1 = rsa.SignData(data, hashAlg); var signature2 = rsa.SignHash(hash, hashOID); Enumerable.SequenceEqual(signature1, signature2).Dump(); // True } |
The resulting RSA signature size is equal to the RSA key bit-length. The *Data methods call the *Hash methods internally, so the *Hash methods might be slightly faster. Another reason to prefer the *Hash methods is when the data hash is used in more than one place, and there is no need to calculate the data hash more than once. The *Data methods make no attempts to dispose or otherwise clean up the IDisposable HashAlgorithm object instance, which makes sense considering that it could be provided externally (as in our previous example). However, the hash instance can also be constructed internally if the passed-in object is not a HashAlgorithm instance but a string like “SHA384”, an OID string, or a type, like typeof(SHA384Cng). Even under these internal hash construction scenarios, no cleanup is done by the *Data methods. Since most fast hash implementations are native and use unmanaged memory, hoping for the garbage collector to eventually kick in and do its magic might not be sufficient, especially when classes like SHA384Cng with an internal BCryptHashAlgorithm nucleus have no finalizers for GC to trigger.
There might be nothing wrong with the *Data method memory cleanup in all scenarios. However, when unmanaged IDisposable objects are not disposed inside crypto code, it makes us very uneasy. We prefer to err on the side of caution and recommend that you do not use the *Data methods with a second parameter being anything other than an externally provided HashAlgorithm instance whose lifetime and cleanup you can control.
All RSA-CSP signature methods use an older PKCS #1 v1.5 padding scheme, which has some weaknesses. A more modern RSA signature-padding scheme is RSA-PSS, which is implemented in newer (CNG) Windows OS crypto modules, but is not available in .NET 4.5. In 2013, we wrote:
Perhaps the next version of .NET would provide something like “RSACng” which would be more cryptographically up-to-date.
In July 2015, Microsoft announced .NET 4.6, which added the RSACng class with RSA-PSS signing padding and OAEP encryption padding using the SHA-2 hash family. We will cover RSACng later, but if you are stuck with RSA-CSP for signatures, we recommend SHA-384 (SHA384Cng) on 64-bit .NET platforms.
RSA secret key exchange/agreement is often called “RSA encryption” because it technically is. RSA encryption can only process a short message of length equal to [RSA-key-size-in-bytes] – N, where N is a number of bytes consumed by and dependent on a particular RSA encryption padding scheme used. There are two RSA encryption padding schemes—PKCS #1 v1.5 and OAEP—both of which are available in RSA-CSP encryption APIs.
OAEP is the more secure padding scheme. OAEP padding is to RSA encryption what RSA-PSS padding is to RSA signatures. You should always use OAEP with RSA encryption.
Under OAEP padding, N is 2 × [hash-size-in-bytes] + 2. Example: Using SHA-384, N = 2 × 48 + 2 = 98 bytes, which means that a to-be-encrypted message under a 384-byte (3072-bit) RSA key with SHA-384 OAEP padding can be at most 384 – 98 = 286 bytes long. We could instead consider using a slower SHA-256 OAEP padding to increase maximum message size to 318 bytes (384 – 2 × 32 – 2). This message length restriction makes classic RSA encryption mostly suitable for key exchange and agreement, since symmetric cryptographic keys are typically short enough.
RSA-CSP encryption implementation has no hash function agility and always uses SHA-1 (20 bytes) internally, which means that the maximum message length under RSA-CSP encryption with OAEP padding is [RSA-key-size-in-bytes] – 2 × 20 – 2. For example, the RSA-CSP-encrypted message under a 384-byte key can be at most 384 – 42 = 342 bytes long. If you need to RSA-encrypt more than 342 bytes, you can increase the RSA key size up to the maximum RSA-CSP-supported 16,384 bytes (which can take minutes to generate). Note that RSA-OAEP encryption is non-deterministic and produces different ciphertext every time. The OAEP padding scheme ensures integrity; altered ciphertext will fail to decrypt. The hash function is only used within the OAEP padding scheme, and is not part of the raw RSA encryption. The use of SHA-1 in RSA-CSP OAEP implementation has no known weaknesses and is perfectly adequate, even by NIST standards (SHA-1 is not used for signatures here), although hash function agility would have been nice.
Code Listing 17
/* RSA-CSP */ int maxDataBytes = (keyBits / 8) - (20 * 2) - 2; //maxDataBytes += 1; // go beyond max size maxDataBytes.Dump("maxDataBytes"); using (var rsa = new RSACryptoServiceProvider(keyBits)) { var data = new byte[maxDataBytes]; new RNGCryptoServiceProvider().GetBytes(data); var cipher = rsa.Encrypt(data, fOAEP: true); "Encryption works.".Dump(); var data2 = rsa.Decrypt(cipher, fOAEP: true); "Decryption works.".Dump(); Enumerable.SequenceEqual(data, data2).Dump(); // should be true } |
/* RSACng */ int maxDataBytes = (keyBits / 8) - (48 * 2) - 2; // SHA-384 is 48 bytes //maxDataBytes += 1; // go beyond max size maxDataBytes.Dump("maxDataBytes"); using (var rsa = new RSACng(keyBits)) { var data = new byte[maxDataBytes]; new RNGCryptoServiceProvider().GetBytes(data); var padding = RSAEncryptionPadding.OaepSHA384; var cipher = rsa.Encrypt(data, padding); "Encryption works.".Dump(); var data2 = rsa.Decrypt(cipher, padding); "Decryption works.".Dump(); Enumerable.SequenceEqual(data, data2).Dump(); // should be true } |
One issue to be aware of with RSA-CSP .Encrypt() is that it will accept a message of length maxDataBytes + 1 and produce a ciphertext without complaints, but decrypting such ciphertext with RSA-CSP .Decrypt() will throw an exception. We think this is just bad bounds-checking on Microsoft’s part, so make sure you calculate your own valid message sizes, or at least verify message round-tripping. RSACng, on the other hand, will throw an unhelpful “The parameter is incorrect” CryptographicException on encrypting data larger than maxDataBytes. While RSACng will at least not produce flawed ciphertexts (unlike RSA-CSP), it does not produce a meaningful error message that explains what is going on. Neither RSA implementation provides a helper to calculate the maximum allowed message size for encryption. For some reason, Microsoft expects developers to be aware of internal details and do their own math. MSDN does not cover these important nuances, either.
Classic RSA-OAEP encryption works on short-length messages only, so in many contexts, “RSA encryption” actually means “RSA key exchange,” which we have just covered. A more useful “encryption” context is one that is similar to “AES encryption” context, where the message length is practically unlimited. While block ciphers such as AES have operation modes (CBC, CTR, etc.) to securely process variable-length plaintext, no such modes exist for asymmetric encryption, which is the main reason to consider a hybrid symmetric or asymmetric approach (asymmetric encryption is often claimed to be much slower than symmetric encryption, but that is not the main reason for hybrid encryption popularity).
Let’s assume that we have four parties, A, B, C, and D, with established RSA keys, and each party is in possession of everyone’s authentic public keys. Consider a scenario of A wishing to send a confidential message to B and C, but not to D (the message for B and C is the same).
Hybrid RSA Encryption, Approach 1:
Pros:
Cons:
Hybrid RSA Encryption, Approach 2:
Pros:
Cons:
Hybrid RSA Encryption, Approach 3:
Pros:
Cons:
The goal of the this hybrid approach exercise is not to lead you to the “one correct algorithm” you should implement, but to emphasize that if you ever find yourself or someone on your team playing these build-a-crypto-protocol games or, even worse, actually implementing something like that, you are doing it wrong.
There are at least three existing specifications with corresponding implementations that already leverage hybrid encryption concepts, and chances are they all do a better job than what you or your team is capable of:
TLS and CMS are based on a centralized trust model (centralized, allegedly trustworthy root CAs and trust chains), while OpenPGP is based on a decentralized trust model (web of trust). While neither model is perfect, the existing implementations of mechanics for container and payload cryptography are likely to be much better than a custom implementation.
Diffie-Hellman (DH) is a key agreement protocol that enables two parties A and B to establish a secret key over an open channel in such a way that both parties end up with the same key without divulging it to anyone listening in on the conversation.
RSA-encrypt protocol can be used for key agreement as well, where party A picks a key, encrypts it with B’s public RSA key, and sends it to B. One interesting advantage of DH over RSA-encrypt for the purposes of key agreement is that DH key agreement implicitly binds both parties’ public DH keys (sender and receiver), while RSA-encrypt key agreement binds only to the receiver’s public RSA key.
Modern versions of the DH protocol are done over elliptic curves, and are called ECDH. It does not matter what elliptic curves are, as long as you know that ECDH is a modern DH protocol. There are different types of elliptic curves that can be used with ECDH, which roughly correspond to different ECDH strengths (similar to AES 128/192/256 strengths). NIST has standardized three different curves: P256, P384, and P521, all of which are available in .NET, and roughly correspond to 128-bit, 192-bit, and 256-bit security levels. NIST Suite B recommends P384, and so do we—not because 128-bit security of P256 is an issue by itself, but because P384 allows for a healthy security margin in case PRNGs have a bias reducing ECDH key entropy. The shared ECDH key is often used as a master key to seed additional symmetric keys (for AES and MAC, for example), which would need to be at 128-bit security level, and using P384 rather than P256 to generate a master key at a higher (192-bit) security level is a conservative approach.
While NIST curves have withstood decades of public scrutiny, they no longer reflect the state of the art in elliptic curve design, which is an active cryptographic research area with lots of recent developments, newer elliptic curve designs, and standardization attempts.
The RSA encryption and DH key agreement schemes we’ve discussed can provide security of the symmetric “session key” as long as recipients’ private keys are not compromised. We might be reasonably assured that these private keys are not compromised today, but it is very difficult to extend that assurance into the future. An encrypted communication can be recorded by an adversary that plans on obtaining the private keys in the future—either through advances in technology, or via some other future weakness in private key safeguarding.
An important consequence of such private key compromise is that all past communication sessions that utilized these private keys would also become compromised—not just one specific communication session.
Perfect forward secrecy (PFS) is an additional property of asymmetric cryptography schemes that prevents session keys from being compromised when long-term private keys are compromised. Our previous RSA examples used the RSA private key for two purposes: signature and session key decryption. A compromise of this key would thus enable an adversary to falsify signatures and decrypt communications. Message authenticity and confidentiality are both important, but confidentiality is supposed to last for a very long time (ideally forever), while signatures are typically a secondary concern compared to the confidentiality of the data itself. PFS addresses this problem by using two sets of asymmetric key-pairs: one long-term key-pair is used for signatures, while a different short-term (session-term) key-pair is used for symmetric session key agreement. This additional short-term key-pair used for symmetric key agreement is called an ephemeral key, with its public component signed by the sender’s long-term private key.
New ephemeral keys are generated for each session, and both communication sides try to forget the (private portion of) ephemeral keys after each session. This makes it more difficult to compromise security of an individual session, and also prevents extending a single session compromise to other sessions—thus providing PFS.
Both RSA and Diffie-Hellman can be used as ephemeral keys. However, RSA keys are expensive to generate, while DH key generation is typically very fast. Most standards, including TLS, use DH keys for PFS. TLS cipher suites that provide PFS have “DHE” or “EDH” in them (Diffie-Hellman Ephemeral). The FIPS-approved TLS cipher suites can be found in RFC 6460.
Since PFS is a nice property to have, a common scenario is to use an ephemeral ECDH key-pair to agree on a new per-session master key, and then use that master key to securely send arbitrary-size messages. This approach is conceptually similar to the non-PFS, hybrid RSA encryption we have covered earlier. Such ECDH-based arbitrary-size message encryption has been standardized as ECIES (also ANSI X-9.63). ECIES uses the following components:
Sender A follows this sequence of steps to encrypt some plaintext for receiver B:
Receiver B follows this sequence of steps to decrypt [T-public, C] received from A:
Note: ECIES KDF can also take optional parameters that can act as “salt” or “associated data.”
A simple example of ECIES implementation can be found in the Inferno library documentation.
The key separation principle mentioned in the “Key derivation” chapter applies to asymmetric key pairs as well. We have covered how to use the same RSA key pair to do both encryption and signatures. However, it might be prudent to use one RSA key-pair for encryption, and a separate RSA key-pair for signatures. The reason is that encryption and signature keys often have different lifetimes and different escrow requirements. The encryption key is often destined for a long lifetime, and is often required to be escrowed to ensure that vital data can be decrypted. The signature key, on the other hand, should never be shared or escrowed to ensure non-repudiation, and can often have a short lifetime with an established expiration process.
Having separate asymmetric key pairs for encryption and signing thus enables a better key management process. One example is GPG, which generates separate asymmetric subkeys for encryption and signature purposes.
Many cryptographic protocols and standards talk about “key wrap”, so it is important to understand what key wrap is, and more importantly, why you are unlikely to ever need it.
Key wrap (KW) is an encryption mechanism that aims to provide privacy and integrity of the plaintext without the use of nonces (such as IVs or random bits). In other words, KW is a specialized form of authenticated encryption (AE) that does not require counters or a cryptographically secure random number generator.
The lack of nonce or random-bit dependency makes KW schemes a form of deterministic authenticated encryption—the same key KW-encrypting the same plaintext will always produce the same ciphertext.
The following reasons might make KW schemes preferable over other AE schemes:
Most real uses of KW are in legacy applications and protocols, or in low-level/hardware scenarios where CSRNG is unavailable or costly. Since KW schemes are trying to ensure that every single bit of ciphertext depends on every single bit of plaintext, they are doing a lot of additional permutation work, and are thus very slow—many times slower than AES. They also usually get progressively slower (per-bit) with longer plaintexts.
Fortunately, Windows OS and the .NET environment have a good, fast, and cheap CSRNG, which negates any potential benefits a KW scheme could have had over a good nonce-misuse-resistant AE scheme (for example, AEAD in the Inferno library). Some people do not understand KW, and mistakenly believe that KW is somehow superior to AE/AEAD because it is called “key wrap” and thus is somehow better suited for encrypting keys.
Do not fall for the “key wrap” hype. If you have a good AEAD scheme and a fast CSRNG, you can certainly forget about KW. Apple uses KW in iOS and for various hardware-accelerated key protection, but that does not make KW worth considering if you build .NET solutions. AE/AEAD always works, even when a plaintext is a key.