Application Security in .NET Succinctly^®
by Stan Drapkin

CHAPTER 1

.NET Security

Cryptographic primitives

The .NET Framework security foundation is made of various cryptographic building blocks, which, in turn, rely on various cryptographic primitives. These primitives are supposed to be well-defined and well-understood cryptographic algorithms and services that follow the single responsibility principle, meaning they should do one thing only, and do it well. Cryptographic primitives are typically combined to form more complex cryptographic building blocks, which can be further combined to form security protocols. Not all primitives provided by Microsoft are well-defined or well-understood, but before we start improving them, we should be clear about basic security rules.

The first rule of sane security architecture is never design your own cryptographic primitives or protocols. The second rule of sane security architecture is never implement cryptographic primitives or protocols—even when you follow an established design, since your implementation will be flawed. Implementation flaws are a much more common cause of practical vulnerabilities than actual cryptographic weaknesses. There are no exceptions to the first rule—it is absolute. There are some exceptions to the second rule, however. One acceptable reason to break the second rule (outside of learning purposes) is to deal with situations when reliable implementations are either not available or are sorely lacking.

You will be breaking the second rule in upcoming chapters, but first you need an understanding of what primitives are available in .NET out of the box, and the strengths and deficiencies of their implementations. So let’s get on with it.

FIPS mode

Federal Information Processing Standards (FIPS) 140 are U.S. government standards for accreditations of cryptographic modules. The Windows operating system already comes with certain FIPS-validated cryptographic modules. The OS-level FIPS mode itself is advisory only; it does not take any proactive steps to enforce application compliance and prevent non-FIPS-validated modules. The .NET-level FIPS mode, however, will prevent any non-validated cryptographic implementations that .NET itself is aware of (the ones that are part of the .NET Framework) from running by throwing an InvalidOperationException. Various regulations might require FIPS mode to be enabled in your .NET environment, so designing your solutions to be FIPS-mode-compliant is a good idea.

The .NET Framework comes with several cryptographic API types:

100 percent managed implementations, typically ending with Managed.
APIs that call external, unmanaged implementations of the underlying Windows OS:

The “old” CryptoAPI providers, typically ending with CryptoServiceProvider.
The “new” crypto-next-generation (CNG) APIs, typically starting or ending with CNG.

The managed APIs are not FIPS-validated (the first type), while both the “old” and “new” unmanaged APIs (the second type) are FIPS-validated. The main reason to use the managed APIs is start-up speed—they are much faster to construct than their unmanaged equivalents. The downside of managed APIs is their raw processing speed—they are slower at processing than native alternatives. By Microsoft’s own guidance, the CNG APIs are preferred over CryptoServiceProvider APIs. The “what do I use?” decision for modern 64-bit .NET deployments is thus very simple:

Tip: Use CNG APIs.

In some scenarios that only process very few bytes of data (16–20 bytes or fewer), managed APIs might be slightly faster than CNG APIs due to faster start-up. Some older Windows platforms do not support CNG APIs. This should be less of a problem now, however, since all these “legacy” Windows platforms are no longer supported by Microsoft (for example, Windows XP or Windows Server 2003). We will see other reasons to choose CNG over CryptoServiceProvider APIs when we discuss HMAC.

Random number generators

The .NET Framework provides two random number generators: the System.Random class (we’ll call it RND) and the System.Security.Cryptography.RNGCryptoServiceProvider class (we’ll call it CSP), both of which reside in mscorlib.dll.

System.Random

RND’s purpose is generation of pseudo-random int and double types in a defined range. According to MSDN, RND implementation follows Knuth’s subtractive generator algorithm from Chapter 3 of The Art of Computer Programming. Knuth’s algorithm keeps a circular buffer of 56 random integers, which all get initialized with a deterministic permutation based on the provided seed value. Two indices into this buffer are kept 31 positions apart. A new pseudo-random number is calculated as the difference of the values at these two indices, which is also stored in the buffer. Unfortunately, Microsoft’s implementation of Knuth’s algorithm has a bug that keeps the indices 21 positions apart. Microsoft has acknowledged this issue, but refused to fix it, citing “backward compatibility” concerns. RND is a seeded deterministic generator, which means it provides the same pseudo-random sequence for a given seed. Fixing the bug would cause the algorithm to produce a different sequence for any given seed, and Microsoft apparently decided that not breaking applications that relied on specific pseudo-random sequences (contrary to MSDN warning that RND implementation is subject to change) was more important than getting the algorithm right.

This oversight, however, does not pose a practical problem for the vast majority of RND uses for the simple reason that RND has far more serious issues that are likely to cause you grief a lot sooner than exhausting the intended or actual sequence period. Mono (an alternative CLR implementation) got the algorithm right, by the way (which also implies that Mono-generated sequences of Random are not compatible with Microsoft ones).

RND could have been a decent choice for many scenarios requiring randomness, since high-quality randomness is often not a must-have requirement. “Quality” of random number generators refers to their ability to pass a battery of statistical tests designed to detect non-randomness (for example, NIST and diehard tests).

Unfortunately, a lot of these common scenarios are multi-threaded (for example, IIS/ASP.NET threads, or developer-created parallelism to utilize all available CPUs). One way to make RND thread-safe is to derive from it, overload all virtual methods, and wrap base calls in a “lock” block (Monitor). This adds thread safety but does not solve the API flaws, which can be illustrated by the following:

int A = new System.Random().Next();
int B = new System.Random().Next();

What do you know about A and B? We know that they are equal. The problem here is that RND must be instantiated with an integer seed, in the absence of which (i.e. the default constructor) RND uses Environment.TickCount, which has millisecond precision. Basic operations, however, take nanoseconds, and both RND objects get seeded with the same TickCount value. MSDN documents this behavior, and recommends to “apply an algorithm to differentiate the seed value in each invocation,” but offers no clues on how to actually do it.

One approach is to have a static integer value which is initialized to Environment.TickCount inside a static constructor (which is thread-safe), and then is read for seeding (inside instance constructor) with var seed = System.Threading.Interlocked.Increment(ref _staticSeed). The seed will wrap around when it reaches the max integer value. The downside of this approach is that it requires keeping state, which is often not convenient.

Another approach is to use var seed = Guid.NewGuid().GetHashCode(), which is fast and thread-safe, and produces integers that are reasonably well-distributed in int space. The upside of this approach is that it is stateless. The downside is that different GUIDs can produce the same hashcode, which will cause identical seeds and identical random sequences.

MSDN makes another suggestion: “call the Thread.Sleep method to ensure that you provide each constructor with a different seed value.” This is a bad idea—do not do this. There are very few reasons to ever call Thread.Sleep, and this is not one of them.

Even if you manage to initialize System.Random with a consistently unique seed, you must still deal with the fact that this class is not thread safe. If you accidentally use its methods from multiple threads, the internal state will get silently corrupted and you will start seeing a not-so-random sequence of all zeroes or negative ones.

A third common approach to making System.Random safe is to never create more than one instance of it, and invoke all System.Random methods on that single instance inside a lock. You can see one example here. One downside of this approach is that a single random sequence becomes the provider of randomness for all components that draw upon it, which makes it easier to reach the sequence period. While this might not pose a problem in many scenarios, it is important to be aware of. Another downside is that all that thread-safe lock/Monitor synchronization not only takes about 20 nanoseconds on every call, but also makes the generator inherently non-parallel.

A fourth common approach is to use the [ThreadStatic] attribute or ThreadLocal<Random> class to create a separate instance of System.Random per thread (example). This is a very popular approach. It is also dangerous because it gives a false sense of addressing the core issues. It is simply a different take on how to deal with parallel access: you either share a single instance among many threads (which requires locking), or you create a separate instance per thread. The System.Random construction and the seeding issues did not go away. ThreadLocal<Random> makes construction explicit—for example, new ThreadLocal<Random>(() => new Random(/* what seed do we use here ?? */))—while the [ThreadStatic] attribute moves the construction logic to a method or property getter that you use to retrieve the instance. In either case, none of the seeding problems are resolved.

The RND algorithm’s randomness properties are supposed to hold within a specific, reasonably long cycle period. The instance-per-thread approach completely disregards these mathematical properties because instead of consuming the RND sequence itself (as designed), this approach cycles the seed instead, and consumes only some short number of cycles after the initial seeding. Such misuse affects the pseudo-randomness quality that RND was mathematically designed to provide, since the seed is 2³², but the period is supposed to be 2⁵⁶.

Another common pitfall is that the ThreadStatic/ThreadLocal instance is usually exposed as an RND-typed property, which makes it easy to use in all places that expect an RND-typed instance. However, it seems reasonable for a developer to write the following code: static Random safeRnd = SafeRandom.Rnd; and then use safeRnd in parallel code. The problem, of course, is that the property returns a thread-safe value for the currently executing thread, so saving the returned value in a static variable would save a specific instance from a specific thread. When the same static value is accessed from another thread, multiple threads start trampling all over each other because RND is not thread-safe. The core flaw here is that this approach is not foolproof enough for developers who might not be familiar with internal implementation nuances.

Let’s pause and reflect for a moment. You just want to generate some random numbers using the documented APIs—how hard should it be? Why do you have to be aware of all these workarounds and complicated approaches which are all bad at the end of the day?

System.Random is a quagmire of bad choices. It is a pit of failure, regardless of how you try to engineer your way out of it. Our advice is to avoid it entirely. Let’s go over CSP next, which is not ideal either.

Tip: Avoid using System.Random if you can—there are better alternatives. Be aware of the pitfalls if you cannot (for example, if you require deterministic pseudo randomness).

RNGCryptoServiceProvider

CSP’s purpose is the generation of cryptographically-secure random bits. Unlike RND, CSP is not specifically designed for random number generation, and thus RND and CSP have very different APIs.

It’s tempting to assume that random bits can be trivially combined to generate random numbers. This is certainly true in the case of integral types: random byte is 8 random bits, random int is 4 random bytes, and random long is 8 random bytes. Floating-point types, however, are not that simple to randomize. For example, double in .NET is an 8-byte, IEEE-754 floating-point structure, which essentially means that its precision varies with magnitude (precision goes down when magnitude goes up). Simply generating 8 random bytes and casting them into double will generate a non-uniform random distribution over a given min-max range. The proper way to achieve range uniformity with doubles is to randomize the significand separately for some pre-set fixed exponent. However, that would require control over bit-construction of double, which is not trivial in .NET.

RND.NextDouble() returns a double in [0,1) range. Its implementation “cheats” (in order to ensure uniformity of distribution) by dividing an internal 4-byte int state by int.MaxValue+1 using floating-point division (+1 ensures that the upper bound is exclusive). RND’s internal sample size is 4 bytes, and it should be apparent that 64 bits of randomness (which you might expect for a double) cannot be manufactured from 4 bytes of internal sample. This approximation, however, suffices for the vast majority of uses that RND was intended for. We could use the same trick to make doubles out of CSP integers. Let’s compare RND and CSP attributes that matter.

Table 1: RND and CSP attributes of interest

	Thread safety	Performance	Quality	Requires seeding
RND	No	Very fast	Poor	Yes (int)
CSP	Yes	Depends	High quality	No

CSP certainly looks very appealing given its thread safety, lack of seeding, cryptographically strong quality, and ease of implementing RND APIs on top of CSP. Performance of CSP is probably its most misunderstood aspect. CSP often gets accused of poor performance, especially compared to RND.

The truth is that CSP can be very fast—much faster than RND—when used appropriately. CSP has only one useful method, GetBytes(), which populates a byte array (the other method, GetNonZeroBytes(), works the same way). This method has a substantial invocation cost; that is, it’s a lot more expensive to call it 100 times to populate a 4-byte array than it is to call it once to populate a 400-byte array. If you measure performance beyond the method invocation cost (which is much more expensive for CSP than RND methods), you will realize that the random-byte-generation performance itself is much better for CSP than RND. (We ran our tests against x64 .NET 4.5 runtime on Intel Core-i7 running Windows 2008R2.) For example, populating a 4-kilobyte array is twice as fast with CSP as it is with RND.

CSP and RND have a performance inflection point (in terms of the size of the passed-in byte array) when CSP performance becomes equal to RND performance, and then surpasses it for all larger array sizes. That point is a 245-byte array on our machine. For small byte arrays in the 1–200 range, CSP performance is significantly worse than RND. For example, a 4-byte array (a random integer) generation could be about 40 times slower with CSP.

Fortunately, we can substantially improve the small-range performance of CSP through buffering. The CryptoRandom class of the Inferno library implements the familiar RND interface, but uses buffered CSP. CryptoRandom is only two to three times slower than RND in the worst case, which should be quite acceptable for most small-range scenarios. .NET 4.5 itself uses the buffered-CSP approach internally (for example, the internal System.Collections.HashHelpers.GetEntropy() method, which uses CSP to populate a 1024-byte array and then consume it in 8-byte chunks).

While the RND approach is a pit of failure, the buffered-CSP/CryptoRandom approach is a mountain of success because it offers an acceptable-to-excellent performance range, high-quality cryptographic randomness, and frictionless thread safety, which makes it virtually impossible to use CryptoRandom incorrectly. RND can still be useful in certain scenarios that require deterministic reproduction of low-quality random sequences.

CSP implementation itself is a wrapper around the CryptGenRandom Windows API. CryptGenRandom is famous enough to have its own Wikipedia entry, but it’s not up to date. Starting with Windows Server 2008/Vista SP1, CryptGenRandom uses a well-known, approved, secure algorithm by default (CTR_DRBG from NIST SP 800-90A).

CSP can be instantiated in several ways:

Table 2: CSP instantiation options

Instantiation code	Performance
var csp = new RNGCryptoServiceProvider();	1x
var csp = RNGCryptoServiceProvider.Create();	~20 times slower

Using the default CSP constructor for instantiation and not the factory method gives much better performance. It is also worth noting that CSP is a Disposable instance, so if you plan on allocating lots of them (although we cannot think of a good reason), you might want to clean up proactively (for example, using block) rather than leaving the cleanup to the garbage collector.

Random integers in min-max range

Randomly-generated integers often need to be bound within a specific min-max range. RND has the int Next(int minValue, int maxValue) method, which you might find yourself re-implementing (for an RND-derived class like CryptoRandom, for example). Another common requirement is the long NextLong(long minValue, long maxValue) method, which extends the min-max range to 64-bit integers (18 full decimal digits).

The naïve implementation approach is to generate a full-range number, mod it by the range length, and add it to the minValue. This will produce a skewed random distribution because the full range will not be evenly divisible into range-length chunks. Be aware of this common pitfall (one correct implementation can be found in the CryptoRandom class).

Generating a random long with CSP or RND is easy—you just generate eight random bytes and convert them to long (for example, using the BitConverter.ToInt64() method). If for some reason you want to generate a long with the RND.Next() method only, you need to do more work. Next() returns positive integers only, which means that it can only provide 31 bits of entropy. You require 64 bits for a long, however, so you need to call Next() three times to obtain intermediate int1, int2, and int3 values. You need 64/3, so about 22 random bits from each intermediate int. There have been reports that the RND algorithm has a bias in the least significant bit, so you should prefer its high-order bits over its low-order bits. The C# implementation could be:

Code Listing 1

public static long NextLong(Random rnd)

{

long r = rnd.Next() >> 9; // 31 bits – 22 bits = 9

r <<= 22;

r |= rnd.Next() >> 9;

r <<= 22;

r |= rnd.Next() >> 9;

return r;

}

Build apps 2X faster

using Syncfusion Essential Studio^® suite

1800+ high-performance UI components.
Includes popular controls such as Grid, Chart, Scheduler, and more.
24x5 unlimited support by developers.

Get Your Free Trial Now