CHAPTER 3
In order to begin looking at our architecture, we really need to take a deep dive into the issues of threading. Along the way, we’ll discover some surprising things.
There are two basic options for how to handle incoming requests:
The source code presented in this section is in the folder Examples\Chapter 3\Demo-AsyncAwait in the Bitbucket repository.
Let's look at instrumenting the StartConnectionListener function in the previous code so that we can get a sense of the processing times and threads. First, we’ll add a couple basic instrumentation functions in the Program class:
Code Listing 6
Next, we add the instrumentation to the StartConnectionListener, replacing the previous method with information on when and what thread the listener starts on. I also have replaced the handling of the response with a common “handler” object (described next).
Code Listing 7
Recall that these listeners are all initialized on a separate thread, but as noted previously, we let the .NET framework allocate a thread on the continuation. Here again is the code from Chapter 2 that initializes the listeners:
Task.Run(() => { while (true) { sem.WaitOne(); StartConnectionListener(listener); } }); |
Code Listing 8
For this test, I've created a ListenerThreadHandler class:
Code Listing 9
CommonResponse (a method of ListenerThreadHandler) artificially injects a one-second delay to simulate some complex process before issuing the response:
Code Listing 10
The handler object is instantiated in the Main:
static void Main(string[] args) { // Supports 20 simultaneous connections. sem = new Semaphore(20, 20); handler = new ListenerThreadHandler(); |
Code Listing 11
After initializing the listeners, we’ll add a test to Main to see how the server responds to 10 effectively simultaneous, asynchronous requests:
TimeStampStart(); { Console.WriteLine("Request #" + i); MakeRequest(i); } |
Code Listing 12
and:
/// <summary> /// Issue GET request to localhost/index.html /// </summary> static async void MakeRequest(int i) { TimeStamp("MakeRequest " + i + " start, Thread ID: " + Thread.CurrentThread.ManagedThreadId); string ret = await RequestIssuer.HttpGet("http://localhost/index.html"); TimeStamp("MakeRequest " + i + " end, Thread ID: " + Thread.CurrentThread.ManagedThreadId); } |
Code Listing 13
RequestIssuer is an “awaitable” request and response function, meaning that it will issue a web request and return to the caller while awaiting the response. The response is handled in the await continuation:
public class RequestIssuer { public static async Task<string> HttpGet(string url) { string ret; try { HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url); request.Method = "GET"; using (WebResponse response = await request.GetResponseAsync()) { using (StreamReader reader = new StreamReader(response.GetResponseStream())) { ret = await reader.ReadToEndAsync(); } } } catch (Exception ex) { ret = ex.Message; } return ret; } } |
Code Listing 14
In the previous code, once an asynchronous function blocks, the await will return to the caller and the next MakeRequest is issued. When the asynchronous function completes, MakeRequest continues.
What we want to know is:
In the trace log, we first see all the MakeRequest function calls all on the same thread, which is expected since they're all being issued by the same Task:
Request #0 |
Code Listing 15
Next, we see the process messages coming in as well as the MakeRequest "end" calls (I'm omitting the StartConnectionListener and MakeRequest messages for clarity):
78 : Process Thread ID: 11 |
Code Listing 16
What's revealing here is that:
Conversely, observe what happens on an 8-core system:
38 : Process Thread ID: 15 38 : Process Thread ID: 13 38 : Process Thread ID: 5 38 : Process Thread ID: 16 39 : Process Thread ID: 17 39 : Process Thread ID: 14 40 : Process Thread ID: 19 41 : Process Thread ID: 18 782 : Process Thread ID: 20 1039 : Process Thread ID: 15 |
Code Listing 17
Now we see eight requests being processed simultaneously, and the last two occurring later. What's going on?
From the previous trace, we can surmise that the thread being allocated for the continuation is allocated based on the number of CPU cores. This is really not the behavior we want. Many requests will involve file I/O, interacting with the database, contacting social media, and so forth, all of which are processes where the thread will be blocked waiting for a response. We certainly don’t want to delay the processing of other incoming requests simply because the mechanism for allocating the continuation thread thinks it should be based on available cores. Unfortunately, this mechanism seems to be in the bowels of how continuations are handled. It is not controllable through TaskCreationOptions because we’re dealing with how the continuation of the awaited call is being handled. All we can declare here is that this is not the implementation we want.
The source code presented in this section is in the Examples\Chapter 3\Demo-Threading folder in the Bitbucket repository.
What happens when we allocate the threads ourselves? Let's give that a try. First, we change the way the context listener threads are initialized, replacing TaskRun and semaphores with the creation of 20 listener threads:
for (int i = 0; i < 20; i++) { Thread thread = new Thread(new ParameterizedThreadStart(WaitForConnection)); thread.IsBackground = true; thread.Start(listener); } |
Code Listing 18
Then, instead of using async/await and semaphores, each thread blocks until a connection is received:
/// <summary> /// Block until a connection is received. /// </summary> static void WaitForConnection(object objListener) { HttpListener listener = (HttpListener)objListener; while (true) { TimeStamp("StartConnectionListener Thread ID: " + Thread.CurrentThread.ManagedThreadId); HttpListenerContext context = listener.GetContext(); handler.Process(context); } } |
Code Listing 19
Now, when our requests are issued, we see immediately that they are processed by 10 unique threads:
75 : Process Thread ID: 3 |
Code Listing 20
And we also see that the responses are all in the same "one second later" block of time:
1083 : MakeRequest 4 end, Thread ID: 31 |
Code Listing 21
This unequivocally shows us that using async/await is not the right implementation choice!
The source code presented in this section is in the Examples\Chapter 3\Demo-ThreadPool folder in the Bitbucket repository. But is the problem with async/await or the system ThreadPool? Using a ThreadPool is not ideal because we’re implementing long-running threads, but we’ll try it regardless:
For (int i = 0; i < 20; i++) { ThreadPool.QueueUserWorkItem(WaitForConnection, listener); } |
Code Listing 22
Look at what happens to the initialization process:
781 : StartConnectionListener Thread ID: 7 |
Code Listing 23
We certainly experience what the MSDN documentation says regarding ThreadPool: “As part of its thread-management strategy, the thread pool delays before creating threads. Therefore, when a number of tasks are queued in a short period of time, there can be a significant delay before all the tasks are started.”
Fortunately though, once the threads have been initialized, we see that the processing happens simultaneously:
12121 : Process Thread ID: 4 |
Code Listing 24
So, while they work, thread pools are also not the correct solution. And as the MSDN documentation indicates, a thread pool is not the right solution here because 1) we’re creating a number of threads in a very short time, and 2) these threads will run perpetually for the life of the server. Furthermore, the threads will potentially block for long periods of timing waiting for connection requests—they are not short-lived threads.
It is now very clear that we should not use async/await to implement asynchronous connection requests. Async/await limits you to processing requests based on the number of cores, preventing you (and the CPU) from distributing request processing across more threads than you have cores. This will definitely be an issue, as it is common to query a database or third-party social media API in your request handler, and your thread will for the most part be waiting for a response, which should not stop other requests from being handled.
The source code presented in this section is in the folder Examples\Chapter 3\Demo-SingleThreadListener in the Bitbucket repository.
Besides having determined that we need to use threads rather than the Task async/await mechanism, we also should consider whether we want multiple threads listening for requests or a single thread. With a single thread, one and only one thread is listening for incoming requests. As soon as a request is received, the request is placed into a queue and the thread immediately waits for the next request. In a separate thread, requests are de-queued and en-queued into a worker thread. We can implement different algorithms for determining which worker thread to en-queue the request, but in the implementation that follows, we use a simple round-robin algorithm.
We’ll begin with a helper class that allows us to create a queue for each thread and a semaphore for signaling the thread:
Code Listing 25
Note the use of .NET’s concurrent collection class, ConcurrentQueue, in Code Listing 25. These are high-performance collections that handle concurrent read/writes and alleviate the complexity of us having to write thread-safe collections.
Instead of processing the request immediately, our handler queues the request and returns. A separate thread de-queues the request and assigns it, round-robin, to a worker thread.
Code Listing 26
The result is what we should expect—our 10 requests begin processing simultaneously and complete processing simultaneously.
76 : Processing on thread 4 |
Code Listing 27
The advantage of the single-threaded connection queuing approach is that it can consume thousands of requests very quickly, and those requests can then be queued onto a finite number of worker threads. The multi-listener approach will stop accepting requests when all the worker threads become busy. In either implementation, the client ends up waiting for its request to be serviced. The major advantage of the second approach is that you are not creating potentially thousands of threads to handle high volume periods. In fact, the single-thread listener approach could even be implemented to dynamically start allocating more threads as volume increases, or even to spool up additional servers. This approach is a much more flexible solution.