Friday, September 28, 2007

Asyncify Your Code

Asyncify your code. Everybody's doing it. (Chicks|Dudes)'ll dig it. It'll make you cool.

Pretty much everything I build these days is asynchronous in nature. In SoapBox products we are often waiting on some sort of IO to complete. We wait for XMPP data to be sent and received, database queries to complete, log files to be written, DNS servers to respond, .NET to negotiate Tls through a SslStream, and much more. Today I'll be talking about a recent walk down Asynchronous Lane: the AsynchronousProcessGate (if you don't like reading just download the package for source code goodness).

I ran into a problem while working on a new web application for Coversant. I needed to execute an extremely CPU and IO intesive process: creating and digitally signing a self extracting compressed file -- AKA The Package Service. This had to happen in an external process, and it had to scale (this application is publicly available on our consumer facing web site). Here's a basic sequence of the design I came up with:


Do you notice the large holes in the activation times? That's because we're asynchronous! The BeginCreatePackage web service method the page calls exits as soon as the BeginExecute method exits, which is as right when the process starts. That means we're not tying up any threads in our .NET threadpools at any layer of our application during the time a task is executing. That's a Good Thing™.

At this point I'm used to writing highly asynchronous/threaded code. However, I still wouldn't call it easy. Why do it? I'd say there are three main reasons.

  1. To provide a smooth user experience. The last thing a developer wants is for his/her software to appear sluggish. There's nothing worse than opening Windows Explorer and watching your screen turn white (that application is NOT very asynchronous).
  2. To fully and most appropriately utilize the resources of the platform (Runtime/OS/Hardware). To scale vertically, you might call it.
  3. Because it makes you cool. AKA: To bill a lot more on consulting engagements.

Microsoft recommends two asynchronous design patterns for .NET developers exposing Asynchronous interfaces. These can be found on various classes throughout the framework. The Event Based pattern comes highly recommended from Microsoft and can be found all over new components they build (like the BackgroundWorker). Personally I think the event based pattern is overrated. The hassle of managing events and not knowing if the completed event will even fire typically steers me away from this one. However, it is certainly easier for those who are new to the asynchronous world. This pattern is also quite useful in many situations in Windows Forms and ASP.NET applications, leaving the responsibility of the thread switching to the asynchronous implementation (the events are supposed to be called in the thread/context that made the Async request -- determined by the AsyncOperationsManager). If you've ever used the ISynchronizeInvoke interface on a Winforms Control or manually done Async ASP.NET Pages you can really appreciate the ease of use of this new pattern...

The second recommended pattern, and usually my preference, is called the IAsyncResult pattern. IAsyncResult and I have a very serious love/hate relationship. I've spent many days with my IM status reading "Busy - Asyncifying" due to this one. But, in the end, it produces a simple interface for performing asynchronous operations and a callback when the operation is complete (or maybe timed out or canceled). Typically you'll find IAsyncResult interfaces on the more "hard core" areas of the framework exposing operations such as Sockets, File IO, and streams in general. This is the pattern I used for the Asynchronous Process Gate in the Package Service.

The Package Service has a user interface (an AJAXified asynchronous ASP.NET 2.0 page) which calls an asynchronous web service. The web service calls another asynchronous class which wraps a few asynchronous operations through the AsynchronousProcessGate and other async methods (i.e. to register a new user account) and exposes a single IAsyncResult interface to the web service.

Confused yet? Read that last paragraph again and re-look at the sequence. In order to make this whole thing scale it had to be asynchronous or we'd be buying a whole rack of servers to support even a modest load. Also because of the nature of the asynchronous operation (high cpu/disk IO) it had to be configurably queued/throttled. I went through a few possible designs on paper. But in the end I chose to push it down as far as possible. The AsynchronousProcessGate, quite simply, only allows a set number of processes to execute simultaneously, the number of CPU's reported by System.Environment.ProcessorCount by default. It does this by exposing the IAsyncResult pattern for familiar consumption. The piece of magic used internally is something we came up with after writing a lot of asynchronous code: LazyAsyncResult<T>.

LazyAsyncResult<T> provides a generic implementation of IAsyncResult. It manages your state, your caller's state, and the completion events. It also uses Joe Duffy's LazyInit stuff for better performance (initializing the WaitHandle is relatively expensive and usually not needed).

Using the asynchronous process gate is straight forward if you're used to the Begin/End IAsyncResult pattern. You create an instance of the class, and call BeginExecuteProcess with your ProcessStartInfo. When the process is complete you will get your AsyncCallback, or you can also wait on the IAsyncResult.WaitHandle that is returned from BeginExecuteProcess. You then call EndExecuteProcess and the instance of Process that was used is returned. If an exception occurred asynchronously, it will be thrown when you call EndExecuteProcess.

The Begin Code:
static void StartProcesses()
{
AsynchronousProcessGate g = new AsynchronousProcessGate();
while (!_shutdown)
{
//keep twice as many queued as we have cpu's.
//for a real, CPU or IO intensive, operation
//you shouldn't do any throttling before the gate.
//that's what the gate is for!
if (g.PendingCount < g.AllowedInstances * 2)
g.BeginExecuteProcess(
new ProcessStartInfo("notepad.exe"),
10000,
ProcessCompleted,
g);
else
System.Threading.Thread.Sleep(100);
}
}
The End Code:
static void ProcessCompleted(IAsyncResult ar)
{
try
{
AsynchronousProcessGate g =
(AsynchronousProcessGate)ar.AsyncState;

using (Process p = g.EndExecuteProcess(ar))
Console.WriteLine("Exited with code: " +
p.ExitCode + ". " +
g.PendingCount + " notepads pending.");
}
catch (Exception ex)
{
Console.WriteLine("("
+ ex.GetType().ToString()
+ ") - " ex.Message);
}
}

Phew! After all that, the end result for SoapBox: a single self extracting digitally signed file someone can download. Oh, and a simple library you can use as an Asynchronous Process Gate! Enjoy. Look, another download link so you don't even have to scroll back up. How nice am I?

No comments:

Post a Comment

About the Author

Wow, you made it to the bottom! That means we're destined to be life long friends. Follow Me on Twitter.

I am an entrepreneur and hacker. I'm a Cofounder at RealCrowd. Most recently I was CTO at Hive7, a social gaming startup that sold to Playdom and then Disney. These are my stories.

You can find far too much information about me on linkedin: http://linkedin.com/in/jdconley. No, I'm not interested in an amazing Paradox DBA role in the Antarctic with an excellent culture!