Showing posts with label soapbox. Show all posts
Showing posts with label soapbox. Show all posts

Monday, February 4, 2008

A New Kind of Application Server

As you probably know, I'm a cofounder of Coversant which, at its heart, is an XMPP development platform. Most of our larger customers (thousands of simultaneous users) are ISV's that have built on the SoapBox Platform®. We allow you to easily develop XMPP applications using .NET technology.

A really long time ago, I wrote about some possibilities for using the SoapBox Platform including examples of what our customers were doing at the time. This was before there microblogging was popular, or I probably would have used that example too. :)

The last couple of weeks there seems to be quite a bit of buzz around the subject of using XMPP as an application server, and that gets me really excited! A friend/competitor Matt Tucker of Jive Software wrote in his company blog about how XMPP is the future for cloud services. A "real" online author (aka not a member of an XMPP company) even picked up Matt's article and ran with it. Yesterday, a little buzz hit Slashdot when another friend/competitor Mickael Raymond of Process One wrote about introducing the XMPP application server (when I wrote this, it seems Process One was experiencing a bit of the Slashdot effect -- hopefully by the time you read this it will be gone and you can read his article), which is an exploration of building a Twitter-like microblogging system on top of their XMPP server. Great stuff, indeed!

This is wonderful news and very validating for me personally! It seems after six years of committing to the infant technology, I wasn't crazy after all, and XMPP is a good platform for presence/messaging systems! And if you're in the market for .NET based XMPP solutions, head on over to the SoapBox Developer site. :)

Thursday, December 14, 2006

Installing Coversant Products on Vista

Due to the enhanced security in Windows Vista, not all Coversant products are able to be installed out of the box. Luckily, this is really easy to work around and rest assured, future version of our installation packages will not suffer from these issues.

The symptoms show up as an error message dialog with code 2869:

And then a series of empty dialog boxes:


This is apparently some sort of permissions issue. To solve it, the MSI needs to be run as an administrator. The easiest way to do this is to create a bat file to run the msi manually. It would be something like this:

msiexec.exe /i "c:\SoapBoxServer2007\files\SoapBox.Server.Enterprise.x64.3.0.213.69.msi"

Then you right click on the bat file and choose "Run As Administrator". Presto, a working installation in Vista.

Saturday, September 16, 2006

SoapBox Platform Possibilities

Our customers do some very interesting things with our platform, including:
  • RPC – The asynchronous nature of XMPP along with its addressing and our platform make it ideal for application to application messaging. One of our customers wrote a .NET Remoting transport on SoapBox, and many use it for other custom RPC needs (some are below).
  • Network gaming infrastructure – sending game data, hosting chat rooms, private chat during games, etc
  • In class test taking – Tests are distribute to Tablet PC’s when students log in and results are tallied.
  • Geographical Chat – Whiteboarding combined with maps and group conferencing.
  • Financial Market Data – Real time data from the market flows into applications used by traders.
  • Social Networking – Consumer social networking site using our platform for chat, content, and advertising delivery to the desktop.
  • Emergency Alerting – “The nuclear power plant is melting down. Evacuate!”
  • Remote Surveillance Control – Watch your surveillance cameras at home in any web browser and control them.
  • Automated Manufacturing Alerting – “Line 5 is clogged. Attention required!”.
  • Plain ole’ chat built into an existing application – Give users access to the people they need in the application they are familiar with.
  • Build System Alerts – Our build system sends us messages as our daily builds runs, letting us know the status.
  • Web Based Live Support – Communicate with customers live, through a web site.
  • Voice and Chat on Ruggedized Handhelds - Push to talk through a contact list with presence, send messages, pictures, or go into walkie talkie mode.
  • And the list goes on…

Some have been in the news, some are still in "stealth mode", but unfortunately I can't mention any names of companies. We will have some case studies coming out for a few of them.

Some of these applications are built on our 2005 platform, but many are built on our upcoming 2007 release. We've been working hard with our partners to make sure this upcoming release is something special. Here's a little overview of the new SoapBox platform.

SoapBox Framework (now SoapBox Studio)

The SoapBox Framework was our first product offering. We built and productized a framework knowing we’d want to build a server and an advanced communicator client. It started out as one framework, and has grown into quite a bit more. With the 2007 product release we will be distributing our frameworks in one package called the SoapBox Studio. They are all based on the same code base, which allows us to quickly add features to our entire product line. These frameworks include:

  • Desktop Edition – Build desktop and web applications in .Net for Windows, Linux, and Solaris.
  • Mobile Edition – Build mobile applications that run on PocketPC, Smart Phone, Windows Mobile, or Windows CE operating systems.
  • Web Service Edition – This is my favorite. J A standard SOAP web service, built using our Desktop Edition on the back end, that allows you to integrate with any language that supports web services. This includes Java, ColdFusion, Flash, C++, Perl, Php, and more (basically every language out there).
  • Server Administration Edition – All of the features that are available in our Management Console are available to you through public API’s. You can quickly perform tasks such as adding users, managing contact lists, and retrieving message archives.
  • Server Edition –This allows you to build plug-ins to the SoapBox Server to manage users from your own custom user store, create custom components to service client requests, filter messages, do custom logging, and manipulate the way the server works in general.

SoapBox Server

The SoapBox Server is our flagship XMPP server product. It is based on our Desktop Edition Framework with additional layers to do everything a server needs to do. You can easily customize the Server to meet your specific needs through the SoapBox Framework Server and Administration SDK’s.

SoapBox Communicator

The SoapBox Communicator is our client software built on our SoapBox Framework Desktop Edition (as well as another layer that we will be productizing soon) and serves as an example of how to best utilize the framework for client side development.

Other Key Platform Features

  • Open, well normalized database – We use a strict Enterprise nTier model in the SoapBox Server. Our database is well normalized with stored procedures for all interactions. This makes it extremely easy to integrate users, contact lists, message archiving, and presence with any environment that can read or write to a database.
  • Shared code base - Whether you’re writing an application for a Smartphone, or a server plug-in to handle workflow in a CRM application, your code interacting with SoapBox will look strikingly similar. This better utilizes your developer resources by focusing them on their business problem, not re-learning another API.
  • Unprecedented vertical scalability - SoapBox Server will fully utilize any hardware you can throw at it.

We have been hard at work creating the best documentation and samples out there for any XMPP platform. Here are a couple of (draft) samples.

If you would like advanced access to the SoapBox 2007 Platform, shoot us an email over at [email protected] and we'll get you on the beta list.

Friday, September 1, 2006

Cross Platform Deployment Project Bootstrapper

I think we are one of a very small set of companies out there building consumer grade, shrink wrapped products on .NET. Why? Well, I'm not sure. It might have to do with Microsoft's positioning on the matter. All their documentation talks about enterprise deployments. It could also have to do with the runtime size. Some people think 50MB download for prerequisites is too much. In general, though, .NET, and Visual Studio 2005 especially, provide all kinds of great tools for building shrink-wrapped products. One of these is the Deployment Project.

To create a working Deployment Project you simply set a few properties and point it at your main executable project. Whether it is a Windows Service, Winforms, or Web Project, you will get an installer that will run for most people. By default in Visual Studio 2005 the Deployment Project generates a bootstrapper Setup.exe. This will download and install MDAC 2.8 and the .NET 2.0 redistributable, if necessary. However, there is one caveat. It only works on x86 based operating systems! I have to admit, this caught me off guard. I expected to check a couple boxes in the UI and have everything just work on every platform where .NET would run. Unfortunately this is not the case.

Luckily, the bootstrapper is very customizable. Using a simple MSBuild task and some XML files you can dynamically build bootstrappers for any MSI or executable file. So, I put together some bootstrapper packages for .NET 2.0 on x86, x64, and IA64 platforms, MDAC on x86, Windows Installer 3.1 on x86, and SQL Express 1.0 SP1 on any platform. The result: I can click a few check boxes in a deployment project and not have to worry about changing my bootstrapper packages for each platform.

Here is the bootstrapper set: CrossPlatformBootstrappers.zip

The key to these bootstrappers working on all platforms is they do not fail if the platform doesn't match. They simply move on to the next bootstrapper. So you can choose all of our .NET 2.0 boostrappers in your Deployment Project, change the target platform of the deployment, and not have to remember to change your bootstrappers.

In addition to the .NET 2.0 bootstrappers I also created a SQL Express 1.0 SP1 bootstrapper using SQLEXPR.exe, which will run on all platforms. However, instead of doing a default install of SQL Express, it installs a specific instance name. For us, this instance is SOAPBOX. You'll want to modify the bootstrapper package.xml and product.xml files to suit your needs for your SQL instance name (or just use the default). If the instance already exists on the computer, it won't be installed again.

Earlier I mentioned MSBuild. A major key to creating shrink-wrapped products is being able to get builds out quickly and accurately. We use MSBuild for this. We have tons of custom tasks, but one of the MSBuild tasks that ships with the framework is called GenerateBootstrapper. With this task you can point MSBuild at your Bootstrapper directory (the contents of the zip file), an msi, and a list of bootstrappers, and it will create a bootstrapper exe for you.

Deployment Projects and I have a love/hate relationship. There are many things about them I don't like, but the bootstrapper piece is wonderful! So, in the upcoming 2007 builds of SoapBox products, expect to see a seamless installation including download on demand for all the prerequisites your system will need, on any platform.

Tuesday, August 1, 2006

Using the MySql Command Line from C#/.NET

We have added two new data access providers to the upcoming SoapBox Server 2007 release. We now support PostgreSql and MySql as well as Microsoft SQL and Oracle. The code for these databases has been in our server and test libraries for a couple of months and this last week it was time to add them into our post installation configuration wizard. We strive to make our setup process as simple as possible. You'll notice many improvements over the 2005 wizard. We have better auto-configuration and fewer wizard screens.

One of our core philosphies here at Coversant is to make all of our software as easy to use as possible. Aside from testing, we spend more time doing this than any other R&D effort. Why is easy to use software so important? Well, the easier our software is to use the more you will like it and the less support we have to provide. We cut costs, you are happier. It's win/win. :)

Adding the automated setup process for these new data access layers was supposed to be very straightforward. In our configuration utility it's all abstracted so I just implement a few classes, the DB guy gives me the scripts to run and I call the appropriate tool (mysql.exe, pgsql.exe, osql.exe, etc), which we ship with our server for your convenience. However, there is one big feature missing from the mysql command line tool. There is no way to specify a file to use as input! But, it does take standard input. So, typically, if i were running the tool from the command line I would do something like: 'mysql -uroot -p < "c:\myscript.sql"'. If you're command line saavy you will know this redirects the file c:\myscript.sql as standard input into the mysql command line tool. I thought I could do this in .NET using command line arguments passed to the System.Process.ProcessStartInfo class. Well, I was wrong. It doesn't work (at least I couldn't figure out how to give it a file as stdin -- the "<" didn't work).

So, what's the answer? Simple. Set the nifty RedirectStandardInput property and read the file into the Process.InputStream. Code follows:


private void ExecuteSQLScript(string databaseName, string user, string password, string command, string filename, string server)
{
using (System.Diagnostics.Process p = new System.Diagnostics.Process())
{
#if LINUX
p.StartInfo.WorkingDirectory = System.IO.Path.GetFullPath(this.InstallOptions.InstallDirectory);
p.StartInfo.FileName = "mysql";
#else
//grab the path from our installation options
p.StartInfo.WorkingDirectory = System.IO.Path.GetFullPath(System.IO.Path.Combine(this.InstallOptions.InstallDirectory, BaseScriptDirectory));
p.StartInfo.FileName = System.IO.Path.Combine(p.StartInfo.WorkingDirectory, "mysql.exe");
#endif

//set all the startup options
p.StartInfo.CreateNoWindow = true;
p.StartInfo.WindowStyle = System.Diagnostics.ProcessWindowStyle.Hidden;
p.StartInfo.UseShellExecute = false;
p.StartInfo.RedirectStandardOutput = true;
p.StartInfo.RedirectStandardError = true;

//build the arguments for mysql
StringBuilder args = new StringBuilder();

if (!string.IsNullOrEmpty(password))
args.AppendFormat("-p{0} ", password);

if (!string.IsNullOrEmpty(server))
args.AppendFormat("-h{0} ", server);

if (!string.IsNullOrEmpty(databaseName))
args.AppendFormat("-D{0} ", databaseName);

if (!string.IsNullOrEmpty(user))
args.AppendFormat("-u{0} ", user);

if (!string.IsNullOrEmpty(command))
args.AppendFormat("-e\"{0}\" ", command);
else if (!string.IsNullOrEmpty(filename))
p.StartInfo.RedirectStandardInput = true;

p.StartInfo.Arguments = args.ToString();

WTrace.TraceInfo("Run Script", this.GetType(), "Executing: '{0}' with args '{1}' in working dir '{2}'", p.StartInfo.FileName, p.StartInfo.Arguments, p.StartInfo.WorkingDirectory);


//start up the process, and handle the redirected stdin and stdout -- send them to our trace lib
p.ErrorDataReceived += new System.Diagnostics.DataReceivedEventHandler(p_ErrorDataReceived);
p.OutputDataReceived += new System.Diagnostics.DataReceivedEventHandler(p_OutputDataReceived);
try
{
p.Start();
p.BeginErrorReadLine();
p.BeginOutputReadLine();

//read in the script file if one was specified and give it to stdin
if (null != filename)
{
using (FileStream f = File.OpenRead(filename))
{
using (StreamReader reader = new StreamReader(f))
{
//we could do this one line at a time to be safe memory wise, but we know our scripts are small
p.StandardInput.WriteLine(reader.ReadToEnd());
}
}

//tell mysql we want to exit, otherwise the process will hang
p.StandardInput.WriteLine("exit");
}

p.WaitForExit();
}
finally
{
p.ErrorDataReceived -= new System.Diagnostics.DataReceivedEventHandler(p_ErrorDataReceived);
p.OutputDataReceived -= new System.Diagnostics.DataReceivedEventHandler(p_OutputDataReceived);
}

if (p.ExitCode != 0)
throw new MySqlException("SQL Command failed.", p.StartInfo.Arguments, "", "");
}
}

void p_ErrorDataReceived(object sender, System.Diagnostics.DataReceivedEventArgs e)
{
if (!string.IsNullOrEmpty(e.Data))
WTrace.TraceInfo("Run Script", this.GetType(), "StdErr: {0}", e.Data);
}

void p_OutputDataReceived(object sender, System.Diagnostics.DataReceivedEventArgs e)
{
if (!string.IsNullOrEmpty(e.Data))
WTrace.TraceInfo("Run Script", this.GetType(), "StdOut: {0}", e.Data);
}

Wow, that's a lot of code. Ok, so I probably didn't have to paste that much of it, but I wanted you to get an idea of the full extent of the method, and I think being able to run mysql scripts is useful.

When all is said and done you now have two screens for installing to a MySql database: 1) enter username, password, and hostname 2) choose an existing or new database name. From there the installer figures everything else out automatically.

Wednesday, July 26, 2006

More on Interop

I'm sitting on a plane on my way back to Sacramento (a whopping 1 hour flight) and I thought I'd try to give a little more information about the interop event. After two days of testing it appears as though we're actually very close to having quite a few fully interoperable XMPP implementations. As I mentioned last time, Monday was a cake walk. Unfortunately Tuesday wasn't quite so easy.

Consensus was reached on Monday to start later on Tuesday. We wanted to do 10am, but had reservations at Jive's office to sample beers of the northwest at 5 (yeah they have some good beers) and play some XBox, so we decided to start at 9 instead. The morning session consisted mostly of protocol discussions that were very productive. We decided on the general protocol flow of PEP (Personal Eventing via Pubsub). This protocol addition allow us to create some very innovative and interesting extended presence features (more on this later) in future versions of SoapBox Communicator, and they'll be compatible with features in other clients such as Google Talk, GAIM, etc.

I had some really good Thai food from a street vendor, and I'm still not sick, so that's good. In fact it was some of the best Thai food I've had in a long time, and I eat Thai about once a week.

After lunch we got down to business with more interoperability testing. The goal: mutually authenticated TLS streams between servers per RFC 3920. A certificate authority was created, X.509 Certificates were generated, servers were configured, and then... it didn't work. Unlike the overwhelming success of Monday, Tuesday brought the skeletons out of the closet. We soon realized there were numerous breaking differences between OpenSSL, Java, and .NET based implementations of TLS. When we started, absolutely none of the servers were able to talk to each other over a fully trusted TLS connection. The interesting thing was we could all talk to another instance of our own servers. Hmmm.

After a few hours of hacking and debugging we realized there was significant work that needed to be performed and we didn't have enough time to do it. We were able to get connected with a few of the servers, and vice versa, but there definitely wasn't a Happy Path for all. As a result the server to server TLS specifications in RFC3920 will be clarified, as we eventually reached consensus on what it all really means and how it should be implemented.

In the very near future the JSF will be facilitating ad-hoc interoperability testing over the internet. It will be managing domains (such as soapbox.xmpp.org, google.xmpp.org, etc) where all participating vendors and open source projects will host servers. These will be semi-private domains without open registration, but open to anyone developing XMPP applications that need to test interoperability.

All in all, this was a very successful couple of days. We probably saved a good two months worth of bickering over e-mail lists to figure out protocol issues, verified that XMPP is in fact interoperable, and set the stage for future interoperability testing. We'll also be exploring fully automated tests, which Coversant will likely contribute to the JSF, to make sure everyone continues to play nice in the future. :)

Monday, July 24, 2006

Interoperability - Yup, we got it covered

Today was the first day of the first ever official XMPP Interop Event. In fact, it was probably the first day of any open instant messaging and presence interop event, ever.

We had showings from Coversant (yeah, that's us), Sun, SixApart (LiveJournal), Google, Jive Software, Jabber Inc, Sun Microsystems, and Process-one. Some of those names might sound familiar, and others not, but in the end what we ended up with was seven completely different XMPP server code bases/implementations both open and closed source, setup on a LAN and federating with each other. We spent more time configuring DNS, IP addresses, and other networking junk than we did fixing bugs that were hampering interoperability.

As far as I know (who knows what everyone was hiding on their laptop screens) there were only a few major bugs that were found and they were on two of the very freshest of server implementations, one of which the vendor considers in "pre-alpha" release status. No, I won't tell you who, but I bet they release patches soon. :)

So what the heck did we do all day? Well, we tested interoperability, obviously! But more specifically:


  • Inter-domain roster manipulation - Add and remove contacts on other servers.
  • Inter-domain messaging - Can we actually hold conversations with each other?
  • Inter-domain presence - Avatars, status messages, show states, etc
  • pizza
  • UTF8 support - Strange unicode characters in other languages in addresses, messages, and presence.
  • Discussed quite a few of the enhancement protocols pending for the XMPP specification and came to concensus on some issues.
  • Mingled with everyone and shared some anecdotes on implementing XMPP that pretty much only the people in that room would understand.
  • beer

What does that mean to you, a loving Coversant customer -- you are a customer, right? Well, it means that you can securely talk through IM in a federated manner to your trading partners, friends, family, and arch enemies even if they aren't one of our customers. Yeah, we support that, and apparently it works. ;)

#1 on the list for tomorrow, trusted TLS based inter-server connections. So we have the joy of setting up a trusted certificate authority and distributing certs. I bet setup will take longer than the interoperability testing itself, again.

Monday, June 26, 2006

How to Build Scalable .NET Server Applications: Memory Management

I'll get this out of the way from the start. This series of blogs will have nothing to do with ASP.NET or web services. However, if you plan on writing you own implementation of IIS in managed code this would probably be a good place to start. :) I also won't be providing very many code examples, as I'd be flogged by our intellectual property lawyers. You will not be able to copy and paste and create your own scalable server. However, I hope to provide enough insight so you can avoid a big list of gotchas we have had to figure out the hard way. This is one piece of a huge puzzle, memory management. Yes, you do have to think about that in .NET, at least if you want to build a large scale application.

For those who don't already know, SoapBox Server is a part of our SoapBox Collaboration Platform that supports the XMPP protocol as well most of the interesting JEP extensions. At the core of SoapBox Server is a highly efficient Socket server and thread machine capable of scaling into the hundreds of thousands of simultaneous users, and it's built 100% on .NET (C# now, but used to be VB).

SoapBox Server is the first multithreaded Socket based server application I've had the pleasure of working on. During the course of building the SoapBox Server into the extremely scalable and reliable system it is today I've learned a few things (as has the rest of the team, I hope). Thanks to Chris (who already had tons of experience with such things in Win32/C++), a few bloggers out there, some books, customers finding very interesting bugs, Windbg with Son of Strike, oh and Starbucks, I'd say I'm pretty well versed in the land of building scalable server applications. I'm no Jeff Richter, mind you, but I feel I have now learned enough to at least speak intelligently about it.

In that spirit I'd like to share the fruits of our tuning and debugging work, which, if history repeats itself, will continue to evolve as we begin work on our next major revision of the product. First, I'd like to repeat something I said a couple paragraphs ago, SoapBox now scales to hundreds of thousands of simultaneous connections with a single piece of server hardware. Think about that for a second. A user brings up an IM client, connects to SoapBox Server, and then holds that connection open until they Log Out. Repeat hundreds of thousands of times. This is no simple task. The .NET CLR does not provide a magic "Process.Scalable = true" property. We have invested hundreds of hours into tuning (maybe thousands) over the life of the server on classes of hardware varying from single processor laptops to 16-way Itanium2 systems with 64GB RAM. We've been through four distinct processing models as well as quite a few iterative improvements on our Socket interaction layer. Basically we have ran the server under a bunch of different profilers under many scenarios, found slow bits of code, and fixed them. But I'm not going to talk about profiling and performance tuning; perhaps another time. I'm going to talk about memory and scalable applications.

Every time your application creates a new Socket, Windows pulls memory from it's Nonpaged Kernel memory, which is simply physical memory that is reserved by the kernel and will never be paged out to disk. This block of memory has a finite limit and the kernel picks the limit based on the amount of phsyical RAM available to it. I don't know the exact algorithm, but with 4GB RAM it's usually somewhere around 150,000 TCP Socket connections, give or take. Want to see this in action? Simply create a loop that instantiates sockets. It will stop working eventually with a SocketException telling you there isn't enough buffer space. On top of this hard kernel level limitation, you also have to worry about how much memory each concurrent connection uses in your own application. In SoapBox we store a lot of information about each connection in memory in order to improve performance and decrease our IO operations. This includes things like the user's contact list, their last presence (available, away, busy, etc), authorization information, culture information, user directory information, etc. If we didn't hold this in memory we'd have to hit a file, database, or some other out of process persistent store for the information every time we needed it. Being IO bound is no fun. Believe me, we started out that way.

However, because of our extensive caching, SoapBox Server 2005 can only reliably handle about 20,000 simultaneous connections on the beefiest of 32 bit hardware (on 64 bit it's much, much, much higher -- I also have to admit we haven't stress tested the 2007 build on 32 bit hardware, it would probably be much higher now). It doesn't matter if you have 64GB RAM and 16 32bit processors, it we can still only handle 20,000 connections. Why, you ask? Well, it's because of the 2GB (well, really 3GB with a boot.ini switch) virtual memory limit per process in 32bit Windows. Without delving into managing your own memory your process is only allowed up to 3GB to play with. Typically, we use that up, or rather, .NET thinks we use it up, somewhere between 20,000 and 30,000 connections. Now why would I say ".NET thinks we use it up?" Story time!

A little over a year ago one of our customers kept running into a very bad situation. As evidenced by the Event Log, SoapBox Server was crashing (insert shock and awe here). It was an irregular occurance, but it did happen. However, we did no take this lightly. This customer was running about 2,500 simultaneous connections on a Dual Xeon with Hyperthreading and 4GB ram and the /3GB switch set. It was plenty of hardware for the job, and probably overkill. However, the service was still crashing. We set them up with the Debugging Tools For Windows and had them startup the process to wait for a crash (another blog we'll have to write some day). After a few tries we got a dump with some useful information in it. The result? We were out of memory, sort of.

In .NET when you call any socket operation and pass it a buffer, whether it be a send or receive, synchronous or asyncronous, it takes that buffer and pins it before giving it to the Winsock API's. Pinning, in a nutshell, is taking a .NET data structure and telling the .NET CLR memory manager not to move it, until it is explicitly un-pinned. The memory manager in the CLR is smart. As you allocate and deallocate memory it is constantly defragmenting it for you so the overall memory footprint is lower. There are quite a few really good/long/complicated articles on how this works so I won't bore you. However, pinning throws a wrench in this and the memory manager isn't quite smart enough to deal with it well (though it has gotten a lot better in 2.0). Basically, that buffer you want to put on the socket cannot move in memory (physically -- in terms of you virtual memory space) from the time the socket IO operation begins until it ends. If you look at the Winsock2 API's this is obvious, since the buffer is passed as a pointer. Anybody who's built this type of application in Winsock2 is probably saying "DUH!". I'd consider this a very leaky abstraction. Due to this behavior, it is quite easy to write a socket application in .NET that runs out of memory.

Back to the story! Not only were we out of memory, but the there was only about 200MB worth of data structures in the heap. For those of you like me that use calc.exe for all your basic math let me figure that out for you, 200MB > 3GB. Uhh, say what? How the heck were we out of memory? Well, we ran into the shortfall of pinning and memory fragmentation. The cause of this was a small number of small pinned buffers, in our case 2KB each, that were high enough in the heap to cause fragmentation spanning over 2.8GB. Where did the other 2.8GB go, you ask? Well, is was there, allocated by our process, but not being used by our code. In Son Of Strike (SoS -- a command line plug-in to the Windbg debugging tool I hope you never have to use) this showed up as free, empty, unused space! It was just sitting there waiting to be used, but we still ran out of memory. I think I mentioned earlier the memory manager in .NET isn't so smart when it comes to fragmented memory and pinning, well, this is what happens in the worst case.

Good thing for you, the answer to all your memory fragmentation and pinning woes is quite simple. Pre-allocate buffers for use by anything that will be causing pinning, and do it early on before there is a lot of memory thrash (when your application is rapidly allocating and deallocating a lot of memory). We created a simple class called a BufferPool that we use to pre-allocate a certain number of buffers. This pool can grow as need be, but it does so in large chunks and forces a garbage collection each time before the buffers are actually used. This considerably reduces the chances of fragmentation caused by pinned memory. If the pool starts off with 500 buffers, but then the 501st buffer is needed it will grow by a configurable value, typically another 500 buffers, and the induced garbage collection will cause these buffers to shift to the lowest possible point on the heap.

Interestingly enough when we found this bug we already knew about the pinning behavior of socket operations, but had only solved half of it. All of our BeginReceive calls were using the BufferPool because we knew the buffers would remain pinned until we received data from a client, but the BeginSend calls were not using the pool. We had not even considered the fact that sending a few KB of data might take long enough to pin memory, fragment the heap, and cause an OutOfMemoryException. But there is one case where they do, timeouts. The Windows TCP subsystem is very forgiving. If a client loses its connection and the server isn't explicitly told about it, the next piece of data you try to send to that client socket will end up being pinned while the TCP subsystem waits for the client to respond. It can take up to 5 minutes with the default configuration of Windows for the TCP subsystem to figure out the client isn't really there. During that entire time your buffer is pinned in memory. *poof* OutOfMemoryException.

Unfortunately, pre-allocating buffers does not completely fix the issue of running out of memory. There are also some other limits to the size of a .NET process's virtual memory space that are very complicated and I won't talk about, but basically you end up with anywhere from 1/2 to 2/3 usable virtual memory without running the risk of OutOfMemoryException. So, if you have 2GB virtual memory available (standard on a 32bit machine), you end up with about 1.3GB you can actually use reliably. Of course, this varies, and some applications will be able to use more, or maybe less. Your mileage may vary.

Don't fret, all of the issues I've talked about in here have been fixed since SoapBox Server 2005 SR1. And with the most common usage patterns people were not actually affected to begin with.

I hope this was at least marginally interesting to someone. :) Next up, I'll probably talk about limitations we discovered in the Windows Socket infrastructure, or maybe async IO, IOCP, and worker threadpools, or maybe how in the world we actually test at this scale. Only time will tell, unless Chris beats me to it.

About the Author

JD Conley is an entrepreneur and hacker, currently working away his golden handcuffs at Playdom, a subsidiary of the Walt Disney Company, since Hive7 was acquired. We make social games. The views and opinions expressed on this post are his and do not necessarily represent or reflect those of The Walt Disney Company.