Starin' at the Wall: 2006

Thursday, December 14, 2006

Installing Coversant Products on Vista

Due to the enhanced security in Windows Vista, not all Coversant products are able to be installed out of the box. Luckily, this is really easy to work around and rest assured, future version of our installation packages will not suffer from these issues.

The symptoms show up as an error message dialog with code 2869:

And then a series of empty dialog boxes:

This is apparently some sort of permissions issue. To solve it, the MSI needs to be run as an administrator. The easiest way to do this is to create a bat file to run the msi manually. It would be something like this:

msiexec.exe /i "c:\SoapBoxServer2007\files\SoapBox.Server.Enterprise.x64.3.0.213.69.msi"

Then you right click on the bat file and choose "Run As Administrator". Presto, a working installation in Vista.

Tuesday, November 21, 2006

XSLT For MSDN Product Keys

Here at Coversant we're Microsoft partners. We have MSDN subscriptions for all our developers/testers, and we share the same set of license keys. Rather than give everyone willy-nilly access to the MSDN download web site (ick, lots of bandwidth suck) we setup an internal file share for MSDN installation files, CD images, etc. We used to have all the product keys in there just saved as html from Microsoft's web site. However, that's no fun!

In the spirit of having fun and being developer friendly, Microsoft is nice enough to offer an "Export Key List to XML" button on the web page where we view all our product keys. So, I clicked the button. Out popped a very nicely/simply formatted XML document, like the following:

<?xml version="1.0" standalone="yes"?>
<Your_Product_Keys>
 <Product_Key
  Name="All products requiring a 10-digit product key"
  Key="xxx-xxx-xxxx"
  Key_Type="Retail"/>
 <Product_Key
  Name="Windows Vista Ultimate"
  Key="XXXXX-XXXXX-XXXXX-XXXXX-XXXXX"
  Key_Type="Retail"
  Date_Key_Claimed="2006-11-20 17:36:11.787"/>
 <Product_Key
  Name="Windows Server 2003 R2 Standard Edition"
  Key="XXXXX-XXXXX-XXXXX-XXXXX-XXXXX"
  Key_Type="Retail"
  Date_Key_Claimed="2006-11-20 17:36:08.270"/>
 <Product_Key
  Name="Office 2007 Desktop Programs"
  Key="XXXXX-XXXXX-XXXXX-XXXXX-XXXXX"
  Key_Type="Retail"
  Date_Key_Claimed="2006-11-20 17:35:57.537"/>

  ...

</Your_Product_Keys>

I whipped up a quick XSLT you might find useful to transform this to a really ugly, really simple, x-html page (yeah without the namespace declaration).

<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
 <html>
 <body>
   <h2>MSDN Product Keys</h2>
   <table border="1">
   <tr bgcolor="#9acd32">
     <th align="left">Product Name</th>
     <th align="left">Product Key</th>
   </tr>
   <xsl:for-each select="/Your_Product_Keys/Product_Key">
   <tr>
     <td><xsl:value-of select="@Name"/></td>
     <td><xsl:value-of select="@Key"/></td>
   </tr>
   </xsl:for-each>
   </table>
 </body>
 </html>
</xsl:template>

</xsl:stylesheet>

Then I made one quick edit to the xml file so friendly xslt aware web browsers will do the transform for me:

<?xml version="1.0" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="productkeys.xslt"?>
<Your_Product_Keys>
  ...
</Your_Product_Keys>

Presto-chango, we now have an easy to update central listing of our MSDN keys. Now if only MS had a web service that took my Live ID credentials and gave me back that XML...

Wednesday, November 8, 2006

Compact Framework WaitHandle.WaitOne Gotcha

I ran into a behavior in the 2.0 Compact Framework today that was most vexing. It wasn't hard to find like a subtle race condition. It wasn't an issue that only duplicated with a certain system configuration, under a full moon, on Wednesday. No, it duplicated every single time the code was ran. But, it wasn't documented anywhere I could find.

One of my favorite new features in the Compact Framework is the availability of the WaitHandle.WaitOne(int, bool) overload. That's something we use quite a bit in our test code and here and there in the actual SoapBox Framework. We used to have our own ManualResetEvent implementation for the Compact Framework that P/Invoked out to Windows CE. But Micrsoft was nice enough to add this into the 2.0 Framework. Yay! (In case you never had the joy of programming to the .NET 1.0 framework, the only WaitOne overload that was there was the indefinitely blocking one. No timeouts allowed.)

I was running our unit test suite against the Compact Framework and tests that used to pass until I ripped out our custom P/Invoking ManualResetEvent implementation were failing. Odd... Well, it turned out to be very easy to track down. WaitHandle.WaitOne(int, true) throws an ArgumentException every single time. That's right. If you pass true to that second parameter, the exception is thrown.

Don't get me wrong, I understand the implications of exiting the synchronization domain for the context or not. It turns out the code that was causing the error should have been passing "false" for the parameter anyway. But, why did the call throw an exception? Why not just ignore the argument when it's not relevant as the documentation alludes? And I quote from MSDN2: "The exitContext parameter has no effect unless the WaitOne method is called from inside a nondefault managed context."

Anyway, if you're doing any Compact Framework development, make sure to pass "false" into the exitContext parameter of your WaitHandle.WaitOne calls.

Friday, September 29, 2006

The Catch with GoDaddy Dedicated Hosting

I'm not known to get easily frustrated or publicly rant about something, but here goes. The public web site and soapbox.net IM service for Coversant is hosted with GoDaddy using their dedicated server plans. This makes things real easy for us. We don't have to manage hardware somewhere in a colo. We get an ftp site for backups. And most of all we get the convenience and control of a dedicated system. It's also very reasonably priced. Ah, but there's always a catch.

Some time near the end of last month our daily and hourly SQL Server backups went on a rampage. The hourly transaction log backup we have scheduled was stuck in a loop so it never reset itself. We were uploading the entire transaction log for the month every hour. The end result was a constant stream of uploading to our FTP backup site, which is hosted by GoDaddy as well. Oh, I should mention the FTP backup is a service we pay for monthly in addition to any dedicated server plan. Here comes the fun part. The bandwidth used to connect to our FTP backup site is charged at the normal rate as if you were serving up web pages to customers. If you go over your monthly allotment you get charged the burst bandwidth rate of $1.99/1000MB. Last month we went over 661 times, or roughly 661GB. You can do the multiplication there. I should mention a nice customer support rep discounted it 50% for us. I'd really hate to have his job. . .

While this isn't that much money in the scheme of things, it is really freaking annoying. It would have been nice to get a phone call letting us know the first time we went over, not when we got our monthly statement. And why do we have to pay for bandwidth to a service we pay for that's hosted by our hosting provider and is very likely even located in the same data center. Aaargh!

So, in conclusion, we're researching what needs to be done to move to a local colo or perhaps another, slightly more expensive but not so unreasonable, dedicated hosting provider. Oh, we're still sorting out what caused the SQL backup issue.

Monday, September 25, 2006

Gentoo Linux MAC Based Host Name

We use Gentoo for all our Linux development (yeah, we do Linux -- coming very very soon). In fact we are currently putting together a test lab with quite a few computers using Gentoo, Mono, and our own StressBot software to drive client load to our server.

The lab currently consists of 25 computers, pointed at whatever server hardware/software configuration we happen to be testing. I'll post some pictures some day, I promise. As any IT guy will tell you, managing 25 computers isn't trivial. You have to make sure images stay in sync with the correct software builds, monitor that everything is working, etc. In a test lab this isn't so horrible, since our hardware is identical and we just use Ghost software and push out new images. However, this does bring up some interesting issues, the main one being the hostname. On Windows there are more issues, but we won't get into that.

On your network the hostname uniquely identifies your system. If you're pushing out images, this gets a little hairy. Most organizations use some sort of boot script that contacts a central repository to take care of this. We didn't need that much control for our lab. Instead, we decided just to base the hostname on the MAC address of the primary NIC, a very simple and guaranteed unique mechanism. Since we're using Gentoo we have the /etc/conf.d/hostname file that is used at boot time to set the hostname. Here's what we used to set the hostname to contain the MAC address:

lab0002A51B9F16 tmp # cat /etc/conf.d/hostname
  # /etc/conf.d/hostname
  # Set to lab+MAC (without ":"). IE: lab0002A51B9F16
  HOSTNAME=lab`ifconfig eth0 |awk '/HWaddr/ {print $5}'|sed 's/://g'`

I'm sure there are more effecient commands that could be used here (I'm no scripting geek), but this works. :) I spent a couple hours hunting for something like this and couldn't find it, so I hope it's of use to someone.

Saturday, September 16, 2006

SoapBox Platform Possibilities

Our customers do some very interesting things with our platform, including:

RPC – The asynchronous nature of XMPP along with its addressing and our platform make it ideal for application to application messaging. One of our customers wrote a .NET Remoting transport on SoapBox, and many use it for other custom RPC needs (some are below).
Network gaming infrastructure – sending game data, hosting chat rooms, private chat during games, etc
In class test taking – Tests are distribute to Tablet PC’s when students log in and results are tallied.
Geographical Chat – Whiteboarding combined with maps and group conferencing.
Financial Market Data – Real time data from the market flows into applications used by traders.
Social Networking – Consumer social networking site using our platform for chat, content, and advertising delivery to the desktop.
Emergency Alerting – “The nuclear power plant is melting down. Evacuate!”
Remote Surveillance Control – Watch your surveillance cameras at home in any web browser and control them.
Automated Manufacturing Alerting – “Line 5 is clogged. Attention required!”.
Plain ole’ chat built into an existing application – Give users access to the people they need in the application they are familiar with.
Build System Alerts – Our build system sends us messages as our daily builds runs, letting us know the status.
Web Based Live Support – Communicate with customers live, through a web site.
Voice and Chat on Ruggedized Handhelds - Push to talk through a contact list with presence, send messages, pictures, or go into walkie talkie mode.
And the list goes on…

Some have been in the news, some are still in "stealth mode", but unfortunately I can't mention any names of companies. We will have some case studies coming out for a few of them.

Some of these applications are built on our 2005 platform, but many are built on our upcoming 2007 release. We've been working hard with our partners to make sure this upcoming release is something special. Here's a little overview of the new SoapBox platform.

SoapBox Framework (now SoapBox Studio)

The SoapBox Framework was our first product offering. We built and productized a framework knowing we’d want to build a server and an advanced communicator client. It started out as one framework, and has grown into quite a bit more. With the 2007 product release we will be distributing our frameworks in one package called the SoapBox Studio. They are all based on the same code base, which allows us to quickly add features to our entire product line. These frameworks include:

Desktop Edition – Build desktop and web applications in .Net for Windows, Linux, and Solaris.
Mobile Edition – Build mobile applications that run on PocketPC, Smart Phone, Windows Mobile, or Windows CE operating systems.
Web Service Edition – This is my favorite. J A standard SOAP web service, built using our Desktop Edition on the back end, that allows you to integrate with any language that supports web services. This includes Java, ColdFusion, Flash, C++, Perl, Php, and more (basically every language out there).
Server Administration Edition – All of the features that are available in our Management Console are available to you through public API’s. You can quickly perform tasks such as adding users, managing contact lists, and retrieving message archives.
Server Edition –This allows you to build plug-ins to the SoapBox Server to manage users from your own custom user store, create custom components to service client requests, filter messages, do custom logging, and manipulate the way the server works in general.

SoapBox Server

The SoapBox Server is our flagship XMPP server product. It is based on our Desktop Edition Framework with additional layers to do everything a server needs to do. You can easily customize the Server to meet your specific needs through the SoapBox Framework Server and Administration SDK’s.

SoapBox Communicator

The SoapBox Communicator is our client software built on our SoapBox Framework Desktop Edition (as well as another layer that we will be productizing soon) and serves as an example of how to best utilize the framework for client side development.

Other Key Platform Features

Open, well normalized database – We use a strict Enterprise nTier model in the SoapBox Server. Our database is well normalized with stored procedures for all interactions. This makes it extremely easy to integrate users, contact lists, message archiving, and presence with any environment that can read or write to a database.
Shared code base - Whether you’re writing an application for a Smartphone, or a server plug-in to handle workflow in a CRM application, your code interacting with SoapBox will look strikingly similar. This better utilizes your developer resources by focusing them on their business problem, not re-learning another API.
Unprecedented vertical scalability - SoapBox Server will fully utilize any hardware you can throw at it.

We have been hard at work creating the best documentation and samples out there for any XMPP platform. Here are a couple of (draft) samples.

If you would like advanced access to the SoapBox 2007 Platform, shoot us an email over at [email protected] and we'll get you on the beta list.

Friday, September 1, 2006

Cross Platform Deployment Project Bootstrapper

I think we are one of a very small set of companies out there building consumer grade, shrink wrapped products on .NET. Why? Well, I'm not sure. It might have to do with Microsoft's positioning on the matter. All their documentation talks about enterprise deployments. It could also have to do with the runtime size. Some people think 50MB download for prerequisites is too much. In general, though, .NET, and Visual Studio 2005 especially, provide all kinds of great tools for building shrink-wrapped products. One of these is the Deployment Project.

To create a working Deployment Project you simply set a few properties and point it at your main executable project. Whether it is a Windows Service, Winforms, or Web Project, you will get an installer that will run for most people. By default in Visual Studio 2005 the Deployment Project generates a bootstrapper Setup.exe. This will download and install MDAC 2.8 and the .NET 2.0 redistributable, if necessary. However, there is one caveat. It only works on x86 based operating systems! I have to admit, this caught me off guard. I expected to check a couple boxes in the UI and have everything just work on every platform where .NET would run. Unfortunately this is not the case.

Luckily, the bootstrapper is very customizable. Using a simple MSBuild task and some XML files you can dynamically build bootstrappers for any MSI or executable file. So, I put together some bootstrapper packages for .NET 2.0 on x86, x64, and IA64 platforms, MDAC on x86, Windows Installer 3.1 on x86, and SQL Express 1.0 SP1 on any platform. The result: I can click a few check boxes in a deployment project and not have to worry about changing my bootstrapper packages for each platform.

Here is the bootstrapper set: CrossPlatformBootstrappers.zip

The key to these bootstrappers working on all platforms is they do not fail if the platform doesn't match. They simply move on to the next bootstrapper. So you can choose all of our .NET 2.0 boostrappers in your Deployment Project, change the target platform of the deployment, and not have to remember to change your bootstrappers.

In addition to the .NET 2.0 bootstrappers I also created a SQL Express 1.0 SP1 bootstrapper using SQLEXPR.exe, which will run on all platforms. However, instead of doing a default install of SQL Express, it installs a specific instance name. For us, this instance is SOAPBOX. You'll want to modify the bootstrapper package.xml and product.xml files to suit your needs for your SQL instance name (or just use the default). If the instance already exists on the computer, it won't be installed again.

Earlier I mentioned MSBuild. A major key to creating shrink-wrapped products is being able to get builds out quickly and accurately. We use MSBuild for this. We have tons of custom tasks, but one of the MSBuild tasks that ships with the framework is called GenerateBootstrapper. With this task you can point MSBuild at your Bootstrapper directory (the contents of the zip file), an msi, and a list of bootstrappers, and it will create a bootstrapper exe for you.

Deployment Projects and I have a love/hate relationship. There are many things about them I don't like, but the bootstrapper piece is wonderful! So, in the upcoming 2007 builds of SoapBox products, expect to see a seamless installation including download on demand for all the prerequisites your system will need, on any platform.

Tuesday, August 1, 2006

Using the MySql Command Line from C#/.NET

We have added two new data access providers to the upcoming SoapBox Server 2007 release. We now support PostgreSql and MySql as well as Microsoft SQL and Oracle. The code for these databases has been in our server and test libraries for a couple of months and this last week it was time to add them into our post installation configuration wizard. We strive to make our setup process as simple as possible. You'll notice many improvements over the 2005 wizard. We have better auto-configuration and fewer wizard screens.

One of our core philosphies here at Coversant is to make all of our software as easy to use as possible. Aside from testing, we spend more time doing this than any other R&D effort. Why is easy to use software so important? Well, the easier our software is to use the more you will like it and the less support we have to provide. We cut costs, you are happier. It's win/win. :)

Adding the automated setup process for these new data access layers was supposed to be very straightforward. In our configuration utility it's all abstracted so I just implement a few classes, the DB guy gives me the scripts to run and I call the appropriate tool (mysql.exe, pgsql.exe, osql.exe, etc), which we ship with our server for your convenience. However, there is one big feature missing from the mysql command line tool. There is no way to specify a file to use as input! But, it does take standard input. So, typically, if i were running the tool from the command line I would do something like: 'mysql -uroot -p < "c:\myscript.sql"'. If you're command line saavy you will know this redirects the file c:\myscript.sql as standard input into the mysql command line tool. I thought I could do this in .NET using command line arguments passed to the System.Process.ProcessStartInfo class. Well, I was wrong. It doesn't work (at least I couldn't figure out how to give it a file as stdin -- the "<" didn't work).

So, what's the answer? Simple. Set the nifty RedirectStandardInput property and read the file into the Process.InputStream. Code follows:

private void ExecuteSQLScript(string databaseName, string user, string password, string command, string filename, string server)
        {
            using (System.Diagnostics.Process p = new System.Diagnostics.Process())
            {
#if LINUX
            p.StartInfo.WorkingDirectory = System.IO.Path.GetFullPath(this.InstallOptions.InstallDirectory);
            p.StartInfo.FileName = "mysql";
#else
                //grab the path from our installation options
                p.StartInfo.WorkingDirectory = System.IO.Path.GetFullPath(System.IO.Path.Combine(this.InstallOptions.InstallDirectory, BaseScriptDirectory));
                p.StartInfo.FileName = System.IO.Path.Combine(p.StartInfo.WorkingDirectory, "mysql.exe");
#endif
                
                //set all the startup options
                p.StartInfo.CreateNoWindow = true;
                p.StartInfo.WindowStyle = System.Diagnostics.ProcessWindowStyle.Hidden;
                p.StartInfo.UseShellExecute = false;
                p.StartInfo.RedirectStandardOutput = true;
                p.StartInfo.RedirectStandardError = true;

                //build the arguments for mysql
                StringBuilder args = new StringBuilder();

                if (!string.IsNullOrEmpty(password))
                    args.AppendFormat("-p{0} ", password);

                if (!string.IsNullOrEmpty(server))
                    args.AppendFormat("-h{0} ", server);

                if (!string.IsNullOrEmpty(databaseName))
                    args.AppendFormat("-D{0} ", databaseName);

                if (!string.IsNullOrEmpty(user))
                    args.AppendFormat("-u{0} ", user);

                if (!string.IsNullOrEmpty(command))
                    args.AppendFormat("-e\"{0}\" ", command);
                else if (!string.IsNullOrEmpty(filename))
                    p.StartInfo.RedirectStandardInput = true;

                p.StartInfo.Arguments = args.ToString();

                WTrace.TraceInfo("Run Script", this.GetType(), "Executing: '{0}' with args '{1}' in working dir '{2}'", p.StartInfo.FileName, p.StartInfo.Arguments, p.StartInfo.WorkingDirectory);


                //start up the process, and handle the redirected stdin and stdout -- send them to our trace lib
                p.ErrorDataReceived += new System.Diagnostics.DataReceivedEventHandler(p_ErrorDataReceived);
                p.OutputDataReceived += new System.Diagnostics.DataReceivedEventHandler(p_OutputDataReceived);
                try
                {
                    p.Start();
                    p.BeginErrorReadLine();
                    p.BeginOutputReadLine();

                    //read in the script file if one was specified and give it to stdin
                    if (null != filename)
                    {
                        using (FileStream f = File.OpenRead(filename))
                        {
                            using (StreamReader reader = new StreamReader(f))
                            {
                                //we could do this one line at a time to be safe memory wise, but we know our scripts are small
                                p.StandardInput.WriteLine(reader.ReadToEnd());
                            }
                        }

                        //tell mysql we want to exit, otherwise the process will hang
                        p.StandardInput.WriteLine("exit");
                    }

                    p.WaitForExit();
                }
                finally
                {
                    p.ErrorDataReceived -= new System.Diagnostics.DataReceivedEventHandler(p_ErrorDataReceived);
                    p.OutputDataReceived -= new System.Diagnostics.DataReceivedEventHandler(p_OutputDataReceived);
                }

                if (p.ExitCode != 0)
                    throw new MySqlException("SQL Command failed.", p.StartInfo.Arguments, "", "");
            }
        }

        void p_ErrorDataReceived(object sender, System.Diagnostics.DataReceivedEventArgs e)
        {
            if (!string.IsNullOrEmpty(e.Data))
                WTrace.TraceInfo("Run Script", this.GetType(), "StdErr: {0}", e.Data);
        }

        void p_OutputDataReceived(object sender, System.Diagnostics.DataReceivedEventArgs e)
        {
            if (!string.IsNullOrEmpty(e.Data))
                WTrace.TraceInfo("Run Script", this.GetType(), "StdOut: {0}", e.Data);
        }

Wow, that's a lot of code. Ok, so I probably didn't have to paste that much of it, but I wanted you to get an idea of the full extent of the method, and I think being able to run mysql scripts is useful.

When all is said and done you now have two screens for installing to a MySql database: 1) enter username, password, and hostname 2) choose an existing or new database name. From there the installer figures everything else out automatically.

Wednesday, July 26, 2006

More on Interop

I'm sitting on a plane on my way back to Sacramento (a whopping 1 hour flight) and I thought I'd try to give a little more information about the interop event. After two days of testing it appears as though we're actually very close to having quite a few fully interoperable XMPP implementations. As I mentioned last time, Monday was a cake walk. Unfortunately Tuesday wasn't quite so easy.

Consensus was reached on Monday to start later on Tuesday. We wanted to do 10am, but had reservations at Jive's office to sample beers of the northwest at 5 (yeah they have some good beers) and play some XBox, so we decided to start at 9 instead. The morning session consisted mostly of protocol discussions that were very productive. We decided on the general protocol flow of PEP (Personal Eventing via Pubsub). This protocol addition allow us to create some very innovative and interesting extended presence features (more on this later) in future versions of SoapBox Communicator, and they'll be compatible with features in other clients such as Google Talk, GAIM, etc.

I had some really good Thai food from a street vendor, and I'm still not sick, so that's good. In fact it was some of the best Thai food I've had in a long time, and I eat Thai about once a week.

After lunch we got down to business with more interoperability testing. The goal: mutually authenticated TLS streams between servers per RFC 3920. A certificate authority was created, X.509 Certificates were generated, servers were configured, and then... it didn't work. Unlike the overwhelming success of Monday, Tuesday brought the skeletons out of the closet. We soon realized there were numerous breaking differences between OpenSSL, Java, and .NET based implementations of TLS. When we started, absolutely none of the servers were able to talk to each other over a fully trusted TLS connection. The interesting thing was we could all talk to another instance of our own servers. Hmmm.

After a few hours of hacking and debugging we realized there was significant work that needed to be performed and we didn't have enough time to do it. We were able to get connected with a few of the servers, and vice versa, but there definitely wasn't a Happy Path for all. As a result the server to server TLS specifications in RFC3920 will be clarified, as we eventually reached consensus on what it all really means and how it should be implemented.

In the very near future the JSF will be facilitating ad-hoc interoperability testing over the internet. It will be managing domains (such as soapbox.xmpp.org, google.xmpp.org, etc) where all participating vendors and open source projects will host servers. These will be semi-private domains without open registration, but open to anyone developing XMPP applications that need to test interoperability.

All in all, this was a very successful couple of days. We probably saved a good two months worth of bickering over e-mail lists to figure out protocol issues, verified that XMPP is in fact interoperable, and set the stage for future interoperability testing. We'll also be exploring fully automated tests, which Coversant will likely contribute to the JSF, to make sure everyone continues to play nice in the future. :)

Monday, July 24, 2006

Interoperability - Yup, we got it covered

Today was the first day of the first ever official XMPP Interop Event. In fact, it was probably the first day of any open instant messaging and presence interop event, ever.

We had showings from Coversant (yeah, that's us), Sun, SixApart (LiveJournal), Google, Jive Software, Jabber Inc, Sun Microsystems, and Process-one. Some of those names might sound familiar, and others not, but in the end what we ended up with was seven completely different XMPP server code bases/implementations both open and closed source, setup on a LAN and federating with each other. We spent more time configuring DNS, IP addresses, and other networking junk than we did fixing bugs that were hampering interoperability.

As far as I know (who knows what everyone was hiding on their laptop screens) there were only a few major bugs that were found and they were on two of the very freshest of server implementations, one of which the vendor considers in "pre-alpha" release status. No, I won't tell you who, but I bet they release patches soon. :)

So what the heck did we do all day? Well, we tested interoperability, obviously! But more specifically:

Inter-domain roster manipulation - Add and remove contacts on other servers.
Inter-domain messaging - Can we actually hold conversations with each other?
Inter-domain presence - Avatars, status messages, show states, etc
pizza
UTF8 support - Strange unicode characters in other languages in addresses, messages, and presence.
Discussed quite a few of the enhancement protocols pending for the XMPP specification and came to concensus on some issues.
Mingled with everyone and shared some anecdotes on implementing XMPP that pretty much only the people in that room would understand.
beer

What does that mean to you, a loving Coversant customer -- you are a customer, right? Well, it means that you can securely talk through IM in a federated manner to your trading partners, friends, family, and arch enemies even if they aren't one of our customers. Yeah, we support that, and apparently it works. ;)

#1 on the list for tomorrow, trusted TLS based inter-server connections. So we have the joy of setting up a trusted certificate authority and distributing certs. I bet setup will take longer than the interoperability testing itself, again.

Monday, June 26, 2006

How to Build Scalable .NET Server Applications: Memory Management

I'll get this out of the way from the start. This series of blogs will have nothing to do with ASP.NET or web services. However, if you plan on writing you own implementation of IIS in managed code this would probably be a good place to start. :) I also won't be providing very many code examples, as I'd be flogged by our intellectual property lawyers. You will not be able to copy and paste and create your own scalable server. However, I hope to provide enough insight so you can avoid a big list of gotchas we have had to figure out the hard way. This is one piece of a huge puzzle, memory management. Yes, you do have to think about that in .NET, at least if you want to build a large scale application.

For those who don't already know, SoapBox Server is a part of our SoapBox Collaboration Platform that supports the XMPP protocol as well most of the interesting JEP extensions. At the core of SoapBox Server is a highly efficient Socket server and thread machine capable of scaling into the hundreds of thousands of simultaneous users, and it's built 100% on .NET (C# now, but used to be VB).

SoapBox Server is the first multithreaded Socket based server application I've had the pleasure of working on. During the course of building the SoapBox Server into the extremely scalable and reliable system it is today I've learned a few things (as has the rest of the team, I hope). Thanks to Chris (who already had tons of experience with such things in Win32/C++), a few bloggers out there, some books, customers finding very interesting bugs, Windbg with Son of Strike, oh and Starbucks, I'd say I'm pretty well versed in the land of building scalable server applications. I'm no Jeff Richter, mind you, but I feel I have now learned enough to at least speak intelligently about it.

In that spirit I'd like to share the fruits of our tuning and debugging work, which, if history repeats itself, will continue to evolve as we begin work on our next major revision of the product. First, I'd like to repeat something I said a couple paragraphs ago, SoapBox now scales to hundreds of thousands of simultaneous connections with a single piece of server hardware. Think about that for a second. A user brings up an IM client, connects to SoapBox Server, and then holds that connection open until they Log Out. Repeat hundreds of thousands of times. This is no simple task. The .NET CLR does not provide a magic "Process.Scalable = true" property. We have invested hundreds of hours into tuning (maybe thousands) over the life of the server on classes of hardware varying from single processor laptops to 16-way Itanium2 systems with 64GB RAM. We've been through four distinct processing models as well as quite a few iterative improvements on our Socket interaction layer. Basically we have ran the server under a bunch of different profilers under many scenarios, found slow bits of code, and fixed them. But I'm not going to talk about profiling and performance tuning; perhaps another time. I'm going to talk about memory and scalable applications.

Every time your application creates a new Socket, Windows pulls memory from it's Nonpaged Kernel memory, which is simply physical memory that is reserved by the kernel and will never be paged out to disk. This block of memory has a finite limit and the kernel picks the limit based on the amount of phsyical RAM available to it. I don't know the exact algorithm, but with 4GB RAM it's usually somewhere around 150,000 TCP Socket connections, give or take. Want to see this in action? Simply create a loop that instantiates sockets. It will stop working eventually with a SocketException telling you there isn't enough buffer space. On top of this hard kernel level limitation, you also have to worry about how much memory each concurrent connection uses in your own application. In SoapBox we store a lot of information about each connection in memory in order to improve performance and decrease our IO operations. This includes things like the user's contact list, their last presence (available, away, busy, etc), authorization information, culture information, user directory information, etc. If we didn't hold this in memory we'd have to hit a file, database, or some other out of process persistent store for the information every time we needed it. Being IO bound is no fun. Believe me, we started out that way.

However, because of our extensive caching, SoapBox Server 2005 can only reliably handle about 20,000 simultaneous connections on the beefiest of 32 bit hardware (on 64 bit it's much, much, much higher -- I also have to admit we haven't stress tested the 2007 build on 32 bit hardware, it would probably be much higher now). It doesn't matter if you have 64GB RAM and 16 32bit processors, it we can still only handle 20,000 connections. Why, you ask? Well, it's because of the 2GB (well, really 3GB with a boot.ini switch) virtual memory limit per process in 32bit Windows. Without delving into managing your own memory your process is only allowed up to 3GB to play with. Typically, we use that up, or rather, .NET thinks we use it up, somewhere between 20,000 and 30,000 connections. Now why would I say ".NET thinks we use it up?" Story time!

A little over a year ago one of our customers kept running into a very bad situation. As evidenced by the Event Log, SoapBox Server was crashing (insert shock and awe here). It was an irregular occurance, but it did happen. However, we did no take this lightly. This customer was running about 2,500 simultaneous connections on a Dual Xeon with Hyperthreading and 4GB ram and the /3GB switch set. It was plenty of hardware for the job, and probably overkill. However, the service was still crashing. We set them up with the Debugging Tools For Windows and had them startup the process to wait for a crash (another blog we'll have to write some day). After a few tries we got a dump with some useful information in it. The result? We were out of memory, sort of.

In .NET when you call any socket operation and pass it a buffer, whether it be a send or receive, synchronous or asyncronous, it takes that buffer and pins it before giving it to the Winsock API's. Pinning, in a nutshell, is taking a .NET data structure and telling the .NET CLR memory manager not to move it, until it is explicitly un-pinned. The memory manager in the CLR is smart. As you allocate and deallocate memory it is constantly defragmenting it for you so the overall memory footprint is lower. There are quite a few really good/long/complicated articles on how this works so I won't bore you. However, pinning throws a wrench in this and the memory manager isn't quite smart enough to deal with it well (though it has gotten a lot better in 2.0). Basically, that buffer you want to put on the socket cannot move in memory (physically -- in terms of you virtual memory space) from the time the socket IO operation begins until it ends. If you look at the Winsock2 API's this is obvious, since the buffer is passed as a pointer. Anybody who's built this type of application in Winsock2 is probably saying "DUH!". I'd consider this a very leaky abstraction. Due to this behavior, it is quite easy to write a socket application in .NET that runs out of memory.

Back to the story! Not only were we out of memory, but the there was only about 200MB worth of data structures in the heap. For those of you like me that use calc.exe for all your basic math let me figure that out for you, 200MB > 3GB. Uhh, say what? How the heck were we out of memory? Well, we ran into the shortfall of pinning and memory fragmentation. The cause of this was a small number of small pinned buffers, in our case 2KB each, that were high enough in the heap to cause fragmentation spanning over 2.8GB. Where did the other 2.8GB go, you ask? Well, is was there, allocated by our process, but not being used by our code. In Son Of Strike (SoS -- a command line plug-in to the Windbg debugging tool I hope you never have to use) this showed up as free, empty, unused space! It was just sitting there waiting to be used, but we still ran out of memory. I think I mentioned earlier the memory manager in .NET isn't so smart when it comes to fragmented memory and pinning, well, this is what happens in the worst case.

Good thing for you, the answer to all your memory fragmentation and pinning woes is quite simple. Pre-allocate buffers for use by anything that will be causing pinning, and do it early on before there is a lot of memory thrash (when your application is rapidly allocating and deallocating a lot of memory). We created a simple class called a BufferPool that we use to pre-allocate a certain number of buffers. This pool can grow as need be, but it does so in large chunks and forces a garbage collection each time before the buffers are actually used. This considerably reduces the chances of fragmentation caused by pinned memory. If the pool starts off with 500 buffers, but then the 501st buffer is needed it will grow by a configurable value, typically another 500 buffers, and the induced garbage collection will cause these buffers to shift to the lowest possible point on the heap.

Interestingly enough when we found this bug we already knew about the pinning behavior of socket operations, but had only solved half of it. All of our BeginReceive calls were using the BufferPool because we knew the buffers would remain pinned until we received data from a client, but the BeginSend calls were not using the pool. We had not even considered the fact that sending a few KB of data might take long enough to pin memory, fragment the heap, and cause an OutOfMemoryException. But there is one case where they do, timeouts. The Windows TCP subsystem is very forgiving. If a client loses its connection and the server isn't explicitly told about it, the next piece of data you try to send to that client socket will end up being pinned while the TCP subsystem waits for the client to respond. It can take up to 5 minutes with the default configuration of Windows for the TCP subsystem to figure out the client isn't really there. During that entire time your buffer is pinned in memory. *poof* OutOfMemoryException.

Unfortunately, pre-allocating buffers does not completely fix the issue of running out of memory. There are also some other limits to the size of a .NET process's virtual memory space that are very complicated and I won't talk about, but basically you end up with anywhere from 1/2 to 2/3 usable virtual memory without running the risk of OutOfMemoryException. So, if you have 2GB virtual memory available (standard on a 32bit machine), you end up with about 1.3GB you can actually use reliably. Of course, this varies, and some applications will be able to use more, or maybe less. Your mileage may vary.

Don't fret, all of the issues I've talked about in here have been fixed since SoapBox Server 2005 SR1. And with the most common usage patterns people were not actually affected to begin with.

I hope this was at least marginally interesting to someone. :) Next up, I'll probably talk about limitations we discovered in the Windows Socket infrastructure, or maybe async IO, IOCP, and worker threadpools, or maybe how in the world we actually test at this scale. Only time will tell, unless Chris beats me to it.

Friday, June 2, 2006

Fun Installing Vista Beta 2 on AMD x64

As a self proclaimed geek and MSDN subscriber I feel as though it's my duty to explore all the new software that Microsoft comes out with. This last week I have been embarking on one such journey. Working with beta software is always a bit trying, but tack on a beta driver model and a "new" hardware platform (x64) and things get really interesting.

Vista Beta 2 was released to MSDN about a week ago. The next day I fired up my DVD burner and started messing around. About four hours later I had a working installation. Why so long? Well, because my workstation is an AMD NForce 4 x64 system and boots from the onboard SATA RAID. Apparently this is not one of Microsoft's test platforms. I had an experience similar to this guy.

I had to run the Vista install from an existing Windows installation. It simply would not work when I attempted to boot from the dvd. I never got an option to load drivers. Nvidia recently released beta Forceware drivers for Vista x64. I assumed these would have the RAID drivers I needed to install Vista, afterall they did have the appropirate txtsetup.oem file and seemed to be correct. After a few attempted installations, blue screens, automatic reboots, and hangs, I decided that my assumption was bad. Lesson learned: to install Vista x64 on an Nforce 4 RAID use the XP x64 Nforce4 RAID and SATA drivers. Yup, that's right. Well, almost.

If you're like me and want to use the latest Nvidia XP x64 drivers you'll be greeted with a black screen telling you that your drivers are corrupt after the first setup reboot. Say wha? Lucky for you, they aren't. This is a feature of Vista x64. Hit F8 at the boot screen and choose to disable driver signature verification. Of course, hitting F8 EVERY time you boot your computer is not going to be very fun. Luckily (for now) there is an application called bcdedit (just run "Bcdedit.exe –set nointegritychecks ON") that you can use to disable signature verification after you get into your desktop. Oh, don't forget to right click on the Command Prompt link in your start menu and choose "Run as Administrator" before trying to run this command, or it will tell you "Access is Denied". Yay security! I should definitely mention that Vista prompted me to allow this action (VPMTATA), at least once.

Oh yeah, I almost forgot, after you finally get to your desktop Vista will keep telling you that it has found an unknown device. This is your RAID controller; the one with XP drivers. Point the device wizard thingy to the inf file (VPMTATA) of the Forceware Vista x64 Beta drivers and this annoyance will go away. You'll also need to install drivers (VPMTATA) for the Nforce4 audio chipset.

On the plus side, Vista had drivers for my Geforce 7800GT and Nforce4 gigabit network card. It even figured out I had dual monitors (VPMTATA), picked the max resolution for both (VPMTATA), and presented a neat little dialog that let me choose the desktop layout (VPMTATA). Of course, I wanted to upgrade to the latest ones from Nvidia. This is usually straightforward.

I ran the setup exe (VPMTATA) for the Nvidia Vista Beta 2 Geforce drivers. It extracted stuff (VPMTATA), ran the second installer exe (VPMTATA), and then failed with some cryptic error messages I probably should have written down and submitted as bug reports. Subsequent attempts to run the installation package (VPMTATA) resulted in an error about running 32 bit uninstaller code on a 64 bit platform. I was very confused, didn't want to spend much time on it, and gave up.

A few days later I had an epiphany - "I should just try to update the Microsoft Geforce driver with the inf". Duh. Well, I opened up the Device Manager (VPMTATA) and clicked update drivers (VPMTATA). Viola! The drivers were upgraded. Though, I have no idea if there are any control panels with these drivers (as there are in XP) since I couldn't run the full setup. Ah well, at least I have better video acceleration.

I'll be back later for my accounts of fun with Vista. After a week of use, I think I could write a book. However, it definitely hasn't all been bad (though VPMTATA) and I will continue to use Vista as my primary OS until it does something very mean or simply won't allow me to get things done.

Friday, May 12, 2006

.net 2.0 web service hair puller

Every couple of weeks I spend four hours doing something that should take five minutes. It just happened, and now I feel compelled to take another few minutes and explain so it doesn't happen to you. Not only did I waste my time, but the time of another one of our developers. What might waste four hours, you say?

We recently migrated our entire web site to .NET 2.0 and a new portal. Since we were building a new web site anyway we thought, "What the heck, let's re-factor the licensing subsystem. The database was hacked together over three years and we don't want no stinkin' .NET 1.1 code running on our shiny new site!" Well, this didn't turn out exactly as planned.

The SoapBox licensing web service is quite simple. There's a single method called "Activate" that takes in a unique hash of some information on the user's computer (so we can track duplicate usages) and the serial number. It returns an XML document containing all the license information, signed with our private key, that the SoapBox Server then validates with the embedded public key. That's it.

We get a call from a customer today saying they can't enter a license key into the MMC. They're getting a wonderfully descriptive "Object reference not set to an instance of an object" error. We immediately kick into debug mode. Afterall, if a potential customer can't enter in our trial key they're going to get upset and likely just give up on using it all together.

After a little debugging we found the web service wasn't returning the license xml document like it should have been. This was _very_ weird, considering our licensing page uses the same web service as our MMC and it was working just fine with the exact same parameters. Much head to wall interaction took place and we had another breakthrough, the web method wasn't receiving any data from the parameters. The serial number was null!

Now we bust out the packet sniffer. After careful examination, a new mini test application to call the service, and three Diet Cokes, we found the problem.

Apparently, during the migration and re-factoring process, we missed a character in a copy/paste operation. That's right, the trailing "/" on the namespace declaration for the web service.

This in itself is not such a suprise. However, the suprise is that .NET did not throw a "There is no matching web method you dummy" exception. No, it didn't throw an exception at all. Instead it called the correct web method, but did not pass in any of the parameters. Oiy.

It turns out our licensing page was working because the web reference was created after the new licensing service was put in place. It all makes sense now... Time for a beer.

Tuesday, May 9, 2006

from the trenches of my first software startup

I was born and raised an entrepreneur (I'm pretty sure that's how you speel that). During my lifetime my dad never had a single "real job". He has always been a small business owner. From a carpet cleaning business, to a trucking company, to a coffee shop, he was always working on something he could call his own. Every time we get together we end up talking about businesses we'd like to start. If only I had the time. . . Somehow, after seeing him (and my mom) constantly working at least 80 hour weeks, stressed out, and exhausted, I still decided I wanted to start a company.

At the end of high school, and for a while after, I had a computer repair business with a really cute name and no business plan whatsoever. This lasted about six months. It turns out to be real work, and it's quite difficult to find people that are good enough to do that sort of work for very little pay. I'm really very interested in how Best Buy pulls that one off, though I assume they just have a good process and the people don't need to be all that great (no offense geeks -- I owe you guys many hours of my time that has been saved because I didn't have to help out my mother in law with her digital camera).

Ok, back on topic, sort of. A few years prior to my repair business I had fallen in love with computer programming. I wrote little programs to do all sorts of interesting things. Of course, I didn't work on anything that took more than a few weeks of my time and only built things that were interesting to me. Ah, the good old days! I went into the professional world when I turned 18 and started writing code for a living. Wow, what a change. Who knew coding could be so much fun?! "Wow, I get to program against another database and improve business processes! Woohoo!"

Needless to say, the life of a professional internal software developer was not in my cards. Even though I was making way too much money for my age as a "consultant" (that's what the recruiters like to call you, even though you're really just staff augmentation and under a different category on someone's budget), at about age 20 I started getting very restless. Luckily I met some like minded individuals (Jason and Chris) and Winfessor (Coversant's old company name) was born.

For the last few years, in between consulting engagements (we're boot strapped), we've been building the SoapBox products. Of course, we haven't only been building software, we've been building a business. Lucky for me, my role is still primarily building the technology. It continues to be a wonderful, trying, stressful, exciting, sleepless, humbling experience.

Here are some tidbits, in no particular order, that I've picked up along the way, from learning both the easy way and the hard way. It's free (and probably bad) advice from my limited experience aimed at any programmers out there that want to start a company around a software product. I'll probably write some more specifics at some point.

Have at least one flagship customer to start with, mainly for promotional purposes, but ideally one that will help fund the project.
Work full time on it. Hire others to do your consulting or other revenue generating work so you can focus, or get funding.
Don't work so hard. There is always more work to do, even if you work 140 hours/week.
Start small (ideally around 6mo development) and iterate from there.
Build it like you would an enterprise project - component based, easily maintainable and easily expandable.
Hire people to do things you aren't good at or learn to do them. Web site, graphics, brochures, accounting, sales, etc. Writing code is an itty bitty teeny weeny piece of your business.
Figure out how to start getting revenues, quickly, and set reasonable goals.
Don't look too far out into the future. Have a plan, but don't be scared to change it.
Any software you build needs to be extremely easy to use and install. This alone will win many sales in a head to head battle with your competitors (yes, you will have competitors).
FOCUS! Do one small thing better than anyone else, and then move on to the next.
Don't pretend to be a big company. You have a lot of advantages being the small guy and people willing to work with small companies know it.
Do something unique that your market needs.

That's all I've got for now. We're constantly growing and learning. I imagine my advice will evolve as the Coversant experience continues. In the end, the freedom, excitement, and financial possibilities of a startup outweigh the pains, at least for me. I also just realized I use way too many parenthesis when I write (maybe I'll work on that).

Monday, May 8, 2006

FakeOutTheUserToThinkWeDontUseAnyMemory

There comes a time in every project where the developers realize we are building software for the users, rather than for ourselves. A user's perception can be the difference between a good and a bad reference, and we all know how detrimental bad word of mouth can be. This unfortunate reality hit me square in the face recently when I was told by a customer that "your application is bloatware".

Any desktop application with a user interface, written in .NET, that does anything interesting, can easily be mistaken for bloatware. It's quite easy to create a super elegant application with no memory leaks that appears to use 50MB or more of memory. I say appears, because the figure everyone sees in Task Manager is the "Working Set" size. Users (myself included, up until recently) see large working set sizes as a sign of bloatware and poor programming.

This is simply not the case. The working set is more along the lines of the amount of physical memory the OS thinks your application might need or needed at one point, including shared memory and all sorts of other complicated things that .NET developers aren't supposed to have to think about. If your system was in need of physical memory for other processes much of this perceived bloat would either be reclaimed and put to better use, or paged out to disk.

At the end of the day, the reality of the situation doesn't matter. Your users think your application is bloated. What do you do? Well, you FakeOutTheUserToThinkWeDontUseAnyMemory.

using System.Diagnostics;

namespace Coversant.Utility {
  public static class MemoryUtility
  {
      private static volatile bool _enabled = true;

      public static void FakeOutTheUserToThinkWeDontUseAnyMemory()
      {
          if (!_enabled)
              return;

          try
          {
              Process curProc = Process.GetCurrentProcess();
              curProc.MaxWorkingSet = curProc.MaxWorkingSet;
          }
          catch
          {
              //Some users won't have permission to adjust their working set.
              _enabled = false;
          }
      }
  }
}

Yep, that's it. Call that method (.NET 2.0 only -- in 1.x you had to P/Invoke) and watch the magic happen. In our applications we set it up on a timer that runs every 30 seconds and after any events we know will raise the working set, usually after loading new assemblies or after a window is closed . Running this code causes Windows to free up as much of the working set as possible. Usually this sends most of your bloat to the page file where it will remain forever. In our case, the application had a 50MB working set and really only needed about 10MB of physical memory after it was running. There is, however, one big gotcha. An application can only attempt to adjust its working set if it is running with appropriate permissions (typically a local Administrator).

Yes, your users perceptions are reality. Using this trick/hack helps keep reality in line.