Most Fitting of Names: Subversion

Thursday, November 17th, 2011

How can a project so completely miss the point as badly as Subversion? I’m truly flabbergasted. I hope I don’t offend people by saying this, but I rather agree with Linus Torvalds:

…my hatred of CVS has meant that I see Subversion as being the most pointless project ever started. The slogan of Subversion for a while was “CVS done right”, or something like that, and if you start with that kind of slogan, there’s nowhere you can go. There is no way to do CVS right.

Here is the unanswered question in my head. How can a project first released in 2000 have problems that were hated in the 80′s and solved in the 90′s? To fully put this in perspective let me frame the problems by looking at the issues in SVN’s parent project CVS.

CVS is by far the worst revision control system I’ve ever seen. To again quote Linus:

For the first 10 years of kernel maintenance, we literally used tarballs and patches, which is a much superior source control management system than CVS…

The only revision control system to more blatantly miss the point than SVN was Team Foundation Server (the successor to VSS) which has many of the problems of CVS that SVN set out to fix.

The most common grievances about CVS are the absence of atomic checkins, awful branching support, and terrible performance. Of those four primary grievances SVN only really addresses one of them as SVN does bring atomic checkins to the mix. They also attempted to address branching, but they kinda missed the point and ended up with abysmal branching support. Additionally SVN’s performance is atrocious.

Before all the SVN fans in the room get their knickers all in a bunch let me explain a few things. If you haven’t worked on a large software project with lots of branches, and you haven’t worked with Git, Perforce, or another SCM tool that has good merging support; you don’t get to talk. The honest fact is that I’ve met more than a few people whose only exposure to SCM is SVN and/or CVS, and those people LOVE the branch support in SVN. That’s because they don’t know better. Let me give you a hypothetical run through.

This should be easy

File history between branches

The diagram above shows a visual representation of the history of a file in two different branches. The first event of importance is when Foo’s A.txt was branched to create Bar’s A.txt. That operation took revision 3 of Foo’s A.txt and merged it to create revision 1 of Bar’s A.txt. Cool, well that’s real easy in SVN. The second event is where things start getting tricky: merging Foo’s revision 5 into Bar’s revision 2 to create Bar’s revision 3. SVN isn’t terrible at this first merge, but it also isn’t good. SVN merges will attempt to resolve the changes between the two files, and if possible it will generate a nice merged file. The two big complaints here are that SVN’s merge resolution isn’t as good as the competitions, and it is unbearably slow. I merged a one byte file change the other day and it took over a minute. ONE BYTE!

Where things get really sticky for SVN in the next two events. Merging Foo’s revision 7 in to create Bar’s revision 5, and merging Bar’s revision 6 back into Foo. Since Foo’s revision 5 has already been merged in, the only changes that should need to be resolved are changes from Foo’s revisions 6 and 7. Unfortunately SVN isn’t smart enough to always know that. I’m not sure if it is a defect, or just poor design, but SVN’s ability to track when files are merged between branches is sub-par. If the mergeinfo is properly maintained it SVN can track whether or not files need to be merged, but it doesn’t do it on a change-by-change basis. SVN’s mergeinfo also doesn’t track history across multiple branches, that causes it to be useless if you are dealing with a lot of branches. I recently found myself in a situation where I had merged changes to a file from one branch to another multiple times, and I had to keep re-resolving the changes that I had already merged. That is a giant waste of time.

The other big issue is of course performance. I don’t know how to phrase this gracefully, so I’ll just come out and say it. SVN’s performance blows. I can rsync a directory of source code over ssh faster than I can perform an SVN checkout. That is ridiculous. When working with CVS I hated how many hours of my day were wasted with long checkout times, and while I will recognize that SVN is exponentially better than CVS, it is still piss-poor. To give another contrast, I took a SVN repository and converted it to a git repository.

Then to see the performance difference I did a SVN checkout and a git clone and timed the two operations. The SVN checkout took 10 minutes to pull down 100k files. The git clone only took 2. For those of you who don’t have experience with git, I should also point out that there are some significant differences between SVN’s checkout and git’s clone. SVN’s checkout makes a local copy of the repository and caches some data locally to allow you to perform local diffs and a few other operations. Git’s clone is essentially making a full copy of the repository database including the incremental diffs and change history for every file in the branch. By all rational thinking the git clone should take WAY longer than an SVN checkout because it does so much more, but SVN is shockingly slow.

Moral of this story: use git.

California Needs Pragmatic Civil Engineers

Tuesday, June 21st, 2011

For anyone who hasn’t read it there is an excellent programming book out there called The Pragmatic Programmer: From Journeyman to Master. One of the many great topics that they push in the book is the importance of prototyping. They talk about a concept they call “Tracer Bullets” which is a prototype based around simulating the code flow without spending time writing the implementation. One of the best things about this idea is that it can be applied outside of software.

One particularly good example of something that could have benefited from a paper prototype is the CA-17/CA-85 junction just outside of San Jose.

If only they had a paper prototype

pardon my poor artistry

Let’s say hypothetically I was driving north from Los Gatos on CA-17, and I wanted to go North on CA-85 to head up to Mountain View. I would exit the highway on the right. While on that exit ramp another lane of traffic merges in from the right. That lane of traffic comes from southbound CA-85. Where it gets really nasty is that if the people in that lane want to get onto CA-17 they need to cut across to the left lane. Meanwhile if I want to get to CA-85 I need to cut across to the right lane. If I fail to cut right I end up back on CA-17, and if they fail to cut across they end up back on CA-85 but going North this time.

In the end what this means is 99.9% of the traffic in the left lane wants to go right, and 90% (adding some extra here for people wanting to actually turn around) of the traffic on the right want to get to the left lane. To make this situation worse. If I do manage to get to the right lane, I have to go through the same type of merge crossover again between traffic going to CA-17 Southbound and CA-85 Northbound. If I mess up that one I end up going right back to Los Gatos.

Here’s where tracer bullets come in. A quick paper prototype to simulate the flow of traffic would have revealed the forced crossover pattern, and if I were a moderately sane civil engineer (something slightly less mythical than Bigfoot), I would re-design that road so as to avoid that crossover because it would increase traffic throughput (and probably reduce collisions). I should point out that paper prototyping is covered in a section called “Prototypes and Post-It Notes” in The Pragmatic Programmer.

One of the really neat things about working in software instead of roads however is that we have tools which can help us find this kind of situation. One of the most powerful tools a software engineer has today is their compiler and static code analysis. Many modern static analysis tools will actually evaluate the logic in your code and can help find situations where your logic just doesn’t work right.

Historically there have been several great static analysis tools available. GCC has had basic integrated static analysis for many years, and Gimpel Software’s PC-Lint is an excellent tool. In recent years both Apple and Microsoft have built static analysis tools into their IDEs.

Apple’s Clang based compiler expands on GCC’s Lint-like static analysis and their new Xcode 4 IDE brings static analysis results right into a friendly user interface. You can read all about Xcode 4 on Apple’s Developer website. I will mention that one of the best things about Xcode is the price point. The full Xcode developer toolset is available from the Mac App Store for $4.99.

Microsoft’s new static analysis tools are built into Visual Studio 2010 Ultimate. Personally I can’t speak much about them because I’ve never worked with them. I have taken advantage of Visual Studio 2010′s new -Wall flag and was pleasantly surprised by it. It seems to point out a lot of new compiler warnings that are useful to have. Unfortunately even if Microsoft’s static analysis tool were God’s gift to programming I wouldn’t recommend it, because it only comes with Visual Studio 2010 Ultimate which has an $11,899 price tag. If you’re working on windows you’re better off getting a cheaper IDE and throwing out the $389 for Gimpel Software’s PC-Lint.

Pointers != Arrays: Why the Stack Hates You

Friday, February 25th, 2011

I once had the pleasure of working for a financial investment firm to try and optimize a 15 year old trading application. Because their application was running really slowly they were trying to do some significant re-factoring, and part of that process involved splitting the application into more threads. In one of their new thread spawning routines I found the following code.

void some_class::foo()
{
    unsigned int thread_addr = 0;
    void* thread_args[] = {this, &(this->some_data)};
    _beginthreadex(0, STACK_SIZE, &thread_func, thread_args, 0, thread_addr);
    Sleep(5);
}

The lead programmer on the project this code came from was mystified by why the new thread would sometimes work properly and sometimes it wouldn’t receive the correct arguments. He found that it always worked if he set the sleep to 20. Any ideas what’s going on here? Here’s a hint… thread_args is an array.

The problem is thread_args is an array created on the stack. If the threads are scheduled such that the new thread could copy the two arguments out of thread_args before the Sleep ended, everything would work fine, otherwise the data could be overwritten on the stack before being read. The lead programmer found that making the Sleep a 20 millisecond sleep rather than a 5 millisecond sleep fixed the problem.

But what if the operating system’s thread scheduler didn’t get the new thread running in that 20 milliseconds, or if a critical section was hit somewhere causing the new thread to stop execution? You’d have the same problem again. The more appropriate way to handle this problem would be to make thread_args dynamically allocated and have the memory de-allocated by the new thread when it is done, or, better yet, find a way to make it so that the new thread only needs to take one object that can be passed as a void*.

In the final implementation of this code the thread didn’t actually need to take any arguments because we made a thread-protected singleton where it could access the data it needed without worrying about the singleton vanishing.

Project Management 101

Monday, February 14th, 2011

I want to apologize in advance for this post because it is a bit of a rant. I’m currently struggling with some sub-par project management, and I feel the need to vent.

The Context:

During our last project the engineering team had a system for managing our work that involved creating task descriptions, having them approved by our team leads, then doing the work. Unfortunately the system didn’t provide a good way of tracking dependencies between the engineering team and other teams in the company which led to some of our other teams being significantly behind schedule.

One of the worst affected teams was our VFX team. To address the problems in the VFX team a new technical artist was brought in. The new technical artist did more than just add an extra set of hands, he introduced the company to SCRUM. Converting the VFX team to SCRUM not only got the VFX team back on schedule, but actually put them ahead of schedule. In the end they were the first team in the company done with the project.

Fast forward a few months. After the project shipped we started working on a new project which got cut, our company significantly downsized, and our management began looking for smarter more efficient ways to manage our projects. Due to the success of SCRUM with the VFX team the management decided to give it a try company-wide.

The Problem:

After a few weeks of using SCRUM our project manager started complaining about all the work involved with maintaining our SCRUM walls and tracking our tasks. Because of his complaining his boss (our COO) stepped in and decided to remove our SCRUM walls in favor of tracking our tasks in an Excel spreadsheet stored on Sharepoint. A lot of people complained about this move from day one. I was one of the more vocal because agile software development process is a personal interest of mine.

The first problem with this new solution is that the spreadsheet takes information that was once publicly displayed on the walls of our workspace, and it places it on a website behind several navigation clicks. While this may seem like a minor annoyance to a casual observer the implications of it are significant. With the scrum walls people used to examine them a few times a day to see not only what their assigned tasks were, but also to see what other people were working on. With the new system almost nobody ever looks at the spreadsheet because the information contained in it isn’t formatted well for developer consumption. In simple terms some of the transparency into the state of the project was removed. Anyone who knows agile development knows that is a big no-no.

The second problem with this solution is that because our project manager is now updating a spreadsheet during our morning standup meetings the meetings are no longer brief. Moving a notecard from one column on a wall to another takes seconds. Status updates take less than a minute. But updating the spreadsheet takes a lot longer. I have the shortest set of standup meetings and they take 20-30 minutes of my day. The people with the longest set of meetings lose almost an hour.

The Explanation:

So how does this happen? The underlying issue here is actually really simple. Our project manager is fixated on how easy or difficult it is for him to manage the project. The reality is that project management isn’t about making it easy on the managers it is about making it easy on the producers (the people actually doing the work).

Let’s pose a hypothetical situation. You have a team of 30 developers managed by one project manager. Lets say that the project manager spends 40 hours a week managing the project, and the developers each have 30 hours a week of productivity, where the remaining 10 hours of developer time each week is dedicated to task tracking, meetings, and other non-development tasks. Now lets say your project manager, being resourceful, figures out that if he spends an extra 10 hours each week (2 hour each day) getting verbal status updates from team members he can give each developer 30 more minutes of productivity a day. Is that worth it? From the perspective of man hours it absolutely is. One person puts in 10 hours to get 15 hours more productivity across the team. Furthermore you might find that adding an extra project manager you can get more productivity back from your developers. This mentality of project management is about making it as easy as possible for your developers to do their jobs so that they are more productive, and since their productivity multiplies across the team it is always worth it.

To paraphrase one of my mentors:

A manager’s effectiveness is measured by the productivity of the people he or she manages. That means that a good manager will do anything possible so that the people he or she manages spend most of their time doing what they are paid to do. Even if that means you get them coffee.

This is what our project management team currently doesn’t understand.

Out of Order Execution and You

Friday, November 12th, 2010

About a year ago I was talking with some of the guys I worked with at the time with about some code that I found while doing an architectural review of their subject observer implementation. I had been turned onto the subject observer code by the reaction of the team which seemed to be to stay away from that code at all costs.

The code jumped out at me immediately because it had an edge condition that I’ve encountered before. When I talked to the other programmers on the project about the problem they told me that they had known for some time that there was a problem with the subject observer implementation and at one point in time it was causing their applications to crash multiple times a day. They knew it was “a theading issue,” but no more than that, and I was told that the current implementation crashes about once a month per user, which worked out to 8-10 crash reports a day. At the time they had determined that it was acceptable because they didn’t want to take the performance hit of a critical section.

Now that you’ve got the background let’s see the stripped down pseudocode:

class observer
{
public:
	void Update(const T& state);
};

class subject
{
public:
	subject()
	: reference_count(0)
	{}

	void setState(const T& st)
	{
		state = st;
		notify();
	}

	void notify()
	{
		++reference_count;
		while(reference_count > 1)
			Sleep(0);

		for(iterator it = observers.begin(); it != observers.end(); it++)
			it->update(state);
		--reference_count;
	}

private:
	int reference_count;
	T state;
	List observers;
}

There are really two problems with this code. Both are thread related, one is caused by passing references, and the other is caused by something the CPU does called out of order execution.

Let’s start with the more obvious one.

	Thread #1						Thread #2
1	Enters setState()
2	sets state = st(1)
3	enters notify(), reference_count = 0
4	increments reference_count
5	starts updating observers				Enters setState()
6	still updating observers				sets state = st(2)
7	still updating observers				enters notify(), reference_count = 1
8	still updating observers				enters while loop
9	decrements reference_count				in while loop
10	leaves function						updates observers

When you hit line 7 what state are the observers getting? Since the state isn’t being cached locally in the function the first few observers got st(1), and the rest got st(2) because the thread safety mechanism is after you set the state.

The second problem is harder to describe. Wikipedia has a good description of the problem here. Basically the compiler will re-arrange the order in which instructions are put into the binary to optimize execution. Generally that works out as grouping reads together and delaying writes. That allows the CPU to better make use of the time it takes to read from and write to RAM. So, let’s take a similar situation:

	Thread #1							Thread #2
1	Enters setState()
2	sets state = st(1)					Enters setState()
3	enters notify()						sets state = st(2)
4	increment reference_count				enters notify()
5	start updating observers				increment reference_count

So the question is, at this point what value does Thread #2 see for reference_count?

There is really no way to know. If the CPU executing Thread #1 wrote out the value of reference_count before the CPU executing Thread #2 requested the data it will probably read 1, otherwise it will probably read 0, which would mean you could have two threads updating the observers simultaneously. Also there is another more subtle problem that happens with the while loop. It is highly possible that the operating system can immediately return from a sleep(0) call without incurring a context switch on the thread. If a context switch does not occur the program will probably not re-read the value stored in reference_count. That means that once a thread enters that loop it may never leave unless the operating system kicks the thread off causing a context switch.

So how can we prevent this? Essentially we need to ensure that all writes to reference_count are completed before we read it, and that we re-read it before use. The most common solution to that would be a mutex. If we acquire a lock, then unaquire it after incrementing that should fix the problem. Given that they went through all this work to avoid locking it is no surprise that the developers I’m working with are adamatly opposed to using a lock because they don’t want the performance hit. In order to fix this problem we will need to either use a memory barrier (also called a fence), an interlocked instruction (which is really just an instruction paired with memory barriers), or we need to some other way to make the variable atomic.

Any way you do this the solution is going to be compiler or operating system dependent, different compilers implement memory barriers, atomic types, and read/write optimizations differently. The first step is going to be making the reference_count variable “volatile”. In most compilers the volatile keyword tells the compiler not to optimize this variable, and the compiler assumes that the value can change at any time. That makes it ideal for this situation where we could have multiple threads executing the same code concurrently. The second step is to perform the increment and decrement using interlocked increment and decrement functions. Interlocked functions ensure that the operations will be atomic. That will ensure that each thread always receives the correct value every time.

The Wonderful World of Strings

Thursday, November 11th, 2010

Ask any C++ programmer what the single greatest weakness in the language is (note this is the language not the libraries) and they’ll probably say strings. If they don’t say strings they’ve probably never tried to do an ANSI to Unicode conversion before.

That all said, I’ve noticed that some people seem to get confused over the differences between char*, char[], and std::string. For that matter, people seem to not even get strlen and strcpy. So let me show you a few misuses I’ve seen, and let’s talk about what they really do.

1) strlen & strcpy

char drop_last_char(const char* in)
{
    int len = strlen(in);
    char last_char = in[len-1];
    return last_char;
}

void copy(char* out, const char* in)
{
    strcpy(out, in);
}

The best way I’ve heard this problem described was by Keith Engell:

Use of the function strcpy is a clear indication to anyone reading your code that you are willing to walk forever to find nothing (e.g. a NULL).

Basically if your string isn’t null terminated you will be stuck walking until by random chance the program finds a byte with a 0 value. Fun isn’t it?

Instead you should use strncpy or strnlen. Those functions allow you to specify a maximum number of characters you will walk before aborting. Always use the ‘n’ versions of the functions rather than the non-’n’ versions.

2) std::string copying

struct safe_string
{
    safe_string(std::string in = std::string())
    : _str(in.c_str())
    {
    }
    safe_string& operator=(std::string rhs)
    {
        _str = rhs.c_str();
    }
    std::string _str;
};

This code has earned a special place in my heart as a free loss of performance. Apparently there is a misconception that if I were to copy a string using the assignment operator or copy constructor both string objects would point to the same c string underneath. Then if I change one, the other would change too.

I don’t know where this misconception started, but it probably comes from old C programmers who don’t understand overloadable operators. The idea behind this struct is to enforce that each time you create or assign a string it “deep copies” the c string.

::Begin Rant::

First off I hate the phrases “deep copy” and “shallow copy.” Anyone programmer knows that how much you copy is based on the way that you construct your object, and implicit copying behavior varies compiler-to compiler, so make sure you implement the “Big 4″ to avoid undefined behaviors (Scott Meyers, Effective C++).

Second, people need to look at libraries before they use them. The programmer who wrote this seems to like to copy string objects by copying the c string (using c_str()) rather than just assigning the string. I think this comes down to the misconception about strings referencing the same c string, but I don’t get how passing in a c string would address this problem. The logic behind string assignment not copying strings and c string assignment performing the copy mystifies me.

And third as a minor style point. Never have your “default” constructor be an overloaded constructor with a default parameter. It’s just bad code style.

::End Rant::

So let’s talk about what this code actually does. Creating a string object from a c string is ugly because it actually has to walk the string twice. Once to figure out the length so it can reserve a buffer, and a second time to copy the string. If you’re lucky the second walk will be an optimized memcpy.

Alternatively if you just assign a string to a string, since string objects cache their length all you have to do is reserve space and memcpy.

The last issue I want to hit here should be self explanatory, but I’m directing it out at all you old C programmers that are set in your c string ways. Don’t mix c string functions with string objects. I’ve seen too many places where people use the c_str() accessor to get a c string then pass it into a standard library c string function like strcat, strchr or strrchr. If you have a string object, use the built-in string functions to do that work. They’ll be faster. Here’s an example I’ve seen:

void if_last_char_is_slash_remove_it(const std::string& in, std::string& out)
{
    out = in.c_str();
    char* pos = strrchr(in.c_str(), '/');
    if( 0 != pos && pos - in.c_str() == strlen(in.c_str() -1))
    {
        out.erase(pos - in.c_str());
    }
}

You may think this code is a bit contrived, but actually the original code was worse.
The number of string iterations here is crazy. We’re iterating over the string twice to copy it, once to find the end, another time for the length. This entire code block could be replaced with the following code:

void if_last_char_is_slash_remove_it(const std::string& in, std::string& out)
{
    if('/' == in[in.length() -1])
    {
        out = in.substr(0, in.length()-2);
    }
    else
    {
        out = in;
    }
}

There are three primary advantages to this solution:

  1. Since in knows it’s length, we can just jump to the character we care about.
  2. We can also then copy the substring, rather than copying data we don’t need
  3. it is obviously cleaner.

And this is why we don’t mix strings and c strings.

Words to Program By: Know Your Libraries

Tuesday, October 26th, 2010

A few months back I found this segment of code:

class input_reader
{
public:
  input_reader()
  : str_buffer(buffer),
  {
     // do nothing
  };

  void recieve_data(char* data, int size)
  {
     str_buffer.read(data,size);
     // do stuff with buffer
  }
private:
  std::stringstream str_buffer;
  char buffer[1024*1024];
};

What immediately caught my eye about this function is that we’re allocating a class larger than a MB in size. So I started looking into it, and I found that ‘buffer’ is only used once as an argument to a stringstream constructor… wow… What’s that do? Consulting our dear friend Stroustrop (21.5.3) I can see that std::basic_stringstream has two constructors (std::stringstream is a typedef of std::basic_stringstream):

explicit basic_stringstream(ios_base::openmode m = out|in);
explicit basic_stringstream(const basic_string& s, ios_base_openmode m = out|in);

Wait… Where’s the constructor for const char*?

By passing in a const char* we’re actually calling the overloaded string constructor, and implicitly converting the const char* to a std::string. That means that the 1MB character array is passed into the std::string constructor, copied to a temporary heap-allocated 1MB block of memory, then the basic_stringstream constructor copied the 1MB buffer from the string over to its’ own 1MB heap-allocated buffer. On top of that we have a 1MB allocation sitting in the object that is never used again.

Apparently the programmer had read somewhere online that if you passed in a pre-allocated buffer as the first parameter to the stringstream it would operate out of that buffer and thus not resize itself. The programmer swore that it was a significant performance improvement. Obviously given that the “buffer” is being used to construct a temporary string object on the stack which is then passed in, this can’t be the case.

So what’s going on? Actually it uses that first argument to fill the buffer. So what was happening is that the buffer was initially being set to the ridiculous 1MB size. That meant that the first time he filled the buffer it didn’t need to resize at all. His metrics weren’t thorough enough to notice that although the first time filling the buffer took less than 0.01 ms, every other time still took 1 ms or more, and the construction of the object took a significantly longer time. So when his average fill time looked lower over a few hundred runs, he thought it was a good optimization… Obviously not the case.

This really all just goes to show that whenever you are working with code that isn’t your own it pays to look closely at it.

Why [Insert Name]‘s Code Smells Like 3 Month Old Leftovers in my Fridge

Wednesday, October 20th, 2010

A long time ago (10 years or so which is an eon in the tech industries) a really smart dude named Martin Fowler wrote a book by the name of Refactoring. The book has since become the de facto manual on the topic. The reason I bring this up is that in his book Fowler identifies common “bad smells” in code and provides formulas (or for real math geeks transformations) for how to clean them up. Unfortunately many programmers haven’t read his book, and many who have read it don’t know how to deal with the more complex stenches that are common in code today.

Which brings me to the three month old Lasagna in the back of my fridge. The other day I found some code with a 1000 line conditional statement comprised of one if statement, one else statement, and about 150 else if blocks.

To the untrained programmer this is just a bit of undesirable or “legacy” code. They don’t understand that the first impression aroma that they are picking up on is a complex medley of Fowler smells which form into a revolting and equally distinct odor.

On the surface you have a smell somewhat like Limburger cheese (an odd mix between dirty socks and BO) which you may mistake for the Hobo in the next desk over. In fact this is the smell Fowler identified as a code bloater, the long method. The second smell hides a bit better. It is more akin to the smell of mildew and unwashed gym clothes that you find in a high school locker room. Fowler identifies this as the smell of switch statements. In our case it is coming from the conditional expressions themselves, but in essence a switch statement is just an if/else if/else statement with cleaner syntax.

In the code I’m talking about here the next bombardment of your olfactory receptors is something that smells on the surface vaguely familiar, like the dead-fish smell in your freezer (this is usually the point where you realize what you thought was lasagna from a few months ago is actually leftover pumpkin pie from Thanksgiving 2006). This scent is probably coming from the fact that those same conditional checks have been there so long they’ve propagated throughout the entire application, so you can find those same checks being repeated over and over again (much like how the pumpkin pie’s scent has staked a significant claim over my fridge). In this code the if statements set enumerations that were later checked, switched on, and otherwise manhandled. This scent is the dead-ringer that makes identifying the solution straight forward.

If you find that your code is riddled with flow control statements that set state and you have to re-check your state variables periodically you have just found the perfect use case for behaviors.

The application this code lives in is a tool distributed with a very popular development environment. The tool reads in a text file and executes a series of operations based on the instructions in the file (basically it is a custom script interpreter). When I first saw this application almost three years ago I felt that the code was unmaintainable, more complicated than it needed to be, and in general a hacky mess, but at the same time it was really powerful, open ended, and could be used for a lot of things other than what it was originally designed for. The script parser supported about 30 different commands and identified them by doing a string compare against the first word on each line in the script file. After identifying the command the tool used an enum to identify each operation and wherever it needed to know which operation it was executing to determine whether or not to do something it would either compare or switch over the enum.

Fast forward a few years, two jobs and a lot of technical development on my part.

Today I’m at a different company using the same middleware vendor and the application has been expanded to about 150 commands and the core architecture remains the same. That means there is a 1000-line chain of conditionals to determine the type of the command, a 400 line switch statement to execute commands and hundreds of conditional checks on an enum’s type.

So how do we fix it?

Up above I mentioned a buzzword “behaviors”. If you boil this problem down essentially you have 150 operations that need to execute functionality at different points in a process. The first thing you need to do is identify where those execution points are. In the application I’m working on this is pretty easy. There is a setup step, an execute step, and a cleanup step. The primary reason for there being three steps is the ability to have the execution step running on a separate thread from the main application thread which will manage setup and cleanup.

Cool! With that established how do I make this work? In my case eliminating the large switch statement and the conditional checks on the type is trivial. The hardest part is understanding that your interpreter doesn’t really care what command is running as long as each command knows what it is supposed to do. That shift of the burden of knowledge from the interpreter to the command is key. All the interpreter really cares about is that the command does the right thing. So you define a base command type with virtual Setup, Execute, and Cleanup methods, and any common functionality that each command needs. That will look something like this:


class Command
{
virtual void Setup(Interpreter&);
virtual void Execute(Interpreter&);
virtual void Cleanup(Interpreter&);
}

With this as a base you can have a derived object for each command and your execution logic becomes three virtual function calls instead of a 150 case switch statement and a bunch of leading and trailing if statements. That means that the execution framework can treat every command exactly the same. That gets rid of the need for the switch statement and a bunch of conditional checks. Cool huh? No not really this is the easy part. Getting rid of the if/else if/else block is where the big win is at.

What makes getting rid of the if/else if/else block tricky is that you are essentially taking input and based on the input creating one of 150-some different objects. It is understandable that someone would think that they might need to identify each possible input and implement a separate handler for each possible input (as we see with this massive if/else if/else block). But what if we could go from a string/int/enum/whatever directly to a constructed object of the type we want?

I’m going to actually provide two solutions for this. One is a generic solution that can be applied to pretty much any programming language, and the other is more specific to the Microsoft .NET languages. The reason I’m giving both solutions is that the tool I’m working with is written in C#, but I’ve solved this problem before in C++ and think it is useful to give that solution too. Also be forewarned: being able to implement the .NET solution requires knowledge of reflection, and being able to implement the more generic solution requires an understanding of function pointers (Yay right?).

Disclaimer: The .NET implementation really isn't anything fancy. I'm pretty sure it is almost exactly how Microsoft implements MSBuild, and I know it is the same way they do a lot of the events in WPF.

In .NET we can make use of reflection to convert a string to an object for us. So let’s say we have a command “Run”. If we pick a strict naming convention such that, for example, Run’s implementation is in a class derived from Command named “RunCommand” we can basically take the command’s name of each line in the script as we execute it and using reflection we can construct RunCommand objects without ever needing to know that the command in the text file is Run.

The code to do this is actually pretty simple, and would look something like this:

class CommandFactory
{
// Dictionary of available commands
private Dictionary map = new Dictionary();

// Use Reflection to populate factory
public CommandFactory()
{
Type[] commandTypes = Assembly.GetAssembly(typeof(Command)).GetTypes();

foreach (Type commandType in commandTypes)
{
if (!typeof(Command).IsAssignableFrom(commandType) || (commandType == typeof(Command)))
{
continue; // not a Command
}

// Do some name mangling to strip off trailing 'Command' suffix
string commandName = commandType.Name;
string commandStr = "command";
commandName = commandName.ToLower();

// strip off 'Command' suffix if it exists
if (commandName.EndsWith(commandStr))
{
commandName = commandName.Substring(0, commandName.Length - commandStr.Length);
}
map.Add(commandName, commandType);
}
}

public Command Create(string name)
{
Command toReturn = null;

try
{

toReturn = (Command)Activator.CreateInstance(map[name.ToLower()]);
}
catch (KeyNotFoundException)
{
throw new Exception(String.Format("Command \"{0}\" not found", name));
}

return toReturn;
}

This solution should work pretty well, and can be adapted to allow you to specify arguments for the constructor too. All in all it is pretty elegant. The C++ solution isn’t quite as elegant, and I won’t go too deep into it because it is a lot of code.

In the C# code the command factory gets all the supported commands by reflecting into the assembly. We can’t do that in C++ (unless we are using Microsoft’s fake C++ which doesn’t count). Because of that we require that all available commands implement a static factory method, and we register the function pointer to that static method with the command name in our factory. The factory then keeps a binary tree or hash table mapping command names to factory method pointers. You will find the worst case binary tree lookup is a hell of a lot better than the worst case run through the conditional block.

You can do a few things to lighten the load on C++ programmers. One thing that comes to mind would be having a dummy class that during object construction registers a command. That class could be defined with a macro which for our run command could be something like “REGISTER_COMMAND( Run )”. The macro would at compile time expand out to a zero-size class which has a default constructor registering the string “Run” with RunCommand::Factory, and a static variable declaration. That way when your application loads during static initialization (which happens before main is entered) the Run command is registered and your programmer doesn’t need to worry about it.

Once this is done your code is a lot more maintainable. Adding a new command basically becomes implementing the command’s class and for any non-.NET language you add registering with the factory. Problem solved. Smells resolved.

Hello Blogosphere!

Tuesday, October 19th, 2010

To get the first post here out of the way let me just establish a few things.

1) What this blog is.

Almost nothing. Seriously look around. There’s this post and an about page… That’s it.

2) What this blog isn’t.

Empty anymore. (See I’m off to a good start).

3) What this blog will be.

I’ve been told on many occasions by people around me that some of the random crap that spews out of my mouth is entertaining, interesting, and at times intelligent. This blog is the place for me to publicly throw out those thoughts. They may be technical, they may be philosophical, they may just be random musings, but they will all be found here.

4) Who the hell do I think I am?

First and foremost I am a software engineer. I also have dreams of someday being a software architect, and fantasies about being a writer (but I won’t hold my breath on that one).

5) Who should read this blog?

Admittedly this blog is geared toward people interested in technology, although not all the posts will necessarily appeal to that wide of an audience, or that narrow of an audience. I intend to write about software development, the software industry, software itself, and just about anything else that strikes my fancy.

6) Why should I read this blog?

Honestly that’s kinda up to you to figure out. I’m dumping out random crap that may or may not appeal to you in any way. If it does appeal to you awesome if not then I will hunt you down stalk you and force you to read it as torture. I’m writing this because I feel I have things to say that someone might care to hear about, but if I’m wrong that’s fine too.

7) Meaning of life.

42

Stay tuned next week.

Follow

Get every new post delivered to your Inbox.