Lost Opportunities

This week I came across some C++ code like this:

if (foo & 0x0FFFFFFF >= width * height) {
    /* copy width * height items to a buffer */
} else {
    /* handle error */
}

This is buggy code. Worse, it’s probably a security vulnerability. The code parses a particular file format. This if-statement is attempting to make sure the fields are internally consistent. Getting this check wrong, probably means an attacker could craft a file to cause the parser to overrun a buffer, which is almost certainly an exploitable security bug.

Don’t see the problem? The variable foo in this case is a 32-bit integral type. The top bits are used for flags, and the remaining portion is a buffer size. The code is attempting to make sure that the buffer is at least large enough to hold width * height items.

Do you see the problem now? It’s a precedence problem. In C and C++, the & operator has a low precedence — lower than even the inquality operators. So the compiler interprets the condition as:

foo & ( 0x0FFFFFFF >= (width * height))

So first the code will compute the product width * height and then it will check if 0x0FFFFFFF is less than the product. This yields a boolean value: true or false. The bitwise AND is then performed between foo and the boolean. Well, sort of. First the boolean will be implicitly converted to an integral type to match foo‘s type. That is, it becomes a 0 for false or a 1 for true.

The net result is that it checks to see if foo is even or odd.

C++, like C, is full of pitfalls like this. It’s easy to write code that does the wrong thing. Programmers aren’t superhuman. Bugs like this get written, but we have lots of techniques for catching these kinds of mistakes.

The first line of defense is the compiler. The incorrect expression is perfectly legal, so it’ll happily compile the code. But most modern compilers are aware of common language pitfalls and will issue warnings about suspicious code. For example, I spotted this problem because the Micosoft VC++ compiler pointed at the line and said:

warning C4554: ‘&’ : check operator precedence for possible error; use parentheses to clarify precedence

There’s little excuse for ignoring (or worse, silencing) compiler warnings. Some programmers don’t want to be bothered with false positives.  In almost all cases, however, it’s possible to eliminate false positives by making the code more clearly express what you intended. For example, if you did intend to evaluate the bitwise-AND last, you could have added parentheses to make it explicit. This would not only eliminate the noisy warning, but it would also make the code more obvious to the next programmer to come along.

The next line of defense is to always step through new code in the debugger. This should be second nature to software engineers, but, unfortunately, many still don’t do it. It’s such a useful practice that Steve Maguire dedicated an entire chapter to it in Writing Solid Code. Stepping through the code while keeping an eye on the variables could have alerted the programmer to the fact that the condition wasn’t testing the right thing.

Unit tests definitely should have caught this mistake. If it’s important enough to check the condition, then it’s probably important enough that you have unit tests to check both outcomes. Alas, this code had no unit tests.

And you might think that normal testing would stumble over the problem. In this case, it wouldn’t because the condition was checking for a very unusual condition that’s not likely to occur with any normal data. And even if somehow a regular test case stumbled over the problem, it wouldn’t provide very good isolation for tracking the cause back to this line of code. File fuzzing could have caught the problem. Fuzzing should probably be done for all parsing code.

Having a peer code review your change–that is another programmer look them over–might have caught the problem. If one developer doesn’t see the bug, then a second might miss it, too. But that won’t always be the case. A fresh set of eyes with a different perspective might at least ask whether the expression does what is intended.  And if you know your colleague is going to be reading your code, you might be motivated to make it as clear as possible.  For example, you might make it a little more explicit like this:

const uint32_t buffer_length = foo & 0x0FFFFFFF;
if (buffer_length >= width * height) {
    /* copy width * height items to a buffer */
} else {
    /* handle error */
}

Notice how trying to make the code clearer to another human can “accidentally” fix the bug.

It’s hard to see how this bug could have made it into a product with all the usual bug defenses in place. I can only conclude that they weren’t all in place. Obviously, the compiler warnings were ignored or silenced. It’s a shame that programmers feel compelled to take such shortcuts. I love it when the compiler says, “um, this looks wrong.” It’s a golden opportunity to fix my stupid mistakes before anyone else sees them.