Lost Opportunities

This week I came across some C++ code like this:

if (foo & 0x0FFFFFFF >= width * height) {
    /* copy width * height items to a buffer */
} else {
    /* handle error */
}

This is buggy code. Worse, it’s probably a security vulnerability. The code parses a particular file format. This if-statement is attempting to make sure the fields are internally consistent. Getting this check wrong, probably means an attacker could craft a file to cause the parser to overrun a buffer, which is almost certainly an exploitable security bug.

Don’t see the problem? The variable foo in this case is a 32-bit integral type. The top bits are used for flags, and the remaining portion is a buffer size. The code is attempting to make sure that the buffer is at least large enough to hold width * height items.

Do you see the problem now? It’s a precedence problem. In C and C++, the & operator has a low precedence — lower than even the inquality operators. So the compiler interprets the condition as:

foo & ( 0x0FFFFFFF >= (width * height))

So first the code will compute the product width * height and then it will check if 0x0FFFFFFF is less than the product. This yields a boolean value: true or false. The bitwise AND is then performed between foo and the boolean. Well, sort of. First the boolean will be implicitly converted to an integral type to match foo‘s type. That is, it becomes a 0 for false or a 1 for true.

The net result is that it checks to see if foo is even or odd.

C++, like C, is full of pitfalls like this. It’s easy to write code that does the wrong thing. Programmers aren’t superhuman. Bugs like this get written, but we have lots of techniques for catching these kinds of mistakes.

The first line of defense is the compiler. The incorrect expression is perfectly legal, so it’ll happily compile the code. But most modern compilers are aware of common language pitfalls and will issue warnings about suspicious code. For example, I spotted this problem because the Micosoft VC++ compiler pointed at the line and said:

warning C4554: ‘&’ : check operator precedence for possible error; use parentheses to clarify precedence

There’s little excuse for ignoring (or worse, silencing) compiler warnings. Some programmers don’t want to be bothered with false positives.  In almost all cases, however, it’s possible to eliminate false positives by making the code more clearly express what you intended. For example, if you did intend to evaluate the bitwise-AND last, you could have added parentheses to make it explicit. This would not only eliminate the noisy warning, but it would also make the code more obvious to the next programmer to come along.

The next line of defense is to always step through new code in the debugger. This should be second nature to software engineers, but, unfortunately, many still don’t do it. It’s such a useful practice that Steve Maguire dedicated an entire chapter to it in Writing Solid Code. Stepping through the code while keeping an eye on the variables could have alerted the programmer to the fact that the condition wasn’t testing the right thing.

Unit tests definitely should have caught this mistake. If it’s important enough to check the condition, then it’s probably important enough that you have unit tests to check both outcomes. Alas, this code had no unit tests.

And you might think that normal testing would stumble over the problem. In this case, it wouldn’t because the condition was checking for a very unusual condition that’s not likely to occur with any normal data. And even if somehow a regular test case stumbled over the problem, it wouldn’t provide very good isolation for tracking the cause back to this line of code. File fuzzing could have caught the problem. Fuzzing should probably be done for all parsing code.

Having a peer code review your change–that is another programmer look them over–might have caught the problem. If one developer doesn’t see the bug, then a second might miss it, too. But that won’t always be the case. A fresh set of eyes with a different perspective might at least ask whether the expression does what is intended.  And if you know your colleague is going to be reading your code, you might be motivated to make it as clear as possible.  For example, you might make it a little more explicit like this:

const uint32_t buffer_length = foo & 0x0FFFFFFF;
if (buffer_length >= width * height) {
    /* copy width * height items to a buffer */
} else {
    /* handle error */
}

Notice how trying to make the code clearer to another human can “accidentally” fix the bug.

It’s hard to see how this bug could have made it into a product with all the usual bug defenses in place. I can only conclude that they weren’t all in place. Obviously, the compiler warnings were ignored or silenced. It’s a shame that programmers feel compelled to take such shortcuts. I love it when the compiler says, “um, this looks wrong.” It’s a golden opportunity to fix my stupid mistakes before anyone else sees them.

Counting to a Billion

[This blog post is reconstructed from the Internet Archive's copy of my original article posted on April 7th, 2007. The original post was lost.]

Let’s count to a billion.

#include <iostream>

void Count(const unsigned long limit) {
  DWORD ticStart = ::GetTickCount();
  unsigned long counter = 0;
  for (unsigned long i = 0; i < limit; ++i) {
    ++counter;
  }
  DWORD ticStop = ::GetTickCount();

  std::cout << "Counting took "
            << ticStop − ticStart
            << "milliseconds."
            << std::endl;
}

int main() {
  Count(1000 * 1000 * 1000);
  return 0;
}

Computers are pretty fast. With optimization turned off, my computer reports that it took a mere 2547 milliseconds (or about 2.5 seconds) to count to a billion. With optimization turned on, the compiler realizes there are no side effects in the loop, so it replaces the loop with an assignment: counter = limit;, taking virtually no time.

In the ray tracer, I want to keep statistics. How many intersection tests were performed? How many rays per seconds were traced? How many of those were shadow rays? Essentially, this boils down to counting how many times certain operations are performed. From this experiment, I can see that incrementing a few counters won’t be a significant performance hit.

But the ray tracer is multithreaded. To keep n cores cranking, I’ve got n threads rendering different parts of the picture. So if I want an accurate count of how many times something happens, I’m going to have to put some sort of synchronization on the accesses to the counter.

void CountWithCriticalSection(const unsigned long limit) {
  CRITICAL_SECTION cs;
  ::InitializeCriticalSection(&cs);

  DWORD ticStart = ::GetTickCount();
  unsigned long counter = 0;
  for (unsigned long i = 0; i < limit; ++i) {
    ::EnterCriticalSection(&cs);
    ++counter;
    ::LeaveCriticalSection(&cs);
  }
  DWORD ticStop = ::GetTickCount();
  ::DeleteCriticalSection(&cs);

  std::cout << "Critical section counting took "
            << ticStop − ticStart
            << " milliseconds."
            << std::endl;
}

The example is still single threaded, but the code is now thread safe. The critical section prevents two threads from corrupting the counter by trying to modify it at the same time.

The bad news is that this example, optimized or not, took 73,719 milliseconds. Ouch. Thread safety has a price. And this is without any contention for the critical section. It would be worse with two threads incrementing the counter. Imagine how bad it would have been if we had used a mutex and had to make a billion round trips to kernel mode!

Fortunately, Windows provides a set of interlocked functions which can help us do simple, thread-safe operations like counting without the overhead of a critical section. One more version:

void CountInterlocked(const unsigned long limit) {
  DWORD ticStart = ::GetTickCount();
  volatile long counter = 0;
  for (unsigned long i = 0; i < limit; ++i) {
    ::InterlockedIncrement(&counter);
  }
  DWORD ticStop = ::GetTickCount();

  std::cout << "Interlocked counting took "
            << ticStop − ticStart
            << " milliseconds."
            << endl;
}

This version scores significantly better at 37,390 milliseconds, just over half the time of the uncontested critical section. But that’s still a lot of overhead to take on while rendering.

And a little back-of-the-envelope calculation delivers even more bad news. Suppose one of the counters is supposed to be the number of intersection tests performed in the rendering job. If were computing images at hi-def resolution (1920 by 1080), that’s about 2 million pixels. If, for anti-aliasing, we oversample with say 16 rays per pixel, we’re up to about 32 million primary rays. Each primary might spawn a secondary reflection or refraction ray. Each of those 64 million rays might require, say, 10 shadow rays for shading. With a good acceleration structure, we might get away with as few as 4 intersections per ray. Next thing you know, we’ve exceeded the capacity of a 32-bit unsigned integer. If not for a single frame, then certainly once we start animating at 24 or 30 frames per second. So it looks like we’ll need 64 bits for at least some of our counters.

But, unless you’re on a 64-bit architecture, there aren’t 64-bit versions of the interlocked APIs. That means we’re back to critical sections for making the counters thread-safe. Obviously, we need to rethink this.

The key to good performance with multiple threads is to eliminate shared data as much as possible. Suppose instead of one counter, each thread had its own. Each thread would be able to increment its own counter without any synchronization concerns. Then, when the render is done, the independent counters could be summed to determine the total count. This adds a small amount of complexity and coordination, but not much, and the benefit should be worth it.

And this brings me back to my singleton post. I’m now paying the price for not taking my own advice. The ray tracer’s statistics are based on a set of counters kept in a singleton, so now I have to do some extra refactoring to distribute the counters among thread-specific objects.

105% Complete

One of my UI pet peeves is misleading progress indicators.

This morning, I’m running a disk drive diagnostic program. The first test claimed it would take 200 seconds. After only 10 seconds, the display said “10% done”. Huh? Last time I checked, 10 ÷ 200 was 0.05, or 5%. Near the end of the test, the display said “100% done” about 10 seconds before it actually completed.

To a user, incorrect progress bars are annoying and useless. I don’t know about you, but when something is “100% done”, I think of it as, well, done. Not close to done. Those eternal seconds between “100% done” and actually done are immensely frustrating. People have even done research papers on designing progress bars to improve the user’s perception of a program’s progress. But this kind of research is light years ahead of real life, where we can’t even get a decent linear progress bar to work.

From my programmer’s perspective, misleading progress indicators are especially perplexing. You have to write extra code to get it wrong. Since programmers are lazy, nobody should be writing extra code, and all of our progress bars should just work.

Consider the disk drive diagnostic. You’d expect a plain jane linear progress indicator to be computed something like this:

indicator_value = work_done / total_work

Pretty simple, right? Ten seconds into a 200 second test should give us 0.05.

Now it’s likely that the calculation is done with integer arithmetic, and the final result will be presented as a percentage. So we have to scale up the numbers like so:

indicator_value = 100 * work_done / total_work

Now, 100 * 10 seconds ÷ 200 seconds gives us 5. So why in the world would the disk diagnostic claim 10%? Because somebody decided to do more work than was necessary.

Somebody apparently decided to have the indicator display progress in 10 percentage point increments (10%, 20%, 30%, etc.). Fine, you have to draw the line somewhere. So we revise our code to something like this:

indicator_value = 100 * work_done / total_work / 10 * 10

Mathematicians unfamiliar with integer arithmetic in most programming languages are now scratching their heads. The salient detail is that there’s an implicit floor function on the result of each division operation when working with integers. Conceptually, the above is equivalent to:

indicator_value = floor(floor(100 * work_done / total_work) / 10) * 10

Any programmer who find him- or herself at this point should stop, take a breath, and check off the progress indicator feature as complete.

But many programmers don’t stop there. Something bothers them. They instinctively worry about the “rounding down” that the integer division does. They want to fix it. They want to round up. Next thing you know, they’ve written an expression like this:

indicator_value = (100 * work_done / total_work + 5) / 10 * 10

Adding 5 (which is half of the 10 percentage point interval) will bias the number up. In many cases, this is the right thing to do. If you’re writing code to credit frequent flier miles to my account, I want you to round up. But we’re talking about code for a progress indicator. Don’t over-promise. Manage the user’s expectations.

It’s true that, 10 seconds into a 200 second operation, we’re closer to 10% than we are to 0%. But we’re not 10% done. We’re not even close. And, at the other end, you’ll show 100% done 10 seconds before we’re actually done. That’s an outright lie. User interfaces shouldn’t lie to users.

Don’t round up on progress indicators. Ever!

“But now users will panic,” some of you are sure to complain, “because it’ll seem like nothing is happening for 20 whole seconds!”

Perhaps so. Maybe it’s time to revisit the decision to show progress in 10 percentage point increments. You could have done even less work.

[Happy Pi Day!]

Test-Driven Development

On a C++ programming forum, somebody asked a relatively simple question: How do you write a function that counts the number of lines in a text file? Most of the answers were friendly and helpful, but flawed.

As programmers, we often think of a problem as so trivial that we don’t take the time to think through the answers or to appreciate the subtlety of corner cases. In one of my favorite books, Programming Pearls, Jon Bentley talks about how hard it is for programmers to get fairly simple and routine problems correct on the first try. He writes:

I’ve assigned this [binary search] problem in courses at Bell Labs and IBM. Professional programmers had a couple hours to [implement a binary search] in the language of their choice; a high-level pseudocode was fine. At the end of the specified time, almost all the programmers reported that they had the correct code for the task. We would then take thirty minutes to examine their code, which the programmers did with test cases. In several classes and with over a hundred programmers, the results were varied little; ninety percent of the programmers found bugs in their programs (and I wasn’t always convinced of the correctness of the code in which no bugs were found). [p. 36]

Remember, these are professional programmers, in the language of their choice, with hours of time, messing up a binary search.

Notice the oh-so-common pattern here:

  1. Write code.
  2. Declare that it works.
  3. Test it and discover it’s broken.

It’s enough to make you wonder if there’s a better way. Many people in the extreme programming and agile crowds would say that we’re doing it all out of order. Not only should “declare it works” come after testing, but probably so should writing code. So let’s take this seemingly trivial line-counting problem and apply a little test-driven development in C++.

First, let’s agree on an interface. I propose:

template <typename InputIt>
size_t CountLines(InputIt begin, InputIt end);

That is, we’ll give the function a pair of iterators pointing to the beginning and end of the text we want examined, and the function will return the number of lines. (If you want to count the lines in a file, read the file into a std::string and call the function.) I’ve used size_t for the return type, as that’s typically unsigned (after all, the count will be 0 or more—it can’t be negative), and it’s generally as large an integer type as your platform will comfortably handle, so we shouldn’t have to worry about text consisting of too many lines. If you’re not a fan of the template parameters for the iterator types, feel free to re-write it with const char pointers.

Now that we know the interface, we will write the test function—even before we discuss how we’ll count lines. This function will iterate through a set of test cases, feed the input to CountLines, and check the return.

bool Test_CountLines() {
  struct TestCase {
    std::string input;
    size_t output;
  };

  const TestCase tests[] = {
    {"Hello\nWorld\n", 2}
  };

  const size_t cTests = sizeof(tests) / sizeof(tests[0]);

  size_t cPassed = 0;
  size_t cFailed = 0;
  for (size_t i = 0; i < cTests; ++i) {
    const TestCase &test = tests[i];
    size_t cLines = CountLines(test.input.begin(), test.input.end());
    if (cLines == test.output) {
      ++cPassed;
    } else {
      std::cerr << "FAILURE" << std::endl
                << "Input \"" << test.input << '"' << std::endl
                << "Output " << cLines << " instead of " << test.output << std::endl;
      ++cFailed;
    }
  }

  return cPassed == cTests && cFailed == 0;
}

I’ve started us with one test case, and we can add more as we develop the function. Just to make sure our testing framework is solid, let’s stub out CountLines and a little main function to make sure it all compiles and runs.

template <typename InputIt>
size_t CountLines(InputIt /* begin */, InputIt /* end */) {
  size_t cLines = 0;
  return cLines;
}

I commented out the parameter names to avoid the “unused parameter” warnings. These will come into play shortly.

int main() {
  assert(Test_CountLines());
  return 0;
}

If you put all this together and run it, you’ll see that the test fails as expected. This is good, as we want to make sure we exercise the failure case of our test function to know that it works.

Some of you are probably saying that you could have already written a correct CountLines function by now. Maybe. Remember all those “helpful” answers on the programming forum? Remember all those professional developers attempting a binary search? Besides, it really didn’t take long to build this test infrastructure. If a bug ever does crop up in the function, we can easily debug it. When we fix a bug, we can immediately verify that we didn’t create a regression by re-checking every test case we’ve ever come up with. And you can add that new test case so that it never breaks again. If you change something fundamental, like the compiler or OS you’re using one, you can know with certainty whether or not it affects the behavior of your function.

Next we need test cases. Off the top of my head, here are some interesting cases we’ll have to design for:

  • An empty string should have zero lines.
  • The count of a string with blank lines or empty lines (e.g., “Hello\n\nWorld!\n”) should include those blank lines.

Doesn’t seem too hard yet, so let’s add those test cases.

const TestCase tests[] = {
  {"", 0},
  {"Hello\nWorld\n", 2},
  {"Hello\n\nWorld\n", 3}
};

It seems pretty “obvious” that the way to count the lines is to count the number of newline (‘\n’) characters in the string:

template <typename InputIt>
size_t CountLines(InputIt begin, InputIt end) {
  return std::count(begin, end, '\n');
}

Ta da! Test-driven development has led us directly to one of the common—but incorrect—answers given on the programming forum.

Consider the string “Hello”. That has zero newline characters, but most people would consider it to be one line long. Too bad we didn’t think of that while coming up with our test cases. Now that we have thought of it, the first thing to do is to add it to our test code.

const TestCase tests[] = {
  …
  {"Hello", 1},
  …
};

We might be tempted to treat this new string as a special case.

// Ugly -- don't do this!
template <typename InputIt>
size_t CountLines(InputIt begin, InputIt end) {
  size_t cLines = std::count(begin, end, '\n');
  if (cLines == 0 && begin != end) {
    cLines = 1;
  }
  return cLines;
}

While this hack will pass our tests, it’s ugly and still wrong. We’ve failed to realize that the “Hello” string is just a special case of not having a newline character on the last line in the string. (This is an extremely common bug in the wild. Many parsers choke if the last line of the input doesn’t end with a newline.) So we add another test case:

const TestCase tests[] = {
  …
  {"Hello\nWorld", 2},
  …
};

When adding a test case, it’s good to think of counter-examples. For instance, if the last line of the string does end with a newline, that shouldn’t trigger an extra line. Fortunately, we’ve already got a test case for that.

Now our original approach is starting to look bad. Simply counting the newline characters, even with some hacky fix-ups for “corner cases”, isn’t going to work.

All is not lost. Yes, we have to completely re-think how we’re going to implement CountLines, but none of that invalidates the test framework we’ve built. When we come up with our new solution, we can immediately check to see if it solves the new test cases as well as the others we’ve already made. Suddenly it looks like the time we’ve put into the test framework is going to pay off.

There are a few ways to solve the line counting problem, but they all boil down to a state machine. Here’s my solution:

template <typename InputIt>
size_t CountLines(InputIt begin, InputIt end) {
  size_t cLines = 0;
  bool bStartOfLine = true;
  for ( ; begin != end; ++begin) {
    if (bStartOfLine) {
      ++cLines;
    }
    bStartOfLine = *begin == '\n';
  }
  return cLines;
}

In this case, there’s not much state—just a boolean. If we’re at the beginning of a line, and there’s anything on it (even a ‘\n’), then we need to count the line. After that, we simply need to figure out if the next character will be at the start of a line or not. This solution passes all five of our test cases.

Some say that you shouldn’t “waste time” on the corner cases, since, well, they’re corner cases. In my experience, though, if you get the corner cases right, then the mainstream cases generally just work. After all, the corner cases should just be degenerate versions of the mainstream case—or vice versa.

Uh oh, another bug. We’ve been considering ‘\n’ to be the newline character. Text files on Multics-derived operating systems use a single ASCII line feed character (LF) to indicate a line break, but that’s far from universal. CP/M, DOS, Windows and a lot of Internet protocols require a carriage return (CR) followed by LF to mark a line break. C and C++ file i/o routines will translate whichever convention your platform uses to and from ‘\n’. Most compilers give ‘\n’ the value 0x0A, which happens to be the same as an ASCII LF. On Unix systems, the compiler’s i/o routines have to do no translation. On others, when you write ‘\n’, the output routine actually writes CR-LF, and when the input routine reads CR-LF, it returns ‘\n’. (Read all about newlines on Wikipedia.)

But if we were to read in the data in binary mode (which could be faster), then we’ll see the actual bytes used in the file. So we have to add test cases. First, we’ll add the CR+LF cases. Then we’ll conditionally add ones with explicit LFs in case our compiler doesn’t use LF for ‘\n’.

const TestCase tests[] = {
  …
  {"Hello\x0D\x0AWorld", 2},
  {"Hello\x0D\x0AWorld\x0D\x0A", 2},
  {"Hello\x0D\x0A\x0D\x0AWorld\x0D\x0A", 3},
#if '\n' != '\x0A'
  {"Hello\x0AWorld", 2},
  {"Hello\x0AWorld\x0A", 2},
  {"Hello\x0A\x0AWorld\x0A", 3}
#endif
};

Happily, our latest version of CountLines works with these new examples. Essentially, the CR characters aren’t important to the algorithm, and our compiler does assign LF for ‘\n’.

If you did read the Wikipedia article, you know that there are several more newline conventions that we might want to support. Old Mac and Amiga files use just a CR, old IBM mainframes used NEL from the EBCDIC character set, Unicode introduced LINE SEPARATOR and PARAGRAPH SEPARATOR code points, etc. I’m sure you could imagine refining CountLines to handle these additional cases. Essentially, the state machine grows in complexity.

Conclusion

OK, this has been a long blog post, mostly because I’ve shown the iterative process necessary to solve the problem. What we’ve seen is that even the simplest problems might be more complex than we anticipate. The more thought we put into coming up with test cases, the more robust an implementation we were able to write. And when we had to start the implementation over from scratch, having a unit test framework helped us quickly ensure that the new implementation was at least as correct as the previous one.

Servo Animation

One of the (far too many) projects I’ve been working on is a video game. You might call it a sports simulations—then again, you might not, depending on your idea of what constitutes a sport. Nevertheless, it involves players waving sticks at balls, trying to get the balls to go in the right direction.

Physics

For the core of the game engine, I’ve built a physics simulator. Back in college, I had a tenuous grasp on Newtonian physics, which came in handy as I was earning my degree in physics. As with many things, you have to use it or lose it, and I haven’t exercised my knowledge of physics much since graduating. So, in addition to the figuring out how to translate conservation of energy into floating point math, I had to relearn some physics and, especially, some calculus. If you find yourself in the same boat, I recommend this series of articles from Chris Hecker as a warm-up:

Those will help you (re-)learn the physics and math necessary to make a nice 2D simulator (think Crayon Physics). With those as inspiration, these heavy-duty tutorials from David Baraff will help you build a nearly-complete 3D simulator:

My simulator is built almost directly from the latter two references. While these are hard-core articles, there is a lot more to this field. Fortunately for me, this was plenty to do the basic bouncing balls simulation that are necessary for my game.

At least, I thought it was. I have the balls bouncing, rolling, and smacking into each other, all under the control of the physical simulation. What I need now is to blend the simulation with canned animations and with input from the user. The question is, how do I handle the interaction of the animation-controlled objects with the simulation-controlled ones?

Consider a canned animation of a 3D character swinging a bat. There’s a ball hurtling toward the player’s avatar. The player presses a button to start the animation. Frame by frame, the bat’s position is updated. If the bat hits the ball, an impulse needs to be imparted to the ball. But since the swing is just a series of positions, we don’t have all the physical values necessary to compute an accurate force to apply to the ball. You could probably guesstimate the change in momentum and send the ball flying off in a plausible way, but what about the bat? If I continue with the animation of the player’s follow-through, you won’t see any recoil on the bat itself. The bat will continue on its trajectory as though the player had missed the ball.

This is a quandary I’ve been pondering off-and-on for a while. But after the MazeBot project, the answer came to me: servo control.

Servo

MazeBot has a compass sensor that gives its orientation in degrees with respect to magnetic north. At any point in time, MazeBot’s software has a desired orientation. The difference between the desired orientation and the measured orientation is the error. The error is used to compute how much power to apply to the left and right motors in order to turn the robot back to the desired heading. This is repeated many times per second in order to keep the robot on track. This is called servo control.

Now here’s the neat part. To execute a turn, the software “animates” the desired heading from its original value to the desired value. For example, if the robot is at heading 18 and it wants to turn 90 degrees, the software changes the desired heading to 19, then to 20, then 21, and so on, until it reaches 108. That’s it. It doesn’t directly issue any commands to the motors.

In the background, the servo controller is always running, trying its darndest to keep the robot aligned to the desired heading—even when the desired heading is a moving target. Thus the robot turns. How closely it follows the animated path depends on how well-tuned the servo controller is and how responsive the mechanism is.

Servo Animation

Suppose we applied the same idea to the animation of the bat. The bat could be just another object that the physics simulator controls. Rather than telling us where the bat is, the animation path tells us where we want the bat to be. We feed that desired position into a servo controller just like the desired heading is given to MazeBot’s servo controller. The servo controller returns the force (and torque) that should be applied to the bat to keep it on the animation path.

Now that the bat is a first-class citizen of the simulation, we have all the physical parameters we need to resolve any ball-bat collisions. The ball experiences a change in momentum and heads away to the outfield. The bat will also experience a recoil impulse causing it to momentarily fall behind the animation path. The servo controller will increase the forces to try to get the bat back on track, resulting in a swing that—while animation-driven—still responds in a realistically different way than if there had been no collision.

I haven’t tried this out yet. Given my project backlog, it might be a while, but I wanted to write up this idea while it was still fresh in my head. What do you think? Will it work?

MazeBot

My brother and I make it a point to get together about four weekends a year to do projects. We started this “tradition” a couple years ago, and it’s become something I look forward to every time.

Last weekend I met him in Southern California. He suggested we do a robotics projects, specifically, to make a small robot that can navigate a maze. First we needed a maze, so we headed to Home Depot with the idea of getting some lumber we could cut into various lengths and layout to make a reconfigurable maze. After settling on some 1×4 pine boards, we discovered some 2×16×6 concrete blocks. They worked beautifully, and were much cheaper than even the cheap wood we had originally selected.

To build the actual robot, which I dubbed MazeBot, we used LEGO (Mindstorms NXT). Working with the NXT, which is the second generation of the LEGO robotics line, was much easier than trying to get the original Mindstorms brick to do much. We used ROBOTC, which is a C compiler from the folks at CMU made especially for robotics projects. We used Bluetooth to transfer the compiled programs from our development PC to the robot. The whole process was a snap, which made it very easy to do lots of iterations.

Adrian with MazeBot

Adrian with MazeBot

Did I say lots of iterations? We made eight major versions of the software (about four pages of C code) in one weekend. In the end, MazeBot could explore and map the maze. The next step would have been to give it more explicit goals, like finding a route to a particular cell in the maze, but we ran out of weekend. This video shows MazeBot in action.

The robot uses a sonar sensor in front to detect the maze walls. It has a compass sensor on a mast to determine its orientation. (The mast is necessary because the magnets in the drive motors interfered with the accuracy of the compass readings. Late in the weekend, we figured out that our mast had to be even taller to get good readings.) The software assumes everything in the maze is at right angles and that the robot begins aligned with one of the walls. The LEGO motors also have feedback, which we used to determine how far the robot has traveled.

MazeBot 8.0 from Adrian on Vimeo.

I certainly learned a lot about PID servo control. Well, PI control anyway—our robot doesn’t use a derivative term.