Tuesday, May 13, 2008

A Stitch in Time

I have written a bunch of multi-threaded code in the past few years. (I realize that a "bunch" isn't very quantifiable, but bear with me.) It seemed to work fairly well; the server product I worked on was able to make use of multiple CPUs whenever possible. I tested it on multiple CPU machines and saw all CPUs being fully used under load. Also, others used that product on 8-way boxes with good reliability. I have found and fixed threading bugs in my own code and in other's code. I still, however, find myself looking for patterns to help me deal with threading, to increase my confidence in my (and other's) multi-threaded code.

I don't know if it is the indeterminacy of it, or if it's that I don't feel I can hold a lot of multiple states in my head at once. I also know that the few times I've tried to have my threading code reviewed by others they gave up long before I got any feedback of any kind. I don't know if that was because the code was opaque (though there wasn't very much of it...) or because they weren't able to deal with multi-threaded code any better than I was.

I had a number of concerns about the use of mutexes in that product. My primary concern was the sheer number, both of types and of instances. I felt that I had to make sure that the whole system couldn't deadlock, but given those numbers there was no way that I (or anyone I know) could do that.

The principles I will follow in the future are as follows:

  • Don't be afraid of multiple mutexes.
  • Do as little as possible while holding locks.
  • Never hold a lock infinitely.

I don't know if this will make it easier (and more comfortable and reliable) for me to work on multi-threaded code, but I know I can't go about it as I have in the past.

[title reference]