Sunday, August 26, 2007

The Costumer's Handshake

What is it about code formatting standards? After all, the important thing is that the code compiles, links and passes its tests, right?

No. All psychological and behavioral arguments aside, there is something attractive about well-formatted code. Admittedly, well-formatted code does not guarantee functional correctness, adequate performance and maintainability, but it is clear (at least to me and the folks that I prefer to work with) that it helps readability and therefore maintainability. And it certainly makes a better first impression than a slap-dash bunch of statements thrown together.

What does this have to do with costumers and handshakes?

My wife likes to sew costumes for my daughter, who is deeply involved with the San Francisco Dickens Faire and other such semi-historical pastimes. At one of the Dickens Faires my wife approached a woman and asked her about her costume. This is not an uncommon occurrence at these events, and most people are only too happy to talk about their work. After chatting for a few minutes and establishing a rapport, my wife reached out, looked closer at the garment, and turned the edge of the woman's bodice to examine the stitching and the lining. The woman chuckled and said, "Ah, the costumer's handshake!" and we both immediately knew what she meant.

People who care about the quality of their (and other's) work know to look inside and underneath. Costumers know the handshake. Woodworkers know to open the drawers of a cabinet to check out craftsmanship of the joints inside. They look to see if something is veneer (likely to come off in time), or solid wood, which will last longer. Printers know to hold up a page of a book to the light to check that the imposition of back-to-back pages is in alignment, and to check the quality of the paper.

I don't know if such quick evaluations can help judge the quality of a body of code. There are obvious things to look for such as error handling and formatting consistency, but problems such as duplicated code and bad naming are harder to spot without spending more time getting to know the code.

I haven't worked with many open source projects, but the one I've spent the most time with (Blender - see www.blender.org) certainly doesn't pass a lot of my tests. Does that mean it's not successful? If open-source success is defined as "usage and visibility" then Blender certainly qualifies as a success. A lot of the commercial closed-source code I work with doesn't pass my tests either. Given that success for commercial closed-source is defined as profitability for the company that produces it, I can't comment on that in public.

I do know, however, that (by the craftsmanship standard), much of the code that I work with might not stand up to having its "lining" examined. But that is one of the goals I aim for.

Wednesday, August 22, 2007

Paranoid Programming

I read and enjoy Joel on Software (and used his CityDesk product for my website). I have to respond, however, to some of his comments in this article.

I strongly agree with most of it. For example, I'm glad to see conventions such as "Don’t put closing braces more than one screen away from the matching opening brace." (And I wish I could enforce that standard at the place I work at!) Also, he writes: "I have to admit that I’m a little bit scared of language features that hide things". I have seen very confused code caused by "features" in C++ like the parentheses operator on iterators. I always prefer to see a real method name such as "Next()" (thanks to Nick Pouschine for putting into words the solution to my unease with that syntax). And I never have, and never will, overload the plus operator for any reason.

I don't, however, see any reason to write off exceptions. I depend very heavily on exceptions in all the C++ and Python code I write. Joel's example of:

dosomething();
cleanup();


just seems wrong. The cleanup code should be on the destructor (or some equivalent in non-object-oriented languages), which should always get called at some point. Note that I'm assuming you've adopted the convention of "every object is correctly owned" so that memory is always freed at the appropriate point in the code, regardless of whether errors occurred or not.

[Apologies in advance for the CAPITALS.] I don't EVER need to "investigate the entire call tree of dosomething()" to see if there’s anything problematic in there. Rather I ALWAYS assume the worst, which is that any line of can cause an error (divide by zero, anyone?). I'm not paranoid, they really are out to get me!

I therefore write all my code to handle errors anywhere. That way, if an error happens in my code or some third-party code that I'm using, it is handled. Once I "accepted" exceptions, then I wrote code that I found more readable. I also was more confident in its error handling capabilities.

For example, instead of having all that nesting in the code from Raymond Chen's article that Joel references, I could put the variables into fields in an object, all the cleanup code into a destructor and then "flatten out" the error checking (I'll try to put an example in here later...). I don't claim that there are fewer lines of code (there may be more), but the error handling code doesn't get in the way of my reading and understanding of the intent of the code. This is especially true when errors have to be propagated back up a long call stack, where there may be many intermediate functions that would have to return the error. With exceptions, I only care about where they are thrown and where they need to be caught, not all the intermediate points in between.

To me, having the "other channel" of exception handling removes noise from the code and makes it more readable. I agree with Raymond that exceptions are tricky and you have to know how to do the right thing (e.g., don't throw pointers to stack-based objects and make sure to understand threading issues). Once, however, I had those principles understood and enforced, I was able to keep the error handling code "out of the way" of the rest of the code. This allows me to focus on the intent of the code as I read it, and not have to mentally "filter out" all the error handling code.

I certainly don't think that I'm smarter than Joel or Raymond. I do know, however, that I prefer using exceptions rather than having every function return error codes, both for code readability and robustness. I know they make me a more productive programmer.

Friday, August 10, 2007

Buffer Overflows Are So Twentieth Century

At one of the companies I worked for in the past, the engineering team was able to agree on a set of standards that met the criteria outlined in my previous post. These standards were short, usually 1-2 page documents. As this was at least Year 3 BW (Before Web), the documents were stored in Lotus Notes.

One of the standards that we quickly agreed on was titled "No unsafe string functions." That is, we banned (except for the very occasional exception case) the use of C-language functions such as strcpy, strcat and sprintf that can cause buffer overflow. We wrote a series of small safe functions that would make sure that buffer overflows would not happen. These functions all took a maximum buffer length as well as the target buffer pointer. We didn't want to convert to std::string because there had been a number of cases where some engineers didn't understand their usage and had caused very serious performance problems.

I wrote a Bourne Shell script to search across the appropriate files for all occurrences of the unsafe functions. I ran this once a week or so. Once in a while, I would find an incorrect usage that had crept in, but usually all the engineers understood the need for such a standard and would adhere to it.

A few years later the company was acquired and a new group of engineers was brought on board. We started over again with the standards process, but something strange happened with this particular standard. This new group could not accept it. I never fully understood why. I re-wrote the standard a number of times and had a long email correspondence with the leader of the group (who I have great respect for). However, those of us on the original team were unable to convince the newer members of the need for this standard.

A year or so later, one of the application engineers that used our product came to me and reported a crash. The product that I worked on was a server product, and crashes were taken seriously. I analyzed the crash and found it was caused by the use of one of those unsafe functions. I fixed the problem. Luckily the crash did not happen at a customer site, so we were able to prevent it before the particular application went live.

I then re-wrote the standard (yet again), and sent a pleading email to the leader of the group asking if they could now accept it.

They could, and did.

And now they understand.