Of late there have been numerous posts for and against comments in source code. On the anti side, we have for example DHH, whose recent post Clarity over Brevity in Method and Variable Names on the 37signals blog calls out comments as a code smell saying that the code should be self explanatory. Or Jeff Atwood’s post here. Comments are of course not without a cost, and once written, they have to be updated if the code is updated. On the pro-side, we have numerous posts saying comments should actually be more prominent than the code as they are an invaluable source of documentation. So what’s the answer? As usual, for any question worth asking, the answer is - it depends.
A fundamental property of good software is that it is easy to change it, which means that it is easy to understand the code. Good programmers therefore write code that is easy to understand. Comments are one very important tool in achieving the desired communication, but is there a way to write comments without having the overhead of maintaining them?
Kent Beck recently wrote a piece called Naming From the Outside In in which he discusses a very interesting concept - that various parts of your system change at different speeds. A commenter there linked to an illuminating article about rates of change in buildings and the implications on architecture (http://www.scottraymond.net/2003/5/19/pace-layers/). Here we have a clue as to how to write comments with a vastly reduced burden of maintainence.
Every piece of code that you write has three Is associated with it - intent, interface and implementation. All of these change at different speeds. In an object oriented language for example, the intent behind creating a class almost never changes, the public interface changes infrequently or in small increments while the implementation is frequently in flux due to refactorings and other activities.
The Intent of a class must be commented. While each individual function might be quite self explanatory, it cannot convey the intent of a class as a whole. Also, the cognitive load of reading a whole class in order to understand what it does can be greatly reduced by starting the class off with comments that convey the intent of the class.
As for the Interface, i.e. the public API, you should be documenting it with comments that feed into YARD, TomDoc or any other automatic documentation generating tools.
Implementation should be more or less self documenting. It is here that we want to avoid the overhead of maintaining comments as the code is free to change fast. Here, we push all our complex code down into private methods with descriptive names and don’t bother with comments about the implementation. This is because the intent of the implementation will already be documented in the specs, which already do change with changes to the code.
In the above example, we hide away the complicated list comprehension behind a descriptive method name like
by_year_and_month. This frees the reader from the burden of having to comprehend the various maps, group_bys and so on that we resort to in order to massage our data into the expected format. This is an example of self-documenting code. As for the complex list comprehension, well, if you want to know exactly how that works, you should be able to find something in the specs that says
describe "by_year_and_month" do…
So to sum up, we can have comments that aren’t a code smell if we take care to comment the slow moving parts of our code such as the intent of a class and the public API. For everything else, there’s self-documenting code and you can push all your complexity down into private methods which can be unreadable to humans and without comments as long as there are specs that express the intended behaviour.