Wednesday, 14 November 2012

NOTE: Comment-Driven Design

I started the NOTE: blog in 2009. It was a successor to the highly-popular email newsletter that I used to send between 2001 and 2006. At its height, the email newsletter had 4,000+ subscribers. One of the regular features was "SAS With Style". Below, I've included one from 2001 which still holds true and which I still practice today.

The focus of the tip is comments. Despite attempting to write unit specifications that provide sufficient detail of what is to be built and coded, I occasionally find that I have provided insufficient detail in some places (yes, I know, you're shocked!). In some cases it will be appropriate to revisit the documentation to augment it, but in others it may be pragmatic to include the detailed design in the code. Moreover, for those to whom external documentation is anathema, it is crucial that the comments in the code reveal the full rationale and intention of the design.

Jef Raskin wrote a good essay on the subject back in 2005. You can still find it on the Association for Computing Machinery (ACM) web site. It's entitled Comments are More Important than Code. Ed Gibbs added some of his own thoughts to the discussion in 2007 in his Musings of a Software Development Manager blog. Ralf Holly provides a neat, alternative summary of the topic in his 2010 blog entry. All three articles are worth a read.

My own tip from 2001 was more basic, but falls into the general approach discussed by Jef, Ed and Ralf. Here it is:
One of the less-attractive aspects of a programmer's life is maintaining existing code. Most programmers would prefer to be creating something new rather than manipulating old code. But maintaining old code is a necessity, be it your own code, or somebody else's. And in those circumstances, you will be grateful if the code has been written in a neat and clear fashion.

Comments are a critical part of your programs, and they can be used in many different ways. I like to encourage the use of "overview" blocks at the top of large sections (a whole program counts as a "large section"). The individual lines of the overview then get used to head-up the respective sections of code. The code might look like this:

/******************************************************/
/* 1. Get subset of the demog info                    */
/* 2. Get subset of the lab info                      */
/* 3. Get subset of the meds info                     */
/* 4. Merge demog, lab, and meds and transpose result */
/* 5. Create final transport file                     */
/******************************************************/

/***********************************/
/* 1. Get subset of the demog info */
/***********************************/
code to do the demog subseting

/***********************************/
/* 2. Get subset of the lab info   */
/***********************************/
code to do the lab subseting

/***********************************/
/* 3. Get subset of the meds info  */
/***********************************/
code to do the meds subseting

/******************************************************/
/* 4. Merge demog, lab, and meds and transpose result */
/******************************************************/
code to do the merge and transpose

/***********************************/
/* 5. Create final transport file  */
/***********************************/
code to do the proc cport

The overview block gives any maintenance programmer a great outline of the program and also acts as some kind of index. A general rule of thumb is to have between 6 and 12 sections (yes, I know the example breaks the rules). If the code in any of the sections is large, consider using a secondary level overview block to break it down further.

This style of commenting simply follows the oft-quoted rule of divide and conquer - break down your problem into small, manageable pieces and solve each of them in turn.
Putting aside my mis-spelling of "subsetting" (albeit, I was very consistent!), this eleven year old tip is still an approach that I frequently follow today.