Tuesday, 24 May 2011

NOTE: Coupling, Bad

In my recent article on %MEND I said I didn’t like nested macro definitions. Some of my correspondents have suggested it’s a good means of keeping macro code near to where it’s called. I think this suggests a bad approach; namely, the inner workings of the sub-macro should not be relevant to the caller – the important element is the interface and outward behaviour, thus the definition of the sub-macro need not be located near to where it's used. (Plus, if it's used in more than one place it cannot possibly be located near to both.)

The design principle involved here is of "loose coupling". Design good practice suggests that objects, components and modules in your applications should make use of little or no knowledge of the internal implementation of other objects, components and modules with which they interact. Designing a loosely-coupled system provides the benefit of making it easier to make changes to one object without impacting another (if the interface and outward behaviour are not changed); this, in-turn, means your applications become more maintainable and reliable.

For example, I am loosely-coupled with my car, i.e. I have little or no understanding of how it works beyond my knowledge of its steering wheel, gear lever and pedals (its interface). I don't need any knowledge of how the engine or gearbox work in order to drive it. The advantage to me is that I can hop into almost any car and drive it just as effectively; the advantage for the car manufacturer is they can sell their cars to a wide range of people without needing to train them on how to use the specific model of car.

For macros in particular, it is very easy to write something that requires the caller to know things about the inner workings of the macro, e.g. the macro may expect certain global macro variables to be defined, or it may write its output to other global macro variables, or it may read/write to/from specific data sets. It is so much easier to define these things as part of the parameter interface for the macro, then it is so much easier to understand what the macro wants as input and what it might provide as output. Consider these two macros:


No parameter interfaceClear parameter interface
%macro demo;
  data beta;
    set alpha;
    x = 2 * x;
  run;
%mend;


Called thus:

%demo;
%macro doubler(data=,out=,var=);
  data &out;
    set &data;
    &var = 2 * &var;
  run;
%mend doubler;


Called thus:

%doubler(data=first
        ,out=second
        ,var=profit);

The second, with the clear parameter interface, does not demand that its user knows the names of the input and output data sets; the interface and function of the macro are already clear. Thus it is easier to enhance the macro without "breaking" any code that uses it.

The example is just a simple one, but the principle has greater and greater value as your applications and their components get larger. The topic is much bigger than I can describe here. It's difficult finding references that don't go into (non-SAS) coding examples. If you're brave, you can try Martin Fowlers' classic Reducing Coupling from 2002; else take a look at Coupling and Cohesion in the C2 wiki.