Wednesday, 25 May 2011

NOTE: Parameter Validation - %DATATYP

In yesterday's article on coupling, I showed how the use of macro parameters can decouple your macros, making them more maintainable and reliable. Building-in some parameter validation is always a good practice too. Some simple, basic validation can sometimes be all that's needed to reveal a problem before it gets too far into your sequence of macro calls and becomes difficult to unpick and debug.

The %DATATYP autocall macro is very useful in this area. When passed a value, it will tell you whether the string is numeric or character. This is especially useful in macro-world where all values are handled as character strings. The following code snippet gives an introduction to its usage:

%macro multiplier(data=,out=,var=,mult=);
  %if %datatyp(&data) ne CHAR %then...
  %if %datatyp(&mult) ne NUMERIC %then...

The macro is smart enough to recognise 1.23E3 as numeric, i.e. 1230.

An alternative DATA Step approach is to use the INPUT function with the question mark (?) or double question mark (??) modifier in order to avoid messages being written to the log.

/* Check that Y (a char var) contains a valid numeric value */
if input(y,??best.) eq . then
  ... <not numeric>


Whatever approach you take, a little parameter validation is better than none and will undoubtedly repay you at some point in the future.

Tuesday, 24 May 2011

NOTE: Coupling, Bad

In my recent article on %MEND I said I didn’t like nested macro definitions. Some of my correspondents have suggested it’s a good means of keeping macro code near to where it’s called. I think this suggests a bad approach; namely, the inner workings of the sub-macro should not be relevant to the caller – the important element is the interface and outward behaviour, thus the definition of the sub-macro need not be located near to where it's used. (Plus, if it's used in more than one place it cannot possibly be located near to both.)

The design principle involved here is of "loose coupling". Design good practice suggests that objects, components and modules in your applications should make use of little or no knowledge of the internal implementation of other objects, components and modules with which they interact. Designing a loosely-coupled system provides the benefit of making it easier to make changes to one object without impacting another (if the interface and outward behaviour are not changed); this, in-turn, means your applications become more maintainable and reliable.

For example, I am loosely-coupled with my car, i.e. I have little or no understanding of how it works beyond my knowledge of its steering wheel, gear lever and pedals (its interface). I don't need any knowledge of how the engine or gearbox work in order to drive it. The advantage to me is that I can hop into almost any car and drive it just as effectively; the advantage for the car manufacturer is they can sell their cars to a wide range of people without needing to train them on how to use the specific model of car.

For macros in particular, it is very easy to write something that requires the caller to know things about the inner workings of the macro, e.g. the macro may expect certain global macro variables to be defined, or it may write its output to other global macro variables, or it may read/write to/from specific data sets. It is so much easier to define these things as part of the parameter interface for the macro, then it is so much easier to understand what the macro wants as input and what it might provide as output. Consider these two macros:


No parameter interfaceClear parameter interface
%macro demo;
  data beta;
    set alpha;
    x = 2 * x;
  run;
%mend;


Called thus:

%demo;
%macro doubler(data=,out=,var=);
  data &out;
    set &data;
    &var = 2 * &var;
  run;
%mend doubler;


Called thus:

%doubler(data=first
        ,out=second
        ,var=profit);

The second, with the clear parameter interface, does not demand that its user knows the names of the input and output data sets; the interface and function of the macro are already clear. Thus it is easier to enhance the macro without "breaking" any code that uses it.

The example is just a simple one, but the principle has greater and greater value as your applications and their components get larger. The topic is much bigger than I can describe here. It's difficult finding references that don't go into (non-SAS) coding examples. If you're brave, you can try Martin Fowlers' classic Reducing Coupling from 2002; else take a look at Coupling and Cohesion in the C2 wiki.

Monday, 23 May 2011

A New Look

If you're a regular visitor to the NOTE: web site you'll instantly notice that it looks different. NOTE:'s 2nd birthday is approaching (in July) so we thought it was time for a make-over. We're pleased with the new, brighter appearance. Please tell us what you think...

Software Practice Advancement (BCS SPA)

I recently started a new contract in London. Having been based outside of London for the last couple of years, I'm being reminded of the benefits of working in the metropolis. One benefit is that I get to use public transport, thereby increasing my reading time; the second is that I am able to regularly attend the BCS SPA meeting on the first Wednesday of each month.

The British Computer Society (BCS) has many specialist sub-groups. One of these is the Software Practice Advancement (SPA) group. BCS SPA's aim is to
Share knowledge and experience about best and emerging practices for software development. In particular the group is concerned with good and efficient design, the positioning of new technologies, and the promotion of reflective, inclusive and balanced processes.
The group holds an event on the first Wednesday of every month, each with a guest speaker. These events are free to attend, and you don't need to be a member of the BCS nor SPA. I shall be attending June's event on the subject of estimation. As ever, the event will be preceded by complimentary sandwiches, and followed by beer in a nearby hostelry (usually the Coal Hole). If you are coming too, let me know.

Recent events include:
  • January - Thomas Power, CEO of Ecademy: From CSC to ORS - Recent Business Ideas on Social Marketing
  • February - Ed Seymour of Fujitsu UK: Agility and Quality in Software Development - The APT Approach
  • March - Barry Varley of ACUTEST: Successful delivery when you have no time to test
  • April - Benjamin Mitchell: Beyond Agile? Ideas & Experiences from Industry
  • May - How to design a flexible platform architecture: Lessons learned from the development of the Jazz platform
As you can see, they're a mixed bag, and some are more relevant to the SAS world than others. Nonetheless, I recommend you add yourself to the mailing list if Central London is accessible to you (you don't have to join BCS, just follow the instructions in the last paragraph of "how to join").

Tuesday, 17 May 2011

NOTE: Inadequate Mends


%macro article;
In the last couple of weeks I’ve been confronted with a lot of macro code. I love SAS Macro language (is there a cure for this?) but it drives me nuts when the name of the macro is not appended to the respective %MEND statement.

If the code contains two macro definitions, one after the other, it is very easy to accidentally overlook the end of the first and the beginning of the second, and thereby conclude that the first macro ends with the %MEND statement of the second macro. I know, I just did it!

And if the programmer has embedded macro definitions within macro definitions it can be even easier to get confused over what macro definition you’re looking at. In fact, I don’t support the idea of embedding definitions within definitions.

Every %MEND statement should have the name of the macro appended to it; what reason is there to not do this? Put this in your Coding Standards and enforce it rigidly in your Peer Reviews!

And while I'm on my soapbox, I will add that indentation is meant to show the structure of your code, i.e. blocks of lines that belong to a higher-order element. That being the case, lines of code within a macro definition should be indented from the %macro and %mend statements.
%mend article;

Monday, 9 May 2011

NOTE: Know Your Customers (Before They Know You)

I'm a fan of BBC reporter Rory Cellan-Jones's blog. Just before jetting off to SAS Global Forum in April I noticed a most interesting article from Rory entitled World Stores - searching for retail success. The article talks of the success of a UK-based internet business who specialise in niche markets. The article describes how they research the popularity of Google search keywords in order to identify potential new markets. SAS software is very good at customer insight and relationship management, but this UK success story has taken "know your customers" to new extremes.

Talking of Google, I was amused to read another article from Rory, also in March, wherein he described Microsoft's competition claim against Google in the European court. I have no beef against either party, but the irony of "little Microsoft" being bullied by "big Google" is delicious after so many competition complaints against Microsoft. How the (technology) world has changed in so few years.

NOTE: Taking a Risk?

A tweet by Manoj Kulwal (global product manager for SAS Enterprise GRC) a couple of weeks ago drew my attention to a Compliance Week article describing how the US Securities and Exchange Commision (SEC) had (for the first time) brought financial penalties against individuals.

SAS have a number of offerings that touch the Governance, Risk & Compliance (GRC) domain, but SAS Enterprise GRC has the widest scope and allows an enterprise to build a single, consolidated register of all significant GRC elements (risks, policies, audits, etc).

With the SEC applying such focused penalties, we might expect enterprises to show even greater interest in the GRC domain over the coming months and years.

Tuesday, 3 May 2011

NOTE: PROC DELETE and Other Hits From the Past

Thanks for all your good wishes with regard to passing the magic 300 subscribers watermark. Needless to say, the number plummeted the day after I posted! I feel sure it'll recover in time.

Thanks also to those who have responded to other recent posts. @LaurieFleming (Wellington, NZ) has tweeted a few times, including a response to my "NOTE: Undocumented Features - Use Them at Your Peril!" post. Laurie said:
Interesting! I'd apply your caveat to the deprecated features as well like proc delete - use it, sure, but document it!
Well said Laurie. I confess to using PROC DELETE occasionally, because its syntax is neater than PROC DATASETS, but never in production code.

It feels like PROC DELETE has been part of the SAS system since the beginning (I feel sure somebody can verify the accuracy of this assertion), but I don't remember a time when I considered it supported. It has remained a part of the SAS system, but undocumented and unsupported. Its syntax (see below) is non-standard, i.e. there's no DATA parameter.

proc delete MYLIB.MYDATA;
run;

The equivalent (documented and supported) use of PROC DATASETS necessitates more typing:

proc datasets lib=MYLIB nolist;
  delete MYDATA;
  run;
quit;

One clear advantage of PROC DATASETS is its ability to delete multiple data sets whose name uses the same prefix, e.g. MYLIB.MYDATAKENT, MYLIB.MYDATASURREY, MYLIB.MYDATASUSSEX. The colon on the end of the data set name prefix does the trick:

proc datasets lib=MYLIB nolist;
  delete MYDATA: ;
  run;
quit;

I used this technique in the "tidy" feature of the test harness macro that I described in my SAS Global Forum paper this year.

Whilst I'm strolling along memory lane, and talking of deprecated PROCs, do you remember PROC SPELL and PROC EDITOR? The copy of SAS 9.2 M3 that I have in front of me on Windows 7 (64 bit) still runs code with these PROCs.

PROC SPELL is described neatly on sasCommunity.org. It allows you to check an input text file for matches with a specified file of words, i.e. a dictionary.

PROC EDITOR has no entry on sasCommunity.org. It allows programmatic editing of a data set.

data class;
  set sashelp.class;
run;

proc editor data=work.class;
  /* Row ptr starts at 1.  */
  /* Change NAME in row 4. */
  /* Length of NAME is $8, */
  /* so no "Katherine".    */
  down 3;
  replace name="Kate";

  /* Now change NAME in row 6. */
  /* No quotes, defaults to all-caps */
  down 2;
  replace name=Wills;

  /* Finally, search NAME */
  string name;
  search 1,last "Philip";
  replace age=90; /* dob=10-jun-1921 */
                  /* One day after mine (different year!) */
run;
The example code shows the most that I can remember of the syntax. It loosely implements Microsoft DOS's Edlin editor. The example code moves the pointer (DOWN) and replaces values of specified variables (REPLACE); the code also searches a specified variable (across a specified range of rows).

An interesting PROC. Its functionality is easily reproduced in DATA step, so its loss is not great.

NOTE: SAS Professionals Convention 2011 - Registration is Open

I mentioned in a recent post that SAS Professionals Convention 2011 would be held between July 12 - 14 in Marlow. The SAS events web site has now been updated with this year's details. You can register for the event, and also respond to the call for papers if you'd like to share your experiences with fellow SAS practitioners.

Windows 7 Problem Steps Recorder

I was recently introduced to the Windows 7 Problem Steps Recorder (PSR). Wow, it's a well-hidden nugget of gold in Windows! Have you ever had the task of recording the steps you took in Windows? Perhaps to describe a problem, or maybe to create some user guide documentation, or perhaps to record your activities for audit purposes. Regulated industries such as pharmaceuticals demand that evidence of test execution be collected.

The PSR is a great tool for quickly documenting a series of steps that you take on your PC. Screenshots are automatically captured by the recorder (showing keystrokes and screen clicks) and comments can optionally be added to provide a more detailed description of what is happening. Once the recording is finished, the screenshots and comments are saved to a zip file.

To start the Problem Step Recorder, click the Start menu, then type Problem in the search field; you will see Problem Step Recorder in the results list. Click the Start Record button. It's pretty simple to use, but if you need an overview you can refer to the PSR video on Microsoft Technet; for further help, check the How Do I Use PSR article on the Microsoft web site.