Wednesday, 1 June 2011

NOTE: Length Functions (Something Missing?)

How many functions to tell you the length of a value do you need? At least six apparently! SAS provides LENGTH, LENGTHC, LENGTHM, LENGTHN, KLENGTH and %LENGTH. Why?...

As we've all discovered to our cost, the basic LENGTH function accurately tells us the length of a character string (excluding trailing blanks) unless the string is completely blank, in which case LENGTH misleadingly returns the value 1. That's why I always use LENGTHN; it returns the value zero for a blank string.

I rarely use the others but, for the record, LENGTHC returns the length of a string including trailing blanks; but beware because it returns the value one when supplied with a null string as input.

The LENGTHM function is a slightly different beast because it returns the declared length of the variable rather than of its contents, i.e. it returns what was specified on (or implied for) the variable's LENGTH statement. KLENGTH is another oddity. In essence, it is the DBCS equivalnet of LENGTH. And %LENGTH is the macro equivalent of LENGTHN, i.e. it returns zero for a null/blank string.

Oh, there's a %KLENGTH too. And SAS/IML has a length function too, but let's not go there!

Why might we be using length functions? One popular use is to test if a variable is missing or null. For these cases, the MISSING or NMISS functions are often the best option - not least because their names make the purpose of their usage far clearer than using a length function.

The MISSING function returns 1 if the value passed to it is missing. The value passed to it can be numeric or character. A chracter string is deemed to be missing if it is all blank or has zero length. Perfect! This is a far better choice than any of the length functions if you want to test  avariable for a missing value.

NMISS returns the number of missing numeric values.

Finally, for completeness, I should mention CALL MISSING. You can use this routine to set character or numeric values to missing, though very few of us do.