Labeling, renaming and formatting variables

Labeling a variable (Section 2.8)

You can give a variable a label that describes the values in it. A label can be up to 256 characters in length. When you use PROC CONTENTS to list the variables and attributes in your SAS dataset, it includes the label as one of the variable attributes. This is a simple and useful method of documenting your datasets.

To label a variable, simply use a LABEL statement and list each of the variables you want to label. After each variable, use an equal sign, "=", to equate the label to the variable. Separate the variables by a space or carriage return. The labels themselves must be enclosed in quotation marks (single or double). After the last label, include a semi-colon to end the statement.


Example:

Suppose the dataset "data1" has 4 variables in it: ID, Student, Grade1 and Grade2.
DATA data1;
  SET data1;
  LABEL ID      = "Student identification number"
	Student = "Student name (last, first)"
	Grade1  = "Score of first exam (points out of 60)"
	Grade2  = "Score of midterm (points out of 100)"
	;
        
PROC CONTENTS DATA=data1 POSITION; 
RUN;
The keyword POSITION is optional. It will add a 2nd listing to the output for PROC CONTENTS with the variables listed in the order in which they appear in the dataset. Without the POSITION option, they appear in alphabetical order.

Renaming a variable

Sometimes it is desired to rename a variable. Perhaps you want to combine two datasets and both have a variable with the same name, but different data. Then you would want to rename the variable in one of the datasets to avoid losing it when you combine the datasets.

The RENAME statement is similar to the LABEL statement. You start with the LABEL keyword and then list the variables you want to rename. After each variable, use an equal sign, "=", to equate the new name to the variable. Then list the new variable name. Separate each variable with a space or carriage return and end the whole set with a semi-colon.


Example:

DATA data2;
  SET data1;
  RENAME Grade1 = Exam1
	 Grade2 = Midterm
	 ;
RUN;

Formats and Informats (Section 2.6 4.5-4.6)

Character: $UPCASEw., ...
Date/time: MMDDYYw., DATEw., HOURw.d, TIMEw.d, ...
Numeric: BESTw., w.d, DOLLARw.d, PERCENTw.d, ...
Notes:

To associate a format with a particular variable, you simply use a FORMAT satement and list the variables followed by the format you want associated with it. If you have several variables you want with the same format, then list them all and follow the last one with the format name.

When you format a variable in a DATA step, that format remains with the variable until you remove it. When you format a variable in a PROC step, it only is associated with the variable for the particular PROC step.

To remove a format, use a FORMAT statement and list the variable(s) you want to be "format-free" and end the list with a semi-colon. You can also change the format associated with a variable by simply redefining it. Example:

DATA data1;
  SET data1;
  FORMAT Salary dollar8.2  DT_Birth DT_Death mmddyy10. Weight 6.2 Height;
RUN;