Creating and reducing observations with the
OUTPUT statement and PROC TRANSPOSE
(Section 5.12 & 5.13)

The OUTPUT statement can also be used to create several new observations from a single observation in the original dataset, or to generate only one observation from several.


Example 1: Creating several observations from one

data1:
ID	Class	Name	 Test1	Test2  
1	  A	John	  93	 85   
2	  C	Jill 	  86 	 75  
3	  B	Paul	  74	 88  
4	  C	Andrea	  85	 .  



DATA new;
  set data1;
  Test = 1; Score=test1; OUTPUT;
  Test = 2; Score=test2; OUTPUT;
  DROP test1 test2;
RUN;


new:
ID	Class	Name	 Test	Score  
1	  A	John	  1	 93   
1	  A	John	  2	 85   
2	  C	Jill 	  1 	 86  
2	  C	Jill 	  2 	 75  
3	  B	Paul	  1	 74  
3	  B	Paul	  2	 88  
4	  C	Andrea	  1	 85  
4	  C	Andrea	  2	 .  

This example creates 2 observations in the dataset new for every observation in the dataset data1. In data1, all test scores for each student was in 2 variables in the same observation. In new, each observation contains the score of a single test. In addition there is a variable to identify which test the score corresponds to.


Example 2: Creating one observation from several
data1:
ID	Class	Name	 Test	Score  
1	  A	John	  1	 93   
1	  A	John	  2	 85   
2	  C	Jill 	  1 	 86  
2	  C	Jill 	  2 	 75  
3	  B	Paul	  1	 74  
3	  B	Paul	  2	 88  
4	  C	Andrea	  1	 85  


DATA new;
  set data1;
  BY id;
  retain test1 test2;
  ARRAY testvar (*) test1-test2;

  IF FIRST.id THEN testnum = 1;
              ELSE testnum + 1;
  testvar(testnum) = score;
  IF LAST.id  THEN OUTPUT;
  DROP test score;

RUN;

new:
ID	Class	Name	 Test1	Test2  
1	  A	John	  93	 85   
2	  C	Jill 	  86 	 75  
3	  B	Paul	  74	 88  
4	  C	Andrea	  85	 .  


This example does the opposite of the previous example, actually taking the resulting dataset and converting it back to the original dataset. Note how FIRST. is used to identify the first observation for an ID and LAST. is used to identify the last observation.


There is actually a SAS procedure that will do the same thing as these two types of the OUTPUT keyword does. It is the TRANSPOSE procedure.


PROC TRANSPOSE

Syntax

PROC TRANSPOSE DATA=olddataset OUT=newdataset;
  BY byvar-list;
  ID variable;
  VAR var-list;
RUN; 

BY statement: This statement will preserve any groupings you have in your dataset and transpose within each group.

ID statement: The variable in this statement tells SAS the variable whose values are the names of the new variables.

VAR statement: These are the variables you want to transpose. The names of the variables will appear as the values of a new variable called _NAME_.



Example 3: PROC TRANSPOSE
data1:
ID	Class	Name	 Test1	Test2  
1	  A	John	  93	 85   
2	  C	Jill 	  86 	 75  
3	  B	Paul	  74	 88  
4	  C	Andrea	  85	 .  


PROC TRANSPOSE DATA=data1 OUT=new (rename=(_name_=test col1=score);
  BY id class name;
  VAR test1 test2;
RUN;


new:
ID	Class	Name	 Test	Score  
1	  A	John	 Test1	 93   
1	  A	John	 Test2	 85   
2	  C	Jill 	 Test1 	 86  
2	  C	Jill 	 Test2 	 75  
3	  B	Paul	 Test1	 74  
3	  B	Paul	 Test2	 88  
4	  C	Andrea	 Test1	 85  



Example 4: PROC TRANSPOSE
PROC TRANSPOSE DATA=new OUT=data2 (rename=(col1=test1 col2=test2) drop=_name_);
  BY id class name;
  VAR score;
RUN;
PROC TRANSPOSE has a few other statements and options available to it. See the SAS documentation for more information.