Micro-data and Aggregate Data

In almost all of your classes in high school and college, if your are asked to
do research, it means that you go read about some phenomenon.  And, if
the thing you are studying lends itself to quantitative description or analysis,
you generally only have available to you the aggregate data, presented in
tables or graphs.  For example, below is the frequency distribution (essentially,
just a table) showing the racial category of all the participants of the first wave
of the 1996 Survey of Income and Program Participation.  This is the kind of 
aggregated, descriptive data available in Census reports, journal articles, and 
books.  All you can say from this particular table is that 81.1% of the 
respondents were white, 14.2% black, etc.  You do not know any other 
characteristics of these individuals and therefore cannot comment on how 
one characteristic (in this case, race) might influence some other characteristic 
of these individuals (let's say, income.).

 

                  Frequency Distribution for RACE

                         PE: Race of this person

                                                                   Cumulative  Cumulative
         RACE             Frequency   Percent     Frequency    Percent
     --------------------------------------------------------------
     White                    308804      81.1           308804         81.1
     Black                      53888      14.2           362692         95.3
     American Indian,       4782       1.3            367474         96.5
     Asian or Pacific       13135       3.5            380609       100.0

In this course, you will have access to the individual records of real individual people.
So, for any given person you will not only have the 'race' variable, but also, the labor force
status, home-ownership status, gender, income, etc.  We call this kind of data "micro-data"
because of the individual records available for survey participants.  Such data will
be available to you in the form of a file, readable in a spread sheet as seen below:
 


Here, you can see that the first person has a score of "1" for 'race' (the person is
white), the person is 43 years old, and this person has a "2" for tenure (which is a
code for the fact that this person is a renter).  The next person is also white, is age
42, and is a renter.  The fourth person is black, age 45 and a renter.   From such 
micro data we can compute the statistics that we want.  For example, among middle
-aged adults, are there racial differences in home-ownership? We will be able to 
cluster all the people by race and age and see what percentage

are home-owners.  You won't be able to do these calculations this term, but you
can begin to imagine how you will structure your research for next term.