St. 412/512 Homework Assignment 5, Due in Lab Tuesday, 5/15/01 (the week after the midterm exam!)

 

1.      Use the data in the problem “Natal dispersal distances of mammals (ex11.24) in the “New Data Problems” link from the course web page.  Do the following. (The T.A.s will demonstrate the commands needed for this exercise, in lab.)

a.    Make a matrix of scatterplots with bodymass, type, and maxdist.  Are transformations of any variables indicated on this plot and by inspection of the data? Make the necessary   
      transformations and redraw the matrix of scatterplots.

b.      Make a coded scatterplot, using text as the plotting code (by including the variable “type” as column z and specifying the text for plotting code as column z).

c.       Make a trellis plot with individual fitted lines for log(maxdist) on log(bodymass) for the three levels of the factor type.

d.       Regress log(maxdist) on log(bodymass), type (as a factor), and the interaction of log(bodymass) and type.

e.       Obtain a plot of the case influence statistics as on the lecture notes, p. 84.

f.        Refit the model without the most influential observation. (You may need to insert a new column named “index” with fill expression: 1:64.  Then in regression for “subset rows” use index != #,  where you would replace # with the number of the row you wish to omit. != means “not equal”)  Do the results change?  Take the appropriate action or inaction.

g.       Test whether the interaction term is significant, with an F-test. (This is most easily accomplished with the ANOVA output, as long as the interaction term was the last one entered in the regression equation.) 

h.       If the interaction term is non-significant, drop it from the model.

      i.  Write a summary of statistical findings.

 

2.      Using the data for problem 11.23 in the book, “Air pollution and mortality,” do the following:
a.   Create new columns for the logarithms of  NOx and SO2; consider case influence statistics (and the data description).  Do the following after resolving any influence problems:
b.   Find a p-value from the F-test for significance of either of the pollution variables (NOx or SO2) after accounting for precipitation, education, and nonwhite.

c.       Obtain a partial residual plot for log(NOx) when precipitation , education, and nonwhite are in the model (but not log(SO2)).

d.   Obtain a partial residual plot for log(SO2) when precipitation , education, and nonwhite are in the model (but not log(NOx)).

e.   Write a summary of statistical findings.

 

 

ANNOUNCEMENT

Schafer’s office hour Thursday must be moved to 1:00pm-2:00 to accommodate a PhD thesis Defense.