Back to 301

Tutorial 4:
Multiple Linear Regression 1

Last updated:
31 Mar 2005

Assumed knowledge

  • Francis Section 3.1 "Relations Between Metric Variables"

Data files

Multiple regression in action

  1. What's your life expectancy? Work it out using this Life Expectancy Calculator
    1. According to this multiple regression equation, what could you do to improve your life expectancy?
    2. Estimate the unstandardised regression coefficients for each of the variables you could change in order to increase your life expectancy.
  2. A psychologist studying perceived "quality of life" in a large number of cities (N = 150) came up with the following equation using mean temperature (Temp), median income in $1000 (Income), per capita expenditure on social services (SocSer), and population density (Popul) as predictors:
    Y (predicted) = 5.37 - 0.01Temp + 0.05Income + 0.003SocSer - 0.01Popul
    1. Interpret the regression equation in terms of the coefficients - in other words, what is the effect of each of the IVs on the Y
    2. Assume a city has a mean temperature of 55 degrees, a median income of $12,000, spends $500 per capita on social services, and has a population density of 200 people per block.  What is the predicted Quality of Life score?
    3. What would we predict in a different city that is identical in every way except that is spends $100 per capita on social services?
      (see Howell, p.550-551 for answers)

General advice

The general recommended strategy for tackling Multiple Linear Regression analyses is:

  1. Check assumptions (see below)
  2. Conduct a multiple linear regression (standard, hierarchical, stepwise, forward, or backward)
  3. Interpret the technical and psychological meaning of the results, based on:
    1. R, R2, Adjusted R2, the statistical significance of R
    2. Changes in R and the significance of the changes if steps (i.e., more than 1 model are used)
    3. Standardised and unstandardised regression coefficients for each model
    4. Zero-order and partial correlations for each IV in each model
  4. If useful, interpret Y-intercept and write a regression equation for predicting Y

Checking Assumptions

  • Check histograms of all variables in an analysis
    (are the variables normally distributed?)
  • Check scatterplots of the relation between each X variable and the Y variable
    (are the relationships linear?  is there homoscedasticity?)
  • Check correlation table for linear relations between Xs and Y
    (are the X-Y relationships linear?  check for multicollinearity between Xs?)
  • Check influential outlying cases using Mahalanobis distance & Cook’s D.
    • In the Linear Regression box, click on Save and select Mahalanobis and Cooks. SPSS will create new variables in your data file called mah_1 and coo_1 once you run the analysis.
    • In your output check the Residuals Statistics table for the maximum Mahalanobis distance and Cook’s distance.
    • The maximum Mahalanobis distance should not be greater than the critical chi-squared value with degrees of freedom equal to number of predictors & alpha =.001.
    • Cook’s D should not be greater than 1. If you detect any outliers on either measure, consider removing the case from you analysis.
  • In your output check the collinearity statistics in the Coefficients table. The Variance Inflation Factor (VIF) should be <3 and tolerance should be >.3.

Francis exercises

  • 5.1 (Worked example)

  • Exercises

    • 5.1
    • 5.2
    • 5.3
    • 5.4

Multiple regression in excel