QUESTION

A sports doctor wants to investigate factors affecting pulse rate . He has conducted a small study of ten

young athletes and has collected the following data:

Pulse (Y): Number of beats per minute

Weight (X1): Weight in Kilograms

BP(X2): diastolic blood pressure

Exercise (X3): Number of minutes of exercise per day

Age (X4): Age, in years

Gender (X5): Gender = 0 for female, 1 for male

The following simple correlation matrix has been obtained using Minitab

Based on the simple correlation matrix, discuss the problem of multicollinearity in a model with Pulse as

the dependent variable and containing all of the independent variables. Specify two different pairs of

variables that should not be included in the model if one wishes to avoid multicollinearity.

The model would exhibit significant multicollinearity because several of the independent variables are

highly correlated, e.g.,

R(weight,gender) = 0.913

R(weight, exercise) = 0.812

R(age, BP) = -0.737

Therefore, weight and gender should not appear in the model together.

Also, Age and BP should not appear in the model together.

Other combinations are possible.

Consider the following regression models:

Regression Analysis: Pulse versus BP

The regression equation is

Pulse = 28.7 + 0.521 BP

Predictor Coef SE Coef T P

Constant 28.67 14.76 1.94 0.088

BP 0.5208 0.1766 2.95 0.018

S = 4.28145 R-Sq = 52.1% R-Sq(adj) = 46.1%

Analysis of Variance

Source DF SS MS F P

Regression 1 159.35 159.35 8.69 0.018

Residual Error 8 146.65 18.33

Total 9 306.00

Regression Analysis: Pulse versus BP, Weight

The regression equation is

Pulse = 22.9 + 0.723 BP – 0.150 Weight

Predictor Coef SE Coef T P

Constant 22.87 15.19 1.50 0.176

BP 0.7232 0.2421 2.99 0.020

Weight -0.1503 0.1264 -1.19 0.273

S = 4.17453 R-Sq = 60.1% R-Sq(adj) = 48.7%

Analysis of Variance

Source DF SS MS F P

Regression 2 184.01 92.01 5.28 0.040

Residual Error 7 121.99 17.43

Total 9 306.00

Perform a test of hypothesis at the 10% level of significance to determine if Weight should be added to the

model containing ‘BP’.

Do not add Weight because the t-value is t-value is – 1.19 with a p-value of 0.273 which is not significant.

The sports doctor now decides to investigate an alternative model by regressing Pulse on Age and Gender.

The model is shown below:

Regression Analysis: Pulse versus Age, Gender

The regression equation is

Pulse = 113 – 2.42 Age + 0.57 Gender

Predictor Coef SE Coef T P

Constant 112.92 12.51 9.03 0.000

Age -2.42 0.736 -3.29 0.013

Gender 0.57 2.634 0.22 0.835

S = 4.13897 R-Sq = 60.8% R-Sq(adj) = 49.6%

Analysis of Variance

Source DF SS MS F P

Regression 2 186.08 93.04 5.43 0.038

Residual Error 7 119.92 17.13

Total 9 306.00

Based on the above output, and assuming a 5% level of significance, should both variables be retained in the

model? Explain your answer.

For Age, the t-value is -2.42/0.736 = -3.288

For Gender, the t-value is 0.57/2.634 = 0.216

The critical value of t is ± t.025;7 = ± 2.365 In this model, Age is significant, Gender is not significant.

Therefore, retain Age but drop Gender.

The complete model with all five independent variables is as follows:

Regression Analysis: Pulse versus Weight, BP, Exercise, Age, Gender

The regression equation is

Pulse = – 109 + 1.27 Weight + 0.703 BP – 0.0954 Exercise + 3.62 Age – 37.5 Gender

Predictor Coef SE Coef T P

Constant -109.09 67.67 -1.61 0.182

Weight 1.27 0.47 2.72 0.053

BP 0.70 0.32 2.17 0.096

Exercise -0.10 0.08 -1.20 0.295

Age 3.62 1.86 1.95 0.123

Gender -37.52 12.25 -3.06 0.038

S = 2.76381 R-Sq = 90.0% R-Sq(adj) = 77.5%

Interpret the coefficients of Weight and Exercise in the above model.

Weight: A 1-kilogram increase in weight will cause pulse rate, on average, to increase by 1.27 assuming all

other variable are fixed.

Exercise: one extra minute of exercise daily will cause pulse rate, on average, to decrease by 0.0954,

assuming all other variables are fixed

Based on this output, write up a brief recommendation advising young athletes regarding what actions they

can take to reduce their pulse rate. Be specific in interpreting the model.

Athletes cannot change their age or gender but they can control the factors weight and exercise.

Therefore they should lose weight and exercise more

Since BP is highly correlated with weight (r = 0.703) it may not be possible to directly lower BP, but losing

weight will likely result in a lower BP.

Construct a 95% confidence interval for the change in pulse rate (Pulse) as young athletes get older.

Interpret the interval.

4

4 4 .025;4

4

3.62 2.776(1.86) 3.62 5.16

1.54 8.78

b b t s

Interpretation: Since the interval contains zero we cannot reject H0: β4 = 0.

An extra year of age may decrease pulse rate by as much as 1.54 beats per minute or increase the rate by up

to 8.78 beats per minute.