统计建模与R软件第六章课后习题答案.docx
- 文档编号:22790127
- 上传时间:2023-04-28
- 格式:DOCX
- 页数:21
- 大小:79.83KB
统计建模与R软件第六章课后习题答案.docx
《统计建模与R软件第六章课后习题答案.docx》由会员分享,可在线阅读,更多相关《统计建模与R软件第六章课后习题答案.docx(21页珍藏版)》请在冰豆网上搜索。
统计建模与R软件第六章课后习题答案
统计建模与R软件第六章习题答案(回归分析)
Ex6.1
(1)
>x<-c(5.1,3.5,7.1,6.2,8.8,7.8,4.5,5.6,8.0,6.4)
>y<-c(1907,1287,2700,2373,3260,3000,1947,2273,3113,2493)
>plot(x,y)
由此判断,Y和X有线性关系。
(2)
>lm.sol<-lm(y~1+x)
>summary(lm.sol)
Call:
lm(formula=y~1+x)
Residuals:
Min 1Q Median 3Q Max
-128.591 -70.978 -3.727 49.263 167.228
Coefficients:
EstimateStd.ErrortvaluePr(>|t|)
(Intercept) 140.95 125.11 1.127 0.293
x 364.18 19.26 18.9086.33e-08***
---
Signif.codes:
0'***'0.001'**'0.01'*'0.05'.'0.1''1
Residualstandarderror:
96.42on8degreesoffreedom
MultipleR-squared:
0.9781, AdjustedR-squared:
0.9754
F-statistic:
357.5on1and8DF, p-value:
6.33e-08
回归方程为Y=140.95+364.18X
(3)
β1项很显著,但常数项β0不显著。
回归方程很显著。
(4)
>new<-data.frame(x=7)
>lm.pred<-predict(lm.sol,new,interval="prediction")
>lm.pred
fit lwr upr
12690.2272454.9712925.484
故Y(7)=2690.227,[2454.971,2925.484]
Ex6.2
(1)
>pho<-data.frame(x1<-c(0.4,0.4,3.1,0.6,4.7,1.7,9.4,10.1,11.6,12.6,10.9,23.1,23.1,21.6,23.1,1.9,26.8,29.9),x2<-c(52,34,19,34,24,65,44,31,29,58,37,46,50,44,56,36,58,51),x3<-c(158,163,37,157,59,123,46,117,173,112,111,114,134,73,168,143,202,124),y<-c(64,60,71,61,54,77,81,93,93,51,76,96,77,93,95,54,168,99))
>lm.sol<-lm(y~x1+x2+x3,data=pho)
>summary(lm.sol)
Call:
lm(formula=y~x1+x2+x3,data=pho)
Residuals:
Min 1Q Median 3Q Max
-27.575-11.160 -2.799 11.574 48.808
Coefficients:
EstimateStd.ErrortvaluePr(>|t|)
(Intercept) 44.9290 18.3408 2.450 0.02806*
x1 1.8033 0.5290 3.409 0.00424**
x2 -0.1337 0.4440 -0.301 0.76771
x3 0.1668 0.1141 1.462 0.16573
---
Signif.codes:
0'***'0.001'**'0.01'*'0.05'.'0.1''1
Residualstandarderror:
19.93on14degreesoffreedom
MultipleR-squared:
0.551, AdjustedR-squared:
0.4547
F-statistic:
5.726on3and14DF, p-value:
0.009004
回归方程为y=44.9290+1.8033x1-0.1337x2+0.1668x3
(2)
回归方程显著,但有些回归系数不显著。
(3)
>lm.step<-step(lm.sol)
Start:
AIC=111.2
y~x1+x2+x3
DfSumofSq RSS AIC
-x2 1 36.0 5599.4 109.3
-x3 1 849.8 6413.1 111.8
-x1 1 4617.810181.2 120.1
Step:
AIC=109.32
y~x1+x3
DfSumofSq RSS AIC
-x3 1 833.2 6432.6 109.8
-x1 1 5169.510768.9 119.1
>summary(lm.step)
Call:
lm(formula=y~x1+x3,data=pho)
Residuals:
Min 1Q Median 3Q Max
-29.713-11.324 -2.953 11.286 48.679
Coefficients:
EstimateStd.ErrortvaluePr(>|t|)
(Intercept) 41.4794 13.8834 2.988 0.00920**
x1 1.7374 0.4669 3.721 0.00205**
x3 0.1548 0.1036 1.494 0.15592
---
Signif.codes:
0'***'0.001'**'0.01'*'0.05'.'0.1''1
Residualstandarderror:
19.32on15degreesoffreedom
MultipleR-squared:
0.5481, AdjustedR-squared:
0.4878
F-statistic:
9.095on2and15DF, p-value:
0.002589
x3仍不够显著。
再用drop1函数做逐步回归。
>drop1(lm.step)
Singletermdeletions
Model:
y~x1+x3
DfSumofSq RSS AIC
x1 1 5169.510768.9 119.1
x3 1 833.2 6432.6 109.8
可以考虑再去掉x3.
>lm.opt<-lm(y~x1,data=pho);summary(lm.opt)
Call:
lm(formula=y~x1,data=pho)
Residuals:
Min 1Q Median 3Q Max
-31.486 -8.282 -1.674 5.623 59.337
Coefficients:
EstimateStd.ErrortvaluePr(>|t|)
(Intercept) 59.2590 7.4200 7.9865.67e-07***
x1 1.8434 0.4789 3.849 0.00142**
---
Signif.codes:
0'***'0.001'**'0.01'*'0.05'.'0.1''1
Residualstandarderror:
20.05on16degreesoffreedom
MultipleR-squared:
0.4808, AdjustedR-squared:
0.4484
F-statistic:
14.82on1and16DF, p-value:
0.001417
皆显著。
Ex6.3
>x<-c(1,1,1,1,2,2,2,3,3,3,4,4,4,5,6,6,6,7,7,7,8,8,8,9,11,12,12,12)
>y<-c(0.6,1.6,0.5,1.2,2.0,1.3,2.5,2.2,2.4,1.2,3.5,4.1,5.1,5.7,3.4,9.7,8.6,4.0,5.5,10.5,17.5,13.4,4.5,30.4,12.4,13.4,26.2,7.4)
>plot(x,y)
>lm.sol<-lm(y~1+x)
>summary(lm.sol)
Call:
lm(formula=y~1+x)
Residuals:
Min 1Q Median 3Q Max
-9.8413-2.3369-0.0214 1.059217.8320
Coefficients:
EstimateStd.ErrortvaluePr(>|t|)
(Intercept) -1.4519 1.8353 -0.791 0.436
x 1.5578 0.2807 5.5497.93e-06***
---
Signif.codes:
0'***'0.001'**'0.01'*'0.05'.'0.1''1
Residualstandarderror:
5.168on26degreesoffreedom
MultipleR-squared:
0.5422, AdjustedR-squared:
0.5246
F-statistic:
30.8on1and26DF, p-value:
7.931e-06
线性回归方程为y=-1.4519+1.5578x,通过F检验。
常数项参数未通过t检验。
>abline(lm.sol)
>y.yes<-resid(lm.sol)
>y.fit<-predict(lm.sol)
>y.rst<-rstandard(lm.sol)
>plot(y.yes~y.fit)
>plot(y.rst~y.fit)
残差并非是等方差的。
修正模型,对相应变量Y做开方。
>lm.new<-update(lm.sol,sqrt(.)~.)
>summary(lm.new)
Call:
lm(formula=sqrt(y)~x)
Residuals:
Min 1Q Median 3Q Max
-1.54255-0.45280-0.01177 0.34925 2.12486
Coefficients:
EstimateStd.ErrortvaluePr(>|t|)
(Intercept) 0.76650 0.25592 2.995 0.00596**
x 0.29136 0.03914 7.4446.64e-08***
---
Signif.codes:
0'***'0.001'**'0.01'*'0.05'.'0.1''1
Residualstandarderror:
0.7206on26degreesoffreedom
MultipleR-squared:
0.6806, AdjustedR-squared:
0.6684
F-statistic:
55.41on1and26DF, p-value:
6.645e-08
此时所有参数和方程均通过检验。
对新模型做标准化残差图,情况有所改善,不过还是存在一个离群值。
第24和第28个值存在问题。
Ex6.4
>toothpaste<-data.frame(X1=c(-0.05,0.25,0.60,0,0.20,0.15,-0.15,0.15,0.10,0.40,0.45,0.35,0.30,0.50,0.50,0.40,-0.05,-0.05,-0.10,0.20,0.10,0.50,0.60,-0.05,0,0.05,0.55),X2=c(5.50,6.75,7.25,5.50,6.50,6.75,5.25,6.00,6.25,7.00,6.90,6.80,6.80,7.10,7.00,6.80,6.50,6.25,6.00,6.50,7.00,6.80,6.80,6.50,5.75,5.80,6.80),Y=c(7.38,8.51,9.52,7.50,8.28,8.75,7.10,8.00,8.15,9.10,8.86,8.90,8.87,9.26,9.00,8.75,7.95,7.65,7.27,8.00,8.50,8.75,9.21,8.27,7.67,7.93,9.26))
>lm.sol<-lm(Y~X1+X2,data=toothpaste);summary(lm.sol)
Call:
lm(formula=Y~X1+X2,data=toothpaste)
Residuals:
Min 1Q Median 3Q Max
-0.37130-0.10114 0.03066 0.10016 0.30162
Coefficients:
EstimateStd.ErrortvaluePr(>|t|)
(Intercept) 4.0759 0.6267 6.5041.00e-06***
X1 1.5276 0.2354 6.4891.04e-06***
X2 0.6138 0.1027 5.9743.63e-06***
---
Signif.codes:
0'***'0.001'**'0.01'*'0.05'.'0.1''1
Residualstandarderror:
0.1767on24degreesoffreedom
MultipleR-squared:
0.9378, AdjustedR-squared:
0.9327
F-statistic:
181on2and24DF, p-value:
3.33e-15
回归诊断:
>influence.measures(lm.sol)
Influencemeasuresof
lm(formula=Y~X1+X2,data=toothpaste):
dfb.1_ dfb.X1 dfb.X2 dffitcov.r cook.d hatinf
1 0.00908 0.00260-0.00847 0.01211.3665.11e-050.1681
2 0.06277 0.04467-0.06785-0.12441.1595.32e-030.0537
3 -0.02809 0.07724 0.02540 0.18581.2831.19e-020.1386
4 0.11688 0.05055-0.11078 0.14041.3776.83e-030.1843 *
5 0.01167 0.01887-0.01766-0.10371.1413.69e-030.0384
6 -0.43010-0.42881 0.45774 0.60610.8141.11e-010.0936
7 0.07840 0.01534-0.07284 0.10821.4814.07e-030.2364 *
8 0.01577 0.00913-0.01485 0.02081.2371.50e-040.0823
9 0.01127-0.02714-0.00364 0.10711.1563.95e-030.0466
10-0.07830 0.00171 0.08052 0.18901.1551.22e-020.0726
11 0.00301-0.09652-0.00365-0.22811.1271.76e-020.0735
12-0.03114 0.01848 0.03459 0.15421.1328.12e-030.0514
13-0.09236-0.03801 0.09940 0.22011.0711.62e-020.0522
14-0.02650 0.03434 0.02606 0.11791.2354.81e-030.0956
15 0.00968-0.11445-0.00857-0.25451.1502.19e-020.0910
16-0.00285-0.06185 0.00098-0.16081.1468.83e-030.0594
17 0.07201 0.09744-0.07796-0.10991.3644.19e-030.1731
18 0.15132 0.30204-0.17755-0.39071.0875.04e-020.1085
19 0.07489 0.47472-0.12980-0.75790.7311.66e-010.1092
20 0.05249 0.08484-0.07940-0.46600.6256.11e-020.0384 *
21 0.07557 0.07284-0.07861-0.08801.4712.69e-030.2304 *
22-0.17959-0.39016 0.18241-0.54940.9129.41e-020.1022
23 0.06026 0.10607-0.06207 0.12511.3745.42e-030.1804
24-0.54830-0.74197 0.59358 0.83710.9142.13e-010.1731
25 0.08541 0.01624-0.07775 0.13141.2495.97e-030.1069
26 0.32556 0.11734-0.30200 0.44801.0186.49e-020.1033
27 0.17243 0.32754-0.17676 0.41271.1485.66e-020.1369
>source("Reg_Diag.R");Reg_Diag(lm.sol)#薛毅老师自己写的程序
residuals1 standards2 students3hat_matrixs4 DFFITSs5
1 0.00443843 0.02753865 0.02695925 0.16811819 0.01211949
2 -0.09114255 -0.53021138 -0.52211469 0.05369239 -0.12436727
3 0.07726887 0.47112863 0.46335666 0.13857353 0.18584310
4 0.04805665 0.30111062 0.29532912 0.18427663 0.14036860
5 -0.09130271 -0.52689847 -0.51881406 0.03838430 -0.10365442
6 0.30162101 1.79287913 1.88596579 0.09362223 0.60613406
7 0.03066005 0.19855842 0.19453763 0.23641540 * 0.10824626
8 0.01199519 0.07085860 0.06937393 0.08226537 0.02077047
9 0.08491891 0.49217591 0.48426323 0.04664158 0.10711246
10 0.11625405 0.68315814 0.67537315 0.07261134 0.18897969
11-0.13874451 -0.81570765 -0.80983786 0.07348894 -0.22807820
12 0.11540228 0.67051940 0.66263761 0.05137589 0.15420864
13 0.16178406
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 统计 建模 软件 第六 课后 习题 答案