书签分享收藏举报版权申诉 / 20

立即下载加入VIP,免费下载

当前位置：首页 > 初中教育 > 数学 > 用R语言做非参数.docx

用R语言做非参数.docx

文档编号：7945425
上传时间：2023-01-27
格式：DOCX
页数：20
大小：120.06KB

用R语言做非参数.docx

《用R语言做非参数.docx》由会员分享，可在线阅读，更多相关《用R语言做非参数.docx（20页珍藏版）》请在冰豆网上搜索。

用R语言做非参数.docx

用R语言做非参数

用R语言做非参数&半参数回归

笔记

由詹鹏整理，仅供交流和学习

根据南京财经大学统计系孙瑞博副教授的课件修改，在此感谢孙老师的辛勤付出！

教材为：

LukeKeele:

SemiparametricRegressionfortheSocialSciences.JohnWiley&Sons,Ltd.2008.

-------------------------------------------------------------------------

第一章introduction:

GlobalversusLocalStatistic

一、主要参考书目及说明

1、Hardle（1994）.AppliedNonparameticRegresstion.较早的经典书

2、Hardleetc（2004）.Nonparametricandsemiparametricmodels:

anintroduction.Springer.结构清晰

3、LiandRacine（2007）.Nonparametriceconometrics:

TheoryandPractice.Princeton.较全面和深入的介绍，偏难

4、PaganandUllah（1999）.NonparametricEconometrics.经典

5、Yatchew（2003）.SemiparametricRegressionfortheAppliedEconometrician.例子不错

6、高铁梅（2009）.计量经济分析方法与建模：

EVIEWS应用及实例（第二版）.清华大学出版社.（P127/143）

7、李雪松（2008）.高级计量经济学.中国社会科学出版社.（P45ch3）

8、陈强（2010）.高级计量经济学及Stata应用.高教出版社.（ch23/24）

【其他参看原ppt第一章】

二、内容简介

方法：

——移动平均（movingaverage）

——核光滑（Kernelsmoothing）

——K近邻光滑（K-NN）

——局部多项式回归（LocalPolynormal）

——LoesssandLowess

——样条光滑（SmoothingSpline）

——B-spline

——FriedmanSupersmoother

模型：

——非参数密度估计

——非参数回归模型

——时间序列的半参数模型

——Paneldata的半参数模型

——QuantileRegression

三、不同的模型形式

1、线性模型linearmodels

2、Nonlinearinvariables

3、Nonlinearinparameters

四、数据转换Powertransformation（对参数方法）

IntheGLMframework,modelsareequallyprone（倾向于）tosomemisspecification（不规范）fromanincorrectfunctionalform.

Itwouldbeprudent（谨慎的）totestthattheeffectofanyindependentvariableofamodeldoesnothaveanonlineareffect.Ifitdoeshaveanonlineareffect,analystsinthesocialscienceusuallyrelyonPowerTransformationstoaddressnonlinearity.

[ADD:

检验方法见SanfordWeisberg.AppliedLinearRegression（ThirdEdition）.AJohnWiley&Sons,Inc.,Publication.（本科的应用回归分析课教材）]

----------------------------------------------------------------------------

第二章NonparametricDensityEstimation

非参数密度估计

一、三种方法

1、直方图Hiatogram

2、Kerneldensityestimate

3、Knearest-neighborsestimate

二、Histogram对直方图的一个数值解释

Supposex1,…xN–f（x）,thedensityfunctionf（x）isunknown.

Onecanusethefollowingfunctiontoestimatef（x）

【与x的距离小于h的所有点的个数】

三、Kerneldensityestimate

Bandwidth:

h;Windowwidth:

2h.

1、Kernelfunction的条件

ThekernelfunctionK（.）isacontinuousfunction,symmetric（对称的）aroundzero,thatintegrates（积分）tounityandsatisfiesadditionalboundedconditions:

（1）K（）issymmetricaround0andiscontinuous;

（2）

;

（3）Either

（a）K（z）=0if|z|>=z0forz0

Or

（b）|z|K（z）à0as

;

（4）

where

isaconstant.

2、主要函数形式

3、置信区间

其中，

4、窗宽的选择

实际应用中，

。

其中，s是样本标准差，iqr是样本分位数级差（interquartilerange）

四、Knearest-neighborsestimate

五、R语言部分

da<-read.table（"PSID.txt",header=TRUE）

lhwage<-da$lhwage

#***bandwidth相等，核函数不同***

den1<-density（lhwage,bw=0.45,kernel="epan"）

den2<-density（lhwage,bw=0.45,kernel="gauss"）

den3<-density（lhwage,bw=0.45,kernel="biwe"）

den4<-density（lhwage,bw=0.45,kernel="rect"）

plot（den4,lty=4,main="",xlab="LogHourlyWage",ylab="Kerneldensityestimates"）

lines（den3,lty=3,col="red"）

lines（den2,lty=2,col="green"）

lines（den1,lty=1,col="blue"）

#***bandwidth不相等，核函数也不同***

den5<-density（lhwage,bw=0.545,kernel="epan"）

den6<-density（lhwage,bw=0.246,kernel="gauss"）

den7<-density（lhwage,bw=0.646,kernel="biwe"）

den8<-density（lhwage,bw=0.214,kernel="rect"）

plot（den8,lty=4,main="",xlab="LogHourlyWage",ylab="Kerneldensityestimates"）

lines（den7,lty=3,col="red"）

lines（den6,lty=2,col="green"）

lines（den5,lty=1,col="blue"）

----------------------------------------------------------------------------

第三章smoothingandlocalregression

一、简单光滑估计法SimpleSmoothing

1、LocalAveraging局部均值

按照x排序，将样本分成若干部分（intervalsor“bins”）；将每部分x对应的y值的均值作为f（x）的估计。

三种不同方法：

（1）相同的宽度（equalwidthbins）：

uniformlydistributed.

（2）相同的观察值个数（equalno.ofobservationsbins）：

k-nearestneighbor.

（3）移动平均（movingaverage）

K-NN：

等窗宽：

2、kernelsmoothing核光滑

其中，

二、局部多项式估计LocalPolynomialRegression

1、主要结构

局部多项式估计是核光滑的扩展，也是基于局部加权均值构造。

——localconstantregression

——locallinearregression

——lowess（Cleveland,1979）

——loess（Cleveland,1988）

【本部分可参考：

Takezana（2006）.IntroductiontoNonparametricRegression.（P1853.7andP1953.9）

ChambersandHastie（1993）.StatisticalmodelsinS.（P312ch8）】

2、方法思路

（1）对于每个xi，以该点为中心，按照预定宽度构造一个区间；

（2）在每个结点区域内，采用加权最小二乘法（WLS）估计其参数，并用得到的模型估计该结点对应的x值对应y值，作为y|xi的估计值（只要这一个点的估计值）；

（3）估计下一个点xj；

（4）将每个y|xi的估计值连接起来。

【R操作

library（KernSmooth）#函数locpoly（）

library（locpol）#locpol（）;locCteSmootherC（）

library（locfit）#locfit（）

#weightfunciton:

kernel=”tcub”.And“rect”,“trwt”,“tria”,“epan”,“bisq”,“gauss”

】

3、每个方法对应的估计形式

（1）变量个数p=0,localconstantregression（kernelsmoothing）

min

（2）变量个数p=1,locallinearregression

min

（3）Lowess（LocalWeightedscatterplotsmoothing）

p=1:

min

【还有个加权修正的过程，这里略，详见原书或者PPT】

（4）Loess（Localregression）

p=1,2:

min

【还有个加权修正的过程，这里略，详见原书或者PPT】

（5）Friedmansupersmoother

symmetrick-NN,usinglocallinearfit,

varyingspan,whichisdeterminedbylocalCV,

notrobusttooutliers,fasttocompute

supsmu（）inR

三、模型选择

需要选择的内容：

（1）窗宽thespan；

（2）多项式的度thedegreeofpolynomialforthelocalregressionmodels；（3）权重函数theweightfunctions。

【其他略】

四、R语言部分

library（foreign）

library（SemiPar）

library（mgcv）

jacob<-read.table（"jacob.txt",header=TRUE）

###############################################################################

#第一部分，简单的光滑估计

#1、KernelDensityEstimation

#IllustrationofKernelConcepts

#DefiningtheWindowWidth

attach（jacob）

x0<-sort（perotvote）[75]

diffs<-abs（perotvote-x0）

which.diff<-sort（diffs）[120]

#ApplyingtheTricubeWeight

#...Tricubefunction

tricube<-function（z）{

ifelse（abs（z）<1,（1-（abs（z））^3）^3,0）

}

#...

a<-seq（0,1,by=.1）

tricube（a）

#Figure2.5

plot（range（perotvote）,c（0,1）,xlab="PerotVote（%）",ylab="TricubeWeight",type='n',bty="l"）

abline（v=c（x0-which.diff,x0+which.diff）,lty=2）

abline（v=x0）

xwts<-seq（x0-which.diff,x0+which.diff,len=250）

lines（xwts,tricube（（xwts-x0）/which.diff）,lty=1,lwd=1）

points（x.n,tricube（（x.n-x0）/which.diff）,cex=1）

###########################################################################

#2、KernelSmoothing

###########################################################################

Figure2.6

par（mfrow=c（3,1））

plot（perotvote,chal.vote,pch=".",cex=1.95,

xlab="PerotVote（%）",ylab="Challenger'sVoteShare（%）",

main="Bandwidth=4",bty="l"）

lines（ksmooth（perotvote,chal.vote,bandwidth="4"））

plot（perotvote,chal.vote,pch=".",cex=.65,

xlab="PerotVote（%）",ylab="Challenger'sVoteShare（%）",

main="Bandwidth=8",bty="l"）

lines（ksmooth（perotvote,chal.vote,kernel="box",bandwidth="8"）,lty=1）

plot（perotvote,chal.vote,pch=".",cex=.65,

xlab="PerotVote（%）",ylab="Challenger'sVoteShare（%）",

main="Bandwidth=12",bty="l"）

lines（ksmooth（perotvote,chal.vote,bandwidth="12"）,lty=1）

#*******Kernelsmoothing中选取box和normal核函数的比较，带宽相等

plot（perotvote,chal.vote,pch=".",cex=.65,xlab="PerotVote（%）",ylab="Challenger'sVoteShare（%）",main="Bandwidth=8",bty="l"）

lines（ksmooth（perotvote,chal.vote,kernel="box",bandwidth="8"）,lty=1）

lines（ksmooth（perotvote,chal.vote,kernel="normal",bandwidth="8"）,lty=2,col="red"）

##################################################################################

#第二部分，LPR模型

#DataPrepForLocalAverageRegressionStep-by-Step

cong<-as.data.frame（jacob[,2:

3]）

cong<-cong[order（cong$perotvote）,1:

2]

y<-as.matrix（cong$chal.vote）

x<-as.matrix（cong$perotvote）

n<-length（y）

#...

tricube<-function（z）{

ifelse（abs（z）<1,（1-（abs（z））^3）^3,0）

}

#...

x0<-x[75]

diffs<-abs（x-x0）

which.diff<-sort（diffs）[120]

x.n<-x[diffs<=which.diff]

y.n<-y[diffs<=which.diff]

weigh=tricube（（x.n-x0）/which.diff）

mod<-lm（y.n~x.n,weights=weigh）

#Figure2.7

plot（x,y,type="n",cex=.65,xlab="PerotVote（%）",ylab="Challenger'sVoteShare（%）",bty="l"）

abline（v=c（x0-which.diff,x0+which.diff）,lty=2）

abline（v=x0）

points（x[diffs>which.diff],y[diffs>which.diff],pch=16,cex=1,col=gray（.80））

points（x[diffs<=which.diff],y[diffs<=which.diff],cex=.85）

abline（mod,lwd=2,col=1）

text（27.5,50,expression（paste（"FittedValueofyat",x[0]）））#这里expression的用法比较有意思

arrows（25,47,15,37,code=2,length=.10）

#################################################################################

#2、NowPuttingItTogetherForLocalRegressionDemonstration.

#OLSFitforComparison

ols<-lm（chal.vote~perotvote,data=jacob）

#Theloessfit

model.loess<-loess（chal.vote~perotvote,data=jacob,span=0.5）

#***默认设置degree=2，family=gauss,tricube加权***

n<-length（chal.vote）

x.loess<-seq（min（perotvote）,max（perotvote）,length=n）

y.loess<-predict（model.loess,data.frame（perotvote=x.loess））#得到预测值便于比较

#Thelowessfit

model.lowess<-lowess（chal.vote~perotvote,data=jacob,f=0.5）

#***默认设置robustlineartricube加权***

n<-length（chal.vote）

x.lowess<-seq（min（perotvote）,max（perotvote）,length=n）

y.lowess<-predict（model.lowess,data.frame（perotvote=x.lowess））#得到预测值便于比较

#Figure2.8

plot（perotvote,chal.vote,pch=".",

ylab="Challengers'VoteShare（%）",xlab="VoteforPerot（%）",bty="l"）

lines（x.loess,y.loess）

lines（x.lowess,y.lowess）

abline（ols）

legend（15,20,c（"Loess","Lowess","OLS"）,lty=c（1,2,1）,bty="n",cex=.8）

#################################################################################

#3、lowess中不同robust的比较

m1.lowess<-lowess（perotvote,chal.vote,f=0.5,iter=0）

#***没有进行第二步的robust加权估计***

m2.lowess<-lowess（perotvote,chal.vote,f=0.5）

#***默认iter=3，要进行3次robust加权估计***

m0.loess<-loess（chal.vote~perotvote,data=jacob,

span=0.5,degree=1,family="symm",iter

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

下载	加入VIP,免费下载

版权申诉 word格式文档无特别注明外均可编辑修改；预览文档经过压缩，下载后原文更清晰！ 立即下载

配套讲稿：: 如PPT文件的首页显示word图标，表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
特殊限制：: 部分文档作品中含有的国旗、国徽等图片，仅作为作品整体效果示例展示，禁止商用。设计者仅对作品中独创性部分享有著作权。
关键词：: 语言参数

冰豆网所有资源均是用户自行上传分享，仅供网友学习交流，未经上传用户书面授权，请勿作他用。

关于本文

本文标题：用R语言做非参数.docx
链接地址：https://www.bdocx.com/doc/7945425.html

用R语言做非参数.docx

热门标签