7主成分分析.docx
- 文档编号:3302185
- 上传时间:2022-11-21
- 格式:DOCX
- 页数:14
- 大小:92.63KB
7主成分分析.docx
《7主成分分析.docx》由会员分享,可在线阅读,更多相关《7主成分分析.docx(14页珍藏版)》请在冰豆网上搜索。
7主成分分析
主成分分析
例1:
为了研究我国31个省、市、自治区2001年城镇居民生活消费的分布规律。
用主成分分析法对我国31个省、市、自治区2001年城镇居民生活消费水平作分析评价,并根据因子得分和综合得分对各省、市、自治区的人均消费水平进行综合分析。
数据集:
d7.2。
x1:
人均食品支出x2:
人均衣着商品支出x3:
人均家庭设备用品及服务支出x4:
人均医疗保健支出x5:
人均交通和通信支出x6:
人均娱乐教育文化服务支出x7:
人均居住支出x8:
人均杂项商品和服务支出。
(1)计算相关矩阵
>X=read.table("clipboard",header=T)
>Z<-scale(X)#对数据标准化
>cor(Z)
X1X2X3X4X5X6X7X8
X11.00000000.25941800.65521490.56617540.88663890.83086390.60200300.8493212
X20.25941801.00000000.21388880.33592520.23856230.2752803-0.19671620.3551320
X30.65521490.21388881.00000000.70239460.63466130.81484360.53468970.6190994
X40.56617540.33592520.70239461.00000000.56598770.76886160.28132730.7111310
X50.88663890.23856230.63466130.56598771.00000000.75907750.72035730.8769794
X60.83086390.27528030.81484360.76886160.75907751.00000000.56233730.8045281
X70.6020030-0.19671620.53468970.28132730.72035730.56233731.00000000.4654599
X80.84932120.35513200.61909940.71113100.87697940.80452810.46545991.0000000
(2)求相关矩阵的特征值和主成分负荷
>PCA=princomp(Z,cor=T)#主成分分析
>PCA#特征值
Call:
princomp(x=Z,cor=T)
Standarddeviations:
Comp.1Comp.2Comp.3Comp.4Comp.5Comp.6Comp.7
2.27871341.12275560.80440590.62313430.48439130.38235580.2964918
Comp.8
0.2068370
8variablesand31observations.
>PCA$loadings#系数矩阵的特征向量
Loadings:
Comp.1Comp.2Comp.3Comp.4Comp.5Comp.6Comp.7Comp.8
X1-0.4000.3010.1330.492-0.2150.604-0.274
X2-0.1410.7520.358-0.488-0.183-0.103
X3-0.363-0.492-0.4920.3210.526
X4-0.3420.262-0.5350.328-0.521-0.1160.367
X5-0.401-0.1350.377-0.1810.3440.1100.714
X6-0.410-0.2110.286-0.618-0.4630.329
X7-0.288-0.5760.140-0.427-0.485-0.222-0.310
X8-0.3990.1070.2150.4550.322-0.521-0.447
(3)确定主成分
>summary(PCA)#方差贡献率与累计方差贡献率
Importanceofcomponents:
Comp.1Comp.2Comp.3Comp.4Comp.5
Standarddeviation2.27871341.12275560.80440590.623134320.48439131
ProportionofVariance0.64906680.15757250.08088360.048537050.02932937
CumulativeProportion0.64906680.80663940.88752300.936060020.96538939
Comp.6Comp.7Comp.8
Standarddeviation0.382355760.296491850.206836981
ProportionofVariance0.018274490.010988430.005347692
CumulativeProportion0.983663880.994652311.000000000
>screeplot(PCA,type="lines")
(4)主成分得分
>PCA$scores#主成分得分
Comp.1Comp.2Comp.3Comp.4Comp.5Comp.6Comp.7Comp.8
北京-6.08816412.09605700-0.9677845300.257768030.00053523980.37264097-0.2593136440.101898159
天津-2.6531538-0.89692239.024*******
河北1.16213650.30058573-0.7845046910.021938940.7896821227-0.098367650.6501938650.212926202
山西1.64997150.43010054-0.4609584740.406461780.36908743710.07020928-0.2045883600.050284508
内蒙古1.63144620.576088230.4412318530.067157040.2598856038-0.12973344-0.5102296070.174135101
辽宁1.24292820.75205167-0.0517530550.335189420.42128335120.498941810.4306594410.003292285
吉林1.64593490.253548500.1121234550.219030960.45798756440.325788580.139150622-0.007332857
黑龙江1.81627840.31636032-0.2549676500.258323680.54303707630.429920660.2619952550.089928521
上海-5.9388303-0.161270860.4130648011.23263843-0.57795649550.738339440.2395402300.019023710
江苏-0.16827820.03012142-0.233423513-0.26671292-0.8701181382-0.656790590.0211919740.137926258
浙江-4.41783770.39587163-0.969755399-0.75820971-0.0758300384-0.568478980.052117591-0.286139927
安徽1.8800460-0.387295380.304359563-0.03960882-0.87859491130.265983940.1469057610.144447324
福建-0.4665698-0.901744900.729825867-0.32744781-0.3734575948-0.268796680.652548503-0.146259386
江西2.5741394-1.495449120.305732170-0.13992335-0.47006093920.05081704-0.2697751230.013447409
山东0.10424081.12235349-0.187443665-0.81020523-0.19912895100.07042515-0.3530621970.196000659
河南1.8817327-0.80438760-0.167097210-0.568442250.90283196180.130391240.125025151-0.408239412
湖北1.1608958-0.213437630.346801704-0.80620043-0.04527730920.72568824-0.295122773-0.070358039
湖南-0.4165050-0.44372643-0.001396215-0.473405710.24321477560.40431003-0.4676587370.040373645
广东-4.6096563-3.092098731.5170869760.335425041.0428558256-0.41620859-0.2035442370.394551345
广西0.2393494-1.95810917-0.100413776-0.46735567-0.10073353090.06618172-0.285896780-0.068351777
海南1.7618874-1.80161082-0.1202254341.35763576-0.6997838002-0.108530580.017022890-0.187015549
重庆-0.44258090.03293607-0.152221620-0.64286419-0.44871454200.585292000.1711942460.123427881
四川0.5004198-0.41276027-0.203583270-0.19934662-0.38678668910.116016980.065305904-0.156010593
贵州1.93292340.06752372-0.0033148390.10134689-0.5829378816-0.215881800.1909032550.396356661
云南0.10844110.119150420.4672132050.856135200.4402763200-0.026407830.005888688-0.326523841
西藏-0.20212152.590814383.214746408-0.43987122-0.1134174018-0.367907570.166188437-0.133106158
陕西0.7689799-0.20239904-1.1510123130.01361699-0.0680352499-0.37792016-0.1113889710.102067293
甘肃1.28753370.80112651.023*******
青海0.67065710.97432899-0.1687851551.32837316-0.0563299105-0.45636528-0.471031209-0.401834404
宁夏0.75379041.01381166-0.7181719660.349702050.3157298525-0.702873870.1725638880.326958031
新疆0.62996510.898382060.261583420-0.697345650.1596741511-0.09971596-0.256603285-0.023500314
(5)计算得分排名,用自编函数mvstats
>library(mvstats)
>princomp.rank(PCA,m=2,plot=T)
Comp.1Comp.2PCrank
北京-6.08816412.09605700-4.48942192
天津-2.6531538-0.89692239-2.31008375
河北1.16213650.300585730.993837320
山西1.64997150.430100541.411676426
内蒙古1.63144620.576088231.425287927
辽宁1.24292820.752051671.147038222
吉林1.64593490.253548501.373939925
黑龙江1.81627840.316360321.523277729
上海-5.9388303-0.16127086-4.81021611
江苏-0.16827820.03012142-0.129521910
浙江-4.41783770.39587163-3.47750634
安徽1.8800460-0.387295381.437133428
福建-0.4665698-0.90174490-0.55157896
江西2.5741394-1.495449121.779167831
山东0.10424081.122353490.303123512
河南1.8817327-0.804387601.357014024
湖北1.1608958-0.213437630.892427419
湖南-0.4165050-0.44372643-0.42182257
广东-4.6096563-3.09209873-4.31320993
广西0.2393494-1.95810917-0.18991209
海南1.7618874-1.801610821.065777821
重庆-0.44258090.03293607-0.34969138
四川0.5004198-0.412760270.322035113
贵州1.93292340.067523721.568527930
云南0.10844110.119150420.110533111
西藏-0.20212152.590814380.343463014
陕西0.7689799-0.202399040.579226415
甘肃1.28753370.801126511.192516723
青海0.67065710.974328990.729977717
宁夏0.75379041.013811660.804584118
新疆0.62996510.898382060.682398916
以第一主成分为横轴,第二主成分为纵轴,绘制各省、市、自治区的成分图。
在日常必需消费主成分C1上得分最高的前五个省、市、自治区依次是上海、北京、广东、浙江和天津,且上海、北京和广东绝对值明显高于其他省、市、自治区。
这就是说,以食品和交通通信等为主的日常必需消费而言,北京、广东和上海的消费水平远远高于其他省、市、自治区;而江西和贵州在这方面的消费相对较低。
西藏、北京和山东在主成分C2上的得分较高,可见这些地区人们用于衣着和住房方面的消费支出不小,西藏排到全国最前,主要是从人均来说,西藏在这方面占有优势。
对衣着因子而言,西藏、北京的得分最高,得分较低的是广东、广西和海南。
这说明衣着因子受气候影响最大,北部、西北部省、市、自治区的人们为了御寒,因此在这方面的支出较多。
其次影响衣着因子的就是各地人们的衣着习惯了,例如天津和广东,他们的经济都比较发达。
但排名却较后,根据资料可知,天津虽和北京一样同为直辖市,且与北京相邻,但由于衣着习惯的原因,北京人是非常注重衣着的,而天津人就没有北京人那么注重着装,因而他的衣着因子得分较低。
同样的道理,同为经济发达地区的广东和上海相比,上海人的穿着就比广东人要讲究的多,广东人平时的穿着很随意,因而该省人们用于衣着方面的人均消费支出相对较少也就不足为奇了。
就综合得分来看,上海、北京、广东、浙江、天津这五个省、市的得分最高,江西、贵州、黑龙江得分位于全国之末,故可知上海、北京、广东、浙江、天津这五个省、市的综合人均消费水平居于全国水平前列,江西、贵州、黑龙江省的综合人均消费水平居于全国水平之末。
例2:
用主成分分析法对我国31个省、市、自治区2006年城镇居民生活消费水平作分析评价.
数据集:
e8.3
>X=read.table("clipboard",header=T)
>Z<-scale(X)
>cor(Z)
X1X2X3X4X5X6X7X8
X11.00000000.64233190.88423600.92814090.87552260.83644310.75135190.9296232
X20.64233191.00000000.80554960.79206700.86296160.82529510.88254570.6857909
X30.88423600.80554961.00000000.93397950.88753550.83310690.83080870.8359878
X40.92814090.79206700.93397951.00000000.88733760.89416520.84040400.8651084
X50.87552260.86296160.88753550.88733761.00000000.93114420.91258100.8913567
X60.83644310.82529510.83310690.89416520.93114421.00000000.91972040.8529065
X70.75135190.88254570.83080870.84040400.91258100.91972041.00000000.7714116
X80.92962320.68579090.83598780.86510840.89135670.85290650.77141161.0000000
>PCA=princomp(Z,cor=T)
>PCA
Call:
princomp(x=Z,cor=T)
Standarddeviations:
Comp.1Comp.2Comp.3Comp.4Comp.5Comp.6Comp.7
2.63710510.71711030.46015580.35626690.28227710.22867160.2120411
Comp.8
0.1258082
8variablesand31observations.
>PCA$loadings
Loadings:
Comp.1Comp.2Comp.3Comp.4Comp.5Comp.6Comp.7Comp.8
X1-0.349-0.5140.1900.4890.577
X2-0.3300.6070.2500.4570.3880.1500.273
X3-0.3570.628-0.426-0.527
X4-0.363-0.1640.348-0.3750.447-0.1710.214-0.552
X5-0.369-0.2170.264-0.2380.6830.132-0.449
X6-0.3610.120-0.428-0.4290.3480.201-0.5180.241
X7-0.3510.393-0.238-0.339-0.538-0.4210.290
X8-0.348-0.396-0.3660.527-0.496-0.212-0.136
Comp.1Comp.2Comp.3Comp.4Comp.5Comp.6Comp.7Comp.8
SSloadings1.0001.0001.0001.0001.0001.0001.0001.000
ProportionVar0.1250.1250.1250.1250.1250.1250.1250.125
CumulativeVar0.1250.2500.3750.5000.6250.7500.8751.000
>summary(PCA)
Importanceofcomponents:
Comp.1Comp.2Comp.3
Standarddeviation2.63710510.717110270.46015579
ProportionofVariance0.86929040.064280890.02646792
CumulativeProportion0.86929040.933571290.96003921
Comp.4Comp.5Comp.6
Standarddeviation0.356266890.2822771450.228671557
ProportionofVariance0.015865760.0099600480.006536335
CumulativeProportion0.975904970.9858650230.992401358
Comp.7Comp.8
Standarddeviation0.2120411170.125808190
ProportionofVariance0.0056201790.001978463
CumulativeProportion0.9980215371.000000000
>screeplot(PCA,type="lines")
>PCA$scores
Comp.1Comp.2Comp.3Comp.4
北京-6.0830445-1.82216912-0.67496049-0.328269412
天津-0.7626158-0.921809150.358967060.409760359
河北0.8400787-
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 成分 分析