数学建模竞赛化验结果的处理解题论文.docx
- 文档编号:4301117
- 上传时间:2022-11-29
- 格式:DOCX
- 页数:18
- 大小:351.47KB
数学建模竞赛化验结果的处理解题论文.docx
《数学建模竞赛化验结果的处理解题论文.docx》由会员分享,可在线阅读,更多相关《数学建模竞赛化验结果的处理解题论文.docx(18页珍藏版)》请在冰豆网上搜索。
数学建模竞赛化验结果的处理解题论文
2007年北京工业大学数学建模竞赛初赛试题B题:
化验结果的处理题解
摘要:
本文运用了距离判别和Fisher判别两种方法对问题进行分析求解,得出了我们想要的结论,即通过体内元素含量较准确的判别个体是否患有肾炎。
1、问题的提出
人们到医院就诊时,通常要化验一些指标来协助医生的诊断。
诊断就诊人员是否患肾炎时通常要化验人体内各种元素含量。
表B.1是确诊病例的化验结果,其中1-30号病例是已经确诊为肾炎病人的化验结果;31-60号病例是已经确定为健康人的结果。
表B.2是就诊人员的化验结果。
我们的问题是:
1根据表B.1中的数据,提出一种或多种简便的判别方法,判别属于患者或健康人的
方法,并检验你提出方法的正确性。
2按照1提出的方法,判断表B.2中的30名就诊人员的化验结果进行判别,判定他(她
们是肾炎病人还是健康人。
3能否根据表B.1的数据特征,确定哪些指标是影响人们患肾炎的关键或主要因素,
以便减少化验的指标。
4根据3的结果,重复2的工作。
5对2和4的结果作进一步的分析。
(表见附录
2、问题分析
1题目中表.1中给出了已经确诊为肾炎病人和健康人的各30组数据;
2每一组数据都有七个数,分别代表了Zn,Cu,Fe,Ca,Mg,K,Na在每个人体内的
量;
3第一问要求我们提出判别一个人属于患者还是健康人的方法,这就需要通过对60组
数据的分析得出健康人和肾炎患者体中这些元素量之差异,这些差异的大小又同时是解决第三问的主要影响因素;
4在寻找数据的差异时,我们用到的传统方法就是求数据的方差和均值,用excel列
表分析,用matlab作直方图分析。
5第二问最可靠的方法就是用判别分析来做,这就需要在R软件中进行一些必要的编
程和处理;
6第四问是建立在第三问的基础上的;当解决了第三问中到底是那些因素影响到了人
们患肾炎的关键时,只需要在那些主要因素中进行判断就可以省去一些复杂繁琐的步骤;
7将以上问题都解决之后,我们使用和步骤5相同的方法,使用R软件帮助我们高效
地对精简后的数据进行再次分析,并且把第二问和第四问的结果之间进行比较,观察差异和详细的分析。
8为了进一步验证我们这种做法的合理性,我们又要用C语言编一个程序,把表B2中
的数据与4中所求出各元素的均值进行比较,进行了一下直观的分析。
3.符号约定
后缀为1:
患者体内元素的含量(例如:
Zn1代表患者体内Zn的含量;
后缀为2:
健康人体内元素的含量(例如:
Zn2代表健康人体内Zn的含量;1:
患者;2:
健康人;
4.模型假设
1题中所给的内容和数据都是真实可信的;
2除了表中列出的元素外,其他元素对是否会患肾炎的影响很小;3外界条件对肾炎患者的影响不计;4没病的个体都是健康体。
5.模型建立
该问题的关键是如何判断一个人是有病的还是健康的,即这是个判别问题,可以采用统计方法中的判别分析法进行分析处理。
题目中只有两类——病体和健康体,所以可采用二类群体的判别方法。
首先考虑用一种简单而直观的判别方法——Mahalanobis距离判别。
根据两个母体样本计算出他们的均值向量和协方差阵,求取待测样本x对两个样本的Mahalanobis距离,二者取差值,判断离那个母体近似。
设x,y是从均值为μ,协方差阵为Σ的总体A中抽取的样本,则总体A内两点x
与y的Mahalanobis距离定义为(,dxy=.定义样本x与A
的Mahalanobis距
离为(,dxA=
值向量和协方差阵来代替:
设(1
1x,(1
2x,(1
1nx是来自母体A的1n个样本,(2
1x,(2
2x,……
(22
nx
是来自母体A的2n个样本,则样本的均值与协方差为(
(1
1,1,2,i
niiij
ji
x
x
inμ∧
===
=∑,
2
1
21
1
12
12
i
nijSSnn∧
==∑=
++-∑∑,((
((
1
((,1,2.i
niiiiT
ij
jjSx
x
xx
i==
--=∑对于待测样本x,如
果两个母体样本的协方差相同,由22
(,(,dxBdxA-得到判别函数为
1(1
(2
(((Txxxx
x
ω∧
-=-∑-,其中(1
(2
2
xxx+=
其判别准则是,(0,
(0.AxxBxωω
∧
∧
⎧
≥⎪∈⎨⎪≥⎩。
如果两个母体样本协方差不同,即1212,μμ≠∑≠∑,对于样本x判别函数定义为:
1
1
(2
(2
(1
(1
2
1
((
((
(T
T
xxx
xx
xx
xx
ω--∧∧∧
=-----∑
∑
(
(
(
(
1
11((,1,21
1
i
niiiiT
jjii
jiixx
xx
Sinn∧
==
--=
=--∑
∑
。
其次考虑用另外一种方法求取解决办法——Fisher判别法,即按类内方差尽量小,类间方
差尽量大的准则来求判断函数。
设两个总体A、B的均值和协方差阵分别是1μ、2μ和1∑、2∑,对任一测样本x,设它
的
判
别
函
数
为
(
uux=,并假设
11((,((uEuxxAuEuxB=∈=,2
2
12((,((VaruxxAVaruxBσσ=∈=,使(ux满
足类内偏差平方和22012Wσσ=+最小,而类间偏差平方和22012((Buuuu=-+-最大,其中121(2
uuu=
+。
即(ux要满足00
BIW=
最大,若12((uxuuxu-≤-,则xA∈,
否则xB∈。
通过推导得出判别函数T1u(x=d(Sxx--,其中
2
((
(
1
1
1
1
1,,1,2i
innii
ij
jijji
xxx
x
in
n==
=
=
=
=∑∑∑,(((
(
1
((,1,2i
niiiiT
ij
j
jSx
x
ix
x
==
--=∑,当
(0ux≤,xA∈,否则xB∈。
6.模型求解
利用模型求解时通过R软件将以上两种算法编写成程序代码,通过手动输入样本,利用计算机进行求解,程序清单如下:
Mahalanobis距离判别:
A<-matrix(c(166,15.8,24.5,700,112,179,513,185,15.7,31.5,701,125,184,427,193,9.80,25.9,541,163,128,642,159,14.2,39.7,896,99.2,239,726,226,16.2,23.8,606,152,70.3,218,171,9.29,9.29,307,187,45.5,257,201,13.3,26.6,551,101,49.4,141,147,14.5,30.0,659,102,154,680,172,8.85,7.86,551,75.7,98.4,318,156,11.5,32.5,639,107,103,552,132,15.9,17.7,578,92.4,1314,1372,182,11.3,11.3,767,111,264,672,186,9.26,37.1,958,233,73.0,347,162,8.23,27.1,625,108,62.4,465,150,6.63,21.0,627,140,179,639,159,10.7,11.7,612,190,98.5,390,117,16.1,7.04,988,95.5,136,572,181,10.1,4.04,1437,184,101,542,146,20.7,23.8,1232,128,150,1092,42.3,10.3,9.70,629,93.7,439,888,28.2,12.4,53.1,370,44.1,454,852,154,13.8,53.3,621,105,160,723,179,12.2,17.9,1139,150,45.2,218,13.5,3.36,16.8,135,32.6,51.6,182,175,5.84,24.9,807,123,55.6,126,113,15.8,47.3,626,53.6,168,627,50.5,11.6,6.30,608,58.9,58.9,139,78.6,14.6,9.70,421,70.8,133,464,90.0,3.27,8.17,622,52.3,770,852,178,28.8,32.4,992,112,70.2,169,ncol=7,byrow=T
B<-matrix(c(213,19.1,36.2,2220,249,40.0,168,170,13.9,29.8,1285,226,47.9,330,162,13.2,19.8,1521,166,36.2,133,203,13.0,90.8,1544,162,98.90,394,167,13.1,14.1,2278,212,46.3,134,164,12.9,18.6,2993,197,36.3,94.5,167,15.0,27.0,2056,260,64.6,237,158,14.4,37.0,1025,101,44.6,72.5,133,22.8,31.0,1633,401,180,899,156,135,322,6747,1090,228,810,169,8.00,308,1068,99.1,53.0,289,247,17.3,8.65,2554,241,77.9,373,166,8.10,62.8,1233,252,134,649,209,6.43,86.9,2157,288,74.0,219,182,6.49,61.7,3870,432,143,367,235,15.6,23.4,1806,166,68.8,188,173,19.1,17.0,2497,295,65.8,287,151,19.7,64.2,2031,403,182,874,191,65.4,35.0,5361,392,137,688,223,24.4,86.0,36
03,353,97.7,479,221,20.1,155,3172,368,150,739,217,25.0,28.2,2343,373,110,494,164,22.2,35.5,2212,281,153,549,173,8.99,36.0,1624,216,103,257,202,18.6,17.7,3785,225,31.0,67.3,182,17.3,24.8,3073,246,50.7,109,211,24.0,17.0,3836,428,73.5,351,246,21.5,93.2,2112,354,71.7,195,164,16.1,38.0,2135,152,64.3,240,179,21.0,35.0,1560,226,47.9,330,ncol=7,byrow=T
X<-matrix(c(58.2,5.42,29.7,323,138,179,513,106,1.87,40.5,542,177,184,427,152,0.80,12.5,1332,176,128,646,85.5,1.70,3.99,503,62.3,238,762.6,144,0.70,15.1,547,79.7,71.0,218.5,85.7,1.09,4.2,790,170,45.8,257.9,144,0.30,9.11,417,552,49.5,141.5,170,4.16,9.32,943,260,155,680.8,176,0.57,27.3,318,133,99.4,318.8,192,7.06,32.9,1969,343,103,553,188,8.28,22.6,1208,231,1314,1372,153,5.87,34.8,328,163,264,672.5,143,2.84,15.7,265,123,73.0,347.5,213,19.1,36.2,2220,249,62.0,465.8,192,20.1,23.8,1606,156,40.0,168,171,10.5,30.5,672,145,47.0,330.5,162,13.2,19.8,1521,166,36.2,133,203,13.0,90.8,1544,162,98.9,394.5,164,20.1,28.9,1062,161,47.3,134.5,167,13.1,14.1,2278,212,36.5,96.5,164,12.9,18.6,2993,197,65.5,237.8,167,15.0,27.0,2056,260,44.8,72.0,158,14.4,37.0,1025,101,180,899.5,133,22.8,31.3,1633,401,228,289,169,8.0,30.8,1068,99.1,53.0,817,247,17.3,8.65,2554,241,77.5,373.5,185,3.90,31.3,1211,190,134,649.8,209,6.43,86.9,2157,288,74.0,219.8,182,6.49,61.7,3870,432,143,367.5,235,15.6,23.4,1806,166,68.9,188,ncol=7,byrow=T
discri1<-function(TrnA,TrnB,TstX=NULL,
var.equal=FALSE{//*TrnA,TrnB,TstX分别为有病,健康和待测得样本,
var.equal缺省值为FALSE,意思是以样本TrnA
为参考
if(is.null(TstX==TRUE
TstX<-rbind(TrnA,TrnBnx<-nrow(TstX;blong<-array(0,c(nx//*把待测样本数(30给nx,建立一个向量值为0的1行30列的矩阵
Ab<-apply(TrnA,2,mean;Bb<-apply(TrnB,2,mean//Ab为1行7列的矩阵,列向量为
TrnA矩阵对应列的所有数的均值,Bb同理
if(var.equal==TRUE||var.equal==T{//两个样本的协方差相等时S<-var(rbind(TrnA,TrnB;Xb<-(Ab+Bb/2
//计算21
21
1
12
12
i
nijSSnn∧
==∑=
++-∑∑,
(
(
(
(
1
((,1,2.i
niiiiT
ij
jjSx
x
xx
i==
--=∑
for(iin1:
nx{
w<-(TstX[i,]-Xb%*%solve(S,Ab-Bb;//得到判别函数值if(w>0
blong[i]<-1//待测体有病else
blong[i]<-2//待测体没病
}
}
else{
Sa<-var(TrnA;Sb<-var(TrnB;//两个样本的协方差不等时
for(iin1:
nx{
y<-TstX[i,]-Ab;z<-TstX[i,]-Bb
w=z%*%solve(Sb,z-y%*%solve(Sa,y//得到判别函数值
if(w>0
blong[i]<-1
else
blong[i]<-2
}
}
blong
}
discri1(A,B,X
Fisher判别:
A<-matrix(c(166,15.8,24.5,700,112,179,513,185,15.7,31.5,701,125,184,427,193,9.80,25.9,541,163,128,642,159,14.2,39.7,896,99.2,239,726,226,16.2,23.8,606,152,70.3,218,171,9.29,9.29,307,187,45.5,257,201,13.3,26.6,551,101,49.4,141,147,14.5,30.0,659,102,154,680,172,8.85,7.86,551,75.7,98.4,318,156,11.5,32.5,639,107,103,552,132,15.9,17.7,578,92.4,1314,1372,182,11.3,11.3,767,111,264,672,186,9.26,37.1,958,233,73.0,347,162,8.23,27.1,625,108,62.4,465,150,6.63,21.0,627,140,179,639,159,10.7,11.7,612,190,98.5,390,117,16.1,7.04,988,95.5,136,572,181,10.1,4.04,1437,184,101,542,146,20.7,23.8,1232,128,150,1092,42.3,10.3,9.70,629,93.7,439,888,28.2,12.4,53.1,370,44.1,454,852,154,13.8,53.3,621,105,160,723,179,12.2,17.9,1139,150,45.2,218,13.5,3.36,16.8,135,32.6,51.6,182,175,5.84,24.9,807,123,55.6,126,113,15.8,47.3,626,53.6,168,627,50.5,11.6,6.30,608,58.9,58.9,139,78.6,14.6,9.70,421,70.8,133,464,90.0,3.27,8.17,622,52.3,770,852,178,28.8,32.4,992,112,70.2,169,ncol=7,byrow=T
B<-matrix(c(213,19.1,36.2,2220,249,40.0,168,170,13.9,29.8,1285,226,47.9,330,162,13.2,19.8,1521,166,36.2,133,203,13.0,90.8,1544,162,98.90,394,167,13.1,14.1,2278,212,46.3,134,164,12.9,18.6,2993,197,36.3,94.5,167,15.0,27.0,2056,260,64.6,237,158,14.4,37.0,1025,101,44.6,72.5,133,22.8,31.0,1633,401,180,899,156,135,322,6747,1090,228,810,169,8.00,308,1068,99.1,53.0,289,247,17.3,8.65,2554,241,77.9,373,166,8.10,62.8,1233,252,134,649,209,6.43,86.9,2157,288,74.0,219,182,6.49,61.7,3870,432,143,367,235,15.6,23.4,1806,166,68.8,188,173,19.1,17.0,2497,295,65.8,287,151,19.7,64.2,2031,403,182,874,191,65.4,35.0,5361,392,137,688,223,24.4,86.0,3603,353,97.7,479,221,20.1,155,3172,368,150,739,217,25.0,28.2,2343,373,110,494,164,22.2,35.5,2212,281,153,549,173,8.99,36.0,1624,216,103,257,202,18.6,17.7,3785,225,31.0,67.3,182,17.3,24.8,3073,246,50.7,109,211,24.0,17.0,3836,428,73.5,351,246,21.5,93.2,2112,354,71.7,195,164,16.1,38.0,2135,152,64.3,240,179,21.0,35.0,1560,226,47.9,330,ncol=7,byrow=T
X<-matrix(c(58.2,5.42,29.7,323,138,179,513,106,1.87,40.5,542,177,184,427,152,0.80,12.5,1332,176,128,646,85.5,1.70,3.99,503,62.3,238,762.6,144,0.70,15.1,547,79.7,71.0,218.5,85.7,1.09,4.2,790,170,45.8,257.9,144,0.30,9.11,417,552,49.5,141.5,170,4.16,9.32,943,260,155,680.8,176,0.57,27.3,318,133,99.4,318.8,192,7.06,32.9,1969,343,103,553,188,8.28,22.6,1208,231,1314,1372,153,5.87,34.8,328,163,264,672.5,143,2.84,15.7,265,123,73.0,347.5,213,19.1,36.2,2220,249,62.0,465.8,192,20.1,23.8,1606,156,40.0,168,171,10.5,30.5,672,145,47.0,330.5,162,13.2,19.8,1521,166,36.2,133,203,13.0,90.8,1544,162,98.9,394.5,164,20.1,28.9,1062,161,47.3,134.5,167,13.1,14.1,2278,212,36.5,96.5,164,12.9,18.6,2993,197,65.5,237.8,167,15.0,27.0,2056,260,44.8,72.0,158,14.4,37.0,1025,101,180,899.5,133,22.8,31.3,1633,401,228,289,169,8.0,30.8,1068,99.1,53.0,817,247,17.3,8.65,2554,241,77.5,373.5,185,3.90,31.3,1211,190,134,649.8,209,6.43,86.9,2157,288,74.0,219.8,182,6.49,61.7,3870,432,143,367.5,235,15.6,23.4,1806,166,68.9,188,ncol=7,byrow=T
discri2<-function(TrnA,TrnB,TstX=NULL{
if(is.null(TstX==TRUE
TstX<-rbind(TrnA,TrnB
nx<-nrow(TstX;blong<-array(0,c(nx
na<-nrow(TrnA;nb<-nrow(TrnB
Ab<-apply(TrnA,2,mean;Bb<-apply(TrnB,2,mean
S<-(na-1*var(TrnA+(nb-1*var(TrnB
xb<-na/(na+nb*Ab+nb/(na+n
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 数学 建模 竞赛 化验 结果 处理 解题 论文