数据分析33Word格式文档下载.docx
- 文档编号:17705661
- 上传时间:2022-12-08
- 格式:DOCX
- 页数:13
- 大小:42.41KB
数据分析33Word格式文档下载.docx
《数据分析33Word格式文档下载.docx》由会员分享,可在线阅读,更多相关《数据分析33Word格式文档下载.docx(13页珍藏版)》请在冰豆网上搜索。
snotnecessarilytruethatμ=0.55,σ=1.
x¯
=0.55,s=1
Thesampledistributionisrightskewed.
0.55isapointestimateforthepopulationmean.
Total
1.00/1.00
QuestionExplanationThisquestionreferstothefollowinglearningobjective(s):
Definesamplestatisticasapointestimateforapopulationparameter,forexample,thesamplemeanisusedtoestimatethepopulationmean,andnotethatpointestimateandsamplestatisticaresynonymous.
Question2
Researchersstudyinganthropometrycollectedvariousbodyandskeletalmeasurementsfor507physicallyactiveindividuals.Thehistogrambelowshowsthesampledistributionofheightsincentimeters.Ifthe507individualsareasimplerandomsample-andlet’sassumetheyare-thenthesamplemeanisapointestimateforthemeanheightofallactiveindividuals.Whatmeasuredoweusetoquantifythevariabilityofsuchanestimate?
Computethisquantityusingthedatafromthissampleandchoosethebestanswerbelow.
standarddeviation=0.417
standarderror=0.417
meansquarederror=0.105
standarddeviation=0.019
standarderror=0.019
Inorrect
0.00
Wequantifyvariabilityinthesamplemeanbycalculatingthestandarderror(ofthemean)SE=σ/n√.
0.00/1.00
Calculatethesamplingvariabilityofthemean,thestandarderror,asSE=σ/n√.
Question3
Thestandarderrormeasures:
thevariabilityofsamplestatistics
thevariabilityofpopulationparameters
thevariabilityinthepopulation
thevariabilityofthesampledobservations
The``variabilityofthesampledobservations"
ismeasuredbythesamplestandarddeviations.
Distinguishstandarddeviation(σors)andstandarderror(SE):
standarddeviationmeasuresthevariabilityinthedata,whilestandarderrormeasuresthevariabilityinpointestimatesfromdifferentsamplesofthesamesizeandfromthesamepopulation,i.e.measuresthesamplingvariability.
Question4
Supposeyoutookalargenumberofrandomsamplesofsizenfromalargepopulationandcalculatedthemeanofeachsample.Thensupposeyouplottedthedistributionofyoursamplemeansinahistogram.Nowconsiderthefollowingpossibleattributesofyourcollecteddataandthepopulationfromwhichtheyweresampled.Forwhichofthefollowingsetsofattributeswouldyounotexpectyourhistogramofyoursamplemeanstofollowanearlynormaldistribution?
n=120.Thepopulationdistributionisunknown,butthedistributionofdataineachsampleisslightlyskewed.
n=20.Thepopulationdistributionisnearlynormal.
n=120.Thepopulationdistributionisslightlyskewed.
n=10.Thepopulationdistributionisunknown,butthedistributionofdataineachsampleisheavilyskewed.
Samplesizeissmallandthepopulationdistributionmightbeskewed,hencelikelynotgoingtoyieldnearlynormalsamplingdistribution.
RecognizethattheCentralLimitTheorem(CLT)isaboutthedistributionofpointestimates,andthatgivencertainconditions,thisdistributionwillbenearlynormal.
InthecaseofthemeantheCLTtellsusthatif
(1a)thesamplesizeissufficientlylarge(n≥30)andthedataarenotextremelyskewedor
(1b)thepopulationisknowntohaveanormaldistribution,and
(2)theobservationsinthesampleareindependent,
thenthedistributionofthesamplemeanwillbenearlynormal,centeredatthetruepopulationmeanandwithastandarderrorofσn√.x¯
∼N(mean=μ,SE=σn√)
∙Whenthepopulationdistributionisunknown,condition(1a)canbecheckedusingahistogramorsomeothervisualizationofthedistributionoftheobserveddatainthesample.
∙Thelargerthesamplesize(n),thelessimportanttheshapeofthedistributionbecomes,i.e.whennisverylargethesamplingdistributionwillbenearlynormalregardlessoftheshapeofthepopulationdistribution.
Question5
TheGeneralSocialSurvey(GSS)isasociologicalsurveyusedtocollectdataondemographiccharacteristicsandattitudesofresidentsoftheUnitedStates.In2010,thesurveycollectedresponsesfromoverathousandUSresidents.Thesurveyisconductedface-to-facewithanin-personinterviewofarandomly-selectedsampleofadults.Oneofthequestionsonthesurveyis“Forhowmanydaysduringthepast30dayswasyourmentalhealth,whichincludesstress,depression,andproblemswithemotions,notgood?
”
Basedonresponsesfrom1,151USresidents,thesurveyreporteda95%confidenceintervalof3.40to4.24daysin2010.Giventhisinformation,whichofthefollowingstatementswouldbemostappropriatetomakeregardingthetrueaveragenumberofdaysof“notgood”mentalhealthin2010forUSresidents?
ForallUSresidentsin2010,thereisa95%probabilitythatthetrueaveragenumberofdaysof“notgood”mentalhealthisbetween3.40and4.24days.
ForallUSresidentsin2010,basedonthis95%confidenceinterval,wewouldrejectanullhypothesisstatingthatthetrueaveragenumberofdaysof“notgood”mentalhealthis5days.
Thereisnotsufficientinformationtocalculatethemarginoferrorofthisconfidenceinterval.
Forthese1,151residentsin2010,weare95%confidentthattheaveragenumberofdaysof“notgood”mentalhealthisbetween3.40and4.24days.
Theconfidenceintervalsonlytrytocapturepopulationparameters,notsamplemeans.Wecancalculateexactlywhatthesamplemeanis.
∙Interpretaconfidenceintervalas“WeareXX%confidentthatthetruepopulationparameterisinthisinterval”,whereXX%isthedesiredconfidencelevel.
∙Definemarginoferrorasthedistancerequiredtotravelineitherdirectionawayfromthepointestimatewhenconstructingaconfidenceinterval.
Question6
Astudysuggeststhattheaveragecollegestudentspends2hoursperweekcommunicatingwithothersonline.Youbelievethatthisisanunderestimateanddecidetocollectyourownsampleforahypothesistest.Yourandomlysample60studentsfromyourdormandfindthatonaveragetheyspent3.5hoursaweekcommunicatingwithothersonline.Whichofthefollowingisthecorrectsetofhypothesesforthisscenario?
H0:
μ=2HA:
μ>
2
μ<
μ=3.5HA:
3.5
=2HA:
<
>
∙Alwaysconstructhypothesesaboutpopulationparameters(e.g.populationmean,μ)andnotthesamplestatistics(e.g.samplemean,x¯
).Notethatthepopulationparameterisunknownwhilethesamplestatisticismeasuredusingtheobserveddataandhencethereisnopointinhypothesizingaboutit.
∙Definethenullvalueasthevaluetheparameterissettoequalinthenullhypothesis.
∙Notethatthealternativehypothesismightbeone-sided(μthenullvalue)ortwo-sided(μ≠thenullvalue),andthechoicedependsontheresearchquestion.
Question7
Whichofthefollowingisthecorrectdefinitionofthep-value?
P(observedormoreextremesamplestatistic|H0true)
P(H0true)
P(H0true|HAfalse)
P(H0true|observeddata)
Reviewtheassociatedlearningobjective.Thissoundsmoreliketheposteriorprobability:
P(hypothesis|data).
Defineap-valueastheconditionalprobabilityofobtainingasamplestatisticatleastasextremeastheoneobservedgiventhatthenullhypothesisistrue.
p-value=P(observedormoreextremesamplestatistic|H0true)
Question8
Allbutoneofthefollowingconfidenceintervalshasamarginoferrorof0.7.Whichistheconfidenceintervalwiththedifferentmarginoferror?
(−4.7,−3.3)
(1.6,4.4)
Thewidthofaconfidenceintervalis2timesthemarginoferror,sinceweaddandsubtractthesamemarginoferrortothesamplestatisticstoobtaintheboundsoftheconfidenceinterval.Tosolvethisquestionweneedtocalculatethemarginoferrorusingthisruleforeachchoice:
|(1.6−4.4)/2|=1.4
(20.3,21.7)
(−0.5,0.9)
∙Recognizethatwhenthesamplesizeincreaseswewouldexpectthesamplingvariabilitytodecrease.
∙Definemarginoferrorasthedistancerequiredtotravelineitherdirectionawayfromthepointestimatewhenconstructingaconfidenceinterval,i.e.z⋆×
SE.
Question9
Aresearcherfounda2006-2010surveyshowingthattheaverageageof
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 数据 分析 33