数据挖掘技术中英文对照外文翻译文献.docx
- 文档编号:28928232
- 上传时间:2023-07-20
- 格式:DOCX
- 页数:13
- 大小:24.39KB
数据挖掘技术中英文对照外文翻译文献.docx
《数据挖掘技术中英文对照外文翻译文献.docx》由会员分享,可在线阅读,更多相关《数据挖掘技术中英文对照外文翻译文献.docx(13页珍藏版)》请在冰豆网上搜索。
数据挖掘技术中英文对照外文翻译文献
(文档含英文原文和中文翻译)
中英文资料对照外文翻译
英文原文
IntroductiontoDataMining
Abstract:
Microsoft®SQLServer™2005providesanintegratedenvironmentforcreatingandworkingwithdataminingmodels.This tutorialusesfourscenarios,targeted mailing, forecasting, market basket,and sequence clustering,todemonstratehowtousetheminingmodelalgorithms,miningmodelviewers,anddataminingtools thatareincludedinthisreleaseofSQLServer.
Introduction
ThedataminingtutorialisdesignedtowalkyouthroughtheprocessofcreatingdataminingmodelsinMicrosoftSQLServer2005.ThedataminingalgorithmsandtoolsinSQLServer2005makeiteasytobuildacomprehensivesolutionforavarietyofprojects,includingmarketbasketanalysis,forecastinganalysis,andtargetedmailinganalysis.Thescenariosforthesesolutionsareexplainedingreaterdetaillaterinthetutorial.
ThemostvisiblecomponentsinSQLServer2005aretheworkspacesthatyouusetocreateandworkwithdataminingmodels.Theonlineanalyticalprocessing(OLAP)anddataminingtoolsareconsolidatedintotwoworkingenvironments:
BusinessIntelligenceDevelopmentStudioandSQLServerManagementStudio.UsingBusinessIntelligenceDevelopmentStudio,youcandevelopanAnalysisServicesprojectdisconnectedfromtheserver.Whentheprojectisready,youcandeployittotheserver.Youcanalsoworkdirectlyagainsttheserver.ThemainfunctionofSQLServerManagementStudioistomanagetheserver.Eachenvironmentisdescribedinmoredetaillaterinthisintroduction.Formoreinformationonchoosingbetweenthetwoenvironments,see"ChoosingBetweenSQLServerManagementStudioandBusinessIntelligenceDevelopmentStudio"inSQLServerBooksOnline.
Allofthedataminingtoolsexistinthedataminingeditor.Usingtheeditoryoucanmanageminingmodels,createnewmodels,viewmodels,comparemodels,andcreatepredictionsbasedonexistingmodels.
Afteryoubuildaminingmodel,youwillwanttoexploreit,lookingforinterestingpatternsandrules.Eachminingmodelviewerintheeditoriscustomizedtoexploremodelsbuiltwithaspecificalgorithm.Formoreinformationabouttheviewers,see"ViewingaDataMiningModel"inSQLServerBooksOnline.
Oftenyourprojectwillcontainseveralminingmodels,sobeforeyoucanuseamodeltocreatepredictions,youneedtobeabletodeterminewhichmodelisthemostaccurate.Forthisreason,theeditorcontainsamodelcomparisontoolcalledtheMiningAccuracyCharttab.Usingthistoolyoucancomparethepredictiveaccuracyofyourmodelsanddeterminethebestmodel.
Tocreatepredictions,youwillusetheDataMiningExtensions(DMX)language.DMXextendsSQL,containingcommandstocreate,modify,andpredictagainstminingmodels.FormoreinformationaboutDMX,see"DataMiningExtensions(DMX)Reference"inSQLServerBooksOnline.Becausecreatingapredictioncanbecomplicated,thedataminingeditorcontainsatoolcalledPredictionQueryBuilder,whichallowsyoutobuildqueriesusingagraphicalinterface.YoucanalsoviewtheDMXcodethatisgeneratedbythequerybuilder.
Justasimportantasthetoolsthatyouusetoworkwithandcreatedataminingmodelsarethemechanicsbywhichtheyarecreated.Thekeytocreatingaminingmodelisthedataminingalgorithm.Thealgorithmfindspatternsinthedatathatyoupassit,andittranslatesthemintoaminingmodel—itistheenginebehindtheprocess.
Someofthemostimportantstepsincreatingadataminingsolutionareconsolidating,cleaning,andpreparingthedatatobeusedtocreatetheminingmodels.SQLServer2005includestheDataTransformationServices(DTS)workingenvironment,whichcontainstoolsthatyoucanusetoclean,validate,andprepareyourdata.FormoreinformationonusingDTSinconjunctionwithadataminingsolution,see"DTSDataMiningTasksandTransformations"inSQLServerBooksOnline.
InordertodemonstratetheSQLServerdataminingfeatures,thistutorialusesanewsampledatabasecalledAdventureWorksDW.ThedatabaseisincludedwithSQLServer2005,anditsupportsOLAPanddataminingfunctionality.Inordertomakethesampledatabaseavailable,youneedtoselectthesampledatabaseattheinstallationtimeinthe“Advanced”dialogforcomponentselection.
AdventureWorks
AdventureWorksDWisbasedonafictionalbicyclemanufacturingcompanynamedAdventureWorksCycles.AdventureWorksproducesanddistributesmetalandcompositebicyclestoNorthAmerican,European,andAsiancommercialmarkets.ThebaseofoperationsislocatedinBothell,Washingtonwith500employees,andseveralregionalsalesteamsarelocatedthroughouttheirmarketbase.
AdventureWorkssellsproductswholesaletospecialtyshopsandtoindividualsthroughtheInternet.Forthedataminingexercises,youwillworkwiththeAdventureWorksDWInternetsalestables,whichcontainrealisticpatternsthatworkwellfordataminingexercises.
FormoreinformationonAdventureWorksCyclessee"SampleDatabasesandBusinessScenarios"inSQLServerBooksOnline.
DatabaseDetails
TheInternetsalesschemacontainsinformationabout9,242customers.Thesecustomersliveinsixcountries,whicharecombinedintothreeregions:
NorthAmerica(83%)
Europe(12%)
Australia(7%)
Thedatabasecontainsdataforthreefiscalyears:
2002,2003,and2004.
Theproductsinthedatabasearebrokendownbysubcategory,model,andproduct.
BusinessIntelligenceDevelopmentStudio
BusinessIntelligenceDevelopmentStudioisasetoftoolsdesignedforcreatingbusinessintelligenceprojects.BecauseBusinessIntelligenceDevelopmentStudiowascreatedasanIDEenvironmentinwhichyoucancreateacompletesolution,youworkdisconnectedfromtheserver.Youcanchangeyourdataminingobjectsasmuchasyouwant,butthechangesarenotreflectedontheserveruntilafteryoudeploytheproject.
WorkinginanIDEisbeneficialforthefollowingreasons:
TheAnalysisServicesprojectistheentrypointforabusinessintelligencesolution.AnAnalysisServicesprojectencapsulatesminingmodelsandOLAPcubes,alongwithsupplementalobjectsthatmakeuptheAnalysisServicesdatabase.FromBusinessIntelligenceDevelopmentStudio,youcancreateandeditAnalysisServicesobjectswithinaprojectanddeploytheprojecttotheappropriateAnalysisServicesserverorservers.
IfyouareworkingwithanexistingAnalysisServicesproject,youcanalsouseBusinessIntelligenceDevelopmentStudiotoworkconnectedtheserver.Inthisway,changesarereflecteddirectlyontheserverwithouthavingtodeploythesolution.
SQLServerManagementStudio
SQLServerManagementStudioisacollectionofadministrativeandscriptingtoolsforworkingwithMicrosoftSQLServercomponents.ThisworkspacediffersfromBusinessIntelligenceDevelopmentStudiointhatyouareworkinginaconnectedenvironmentwhereactionsarepropagatedtotheserverassoonasyousaveyourwork.
Afterthedatahasbeencleanedandpreparedfordatamining,mostofthetasksassociatedwithcreatingadataminingsolutionareperformedwithinBusinessIntelligenceDevelopmentStudio.UsingtheBusinessIntelligenceDevelopmentStudiotools,youdevelopandtestthedataminingsolution,usinganiterativeprocesstodeterminewhichmodelsworkbestforagivensituation.Whenthedeveloperissatisfiedwiththesolution,itisdeployedtoanAnalysisServicesserver.Fromthispoint,thefocusshiftsfromdevelopmenttomaintenanceanduse,andthusSQLServerManagementStudio.UsingSQLServerManagementStudio,youcanadministeryourdatabaseandperformsomeofthesamefunctionsasinBusinessIntelligenceDevelopmentStudio,suchasviewing,andcreatingpredictionsfromminingmodels.
DataTransformationServices
DataTransformationServices(DTS)comprisestheExtract,Transform,andLoad(ETL)toolsinSQLServer2005.Thesetoolscanbeusedtoperformsomeofthemostimportanttasksindatamining:
cleaningandpreparingthedataformodelcreation.Indatamining,youtypicallyperformrepetitivedatatransformationstocleanthedatabeforeusingthedatatotrainaminingmodel.UsingthetasksandtransformationsinDTS,youcancombinedatapreparationandmodelcreationintoasingleDTSpackage.
DTSalsoprovidesDTSDesignertohelpyoueasilybuildandrunpackagescontainingallofthetasksandtransformations.UsingDTSDesigner,youcandeploythepackagestoaserverandrunthemonaregularlyscheduledbasis.Thisisusefulif,forexample,youcollectdataweeklydataandwanttoperformthesamecleaningtransformationseachtimeinanautomatedfashion.
YoucanworkwithaDataTransformationprojectandanAnalysisServicesprojecttogetheraspartofabusinessintelligencesolution,byaddingeachprojecttoasolutioninBusinessIntelligenceDevelopmentStudio.
MiningModelAlgorithms
Dataminingalgorithmsarethefoundationfromwhichminingmodelsarecreated.ThevarietyofalgorithmsincludedinSQLServer2005allowsyoutoperformmanytypesofanalysis.Formorespecificinformationaboutthealgorithmsandhowtheycanbeadjustedusingparameters,see"DataMiningAlgorithms"inSQLServerBooksOnline.
MicrosoftDecisionTrees
TheMicrosoftDecisionTreesalgorithmsupportsbothclassificationandregressionanditworkswellforpredictivemodeling.Usingthealgorithm,youcanpredictbothdiscreteandcontinuousattributes.
Inbuildingamodel,thealgorithmexamineshoweachinputattributeinthedatasetaffectstheresultofthepredictedattribute,andthenitusestheinputattributeswiththestrongestrelationshiptocreateaseriesofsplits,callednodes.Asnewnodesareaddedtothemodel,atree
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 数据 挖掘 技术 中英文 对照 外文 翻译 文献