03Data-Warehousing-and-O.ppt
- 文档编号:202047
- 上传时间:2022-10-06
- 格式:PPT
- 页数:58
- 大小:3.19MB
03Data-Warehousing-and-O.ppt
《03Data-Warehousing-and-O.ppt》由会员分享,可在线阅读,更多相关《03Data-Warehousing-and-O.ppt(58页珍藏版)》请在冰豆网上搜索。
October6,2022,DataMining:
ConceptsandTechniques,1,DataMining:
ConceptsandTechniquesChapter3,JiaweiHanDepartmentofComputerScienceUniversityofIllinoisatUrbana-Champaignwww.cs.uiuc.edu/hanj2006JiaweiHanandMichelineKamber,Allrightsreserved,October6,2022,DataMining:
ConceptsandTechniques,2,October6,2022,DataMining:
ConceptsandTechniques,3,Chapter3:
DataWarehousingandOLAPTechnology:
AnOverview,Whatisadatawarehouse?
Amulti-dimensionaldatamodelDatawarehousearchitectureDatawarehouseimplementationFromdatawarehousingtodatamining,October6,2022,DataMining:
ConceptsandTechniques,4,WhatisDataWarehouse?
Definedinmanydifferentways,butnotrigorously.AdecisionsupportdatabasethatismaintainedseparatelyfromtheorganizationsoperationaldatabaseSupportinformationprocessingbyprovidingasolidplatformofconsolidated,historicaldataforanalysis.“Adatawarehouseisasubject-oriented,integrated,time-variant,andnonvolatilecollectionofdatainsupportofmanagementsdecision-makingprocess.”W.H.InmonDatawarehousing:
Theprocessofconstructingandusingdatawarehouses,October6,2022,DataMining:
ConceptsandTechniques,5,DataWarehouseSubject-Oriented,Organizedaroundmajorsubjects,suchascustomer,product,salesFocusingonthemodelingandanalysisofdatafordecisionmakers,notondailyoperationsortransactionprocessingProvideasimpleandconciseviewaroundparticularsubjectissuesbyexcludingdatathatarenotusefulinthedecisionsupportprocess,October6,2022,DataMining:
ConceptsandTechniques,6,DataWarehouseIntegrated,Constructedbyintegratingmultiple,heterogeneousdatasourcesrelationaldatabases,flatfiles,on-linetransactionrecordsDatacleaninganddataintegrationtechniquesareapplied.Ensureconsistencyinnamingconventions,encodingstructures,attributemeasures,etc.amongdifferentdatasourcesE.g.,Hotelprice:
currency,tax,breakfastcovered,etc.Whendataismovedtothewarehouse,itisconverted.,October6,2022,DataMining:
ConceptsandTechniques,7,DataWarehouseTimeVariant,ThetimehorizonforthedatawarehouseissignificantlylongerthanthatofoperationalsystemsOperationaldatabase:
currentvaluedataDatawarehousedata:
provideinformationfromahistoricalperspective(e.g.,past5-10years)EverykeystructureinthedatawarehouseContainsanelementoftime,explicitlyorimplicitlyButthekeyofoperationaldatamayormaynotcontain“timeelement”,October6,2022,DataMining:
ConceptsandTechniques,8,DataWarehouseNonvolatile,AphysicallyseparatestoreofdatatransformedfromtheoperationalenvironmentOperationalupdateofdatadoesnotoccurinthedatawarehouseenvironmentDoesnotrequiretransactionprocessing,recovery,andconcurrencycontrolmechanismsRequiresonlytwooperationsindataaccessing:
initialloadingofdataandaccessofdata,October6,2022,DataMining:
ConceptsandTechniques,9,DataWarehousevs.HeterogeneousDBMS,TraditionalheterogeneousDBintegration:
AquerydrivenapproachBuildwrappers/mediatorsontopofheterogeneousdatabasesWhenaqueryisposedtoaclientsite,ameta-dictionaryisusedtotranslatethequeryintoqueriesappropriateforindividualheterogeneoussitesinvolved,andtheresultsareintegratedintoaglobalanswersetComplexinformationfiltering,competeforresourcesDatawarehouse:
update-driven,highperformanceInformationfromheterogeneoussourcesisintegratedinadvanceandstoredinwarehousesfordirectqueryandanalysis,October6,2022,DataMining:
ConceptsandTechniques,10,DataWarehousevs.OperationalDBMS,OLTP(on-linetransactionprocessing)MajortaskoftraditionalrelationalDBMSDay-to-dayoperations:
purchasing,inventory,banking,manufacturing,payroll,registration,accounting,etc.OLAP(on-lineanalyticalprocessing)MajortaskofdatawarehousesystemDataanalysisanddecisionmakingDistinctfeatures(OLTPvs.OLAP):
Userandsystemorientation:
customervs.marketDatacontents:
current,detailedvs.historical,consolidatedDatabasedesign:
ER+applicationvs.star+subjectView:
current,localvs.evolutionary,integratedAccesspatterns:
updatevs.read-onlybutcomplexqueries,October6,2022,DataMining:
ConceptsandTechniques,11,OLTPvs.OLAP,October6,2022,DataMining:
ConceptsandTechniques,12,WhySeparateDataWarehouse?
HighperformanceforbothsystemsDBMStunedforOLTP:
accessmethods,indexing,concurrencycontrol,recoveryWarehousetunedforOLAP:
complexOLAPqueries,multidimensionalview,consolidationDifferentfunctionsanddifferentdata:
missingdata:
DecisionsupportrequireshistoricaldatawhichoperationalDBsdonottypicallymaintaindataconsolidation:
DSrequiresconsolidation(aggregation,summarization)ofdatafromheterogeneoussourcesdataquality:
differentsourcestypicallyuseinconsistentdatarepresentations,codesandformatswhichhavetobereconciledNote:
TherearemoreandmoresystemswhichperformOLAPanalysisdirectlyonrelationaldatabases,October6,2022,DataMi
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 03 Data Warehousing and