英文文献及翻译.docx
- 文档编号:10184936
- 上传时间:2023-02-09
- 格式:DOCX
- 页数:13
- 大小:62.69KB
英文文献及翻译.docx
《英文文献及翻译.docx》由会员分享,可在线阅读,更多相关《英文文献及翻译.docx(13页珍藏版)》请在冰豆网上搜索。
英文文献及翻译
姓名:
_____郭鑫____
学号:
____120360114____
专业:
信息管理与信息系统
班级:
1203601__
指导教师:
胡仕成___
经济与管理学院
哈尔滨工业大学
BasedonDataMiningAnalysisofauditdata
[Abstract]Inthispaper,acomputerauditofthestatusquoisproposedbasedondataminingauditdataanalysisprocess,andtheapplicationofDBSCANclusteringalgorithmtofindtheauditevidence.
[Keywords]ComputerAudit,DataMining,ClusteringAlgorithms,NoiseData
Withtheeconomyandinformationtechnologycontinuestoevolve,manycompaniesbegantointroducetheERPandothersystems,whichmakesthecompany'smanyactivitiesinreal-timedatalogging,theformationofalargenumberofthebusinessmanagementofthedatawarehousefromthemassiveamountsofdatatoobtainusefulauditdataisanapplicationofcomputeraudit.Fortheauditstaff,howunitmassfromtheauditeddatainfindingacomprehensive,highqualityauditdatatoidentifytheauditevidenceisaproblemthispaperusingdataminingtechniquesdiscussedthisissueandproposedsolution.
DataMining(DataMiningisthetimefromalotof,incomplete,noisy,fuzzy,andthepracticalapplicationofrandomdatatoextractthehidden,unknown,butpotentiallyusefulinformationandknowledgeoftheprocess[1]Infact,thepracticalapplicationofthequalityofdataandstoragemodelsforthesuccessfulimplementationofcomputerauditandtoobtainauditevidenceisveryimportant.Beishendanweiinformationsystemshardwareandsoftwareplatformastheheterogeneityandpossibleman-madeintentionalconcealment,fraud,etc.,ascomputerauditworktoensurethesmoothandcorrectauditfindings,theauditdatacollectionmustbecheckedfordata,controlandanalysis.
AnAuditOfDataCollection
Auditdataacquisitionmeanstocarryoutcomputerauditfromtheauditedentity'sfinancialandbusinessinformationsystemsauditandotherdatasourcestoobtainthenecessaryandappropriateelectronicdataformatconversion[3]Ingeneral,thedataacquisitioncomputerauditmethodsincludethefollowing:
(1Beishendanweiinformationsystemsusingdataexportcapabilities. Mostoftheinformationmanagementsystemprovidesadataexportfunctions,auditorscanusethefeaturetoexportdirectlytocorporatefinancialdatatocompletedatacollection.
(2)theuseofcommondataprocessingsoftwarefordataacquisition,suchasAccess,SQLServerandsohasamorepowerfuldataimportandexportfunctionsanddataconversion.Auditorscanusethesoftwarefordatacollection,suchasrawdatabythetrialenterprisesforthetextformatcanbeconvertedtoadatabasetableformat.
(3useofauditsoftwarefordataacquisition,suchastheconstructionofthecountryfrom2002's"GoldenAuditProject"on-siteauditoftheimplementationofthesystem(AOandauditofficesystems(OAasacomputer-assistedaudittoolsdobusinessoutsideoftheapplicationofdomesticfinancialauditsoftware,auditdataacquisitionandanalysissoftwaresoyoucancompletetheauditdatacollection.
(4usingadedicatedinterfacetocompletethedatacollectionprocesswhentheauditedentitytoprovideauditdata,auditofexistingdatastructuresanddataprocessingsoftwaresystemdatastructuresarequitedifferent,youcanauditwiththeassistanceofthededicatedprogrammerdevelopedinterfaceprogram,thecompletionofdatacollection,butthecostisrelativelyhigh.
2DataCleaning
Useofdataminingclassificationofauditdataprocessing,inordertoimprovetheclassificationaccuracy,efficiencyandscalability,thedatabasemustbepre-processing,including:
datacleaning,correlationanalysis,dataconversion.
[4]givesdatacleansingisdefinedas:
findandeliminatedataerrorsandinconsistenciestoimprovethequalityofthedata. Ingeneral,theauditdatabase,dataacquisitionoperationsinheterogeneousdatabases,thereareinevitableerrorsinthedataorinconsistenciesandotherissues,suchasdatafraud,dataduplication,dataerrorssuchasmissing,accordingtotheliterature[5]proposedauditdataqualitycharacteristics,havetocollecttherawdataforcleaning,thatis"dirty"to"clean",improvedataqualityaudit,whichistoensurethatthecorrectkeyauditfindings.
ThegeneralprocessofdatacleaningasshowninFigure2.
(1dataanalysis:
Inordertocleanoutthecleandata,theneedfordetailedanalysisofthedata,includingdataformatsandcategories,suchasfinancialdatacollectedtothefieldtype,width,meaning,etc.
(2modeconversion:
modeconversionmainlyreferstothesourcedataismappedintothetargetdatamodel,suchastheconversionofproperty,fieldconstraintsanddatabasemappingbetweendifferentdatasetsandconversionsometimesneedmorethanonedatatablecombinedintoatwo-dimensionalform,andsometimeshavetosplitatableintomultipletwo-dimensionalforminordertosolvetheproblem.
(3datavalidation:
themodeswitchonthestepifpossible,theneedforassessmenttests,afterrepeatedanalysis,design,calculationandanalysisinordertobettercleanthedataorwithoutdatavalidationmaybesomeerroneousdataisnotveryclear,notbewellscreened,suchasmodeconversiontoadatasetintomultipledatatable,resultingintheparenttable'sprimarykeyvalueandthechildtableforeignkeyvaluesareinconsistent,andthustheformationofisolatedrecords,theauditstaffevidenceofthecorrectnessoftheaudit,therebyaffectingtheaccuracyofauditfindings.
(4)Databack:
withthe"clean"alternativetotheoriginaldatasourcedatainthe"dirty"data,datacollectiontoavoidthenexttimeredodatacleaning.
Sometimestheneedforrepeateddatacleaning,auditorsneedtocollectelectronicdataonmultiplecleaning,inordertoobtainhigh-qualityauditdata.
3DataMiningToAchieve
Afterdatapre-auditafterthedatabasecontainsanumberofdatasets,eachdatasetalsocontainsanumberofdatarecordsortuplesashowthedatafromthesetwo-dimensionalformmeaningfulauditdataminingiscrucial.Thispaperpresentsaclusteringalgorithmtoaudittheuseofdataminingalgorithms.
3.1OverviewOfAlgorithm
3.1.1ClusteringAlgorithm
Theso-calledclusteringisthesimilarityofdataobjectsaccordingtogroup,foundthatthedatadistribution,makingthedataineachclusterhasaveryhighsimilarityofthedataindifferentclustersasdifferent[6]It'sthesameclassificationThemaindifferenceisthatclassificationbasedonpriorknowledgeofthecharacteristicsofthedata,anddataclusteringistofindthisfeatureasafunctionofdatamining,clusteranalysiscanbeusedasadistributionforthedatatoobservethecharacteristicsofeachclassandthespecificclassfurtherindependentanalysistoolforclusteringcaneffectivelydealwithnoisydata,suchasthedatabasegenerallycontainsisolatedpoints,thevacancyorerrordata.
Clusteringalgorithmsareusuallyfivecategories[7]:
①basedclassificationmethods,suchasCLARANS,②basedonhierarchicalmethods,suchasCUREandBIRCH,③density-basedmethods,suchasDBSCAN,OPTICS,GDBSCANandDBRS;④network-basedgridmethods,suchasSTINGandWaveCluster,⑤model-basedmethods,suchasCOBWEB.DBSCANalgorithmwhichhasagoodadvantageoffilteringthenoisedata.ThispaperdiscussestheuseofDBSCANalgorithmtoprocesstheauditdatatoidentifyabnormaldata,findouttheauditevidence.
3.1.2DBSCANAlgorithm
ThebasicideaofDBSCANalgorithm[8]:
forthesameclustereachobjectinagivenneighborhoodradiusdoftheobjectcontainsnotlessthanaminimumnumberofagivenMinPts(alsocalleddensity).
Inordertosurviveacluster,DBSCANalgorithmfirstselectedfromthedatasetDBanyobjectp,andfindthedatasetDBontheneighborhoodradiusdofallobjects,iftheneighborhoodislessthantheminimumnumberofobjectsthenumberofMinPts,thenpisthenoisedata,objectorneighborhoodofptoformaninitialclusterN,Ncontainsobjectspandpdirectlydensity-reachablealltheobjectsandthendetermineforeachobjectclassisthecoreobjectq,if,willbethed-neighborhoodofqdoesnotyetcontainalltheobjectsappendedtotheNtoN,andcontinuetodeterminewhetheranewobjectaddedasthecoreobject,andifso,repeattheaboveadditionalprocessuntiltheclustercannotbeextendedsofar.DBSCANalgorithmthenre-electtheDBinadatasethasnotbeenidentifiedasaclusterornoiseobjects,repeattheaboveoperation,thedatasetuntilallobjectsintheDBorisidentifiedasacluster,eitheridentifiedasnoisedatasofar.
DBSCANclusteringalgorithmtoperformdatacollectionprocessisthecontinuousprocessofcomparingthequery,andfinallythenoisedataiscommonlyreferredtoasabnormaldata,theauditorsfortheaudittohelpdeterminetheveryeffectiveFigure3showsthetwo-dimensionalcoordinatesnoisedataandthenumberofclusters.
3.2DefinitionOfDataModel
3.2.1TheDistanceBetween
IsSetRiAndRjDBDataSetsTwoRecordsInWhichAnyTwoDataItemsA,TheDistanceBetweenThemIsDefinedAs:
Where,Ri(Rix,Riy,Rj(Rjx,RjytwoitemsthatthedatasetRiandRjinthetwo-dimensionalcoordinatesofthepoints,sothatRiandRjdijtwo-dimensionalspatialcoordinatesinthedistanceisgreaterthanifdijtogivenvalued,saidRiandRjdoesnotbelongtothesameclustergroup.Linkstofreedownload
3.2.2Pre-AuditData
Choiceofdataminingdataiscarriedoutintwo-dimensionalplane,firstselectthecolumns(fieldsorattributes,andthenselecttherows(recordsortuples).Inordertoobtainvalidauditevidencetoarriveatacorrectauditfindings,sometimessourcedatamustbesetfordataconversion.
Becauseof
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 英文 文献 翻译