visual computing 视觉计算学习体会.docx

文档编号：29605103
上传时间：2023-07-25
格式：DOCX
页数：11
大小：185.47KB

《visual computing 视觉计算学习体会.docx》由会员分享，可在线阅读，更多相关《visual computing 视觉计算学习体会.docx（11页珍藏版）》请在冰豆网上搜索。

visual computing 视觉计算学习体会.docx

visualcomputing视觉计算学习体会

VisualComputing

Lasttwoweeks,wehavetakenthecourseVisualComputing.Thesubjectisveryinteresting,neotericandmeaty.ThroughtheExcellentpresentationoftheteachers,Iknowmoreaboutthissubjectandgetinterestedinit.

WhatisVisualComputing?

VisualComputingisanexcitingnewresearchareathatstudieshowtomakecomputersefficientlyperceive,process,andunderstandvisualdatasuchasimagesandvideos.Computervisionistheknowledgeontheuseofcameraandcomputertoobtaintherequiredinformationofthesubject.Figurativelyspeaking,thatisinstalledonyourcomputerontheeye（camera）andbrain（algorithm）,allowthecomputertobeabletoperceivetheenvironment.

Thereisnodoubtthatthecomputercannotobtaintheinformationlikehumanfromthevisualmessageitget.Forthehuman,wecangetmuchinformationotherthansize,distance,color.Forexample,wecanjudgethefeelingoftheotherpersonjustthroughpettyactionorpettyfacialexpressionofthepeople.Butthecomputercannotdoiteasily.Itisjustnoteasyeasilyforthecomputertojudgethefeelingorsomeotherabstractthingsthroughthevisualvisionofcamera.Thus,theultimategoalforcomputersistoemulatethestrikingperceptualcapabilityofhumaneyesandbrains,oreventosurpassandassistthehumanincertainways.

Therearemanyfamousscientistsinthisfield.Amongthem,IthinkthefamousoneisD.Marr.Heisafamousscientistinthefieldofvisualcomputing,whohadputforwardmanytheoryaboutvisualcomputing.Hesaid,VisionResearch'sultimategoalistoclarifywhetherthevisualsystemishowtocompletethevisualtask.Andheconsideredthattheinformationprocessingbythenervoussystemandmachineissimilar.Thevisionisacomplexinformationprocessingtask,tograspthevarioususefulandmeaningfuloutsideworld,andtoexpressthem.Thistaskmustbeunderstoodatthreedifferentlevels,whichis:

a.Calculationtheory;b.Algorithm;c.Mechanism.Thisdivisionisnotverystrict,butifyoudonotcomplywiththeaboveclassification,thenthatisnotoneoragroupdescriptionisperfect.Inthevisualperception,eachofthethreelevelshasitsspecificlocation.Theyarebasicallyindependentofeachother.Thus,intheexplorationoftheoreticalissues,wemustbeusenewresearchmethods,andmakeitstrictlydistinguishedfromlawandmechanisms.

TheabovemethodputforwardbyD.Marrissostrong,itjustmakesthevisualinformationsciencesjumpintotherapiddevelopmentandgrowth,withwhichtheresultisthatitjustlikephysicsasapermanentnature.Becausetheyareasolidfoundationonthebasiclawsofphysicsandformulaicimageoftherealworld,thus,thislevelofvisualcomputingtheoryinthedevelopmentprocessmaybecomearealscience.D.Marr'sworkisfromthemainthemeofcalculationtothebasisofthetheoryofanalyzetheconcretedetailsofthemethodological.Heisjustagreatandexcellentscientist!

Comebacktotheconcretecontenttoourcourse,therearemanyinterestingcontent.Andatthebeginningofthecourse,theteacherfromItalywhonamedP.LECALLET,telltheknowledgeabouttheintroductiontoVideocodingstandardmotionestimation.Thoughhegotintothemajorpointquickly,whichmadeusalittlebitdifficultandconfusedtogetthepointofhim,wewasabsorbedbywhathesaidlittlebylittleandinterestingjokesofhim.

Firstly,welearntheuncompresseddigitalvideo:

bit-rate.Thebit-ratejustdescribesthetransferspeedbythenumberofbitstransferredinonesecond.Allofusknowthatthesmallestunitinthesystemofcomputerisbit.Anditisobviouslythatmachinecantransferredmoredatainunittimewithhighbit-rate.Wehavegotsomebit-rateofvideoforms.Wecangetthattheclearervideoformhasthehigherbit-rate.WeknowthattheHDisclearerthantheSD.

Thenforthecompresseddigitalvideo,itcanbedividedintotwotypologies:

lossyorloss-less.Forthelossycase,ithasonespecialcasewhichnamedvisuallyloss-less.Itisamazing.Whatismore,forthelossycompression,weneedtodefineacceptablevisualquality,whichmayvarywiththetargetedservices.

AllofuswillliketowatchtheHDmoviesnowfortheclearerimageandvisualfeelingofthisvideoform.Butforthelossycompression,itjustmakesgooduseofthecharacteristicsthatthehumanisnotsensitivetocertainfrequencycomponentsintheimageoracoustic,whichallowsacertainamountofinformationduringthecompressionloss.Althoughitcannotcompletelyrestoretheoriginaldata,thelossofpartofthetakeslittleeffecttotheunderstandingoftheoriginalimage,inreturnforamuchlargercompressionratio.Nowadays,lossycompressioniswidelyusedinthevoice,imageandvideodatacompression.Thecommonsound,imageandvideocompressionarebasicallylossy.Inmultimediaapplications,acommoncompressionmethodis:

PCM（PulseCodeModulationlossycompression）,Predictivecoding,transformcoding,interpolationandextrapolation,statisticalcoding,vectorquantizationandsub-bandcoding,hybridcodingisawidelyusedmethodinrecentyears.Thevideoformmp3,divX,Xvid,jpeg,rm,rmvb,wma,wmvarealllossycompression.

Theadvantageofthecompressionisthatthefilesizeismuchsmallerthananyknownnondestructivemethod,whileatthesametimetomeettheneedsofthesystem.Lossycompressionfilewhentheuser,forexample,inordertosavedownloadtime,theunzipfileandtheoriginalfileinthedatabitlevelpointofviewmaybeverydifferent,butformostpracticalpurposes,thehumanearorthehumaneyecannotdistinguishbetweenthedifferencebetweenthem.Andwiththelossycompressiontechniques,certaindataisintentionallydeleted,thedataisalsonolongerbecanceledtorestore.

Forthelossycompressionandtheloss-lesscompression,weconsiderthattheperformancedependsonthreetheories:

bit-rate,ratedistortioncurveandtheMoore’slaw.

Therearetwomajormethodstorealizethecompression.Oneislossytransformcoding,whichcutthedataintosmallpieces,andconvertthemintoanewspace,thenquantizethem.Finallydoentropycodingtothequantizedvalues.Anotherisnamedpredictivecoding,thepreviousdata,andlaterdecodeddatauselossycompressiontopredictthesamplingofthecurrentsoundorimageframestogetthedataoferrorbetweenthepredictiondataandtheactualdataaswellasdothequantizationandcodingofsomeotherinformationtoreproducetheprediction.

Ininformationtheory,entropyencodingisaloss-lessdatacompressionschemethatisindependentofthespecificcharacteristicsofthemedium.Oneofthemaintypesofentropycodingcreatesandassignsauniqueprefix-freecodetoeachuniquesymbolthatoccursintheinput.Theseentropyencodersthencompressdatabyreplacingeachfixed-lengthinputsymbolwiththecorrespondingvariable-lengthprefix-freeoutputcodeword.Thelengthofeachcodewordisapproximatelyproportionaltothenegativelogarithmoftheprobability.Therefore,themostcommonsymbolsusetheshortestcodes.TwoofthemostcommonentropiesencodingtechniquesareHuffmancodingandarithmeticcoding.Iftheapproximateentropycharacteristicsofadatastreamareknowninadvance（especiallyforsignalcompression）,thenasimplerstaticcodemaybeuseful.Thesestaticcodesincludeuniversalcodes（suchasEliasgammacodingorFibonaccicoding）andGolombcodes（suchasunarycodingorRicecoding）.

FortheHuffmancoding,themajorpointistoavoidrepetition.AndtheHuffmancodingusuallyappearsasthebinarytree,forwhichtheendingpointsrepresentsthecodingletters,andtherootnodeisonbehalfofthebits.Thusforthecharacterswithhighoccurrencerate,weshouldletthecodeofittobeshorterthantheothers.Inthisway,theaveragelengthofexpectedstringswillbereducedsoastoachievethepurposeofcompression.

Andthen,fortheothermethodwhichnamedpredictivecoding,differentfromtheentropycodingabove,itisbasedonthecorrelationbetweeninformation.Thepredictivecodingisbasedonthecharacteristicsthattherearecertaincorrelationbetweenthediscretesignals,usingthelastoneormoresignalstopredictthenextsignaltobeencoded,thendothecodingwiththedifference（predictionerror）betweentheactualvalueandthepredictedvalue.Iftheforecastisaccurate,theerrorwillbeverysmall.Intherequiredconditionsofthesameaccuracy,itispossibletousefewerbitsforencoding,andtoachievethepurposeofthecompresseddata.

Thenwelearnthediscretecosinetransformation（DCT）.Thediscretecosinetransformation（DCT）isatransformationrelatedtoFouriertransformation,whichissimilartothediscreteFouriertransform（DFT）,butusingonlytherealnumber.DiscretecosinetransformationisequivalenttoadiscreteFouriertransformationtwiceofthelength,whichiscarriedoutononedualfunction（functionasarealdualfunctionoftheFouriertransformationisstillarealdual）,somedeformationwhichneedstobeinputtedoroutputtedpositionthemobilehalfaunit.Whatismore,DCThaseightstandardtypes,fourofwhicharecommon.

ThroughtheDCT,wecantranslatethespatialdomaintofrequencydomain.ThereisaexampleofDCT:

Forthedigitalvideocompression,thegeneralprincipleisthattheperformanceofthecompressionisjustrelatedtotheremoveofsignalredundancy.Thegeneralprocessesofthecompressionaredothetransformationofthesource,thendothequantization,andusingtheentropycodingmethodtocode.Finallywecangetthecompresseddata.Forthevideocase,thereare3typesof