英文文献翻译.docx
- 文档编号:6150633
- 上传时间:2023-01-04
- 格式:DOCX
- 页数:17
- 大小:41.20KB
英文文献翻译.docx
《英文文献翻译.docx》由会员分享,可在线阅读,更多相关《英文文献翻译.docx(17页珍藏版)》请在冰豆网上搜索。
英文文献翻译
英文原文
Speechsynthesis
Speechsynthesisistheartificialproductionofhumanspeech.Acomputersystemusedforthispurposeiscalledaspeechsynthesizer,andcanbeimplementedinsoftwareorhardware.Atext-to-speech(TTS)systemconvertsnormallanguagetextintospeech;othersystemsrendersymboliclinguisticrepresentationslikephonetictranscriptionsintospeech.Synthesizedspeechcanbecreatedbyconcatenatingpiecesofrecordedspeechthatarestoredinadatabase.Systemsdifferinthesizeofthestoredspeechunits;asystemthatstoresphonesordiphonesprovidesthelargestoutputrange,butmaylackclarity.Forspecificusagedomains,thestorageofentirewordsorsentencesallowsforhigh-qualityoutput.Alternatively,asynthesizercanincorporateamodelofthevocaltractandotherhumanvoicecharacteristicstocreateacompletely"synthetic"voiceoutput.Thequalityofaspeechsynthesizerisjudgedbyitssimilaritytothehumanvoiceandbyitsabilitytobeunderstood.Anintelligibletext-to-speechprogramallowspeoplewithvisualimpairmentsorreadingdisabilitiestolistentowrittenworksonahomecomputer.Manycomputeroperatingsystemshaveincludedspeechsynthesizerssincetheearly1990s.
Overviewoftextprocessing
Atext-to-speechsystem(or"engine")iscomposedoftwoparts:
afront-endandaback-end.Thefront-endhastwomajortasks.First,itconvertsrawtextcontainingsymbolslikenumbersandabbreviationsintotheequivalentofwritten-outwords.Thisprocessisoftencalledtextnormalization,pre-processing,ortokenization.Thefront-endthenassignsphonetictranscriptionstoeachword,anddividesandmarksthetextintoprosodicunits,likephrases,clauses,andsentences.Theprocessofassigningphonetictranscriptionstowordsiscalledtext-to-phonemeorgrapheme-to-phonemeconversion.Phonetictranscriptionsandprosodyinformationtogethermakeupthesymboliclinguisticrepresentationthatisoutputbythefront-end.Theback-end—oftenreferredtoasthesynthesizer—thenconvertsthesymboliclinguisticrepresentationintosound.Incertainsystems,thispartincludesthecomputationofthetargetprosody(pitchcontour,phonemedurations),whichisthenimposedontheoutputspeech
History
Longbeforeelectronicsignalprocessingwasinvented,therewerethosewhotriedtobuildmachinestocreatehumanspeech.Someearlylegendsoftheexistenceof"speakingheads"involvedGerbertofAurillac(d.1003AD),AlbertusMagnus(1198–1280),andRogerBacon(1214–1294).
In1779,theDanishscientistChristianKratzenstein,workingattheRussianAcademyofSciences,builtmodelsofthehumanvocaltractthatcouldproducethefivelongvowelsounds(inInternationalPhoneticAlphabetnotation,theyare[aː],[eː],[iː],[oː]and[uː]).[5]Thiswasfollowedbythebellows-operated"acoustic-mechanicalspeechmachine"byWolfgangvonKempelenofPressburg,Hungary,describedina1791paper.[6]Thismachineaddedmodelsofthetongueandlips,enablingittoproduceconsonantsaswellasvowels.In1837,CharlesWheatstoneproduceda"speakingmachine"basedonvonKempelen'sdesign,andin1857,M.Faberbuiltthe"Euphonia".Wheatstone'sdesignwasresurrectedin1923byPaget.
Inthe1930s,BellLabsdevelopedthevocoder,whichautomaticallyanalyzedspeechintoitsfundamentaltoneandresonances.Fromhisworkonthevocoder,HomerDudleydevelopedamanuallykeyboard-operatedvoicesynthesizercalledTheVoder(VoiceDemonstrator),whichheexhibitedatthe1939NewYorkWorld'sFair.
ThePatternplaybackwasbuiltbyDr.FranklinS.CooperandhiscolleaguesatHaskinsLaboratoriesinthelate1940sandcompletedin1950.Therewereseveraldifferentversionsofthishardwaredevicebutonlyonecurrentlysurvives.Themachineconvertspicturesoftheacousticpatternsofspeechintheformofaspectrogrambackintosound.Usingthisdevice,AlvinLibermanandcolleagueswereabletodiscoveracousticcuesfortheperceptionofphoneticsegments(consonantsandvowels).
Dominantsystemsinthe1980sand1990sweretheMITalksystem,basedlargelyontheworkofDennisKlattatMIT,andtheBellLabssystem;[8]thelatterwasoneofthefirstmultilinguallanguage-independentsystems,makingextensiveuseofnaturallanguageprocessingmethods.
Earlyelectronicspeechsynthesizerssoundedroboticandwereoftenbarelyintelligible.Thequalityofsynthesizedspeechhassteadilyimproved,butoutputfromcontemporaryspeechsynthesissystemsisstillclearlydistinguishablefromactualhumanspeech.
Asthecost-performanceratiocausesspeechsynthesizerstobecomecheaperandmoreaccessibletothepeople,morepeoplewillbenefitfromtheuseoftext-to-speechprograms.
Electronicdevices
Thefirstcomputer-basedspeechsynthesissystemswerecreatedinthelate1950s.ThefirstgeneralEnglishtext-to-speechsystemwasdevelopedbyNorikoUmedaetal.in1968attheElectrotechnicalLaboratory,Japan.[10]In1961,physicistJohnLarryKelly,JrandcolleagueLouisGerstman[11]usedanIBM704computertosynthesizespeech,aneventamongthemostprominentinthehistoryofBellLabs.Kelly'svoicerecordersynthesizer(vocoder)recreatedthesong"DaisyBell",withmusicalaccompanimentfromMaxMathews.Coincidentally,ArthurC.ClarkewasvisitinghisfriendandcolleagueJohnPierceattheBellLabsMurrayHillfacility.Clarkewassoimpressedbythedemonstrationthatheuseditintheclimacticsceneofhisscreenplayforhisnovel2001:
ASpaceOdyssey,ArthurC.ClarkeBiographyattheWaybackMachine(archivedDecember11,1997)wheretheHAL9000computersingsthesamesongasitisbeingputtosleepbyastronautDaveBowman."Where"HAL"FirstSpoke(BellLabsSpeechSynthesiswebsite)".BellLabs.http:
//www.bell-Retrieved2010-02-17.Despitethesuccessofpurelyelectronicspeechsynthesis,researchisstillbeingconductedintomechanicalspeechsynthesizers.AnthropomorphicTalkingRobotWaseda-TalkerSeriesHandheldelectronicsfeaturingspeechsynthesisbeganemerginginthe1970s.OneofthefirstwastheTelesensorySystemsInc.(TSI)Speech+portablecalculatorfortheblindin1976.TSISpeech+&otherspeakingcalculatorsGevaryahu,Jonathan,"TSIS14001ASpeechSynthesizerLSIIntegratedCircuitGuide"[deadlink]Otherdeviceswereproducedprimarilyforeducationalpurposes,suchasSpeak&Spell,producedbyTexasInstrumentsBreslow,etal.UnitedStatesPatent4326710:
"Talkingelectronicgame"April27,1982in1978.Fidelityreleasedaspeakingversionofitselectronicchesscomputerin1979.VoiceChessChallengerThefirstvideogametofeaturespeechsynthesiswasthe1980shoot'emuparcadegame,Stratovox,fromSunElectronics.Gaming'sMostImportantEvolutions,GamesRadarAnotherearlyexamplewasthearcadeversionofBezerk,releasedthatsameyear.Thefirstmulti-playerelectronicgameusingvoicesynthesiswasMiltonfromMiltonBradleyCompany,whichproducedthedevicein1980.
Synthesizertechnologies
Themostimportantqualitiesofaspeechsynthesissystemarenaturalnessandintelligibility.[citationneeded]Naturalnessdescribeshowcloselytheoutputsoundslikehumanspeech,whileintelligibilityistheeasewithwhichtheoutputisunderstood.Theidealspeechsynthesizerisbothnaturalandintelligible.Speechsynthesissystemsusuallytrytomaximizebothcharacteristics.
Thetwoprimarytechnologiesforgeneratingsyntheticspeechwaveformsareconcatenativesynthesisandformantsynthesis.Eachtechnologyhasstrengthsandweaknesses,andtheintendedusesofasynthesissystemwilltypicallydeterminewhichapproachisused.
Concatenativesynthesis
Concatenativesynthesisisbasedontheconcatenation(orstringingtogether)ofsegmentsofrecordedspeech.Generally,concatenativesynthesisproducesthemostnatural-soundingsynthesizedspeech.However,differencesbetweennaturalvariationsinspeechandthenatureoftheautomatedtechniquesforsegmentingthewaveformssometimesresultinaudibleglitchesintheoutput.Therearethreemainsub-typesofconcatenativesynthesis.
Unitselectionsynthesis
Unitselectionsynthesisuseslargedatabasesofrecordedspeech.Duringdatabasecreation,eachrecordedutteranceissegmentedintosomeorallofthefollowing:
individualphones,diphones,half-phones,syllables,morphemes,words,phrases,andsentences.Typically,thedivisionintosegmentsisdoneusingaspeciallymodifiedspeechrecognizersettoa"forcedalignment"modewithsomemanualcorrectionafterward,usingvisualrepresentationssuchasthewaveformandspectrogram.[12]Anindexoftheunitsinthespeechdatabaseisthencreatedbasedonthesegmentationandacousticparameterslikethefundamentalfrequency(pitch),duration,positioninthesyllable,andneighboringphones.Atruntime,thedesiredtargetutteranceiscreatedbydeterminingthebestchainofcandidateunitsfromthedatabase(unitselection).Thisprocessistypicallyachievedusingaspeciallyweighteddecisiontree.
Unitselectionprovidesthegreatestnaturalness,becauseitappliesonlyasmallamountofdigitalsignalprocessing(DSP)totherecordedspeech.DSPoftenmakesrecordedspeechsoundlessnatural,althoughsomesystemsuseasmallamountofsignalprocessingatthepointofconcatenationtosmooththewaveform.Theoutputfromthebestunit-selectionsystemsisoftenindistinguishablefromrealhumanvoices,especiallyincontextsforwhichtheTTSsystemhasbeentuned.However,maximumnaturalnesstypicallyrequireunit-selectionspeechdatabasestobeverylarge,insomesystemsrangingintothegigabytesofrecordeddata,representingdozensofhoursofspeech.[13]Also,unitselectionalgorithmshavebeenkn
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 英文 文献 翻译