A ChineseEnglish Machine Translation System文档格式.docx
- 文档编号:16151978
- 上传时间:2022-11-21
- 格式:DOCX
- 页数:7
- 大小:45.54KB
A ChineseEnglish Machine Translation System文档格式.docx
《A ChineseEnglish Machine Translation System文档格式.docx》由会员分享,可在线阅读,更多相关《A ChineseEnglish Machine Translation System文档格式.docx(7页珍藏版)》请在冰豆网上搜索。
Frederking,RudnickyandHogan,1997:
61-65;
Nirenburg,1996:
96-105;
RaynerandCarter,1997:
107-10).Experimentshaveshowntheresultsofusingmulti-engineMTsystemareindeedbetterthananyofthesingleMTenginesinthesystem(HoganandFrederking,1998).
Insuchasystem,eachenginetriestotranslatethesourcesentenceseparately,givesaseriesoftranslationsofthephrasesinthesourcesentence,andthenputstheresultingoutputsegmentsintoasharedchart-likedatastructure.Allthepartialtranslationscanthenbegivenaninternalqualityscore.Achart-walkalgorithmisusedtofindthebestcombinationofthepartialtranslation.
Inmulti-enginearchitecture,theenginesworkindependently.Thatmeans,anenginecannotmakeuseoftheresultofotherengines.Forexample,anexample-basedMT(EBMT)enginecantranslateaChinesesentence“我喜歡看電影,”becausethereisasentence“我喜歡看電視劇”inthecorpus.Butforthesentence“我喜歡看成龍演的這部電影,”theEBMTenginecannotgivetheresult,becausethereisnosampleinthecorpuscanmatchthephrase"
成龍演的這部電影.”Itispossiblethatarule-basedMT(RBMT)enginecantranslatethisphrasecorrectly.Butinthemulti-enginesystem,theEBMTenginecannotusetheresultgivenbyRBMTengine.
Herewegiveamicro-engineapproachtomachinetranslationandintroduceaChinese-Englishmachinetranslationsystemusingsuchanapproach.Similarwiththemulti-engineapproach,itcansynthesizetheresultofthedeferentMTengine.Whatismore,enginesinthemicro-enginesystemcaninteractwitheachother.
TheMicro-engineArchitecture
Amicro-engineMTsystemconsistsofseveralmicro-enginesandanenginemanager.Allthemicro-enginesshareachartdatastructure.Theenginemanageralsomaintainsanactiveconstituentlist.Anactiveconstituentisaconstituentrecognizedbyamicro-enginebuthasnotbeenusedtogeneratenewconstituents.Theenginemanagerselectsthebestactiveconstituentfromtheactiveconstituentlistandsendsittoallthemicroengines.Themicro-enginesrecognizenewconstituentsusingthisactiveconstituentandtheexistinginactiveconstituents(theedgesinthechart).Theenginemanagerwilladdthesenewconstituentstotheactiveconstituentslistandthepreviousselectedactiveconstituentwillbemovedfromtheactiveconstituentlisttothechart.Thisprocessrepeatsitselfuntilaconstituentcoverthewholeinputsentenceisrecognized.
TheMicro-engine
Amicro-engineisamachinetranslationengine.UnlikeatraditionalMTengine,amicro-enginedoesnottrytotranslatethewholeinputsentence.Amicro-engineisspecialized.Itonlytriestofindaspecifictypeofconstituentintheinputsentenceandtranslatetheseconstituents.Alltheenginesworkcooperativelytotranslatethewholeinputsentence.Thatmeans,anenginecanmakeuseoftheresultsgeneratedbyothermicro-engines.
Normally,amicro-engineshouldimplementtwofunctions:
(1)Recognition
Themicro-engineacceptsanactiveconstituent,combinesitwiththeexistinginactiveconstituents,andgeneratesalistofnewconstituents.
(2)Translation
Theengineshouldtranslatetheconstituentitrecognized.ItmaycalltheTranslatefunctionofthemicro-enginesthatrecognizethesub-constituents.
TheEngineManagementAlgorithm
Data:
Chart––containingalltheinactiveconstituents
ActiveList––theactiveconstituentlist
EngineList––thelistofmicro-engines
Algorithm:
Usethelexicalenginetorecognizeallthewordsinthesourcesentence
AddthesewordsintoActiveList
RepeatwhileActiveListisnotempty
TheEdge=theconstituentwiththehighestscoreinActiveList
IfTheEdgecoversthewholeinputsentence
CalltheTranslatefunctionofTheEdge
Returntheresulttranslationtext
EndIf
ForEachEngineinEngineList
CalltheRecognizefunctionofEachEngineusingTheEdgeasinput
AddalltheoutputconstituentstoActiveList
EndFor
RemoveTheEdgefromActiveList
AddTheEdgetoChart
SortActiveListaccordingtocertaincriterion
EndRepeat
Ifnoconstituentcoveringthewholesentenceisrecognized
UsetheFail-softEnginetofindabestcombinationofexistingconstituents
Translatetheconstituentsinthecombination
EndAlgorithm
LexicalEngineandFail-softEngine
Normally,iftheinputactiveconstituentisempty,themicro-engine’sRecognitionfunctionwillnotdoanything.Buttherearetwoexceptions.Thesetwomicro-enginesarecalledtheLexicalandtheFail-softEngines.
TheLexicalEngineistheenginethatcarriesoutthelexicalanalysis.Thatmeans,tolookupthedictionary,tosegmenttheChinesesentencetowords,andtorecognizeChinesepersonalnamesandplacenames.ItsRecognitionfunctionworksonlywhentheinputactiveconstituentandthechartarebothempty.
TheFail-softEngineisusedwhenthereisnoconstituentcoveringthewholeinputsentencerecognized.Itselectsthebestcombinationoftheexistingconstituentsinthechartandgeneratesthetranslationbasedonthem.ItsRecognitionfunctionworksonlywhentheinputconstituentisemptyandthechartisnotempty.
OtherMicro-engines
InadditiontotheLexicalEngineandFail-SoftEngine,weusefiveothermicro-enginesinourChinese-Englishmachinetranslationsystem.
Oneofthemicro-enginesisarule-basedengine.Thisengineisconstructedfromatraditionalrule-basedChinese-Englishmachinetranslationsystem.(LiuandYu,1998:
514-17)Thereareabout300syntaxrulesinthisengine.Itusesachart-parsingalgorithmtoparsethesentence.
Anothermicro-engineisanexample-basedengine.Wecollectabilingualcorpuswithabout200,000words.MosttextsinthecorpusconsistofnewsoreditorialsofXinhuanewsagencyorgovernmentwhitebooks.
Thethirdmicro-engineisaproper-NPtranslationengine.ThisenginecanrecognizepropernounphrasesfromtheChinesesentenceandtranslationthemintoEnglish.Thepropernounphrasesincludepersonnamephrases,placenamephrases,organizationnamephrases,timephrases,numberphrases,moneyphrases,andsoon.
Thefourthmicro-engineisatitletranslationengine.ThetitleofChinesearticlesusuallyhaveaspecialsyntaxstructure,suchas"
機器翻譯的預處理研究"
"
試論網絡黑客的行為方式"
and"
魯迅傳"
.Thismicro-enginecanrecognizethesekindoftitlesandtranslatethemproperly.
Thefifthmicro-engineisacompoundsentencetranslationengine.Thisenginecanfindthelogicalrelationsbetweenthesimplesentencesinacompoundsentenceaccordingtotheconjunctionwords,andtranslatethesentenceproperly.
AnExample
Herewegiveanexampletoshowhowthemicro-engineMTsystemworks.ForanChinesesentence:
演員帕特里克·
斯威茲在他最近的一部電影中扮演了一個感人的保鏢角色。
(ActorPatrickSwayzeplayedatouchbouncerinoneofhisrecentmovies.)
TheLexicalEnginewilllookupthedictionaryandcutthesentenceintowords:
演員/n帕/g特/g里/f克/v·
/w斯/g威/g茲/g在/p他/r最近/a的/u
一/m部/q電影/n中/f扮演/v了/u一/m個/q感人/a的/u保鏢/n角色/n。
/w
Thelabelsfollowingeachwords,suchas“n”,“g”,“w”andetc.,isthepart-of-speechtagsofthewords.
TheProper-NPTranslationEnginewillrecognizethe“帕特里克·
斯威茲”asatransliterationofaforeignname:
帕特里克·
斯威茲/n(PatrickSwayze)
TheRule-BasedMTEnginewillrecognizeandtranslatetheconstituentsasbelow:
Np演員帕特里克·
斯威茲(ActorPatrickSwayze)
Pp在他最近的一部電影中vp(inhisrecentamovie)
Vp扮演了一个感人的保鏢角色np(playedaroleofatouchbodyguard)
S演員帕特里克·
(ActorPatrickSwayzeplayedaroleofatouchbodyguardinhisrecentamovie.)
TheresulttranslationisnotsogrammaticalinEnglish.
WhiletheExample-basedMTEnginecantranslationsomephrasesinotherway:
Np在他最近的一部電影中vp(inoneofhisrecentmovies)
Vp扮演了一个感人的保鏢角色np(playedatouchbodyguard)
Thesepartialtranslationsarebetter,becausetheExample-basedMTEnginecantranslatephrasesbythecomparingthemwiththesimilarexamplesinthecorpus,ratherthantranslatethemaccordingtothemanuallywrittenrules.
FinallytheRule-basedMTEnginewillsynthesistheintermediateresulttoanacceptedtranslation:
(ActorPatrickSwayzeplayedatouchbodyguardinoneofhisrecentmovies.)
Inthisexample,wecanseethatdifferentmicro-enginesworkcooperativelyandthetranslationisbetterthanwhatmaybegeneratedbyanyofthesingleengine.
ConclusionandFutureWork
Bothmicro-engineMTsystemandmulti-enginesystemcanemploydifferentMTtechnologiesinasinglesystem.Buttherearestilldifferencesbetweenthem.
Thegranularityofenginesinamicro-enginesystemisfinerthanthatofenginesinamulti-enginesystem.Inamulti-enginesystem,eachengineshouldbeacompleteMTsystem.Ittriestotranslatethewholeinputsentence.Butinamicro-enginesystem,eachenginehasitsspecialty.Amicro-enginedoesnotneedtotrytotranslatethewholesentence.Itjustneedstotranslatethe“familiar”partofthesentenceandignoretherestofthesentence.
Amicro-enginesystemisacl
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- ChineseEnglish Machine Translation System
链接地址:https://www.bdocx.com/doc/16151978.html