CLEC中国英语学习者语料库.docx
- 文档编号:17370532
- 上传时间:2023-04-24
- 格式:DOCX
- 页数:14
- 大小:35.61KB
CLEC中国英语学习者语料库.docx
《CLEC中国英语学习者语料库.docx》由会员分享,可在线阅读,更多相关《CLEC中国英语学习者语料库.docx(14页珍藏版)》请在冰豆网上搜索。
CLEC中国英语学习者语料库
CLEC中国英语学习者语料库
CLEC收集了包括中学生、大学英语4级和6级、专业英语低年级和高年级在内的5种学生的语料一百多万词,并对言语失误进行标注。
其目的就是观察各类学生的英语特征和言语失误的情况,希望通过定量和定性的方法对中国学习者英语作出较为精确的描写,为我国学生的英语教学提供有用的反馈信息。
表1CLEC语料分布
类型词次
ST2208088
ST3209043
ST4212855
ST5214510
ST6226106
总计1070602
言语失误标注原则
1.简单合理,易于系统操作。
参与标注的人比较多,分类表过于繁复,就难于掌握。
我们采取两级分类,第一级有11类:
词形(fm)、动词短语(vp)、名词短语(np)、代词(pr)、形容词短语(aj)、副词(ad)、介词短语(pp)、连词(cj)、词汇(wd)、搭配(cc)、句子(sn)。
每一类里再用数目字细分。
如[cc]为词语搭配不当,[cc1]表示名词和名词的搭配,[cc2]表示名词和动词的搭配,[cc3]表示动词和名词的搭配,等等。
2.分类表的类别要适中。
过粗容易统一,但信息太少,不利于分析学习者的失误/过细难以统一,容易把同一种失误归到不同类别。
目前我们采取的办法是对常见的失误从细(如vp和np都有9小类),对少见的失误从粗(如cj只有两小类)。
现在的分类表有61个失误码,是属于中等规模的分类表。
提供足够的失误信息(失误本身、失误类型和失误发生范围)。
例如Inthepast,
[vp6,4-]kindtoeachother…,失误用方括号表示,放在失误peopleare
之后。
[vp6]为vp(动词)第6种(时态)失误,4-为失误发生的范围,-表示失误的位置,4表示失误前有4个词。
要联系这4个词,才能判断are这个词用错了。
开放性。
容许研究者根据需要对失误类型进行补充或进一步再分出细类。
例如[sn8]为句子结构有缺陷,研究者可以对这种失误再分为若干细类来研究。
这需要把sn8的失误全部检索出来,然后定出第三级的分类范畴,如sn81,sn82,等等。
5.对语体或失误的来由暂不作标注,因为这需要标注者较多的主观判断,更难以统一。
言语失误分类表(总数:
61)
词形动词短语名词短语代词码类型码类型码类型码类型
Spellingvp1patternnp1patternpr1Referencefm1
fm2wordbuildingvp2setphrasenp2setphrasepr2anticipatory
itfm3capitalizationvp3agreementnp3agreementpr3Agreement
vp4finite/non-finitenp4casepr4Case
vp5non-finitenp5countabilitypr5wh-
vp6tensenp6numberpr6Indefinite
vp7voicenp7article
vp8moodnp8quantifiers
vp9modal/auxiliarynp9other
determiners
形容词短语副词介词短语连词
码类型码类型码类型码类型
patternad1orderpp1patterncj1patternaj1
aj2setphrasead2modificationpp2setphrasecj2setphraseaj3degreead3degreeaj4-ed/-ing
confusion
aj5predicative
/attributive
词语搭配句子
码类型码类型码类型
ordercc1noun/nounsn1run-onwd1
sentence
wd2partofspeechcc2noun/verbsn2sentence
fragment
wd3substitutioncc3verb/nounsn3dangling
modifier
wd4absencecc4adj/nounsn4illogical
comparisonwd5redundancycc5verb/advsn5topic
prominencewd6repetitioncc6adv/adjsn6Coordinationwd7ambiguitysn7Subordination
sn8structural
deficiency
sn9Punctuation
标注说明
码分类类别说明
fm1wordSpelling(拼写)spelling,coinage,abbreviation,
apostrophe
fm2wordwordbuildingderivation,inflection,compounding,
(构词)plurality(noun),irregularity(verb),
3rdpersonsingularform(verb),
syllabification,hyphenation,word
divisionorfusion
fm3wordCapitalizationlowerinitialletterforupperinitial
(大小写)letterorviceversa
vp1vbphrPattern(及物性errorintransitivity(viasvtorvice
型式)versa),transitiveverbpattern/
grammatical(cfOxfordadvanced
learner’sdictionaryofcurrent
editedbyA.S.Hornby)English
vp2vbphrsetphrase(固定phrasalverbandverbalphrase:
error
词组)informoruse
vp3vbphrAgreement(主谓numberagreementwithitssubject
一致性)(nounorpronoun)
vp4vbphrfinite/non-finifiniteverbfornon-finiteverborvice
te(定式)versa
vp5vbphrnon-finite(不定infinitiveerror:
formanduse/
式)infinitiveforparticipleorvice
versa/-edparticiplefor-ing
participleorviceversa
vp6vbphrTense(时态)errorintenseusewithinasentence/
thesequenceoftensesbetween
sentences
vp7vbphrvoice(语态)errorintheuseofvoice:
activefor
passiveorviceversa
vp8vbphrMood(语气)errorintheuseofmood:
imperative,
subjunctive/improperstructureof
conditionalsentences
vp9vbphrmodal/auxiliarymisuseofmodal/auxiliaryverbs/wrong
(情态)formofmodalverb(orauxiliaryverb)
andverbcombination(e.gtenseform,
voiceform,etc)
np1nnphrPattern(名词型Errorincombinationwithother
式)words/grammatical
np2nnphrsetphrase(固定omissionorreplacementofafixed
词组)elementthatgoesafteracertainnoun
np3nnphrAgreement(主谓numberagreementofanounwithits
一致性)determinerorawordthatreferstoit
np4nnphrCase(格)possessivecaseerror:
formoruse
np5nnphrCountability(可uncountablenounusedascountable
数性)noun
np6nnphrNumber(数)countablenounusedwithnodeterminer
or-s/aor-swithpluralnoun
np7nnphrArticle(冠词)confusionora/andefinite/indefinite
confusion
np8nnphrQuantifiers(数misuseorconfusionbetweenmany/much,
量词),etc(a)few/(a)little,some/anynp9nnphrothermisuseorconfusionofdemonstratives,
determiners(其wh-determiners,numerals,etc.
他限定词)
pr1pronReference(指称)incorrect/ambiguouspronoun
reference/anaphoric
pr2pronanticipatoryitimproperorwronguseofanticipatory
it/itreplacedbyademonstrative,(先行it)
etc
pr3pronAgreement(主谓numberagreementwithanounitrefers
一致性)to
pr4pronCase(格)caseerrorofanypersonalpronoun
pr5pronwh-(wh-代词)misuseorconfusionofinterrogative,
relativeandconjunctivepronouns
pr6pronIndefinite(不定misuseorconfusionofindefinite
式)pronounssuchasall/both,
few/little,some/any,either/neither,
etc
aj1adjPattern(形容词errorinthecombinationwithother
型式)words/grammatical
aj2adjsetphrase(固定errorintheidiomaticuseofan
词组)adjectivalphrase/omissionor
replacementofafixedelementthat
goesafteracertainadjective
aj3adjDegree(级)adjectivedegreeerror:
formanduse
aj4adj-ed/-ing-edadjectivefor-ingadjectiveor
confusionviceversa
(-ed/-ing混淆)
aj5adjpredicative/attpredicativeadjectiveusedas
ributive(谓语/attributiveadjective
定语)
ad1advOrder(词序)improperadverbplacement/wrong
position
ad2advModification(修adjectivemodifierusedasverb
饰语)modifier/otherkindsofconfusion
ad3advDegree(级)adverbdegreeerror:
formanduse
pp1prepPattern(介词型unacceptablecombinationwithother
式)words/grammatical
pp2prepsetphrase(固定errorintheformationoruseofan
词组)idiomaticprepositionalphrase
cj1conjPattern(连词型unacceptablecombinationwithother
式)words/grammatical
cj2conjsetphrase(固定errorintheformationoruseofa
词组)phrasefunctioningasaconjunction
wd1wordOrder(词序)misplacementofanywordotherthanan
adverb
wd2wordpartofspeecherrorinpartofspeech:
rightrootbut
(词类)wrongwordclass
wd3wordSubstitution(替errorinwordchoice:
rightwordclass
代)butwrongselection(anypartof
speech)
wd4wordAbsence(缺少)omissionofaword(anypartofspeech)
wd5wordRedundancy(冗oversupplianceofaword(anypartof
余)speech)
wd6wordRepetition(重unnecessaryrepeatingofaword
复)
wd7wordAmbiguity(歧义)notclearwordmeaning/semantic
cc1notionan/ncollocationimpropernoun(phrase)and
l(名词/名词)noun(phrase)combination/semantic
cc2notionan/vcollocationimpropernoun(phrase)and
l(名词/动词)verb(phrase)combination/semantic
cc3notionav/ncollocationimproperverbandnoun(phrase)
l(动词/名词)combination/semantic
cc4notionaa/ncollocationimproperadjectiveandnoun(phrase)
l(形容词/名词)combination/semantic
cc5notionav/adimproperverbandadverb(orad/v)
lcollocation(动combination/semantic
词/副词)
cc6notionaad/aimproperadverbandadjective
lcollocation(副combination/semantic
词/形容词)
sn1sentencrun-onsentenceimproperadditionofclauses/fused
e(不断句)sentence
sn2sentencsentencesubordinateclauseasasentence/any
efragment(片段)phraseasasentence
sn3sentencdanglingillogicaladverbialmodificationofa
emodifier(垂悬修clause
饰语)
sn4sentencillogicalerrorinthecomparisonofwordsor
ecomparison(比较phrasesinasentencewhichcannotbe
不符合逻辑)compared
sn5sentenctopictheco-occurrenceofaninitialnoun
eprominence(主题phraseanditsequivalent(usuallya
突出)pronoun)inthesamesentence
sn6sentencCoordination(并faultyparallelismofclauses(or
e列)words/phrases)inasentence
sn7sentencSubordinationfaultyattachmentofasubordinate
e(主从)clausetothemainclause
sn8sentencstructuralerrorinthegrammaticalconstruction
edeficiency(结构ofasentence:
impropersplitting,
缺陷)patternshifting,confusing
structure,etc
sn9sentencPunctuation(标overuse,absence,choice,apostrophe,
e点符号)commasplice,etc.
标准化处理后的各种失误频数及其比例
失误类型总计百分比(%)st2st3st3st4st5
1686.
fm11928.82877.42112.61826.7710432.217.47
fm2349.3448.9438.9226.9328.71792.73
fm31474.4731.8405.8694.1174.63480.75.83
vp1259.4325.9498.4103.4200.81387.92.32
vp2179139.361.2104.222.1505.80.85
vp3374524.6785.2273.13272283.93.82
vp4140.8159.1110.863.951.6526.20.88
vp5140118.7107.489.946.7502.70.84
vp61165.7356311.6379.8215.62428.74.07
vp7172.7104.198.463.946.7485.80.81
vp827.116.38.325.211.588.40.15
vp9111.4274.3278.542.986.1793.21.33
np146.933.528.916.810.7136.80.23
np224.722.417.419.32.586.30.14
np3202.1247.7249.6210.91861096.31.84np466.855.926.422.721.3193.10.32np558.99871.960.584.4373.70.63np6374654.4481358.8354.12222.33.72np7237.9107.589.3174.854.9664.41.11np83565.447.913.47.4169.10.28np96.441.312.47.65.773.40.12pr182236.520589.918.9632.31.06pr216.778.323.14.20122.30.2pr352.554.2172.728.660.6368.60.62pr474.83720.748.710.7191.90.32pr526.353.314.17.610.71120.19pr69.52.653.4020.50.03aj16.418.915.759550.09aj29.53.49.95.97.436.10.06aj338.239.632.243.797.5251.20.42aj416.72.622.312.65.759.90.1aj50.83.47.41.7013.30.02ad135.896.339.727.715.6215.10.36ad242.237.812.49.24.9106.50.18ad37.2129.91.72.533.30.06pp1136.19843169.728.7475.50.8pp225.5262.3143.83727.9496.50.83cj127.820.618.221.812.3100.70.17cj247.713.25.94.935.70.06Wd143.8151.3114.125.237.7372.10.62Wd2324.6929.6772.8226.9242.62496.54.18Wd311021634.71815757.1359.85668.69.49Wd4585.6829.8443.8403.34272689.54.5Wd5410.6613.1518.2265.5171.31978.73.31Wd627.13722.334.529.5150.40.25Wd7261.8430.8261
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- CLEC 中国 英语 学习者 语料库