ORA Thesis: "Automatic error detection in non-native English" - uuid:5192f0cb-6e4d-4730-bb54-a97a73d603ed




Links & Downloads


Reference: Rachele De Felice, (2008). Automatic error detection in non-native English. DPhil. University of Oxford.

Citable link to this page:
Title: Automatic error detection in non-native English

Abstract: This thesis describes the development of Dapper (`Determiner And PrePosition Error Recogniser'), a system designed to automatically acquire models of occurrence for English prepositions and determiners to allow for the detection and correction of errors in their usage, especially in the writing of non-native speakers of the language. Prepositions and determiners are focused on because they are parts of speech whose usage is particularly challenging to acquire, both for students of the language and for natural language processing tools. The work presented in this thesis proposes to address this problem by developing a system which can acquire models of correct preposition and determiner occurrence, and can use this knowledge to identify divergences from these models as errors. The contexts of these parts of speech are represented by a sophisticated feature set, incorporating a variety of semantic and syntactic elements. DAPPER is found to perform well on preposition and determiner selection tasks in correct native English text. Results on each preposition and determiner are discussed in detail to understand the possible reasons for variations in performance, and whether these are due to problems with the structure of DAPPER or to deeper linguistic reasons. An in-depth analysis of all features used is also offered, quantifying the contribution of each feature individually. This can help establish if the decision to include complex semantic and syntactic features is justified in the context of this task. Finally, the performance of DAPPER on non-native English text is assessed. The system is found to be robust when applied to text which does not contain any preposition or determiner errors. On an error correction task, results are mixed: DAPPER shows promising results on preposition selection and determiner confusion (definite vs. indefinite) errors, but is less successful in detecting errors involving missing or extraneous determiners. Several characteristics of learner writing are described, to gain a clearer understanding of what problems arise when natural language processing tools are used with this kind of text. It is concluded that the construction of contextual models is a viable approach to the task of preposition and determiner selection, despite outstanding issues pertaining to the domain of non-native writing.

Digital Origin:Born digital
Type of Award:DPhil
Level of Award:Doctoral
Awarding Institution: University of Oxford
About The Authors
institutionUniversity of Oxford
facultyMathematical,Physical & Life Sciences Division - Computing Laboratory
oxfordCollegeSt Catherine's College
fundingArts and Humanities Research Council
Prof Stephen Pulman More by this contributor
Bibliographic Details
Issue Date: 2008
Copyright Date: 2009
Urn: uuid:5192f0cb-6e4d-4730-bb54-a97a73d603ed
Item Description
Member of collection : ora:thesis
Alternate metadata formats
Copyright Holder: Rachele De Felice
Terms of Use: Click here for our Terms of Use