Using automatically labelled examples to classify rhetorical relations: An assessment

From AcaWiki

Jump to: navigation, search


Citation: Caroline Sporleder, Alex Lascarides (2008) Using automatically labelled examples to classify rhetorical relations: An assessment. Natural Language Engineering (RSS)


Download: http://homepages.inf.ed.ac.uk/alex/pubs/jnle.rhetorical.html

Tagged: Computer Science (RSS) discourse analysis (RSS), computational linguistics (RSS), rhetorical relations (RSS), discourse markers (RSS), rhetorical structure (RSS), discourse relations (RSS)


Summary:

This paper takes up the question of whether rhetorical relations can be automatically derived and classified. It focuses, in particular, on discourse markers. These may be ambigious (e.g 'since', 'yet' have multiple uses and are sometimes, but not always, discourse markers); and these discourse markers may also be missing altogether.

The authors comment that: "what is needed is a model which can classify rhetorical relations in the absence of an explicit discourse marker." (p4). Previous work (e.g. Marcu & Echihabi 2002) has suggested creating training data for a classifier by labelling examples which contain an unambiguous lexically marked rhetorical relation, then removing the markers. The main purpose of this paper is to empirically test this.

It also provides an interesting theoretical observation: Two conditions are needed for training on marked examples to work well:

"First, there has to be a certain amount of redundancy between the discourse marker and the general linguistic context, i.e. removing the discourse marker should still leave enough residual information for the classifier to learn how to distinguish different relations."

Second, similarity between marked and unmarked examples is needed so that a classifier can make generalizations.

The paper suggests that texts with lexically marked and lexically unmarked rhetorical relations may be inherently different, in so far as removing discourse markers may change the meaning of a sentence, and classifiers built based on removing markers from classified sentences work little better than chance.

Example

Selected References

Theoretical and practical relevance:

Section 2, on related research, summarizes a number of important related work.

Examples of both lexically marked and unmarked rhetorical relations, given in the introduction and in the appendices, will be useful elsewhere.



Personal tools
Namespaces
Variants
Actions
Navigation
New
Tools
Discussion
Help
Toolbox