Using automatically labelled examples to classify rhetorical relations: An assessment

From AcaWiki
Jump to: navigation, search

Citation: Caroline Sporleder, Alex Lascarides (2008) Using automatically labelled examples to classify rhetorical relations: An assessment. Natural Language Engineering (RSS)
Internet Archive Scholar (search for fulltext): Using automatically labelled examples to classify rhetorical relations: An assessment
Download: http://homepages.inf.ed.ac.uk/alex/pubs/jnle.rhetorical.html
Tagged: Computer Science (RSS) discourse analysis (RSS), computational linguistics (RSS), rhetorical relations (RSS), discourse markers (RSS), rhetorical structure (RSS), discourse relations (RSS)

Summary

This paper takes up the question of whether rhetorical relations can be automatically derived and classified. It focuses, in particular, on discourse markers. These may be ambigious (e.g 'since', 'yet' have multiple uses and are sometimes, but not always, discourse markers); and these discourse markers may also be missing altogether.

The authors comment that: "what is needed is a model which can classify rhetorical relations in the absence of an explicit discourse marker." (p4). Previous work (e.g. Marcu & Echihabi 2002) has suggested creating training data for a classifier by labelling examples which contain an unambiguous lexically marked rhetorical relation, then removing the markers. The main purpose of this paper is to empirically test this.

It also provides an interesting theoretical observation: Two conditions are needed for training on marked examples to work well:

"First, there has to be a certain amount of redundancy between the discourse marker and the general linguistic context, i.e. removing the discourse marker should still leave enough residual information for the classifier to learn how to distinguish different relations."

Second, similarity between marked and unmarked examples is needed so that a classifier can make generalizations.

The paper suggests that texts with lexically marked and lexically unmarked rhetorical relations may be inherently different, in so far as removing discourse markers may change the meaning of a sentence, and classifiers built based on removing markers from classified sentences work little better than chance.

Example

Selected References

Theoretical and Practical Relevance

Section 2, on related research, summarizes a number of important related work.

Examples of both lexically marked and unmarked rhetorical relations, given in the introduction and in the appendices, will be useful elsewhere.