Argumentative Zoning for improved citation indexing

From AcaWiki

Jump to: navigation, search


Citation: Simone Teufel Argumentative Zoning for improved citation indexing.



Tagged: Computer Science (RSS) citations (RSS), citation indexing (RSS), argumentation (RSS), argumentation zoning (RSS), information retrieval (RSS)


Summary:

This paper addresses the "citation indexing" task suggested in Task-based evaluation of summary quality: Describing relationships between scientific papers as a possible outcome of argumentative zoning.

Rhetorical citation maps could provide a glanceable summary of how the work in a particular paper relates to the literature it cites. First, contrasts and continuations are distinguished (using grey vs. black lines). Second, "the most important textual sentence about each citation can be displayed" directly in the citation map; this is taken to be an evaluative sentences about the citation.

Since evaluative statements about the citation may appear in neighboring sentences, or in another section, they use machine learning (with a Naive Bayes classifier based on Kupiec 1995) to classify and extract these. The 15 features used are summarized in Figure 5 (see 'features' below); these are related to the features tested in Discourse-level argumentation in scientific articles: Human and automatic annotation but may have been renamed (e.g. "AbsLoc" was previously called "Relative Location"), added (e.g. SentLength - Is the sentence longer than a certain threshold?), or dropped (e.g. verb negation).

Rhetorical-citation-map.png

Features

The most important features (in order) are: Absolute Sentence Location, Agent, Citations, Headlines, History, Forumulaic, and Action.

Relation to previous work

This work follows from the 'argumentation zoning' project of Simone Teufel's thesis. See also An annotation scheme for discourse-level argumentation in research articles (which introduces the zones), Discourse-level argumentation in scientific articles: Human and automatic annotation, What's yours and what's mine: Determining intellectual attribution in scientific text, and Task-based evaluation of summary quality: Describing relationships between scientific papers.

The argumentative zoning categories first introduced in An annotation scheme for discourse-level argumentation in research articles are given, along with examples of each. The CONTRAST and BASIS categories are used, as a proxy for finer classification schemes in the field of Content Citation Analysis (see Weinstock 1971).

Metadiscourse

This paper gives a useful description of metadiscourse. Based on Myers 1992 the use metadiscourse to mean "the set of expressions that talk about the act of presenting research in a paper, rather than the research itself." Drawing from Swales 1990, who observed that "the argumentation of the paper is rather prototypical" they use the Formulaic feature to collect 1762 phrases and their variations. Likewise, the Agent feature represents the grammatical subjects; often this agent is the one being attributed. The Verb features are also related.

Verbs

"There is a set of verbs that is often used when the overall scientific goal of a paper is defined."("propose, present, report, suggest" -- and to a lesser extent "describe, discuss give, introduce, put forward, show, sketch, state, and talk about").

Verbs are very useful in distinguishing CONTRAST sentences (which use verbs of failure and contrast - see Figs 8 and 9) from CONTINUATION ones (which use verbs of continuation and change - see Figs 6 and 7).

Verbs of continuation: adopt, agree with, base, be based on, be derived from, be originated in, be inspired by, borrow, build on, follow, originate from, originate in, side with
Verbs of change: adapt, adjust, augment, combine, change, decrease, elaborate on, expand, extend, derive, incorporate, increase, manipulate, modify, optimize, refine, render, replace, revise, substitute, tailor, upgrade
Verbs of failure: abound, aggravate, arise, be cursed, be incapable of, be forced to, be limited to, be problematic, be restricted to, be troubled, be unable to, contradict, damage, degrade, degenerate, fail, fall prey, fall short, force oneself, force, hinder, impair, impede, inhibit, lack, misclassify, misjudge, mistake, misuse, neglect, obscure, overestimate, overfit, overgeneralize, overgenerate, overlook, pose, plague, preclude, prevent, resort to, restrain, run into problems, settle for, spoil, suffer from, threaten, thwart, underestimate, undergenerate, violate, waste, worsen
Verbs of contrast: be different from, be distinct from, conflict, contrast, clash, differ from, distinguish oneself, differentiate, disagree, disagreeing, dissent, oppose


Human Annotation

This briefly reviews the annotation work from What's yours and what's mine: Determining intellectual attribution in scientific text but extends it with the citation indexing task in mind.

In this case, annotators needed to identify the citation(s) associated with each evaluative statement (CONTRAST, BASIS). Citations can appear before or after the evaluative statement, so the distance between the evaluative statement and the citation must be indicated as well. Further, each citation may have several statements made about it. Contrastive evaluation statements, for instance, can be several sentences after the citation.

Citation patterns

The authors identified 6 "patterns of citing and author stance statements":

  1. citation, CONTRAST follows after a few sentences
  2. citation, BASIS follows after a few sentences
  3. approach criticized (CONTRAST), then described in following sentences
  4. citing with no evaluation
  5. BASIS embedded in a paragraph about OWN work
  6. CONTRAST embedded in OWN (without repetition of the citation) -- e.g. to contrast results from the work being contrasted

Future Work

The future work section of the paper is detailed, particularly regarding planned evaluations.


Selected References




Personal tools
Namespaces
Variants
Actions
Navigation
New
Tools
Discussion
Help
Toolbox