Citation: Pranav Anand, Marilyn Walker, Rob Abbott, Jean E. Fox Tree, Robeson Bowmani, and Michael Minor (2011) Cats rule and dogs drool!: Classifying stance in online debate. Proceedings of the 2n Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (RSS)
Internet Archive Scholar (search for fulltext): Cats rule and dogs drool!: Classifying stance in online debate
Tagged: Computer Science (RSS) online argumentation (RSS), NLP (RSS), stance (RSS), disagreement (RSS), Mechanical Turk (RSS), natural language processing (RSS), discourse analysis (RSS), cue words (RSS)

Summary

Investigates the problem of classifying stance using 1113 two-sided debates for 12 topics, from the debate website Convinceme.com.

They note that there are 3 dialogue structure elements at Convinceme.com:

the side (e.g. pro or con)
explicit rebuttal links
temporal context/state of the debate at a particular time.

They identify rebuttals with 63% accuracy, and classify the side of debates by 54-69% using lexical and contextual features (compared to 49-60% for a unigram baseline).

Human annotators, however, can classify the side of debates 73% of the time (rebuttals) and 87% of the time (non-rebuttals). Posts were difficult to classify in some cases, e.g. short comments and ad hominem responses (39% of those where only 4-6 of the 9 annotators were correct); ambiguous and out-of-context comments (17%); and meta-debate comments (10%).

They distinguish ideological and non-ideological topics. Ideological topics have:

more posts per author
more rebuttals per topic
more context-dependence

Yet post length is not correlated with these.

Rebuttals have

more "markers of dialogic interaction"
more pronouns (you, that, it)
more ellipsis
more dialogic cue words

Human Annotation

Mechanical Turk was used to get human annotations to judge what side a post was on, without context. They present some interesting results about which posts are harder and easier to place.

Machine Learning

They used the Weka toolkit and two classifiers: NaiveBayes and JRip with the following feature sets:

Post info (IsRebuttal, Poster)
Unigrams
Bigrams
Cue words (initial unigram, bigram, and trigram)
Repeated punctuation (collapsed into ??, !!, ?!)
LIWC measures and frequences
Dependencies derived from the Stanford parser
Generalized dependencies (POS of the head word, opinion polarity of both words)
Context features ("matching features use for the post from the parent post")

Theoretical and Practical Relevance

Their corpus is available!

May contribute to long-term goals; they suggest that their work is motivated by

Automatic summarization
Understanding persuasiveness
"Identifying the linguistic reflexes of perlocutionary acts" (e.g. persuasion, disagreement)

Significant review of related work, embedded in the paper.

Cats rule and dogs drool!: Classifying stance in online debate

Summary

Human Annotation

Machine Learning

See also

Theoretical and Practical Relevance

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

New

Discussion

Help

Tools