Behind the article: Recognizing dialog acts in Wikipedia talk pages

{{Summary
 * title=Behind the article: Recognizing dialog acts in Wikipedia talk pages
 * authors=Oliver Ferschke and Iryna Gurevych and Yevgen Chebotar
 * url=http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2012/EACL_2012_OF.pdf
 * tags=Wikipedia, Wikipedia Talk pages, speech act theory, dialog acts, collaborative writing, NLP, corpus-building, machine learning, content analysis, Simple English Wikipedia
 * summary=This paper uses speech act theory and dialog acts as a theoretical framework for studying collaborative writing in Wikipedia. The overall goal of this line of research (in its early stages) is to understand how Talk pages contribute to article improvement.

The authors use Simple English Wikipedia in order to have significant coverage: This paper analyzes 100 Talk pages, which they describe as "almost 15% of Talk pages from that wiki"; in fact, they ignored all Talk pages with fewer than 4 turns. They used the 2011-04-04 snapshot, which has 69900 articles and 5783 Talk pages; 683 Talk pages contained more than 3 contributions.

It provides:
 * 1) provides an annotation schema for coordination-related dialog acts
 * 2) results in a freely-downloadable corpus of 100 segmented and annotated Talk page, called the Simple English Wikipedia Discussion Corpus,
 * 3) a machine learning procedure for classifying dialog acts on this corpus (seen as "a dialog act classification pipeline")

Annotation schema
The authors have 4 high-level classifications broken down into 17 subclassifications:
 * 1) Article Criticism
 * 2) Content incomplete or lacking detail
 * 3) Lack of accuracy or correctness
 * 4) Unsuitable or unnecessary content
 * 5) Structural problems
 * 6) Deficiencies in language or style
 * 7) Objectivity issues
 * 8) Other kind of criticism
 * 9) Explicit Performative Announce
 * 10) Explicit suggestion, recommendation or request
 * 11) Explicit reference or pointer
 * 12) Commitment to an action in the future
 * 13) Report of a performed action
 * 14) Information Content
 * 15) Information providing
 * 16) Information seeking
 * 17) Information correcting
 * 18) Interpersonal
 * 19) Positive attitude towards other contirbutor or acceptance
 * 20) Partial acceptance or partial rejection
 * 21) Negative attitude towards other contributor or rejection



Page selection
They used three classes, and randomly chose pages from these classes:
 * 1) 4-10 turns (50 pages)
 * 2) 11-20 turns (40 pages)
 * 3) more than 20 turns (10 pages)

They do not report the distribution of turn count; that would be useful.

Segmentation
They take a thorough approach to automatically segmenting Talk pages, both into threads, topics, and individual author contributions, using some new libraries like the Java-based Wikipedia library and the Wikipedia Revision Toolkit, described in Wikipedia Revision Toolkit: Efficiently accessing Wikipedia’s edit history.

Annotation
Annotations were made by two annotators who were trained on 10 (held-out) discussion pages. Annotators could discuss difficult cases and consult the coordinator. A third person reconciled annotations, choosing a value for the gold standard when the others did not agree.

Further discussion of the annotation procedures (e.g. an appendix with the annotation manual) would make this work easier to reuse; the authors comment that some labels (e.g. interpersonal category, "other kind of criticism") were particularly problematic, and should be revisited in the future.

Indirectness is one of the issues mentioned as an issue, and thus they also provide a useful discussion of conversational implicature theory, with some examples.

File formats
They used the open source MMAX2 software and data from their corpus is released in MMAX2's native format as well as UIMA's XMI format.

Machine learning pipeline
Using Weka, they compare three machine learners Naive Bayes, a decision tree algorithm, and SMO (an SVN optimization algorithm), and combine the best performers for each label into a UIMA-based classification pipeline. They describe feature selection and classification results, pointing out some issues with inter-annotator agreement. One interesting topic is that the algorithms can sometimes outperform humans, for instance with the "Other kind of criticism" and "Unsuitable or unnecessary content" classes.

Underlying theory

 * John L. Austin. 1962. How to do things with words. Clarendon Press, Cambridge, UK


 * Paul Grice. 1975. Logic and conversation. In Peter Cole and Jerry L. Morgan, editors, Syntax and Semantics, volume 3. New York: Academic Press.


 * Ilona R. Posner and Ronald M. Baecker. 1992. How people write together. In Proceedings of the 25th Hawaii International Conference on System Sciences, pages 127–138, Wailea, Maui, HI, USA.


 * John R. Searle. 1969. Speech acts. Cambridge University Press, Cambridge, UK.


 * John R. Searle. 1976. A classification of illocutionary acts. Language in Society, 5:1–23.

Dialog act classification in other genres

 * Tamitha Carpenter and Emi Fujioka. 2011. The role and identification of dialog acts in online chat. In Procedings of the Workshop on Analyzing Microtext at the 25th AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.


 * William W. Cohen, Vitor R. Carvalho, and Tom M. Mitchell. 2004. Learning to classify email into "speech acts". In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 309–316, Barcelona, ES.

Information quality

 * Eti Yaari, Shifra Baruchson-Arbib, and Judit Bar-Ilan. 2011. Information quality assessment of community generated content: A user study of Wikipedia. Journal of Information Science, 37:487-498.
 * Besiki Stvilia, Michael B. Twidale, Linda C. Smith, and Les Gasser. 2008. Information quality work organization in Wikipedia. Journal of the American Society for Information Science, 59:983–1001.

Talk pages

 * David Laniado, Riccardo Tasso, Yana Volkovich, and Andreas Kaltenbrunner. 2011. When the Wikipedians talk: Network and tree structure of Wikipedia discussion pages. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media, Dublin, IE.


 * Jodi Schneider, Alexandre Passant, and John G. Breslin. 2011. Understanding and improving Wikipedia article discussion spaces. In Proceedings of the 26th Symposium on Applied Computing, Taichung, TW.


 * Fernanda Vie ́gas, Martin Wattenberg, Jesse Kriss, and Frank Ham. 2007. Talk before you type: Coordination in Wikipedia. In Proceedings of the 40th Annual Hawaii International Conference on System Sciences, Waikoloa, Big Island, HI, USA.

Wikipedia-specific Tools
Code is the Java-based Wikipedia library and the Wikipedia Revision Toolkit, described in 2 papers:
 * Oliver Ferschke, Torsten Zesch, and Iryna Gurevych. 2011. Wikipedia Revision Toolkit: Efficiently accessing Wikipedia’s edit history. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. System Demonstrations, pages 97–102, Portland, OR, USA. See opensource code for accessing revisions, etc.
 * Torsten Zesch, Christof Mu ̈ller, and Iryna Gurevych. 2008.	Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary. In Proceedings of the 6th International Conference on Language Resources and Evaluation, Marrakech, MA. See opensource code
 * relevance=Most important is their free downloadable corpus, the Simple English Wikipedia Discussion Corpus, of 100 segmented and annotated Talk pages.

Their segmentation approaches should be considered as a candidate for the standard way to segment Talk pages.

They also provide a useful discussion of conversational implicature theory, with some examples from their corpus

Useful definitions
The authors provide some useful definitions: We define a turn (or contribution) as the body of text that is added by an individual contributor in one or more revisions to a single discussion topic until another contributor edits the page."

"a topic (or discussion) is the body of turns that revolve around a single matter. They are usually headed by a topic title."

"the thread structure designates the sequence of turns and their indentation levels on the Talk page."

Graphing method

 * Figure 2 uses an interesting graphing approach: it shows F1-scores for the classification pipeline ("best performance") as a bar chart, along with human and baseline performance marked on the individual bars.

Dialog acts
Their literature review section, in "related work", and in "classification results", provides a useful introduction to theory and applications of dialog acts, including some recent work in learning analytics. }}
 * journal=Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
 * pub_date=2012
 * subject=Computer Science