Machine learning techniques for persuasion detection in conversation

{{Summary
 * title=Machine Learning Techniques For Persuasion Detection In Conversation
 * authors=Pedro Ortiz
 * tags=persuasion, argumentation mining, machine learning, theses
 * summary=The core question of this Master's thesis, as the author puts it, is: “Can we learn to identify persuasion as characterized by Cialdini’s model using traditional machine learning techniques?” The authors give a qualified "yes"; improvement is needed for real-world results, but the methods function. The corpus used was developed in his colleague's Master's thesis, Persuasion detection in conversation.

Persuasion model
The persuasion model is taken from Cialdini; see Cialdini's 6 key principles of persuasion; examples are provided on pages 5-8.

Features

 * Word unigrams and bigrams
 * gappy word bigrams
 * orthogonal sparse word bigrams
 * feature discrimination (stop words, entropy-base pruning)
 * texttiling -- segmentation using three stages (see Hearst [14] and Nomoto and Nitta [15])
 * 1) tokenization
 * 2) lexical score determination ("based on block comparison and vocabulary introduction")
 * 3) boundary identification

Machine Learning Techniques
Naive Bayes with add-one smoothing, Maximum Entropy, and Vector_Machine SVMs were used. Results were evaluated with precision, recall, and F-score.

Evaluation

 * Precision
 * Recall
 * Accuracy - "the number of corect classifications in proportion to the size of the set being classified"

Persuasion model

 * R. Cialdini, Influence: The psychology of persuasion. New York, NY: Collins, 2007.

Texttiling
[14] M. Hearst, “Texttiling: Segmenting text into multi-paragraph subtopic passages,” Computational Linguistics, vol. 23, no. 1, pp. 33–64, 1997. [15] T. Nomoto and Y. Nitta, “A grammatico-statistical approach to discourse partitioning,” in Proceedings of the 15th Conference on Computational Linguistics. Morristown, NJ: Association for Computational Linguistics, 1994, pp. 1145–1150.

Previous work in Persuasion detection

 * W.-H. Lin, T. Wilson, J. Wiebe, and A. Hauptmann, “Which side are you on?: Identifying perspectives at the document and sentence levels,” in CoNLL-X ’06: Proceedings of the Tenth Conference on Computational Natural Language Learning. Morristown, NJ, USA: Association for Computational Linguistics, 2006, pp. 109–116.


 * D. Bikel and J. Sorensen, “If we want your opinion,” in ICSC ’07: Proceedings of the International Conference on Semantic Computing. Washington D.C.: IEEE Computer So- ciety, 2007, pp. 493–500.

"the transcripts from the Davidian standoff in Waco, Texas were significantly different from the rest of the corpus." -- May have bearing for other sciences studying these.
 * H. T. Gilbert, “Persuasion detection in conversation,” Master’s thesis, Naval Postgraduate School, Monterey, CA, 2010.
 * relevance===Interesting Observations==

Useful summaries
Summaries of machine learning techniques given are particularly interesting.

Suggestions for future work
The author identifies several areas of future work needed:

Data set improvements

 * more and larger data sets
 * additional genres such as Web pages, blogs, and SMS messages
 * additional information
 * belief annotations
 * distance from the previous persuasive post
 * correct speaker tags
 * dialogue act tags

Features set improvements

 * topic models
 * combining high recall features with high precision features

Future research

 * segmentation schemes
 * effects of time and sequence
 * the utility of bagging, boosting, and voting
 * the role of speaker type
 * the impact of parts of speech and syntax