Thematic coherence and quotation practices in OSS design-oriented online discussions

From AcaWiki
Jump to: navigation, search

Citation: Flore Barcellini, Françoise Détienne, Jean-Marie Burkhardt, Warren Sack (2005) Thematic coherence and quotation practices in OSS design-oriented online discussions. Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work (RSS)
DOI (original publisher): 10.1145/1099203.1099237
Semantic Scholar (metadata): 10.1145/1099203.1099237
Sci-Hub (fulltext): 10.1145/1099203.1099237
Internet Archive Scholar (search for fulltext): Thematic coherence and quotation practices in OSS design-oriented online discussions
Tagged: quotation (RSS), coherence (RSS), open source software (RSS), Open Source software community (RSS), CSCW (RSS), ontologies (RSS), open source software (RSS), design (RSS), open source software design (RSS)

Summary

Rather than focusing on threads created by reply-to chains, this paper focuses on quotation.

Hypotheses

Their hypothesis is that "quotation-based representations are more relevant than threading-based representations to reconstruct thematic coherence of design-oriented online discussion" and that quotes show the influence of certain participants in open source projects. These hypotheses are supported by the research.

Background

The paper reviews previous work in coherence, identifying quotation as an online analogue of (face-to-face) turn-taking.

Mailing lists and forums are generally displayed by thread and/or posting time. However, previous prototypes (such as Conversation Map and Zest) have automatically identified quotes and repesented the quotation links between messages.

Study

This paper is part of Barcellini's ongoing hand-analysis of the Python open source software community. In this paper, both hand and automatic analysis form the methodology for identifying quotes.

They provide a categorization of message structure:

  1. text-only messages (no quotations)
  2. one-quote message (one block of quotations, followed by a comment)
  3. multiple quotes message (alternating quotes and comments)

Two discussion topics are analysed, both drawn from the Python Enhancement Proposals (PEPs):

  1. "PEP 279 proposes three different enhancements to Python: (1) a new index builtin function; (2) a way to facilitate generator comprehension; and, (3) a means for generator exception passing"
  2. PEP 285 proposes the introduction of booleans as a built-in type

Only 4-8% of the messages in these discussions did not contain a quote. "Reply-to" threading breaks the discussion into several threads. While 5 to 6 themes are identified in these discussions, and presented as relating to technical design problems, most messages discuss multiple themes. Quoting, the authors conclude, helps preserve the thematic coherence, because all the messages are linked together.

One interesting finding is that the messages that are NOT linked to other messages--that is, the ones that don't quote other messages--are "pivotal to the overal discussion", somtimes generating new branches of discussion. In this case, reply-to threading is particularly harmful because it obscures the connections other messages have to it.

Synchronicity

Although the project is distributed geographically, "responses quoting a message are quickly posted": "half (median) of the 1st quotations of posted messages come within 1h (PEP 279) or 2h16 (PEP 285). Furthermore, 3/4 of the quotations occur within 5h (PEP 279) to 7h33 (PEP 285)."

Roles

Participant roles (e.g. administrator, developer) are also discussed, and there is a relationship between the social structure and the participation in the project. This can be observed by when discussions end and branch.


Related work

A study of 1 PEP (rather than the 2 presented here) is given in A study of online discussion in an Open-Source community: reconstructing thematic coherence and argumentation from quotation practices, which also considers the rhetorical structure and argumentation moves.

Theoretical and Practical Relevance

Quotation-based visualizations would be helpful for browsing archives, or reading listserv messages, avoiding some of the problems with threading (e.g. "incorrectly divid[ing] some theme-related messages into different threads" and missing the importance of certain messages that are central).

This paper can be helpful for beginning to understand the relationship between social structure and posting structure for open source communities.