Design, discussion, and dissent in open bug reports

{{Summary
 * title=Design, discussion, and dissent in open bug reports
 * authors=Andrew J. Ko, Parmit K. Chilana
 * url=http://faculty.washington.edu/ajko/papers/Ko2011ContentiousBugReports.pdf
 * tags=bug reports, open source software, argumentation, online argumentation
 * summary=Based on a a qualitative study of 100  contentious open source  bug reports, this paper aims to understand design discussions. They focused on

Data

 * Bugzilla repositories of Firefox, the Linux kernel, and the Facebook API.
 * Focused on reproducible bugs that had been decided on by limiting to RESOLVED, VERIFIED, and FIXED reports which were resolved as FIXED, INVALID, or WONTFIX.

Methods
They chose to sample only reports with substantial discussion. Word count was not a good measure of this (due to the inclusion of error logs). They instead ranked reports by the frequency of personal pronouns (I, you, we, they, us, IMO, IMHO) which indicate personal involvement (Psychological aspects of natural language use: Our words, our selves), yielding a power law distribution, from which they sampled 100 reports (~1 million words) from the top 300.

Conversations were read by the two authors, then codes were developed: scope, idea, dimension, rationale, process, or decision. Each author coded half of the sample, informally assessing agreement along the way. Discussions generally proceed from scope and ideas to rationale and process. They present results from each code type, along with examples.

Results

 * Discussions show a philosophical divide between achieving the original intent of a design and adapting to user needs.
 * Measurability helped discussions come to consensus. Value-based qualities called for authoritative decisions.
 * Sequential discussions obscured important points, which got lost in repetitive volumes of text. Design proposals and critiques were detached from one another.
 * They report the qualities referred to in report titles (functionality, usability, flexibility, among others).
 * They discuss moderation, which did not work sufficiently. They point to a comment asking users who don't have new information to add to use the vote mechanism, rather than adding a comment. Yet "this request was in the middle of hundreds of comments and most commenters did not notice it."

Rhetorical devices
Table 4 shows the rhetorical devices appearing more than 30 times: Other devices were used, such as
 * anecdotes
 * speculation
 * generalizations
 * hyperbole
 * pragmatism
 * impact
 * logic
 * connotation
 * tradeoffs
 * authorities
 * hypotheticals
 * insults
 * priorities
 * statistics
 * policies
 * sententia ("pithy wisdom")

Decision-Making
Developers' authority and action were the most powerful aspects of decision-making, and pragmatism was very important. Discussion only influenced design decisions when a small number of developers were involved. They quote a commenter, who summarizes this as "Seems like this issue is just a matter of different personal opinions. Personal opinions of those who have power to fix this prevail, by definition."

Overall, reports were not deliberative; the authors speculate that this was affected either by commenters' lack of design experience or developers' interest in suppressing design debate (since bug reports are not an appropriate place for such discussions).

They explore three patterns:
 * 63%: Developers discussing the functional design, quickly leading to consensus and implementation, and moving on to discussing the design of code. "A key characteristic of these reports was how little of the functional design space was explored."


 * 19% Divergent discussion, involving both developers and users.
 * In 1/3rd, developers would end the discussion, giving a decision, sometimes with a rationale (which would usually be based on pragmatism and impact)
 * In 2/3rds, discussion ended when the issue became moot (due to some other change)


 * 18% Convergent discussion, usually with both developers and users.
 * Typically developers vetoed a proposed change. Rationale was rare but, when given, usually had to do with inconsistency with prior decisions, pragmatism, and authority.
 * There was often controversy between sticking with original intent vs. supporting unexpected uses. History of previous designs often played into these discussions.

Decision-Making

 * Espinosa, A., Kraut, R., Slaughter, S., Lerch, J., Herbsleb, J. and Mockus, A. (2002). Shared mental models, familiarity, and coordination: A multi-method study of distributed software teams. Int’l Conf. on Information Systems, 425-433.
 * Hiltz, S., Johnson, K., and Turoff, M. (1986). Experiments in group decision making: communication process and outcome in face-to-face versus computerized conferences. Human Communication Research, 13, 225- 252, 1986.
 * Lemus, D.R., Seibold, D.R., Flanagin A.J., Metzger M.J. (2006). Argument and decision making in computer-mediated groups. Journal of Communication, 54(2), 302-320.
 * Straus, S. G., McGrath, J. (1994) Does the medium matter? The interaction of task type and technology on group performance and member reactions. J. of Applied Psychology, 79, 87-97.

Language use

 * Pennebaker J.W., Mehl M.R. and Niederhoffer K.G. (2003). Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology, 54(1), 547-577.

Argumentation
}}
 * Canary D.J., Brossmann B.G., and Seibold D.R. (1987). Argument structures in decision-making groups. Southern Speech Communication Journal, 53(1), 18-37.
 * Kuhn, D. (1991). The skills of argument. Cambridge U. Press.
 * relevance=Understanding decision-making in software communities can help improve these discussions, ultimately impacting the quality of the software itself.
 * journal=iConference
 * pub_date=2011
 * doi=10.1145/1940761.1940776
 * subject=Computer Science