Design, discussion, and dissent in open bug reports

From AcaWiki
Jump to: navigation, search

Citation: Andrew J. Ko, Parmit K. Chilana (2011) Design, discussion, and dissent in open bug reports. iConference (RSS)
DOI (original publisher): 10.1145/1940761.1940776
Semantic Scholar (metadata): 10.1145/1940761.1940776
Sci-Hub (fulltext): 10.1145/1940761.1940776
Internet Archive Scholar (search for fulltext): Design, discussion, and dissent in open bug reports
Download: http://faculty.washington.edu/ajko/papers/Ko2011ContentiousBugReports.pdf
Tagged: Computer Science (RSS) bug reports (RSS), open source software (RSS), argumentation (RSS), online argumentation (RSS)

Summary

Based on a a qualitative study of 100 contentious open source bug reports, this paper aims to understand design discussions. They focused on


Data

  • Bugzilla repositories of Firefox, the Linux kernel, and the Facebook API.
  • Focused on reproducible bugs that had been decided on by limiting to RESOLVED, VERIFIED, and FIXED reports which were resolved as FIXED, INVALID, or WONTFIX.

Methods

They chose to sample only reports with substantial discussion. Word count was not a good measure of this (due to the inclusion of error logs). They instead ranked reports by the frequency of personal pronouns (I, you, we, they, us, IMO, IMHO) which indicate personal involvement (Psychological aspects of natural language use: Our words, our selves), yielding a power law distribution, from which they sampled 100 reports (~1 million words) from the top 300.

Conversations were read by the two authors, then codes were developed: scope, idea, dimension, rationale, process, or decision. Each author coded half of the sample, informally assessing agreement along the way. Discussions generally proceed from scope and ideas to rationale and process. They present results from each code type, along with examples.

Results

  • Discussions show a philosophical divide between achieving the original intent of a design and adapting to user needs.
  • Measurability helped discussions come to consensus. Value-based qualities called for authoritative decisions.
  • Sequential discussions obscured important points, which got lost in repetitive volumes of text. Design proposals and critiques were detached from one another.
  • They report the qualities referred to in report titles (functionality, usability, flexibility, among others).
  • They discuss moderation, which did not work sufficiently. They point to a comment asking users who don't have new information to add to use the vote mechanism, rather than adding a comment. Yet "this request was in the middle of hundreds of comments and most commenters did not notice it."

Rhetorical devices

Table 4 shows the rhetorical devices appearing more than 30 times:

  • anecdotes
  • speculation
  • generalizations
  • hyperbole
  • pragmatism
  • impact
  • logic
  • connotation
  • tradeoffs
  • authorities

Other devices were used, such as

  • hypotheticals
  • insults
  • priorities
  • statistics
  • policies
  • sententia ("pithy wisdom")

Decision-Making

Developers' authority and action were the most powerful aspects of decision-making, and pragmatism was very important. Discussion only influenced design decisions when a small number of developers were involved. They quote a commenter, who summarizes this as "Seems like this issue is just a matter of different personal opinions. Personal opinions of those who have power to fix this prevail, by definition."

Overall, reports were not deliberative; the authors speculate that this was affected either by commenters' lack of design experience or developers' interest in suppressing design debate (since bug reports are not an appropriate place for such discussions).

They explore three patterns:

  • 63%: Developers discussing the functional design, quickly leading to consensus and implementation, and moving on to discussing the design of code. "A key characteristic of these reports was how little of the functional design space was explored."
  • 19% Divergent discussion, involving both developers and users.
    • In 1/3rd , developers would end the discussion, giving a decision, sometimes with a rationale (which would usually be based on pragmatism and impact)
    • In 2/3rds, discussion ended when the issue became moot (due to some other change)
  • 18% Convergent discussion, usually with both developers and users.
    • Typically developers vetoed a proposed change. Rationale was rare but, when given, usually had to do with inconsistency with prior decisions, pragmatism, and authority.
    • There was often controversy between sticking with original intent vs. supporting unexpected uses. History of previous designs often played into these discussions.


Selected References

Decision-Making

Language use

Argumentation

Theoretical and Practical Relevance

Understanding decision-making in software communities can help improve these discussions, ultimately impacting the quality of the software itself.