Information quality work organization in Wikipedia

From AcaWiki
Jump to: navigation, search

Citation: Besiki Stvilia, Michael B. Twidale, Linda C. Smith, Les Gasser (2008) Information quality work organization in Wikipedia. Journal of the American Society for Information Science and Technology (RSS)
DOI (original publisher): 10.1002/asi.20813
Semantic Scholar (metadata): 10.1002/asi.20813
Sci-Hub (fulltext): 10.1002/asi.20813
Internet Archive Scholar (search for fulltext): Information quality work organization in Wikipedia
Tagged: Computer Science (RSS) information quality (RSS), Wikipedia (RSS), Featured Articles (RSS), information science (RSS)

Summary

This paper, based on a line of earlier work (e.g. Assessing information quality of a community-based encyclopedia) in the author's dissertation research, discusses information quality (IQ) in Wikipedia.

Research Questions

  • What are the quality criteria in Wikipedia?
  • What are some of the information quality assurance processes (including tasks, roles, social arrangements, and strategies?)
  • How did these quality assurance processes evolve?
  • What are the quality dynamics of different types of information objects? Can they be modeled?
  • What are some of the motivations that make people contribute?
  • What are some of the information quality intervention strategies used, and in what circumstances?

Data analyzed

  • Used 3 data dumps from March 2005 to September 2007
    • March 9, 2005 (500,623 articles; 236 featured articles (FA))
    • November 30, 2006 (1,774,062 articles; 715 featured articles)
    • September 8, 2007 (2,234,170 articles; 822 featured articles)
  • Collected pages along with edit histories and discussion pages for each
    • Randomly selected 1,000 articles from each dump
    • Collected all featured articles from each dump
    • Randomly selected 1,000 projects, categories, and template collections from projects, categories, and templates from each dump

Analysis

  • Manual examination of a small sample of "process support artifacts" -- user home pages, vote logs, policy pages, and discussion pages.
  • Content analysis of 60 discussion pages (30 random, 30 FA) from the March 9, 2005 dump
  • Content analysis of 120 featured article removal candidate votes from July 2004-May 2005
  • Content analysis of 100 user profile pages who edited any of the articles in the random sample from the November 30, 2006 dump.
  • Analyzed the user profile pages and discussion pages of three editors with the highest closeness centrality scores in each of the November 30, 2006 sample networks (featured, random, category, project, and template), where the network was editors who edited at least one page (vertices) and were connected by pages edited (edges)

The Information Quality Assurance Context

Three types of processes:

  • evaluating article quality and directly affecting it (modifying, deleting, or changing its status)
  • evaluating the performance of editors and selecting admins and bots
  • building/maintaining work coordination artifacts

Four types of roles:

  • editors
  • information quality assurance agents (who monitor changes & revert vandalism)
  • malicious agents
  • environmental agents (who change the representational information quality of articles through changes in real world states; this may degrade info quality, or may enhancing it by "aligning the real-world state with the information contained in an article")

Administrators, Bots, and Heavy Contributors

The article includes a significant discussion of the administration process, including RfA and RfB, and user comments about the roles of administrators. Their increase in numbers, and the share of edits in each of the snapshots, is also discussed.

Bots are also briefly discussed.

They vote that edits and votes appear to exhibit a power law, increasing the importance of heavy contributors who "are familiar with the IQ policies and norms". It raises the question whether the decrease in the administrators' share of edits points to a robust community or a potential scalability problem.

Discussion pages

Discussion pages, they point out, are routinely used for feedback quality, notices and warnings, cross-article communication, and general coordination. Yet they are also used outside the editorial group to ask questions or solicit assistance elsewhere.

These are viewed as "work articulation artifact", and featured articles have better developed discussion pages, which are 10x longer, more organized, and more readable (according to Flesch readability scores).

As they note, long discussions can indicate either lack of consensus or high article quality.

Article edit histories

History logs are used for coordination, to identify and fight vandalism, and to help in dispute resolution.


Featured Article criteria and processes

They discuss the evolution of the Featured Article process and its guidelines, and compare these to the Crawford model and their own framework (which distinguishes intrinsic, relational, and reputational quality measures).

Deletion

The original deletion criteria were informal, and only 3 administrators could delete pages on request. More recently, deletion has evolved to included several processes, avoiding votes where possible, and is based on notability and its appropriateness to an encyclopedia, while quality has not been emphasized.

Criteria

  • Accessibility
  • Accuracy
  • Authority
  • Cohesiveness
  • Complexity
  • Consistency
  • Informativeness
  • Naturalness
  • Relevance
  • Verifiability
  • Volatility

Tradeoffs

  • completeness versus accessibility
  • accuracy versus accessibility
  • completeness versus cohesiveness
  • accessibility versus complexity
  • completeness versus consistency
  • accessibility versus consistency
  • completeness versus complexity
  • volatility versus accessibility

Selected References

  • Crawford, H. (2001). Encyclopedias. In R. Bopp & L. C. Smith (Eds.), Reference and information services: An introduction (3rd ed., pp. 433–459). Englewood, CO: Libraries Unlimited.

Theoretical and Practical Relevance

A paragraph on page 2 provides brief summaries of a number of studies of Wikipedia.

Fewer than .05% of articles are Featured Articles.