Interpolating Quality Dynamics in Wikipedia and Demonstrating the Keilana Effect
Citation: Aaron Halfaker (2017/08/23) Interpolating Quality Dynamics in Wikipedia and Demonstrating the Keilana Effect. 2017 Proceedings of OpenSym (RSS)
DOI (original publisher): https://doi.org/10.1145/3125433.3125475
Semantic Scholar (metadata): https://doi.org/10.1145/3125433.3125475
Sci-Hub (fulltext): https://doi.org/10.1145/3125433.3125475
Internet Archive Scholar (search for fulltext): Interpolating Quality Dynamics in Wikipedia and Demonstrating the Keilana Effect
Tagged: Communication "Communication" is not in the list (Anthropology, Arts and Literarure, Astronomy, Biology, Business, Chemistry, Clinical Research, Computer Science, Economics, Education, ...) of allowed values for the "Subject" property.
(RSS) wikipedia (RSS)
Summary
This article describes a method, dataset, and API for the study of article quality in Wikipedia. The author summarizes the history of researchers investigating article quality, notably beginning with a 2005 report by Jim Giles in Nature, which evaluated the quality of Wikipedia coverage of science topics as equal or better than traditional encyclopedias. Researchers have explored the processes by which non-expert volunteers are able to generate quality articles, particularly focusing on two aspects of article development and maintenance: the sometimes automated processes to detect and remove vandalism, and the cumulative collaboration that develops between contributors to an article, who may be driven by their interest in the topic.
Studies analyzing article quality have generally depended on either expert analysis or on assessments conducted by volunteers within the Wikipedia community. However, expert analysis is expensive, volunteer assessments are not widespread, and neither offer a view of quality as it develops over time.
Building on previous work by Warncke-Wang et al. (e.g. M. Warncke-Wang, D. Cosley, and J. Riedl. Tell me more: An actionable quality model for Wikipedia. In OpenSym, page 8. ACM, 2013.), the author implemented a method to predict article quality. This quality assessment routine was then integrated into the same API platform used within the Wikipedia community for other tasks, such as vandalism detection. In addition to predicting article quality, the API can also calculate an assessment of how close an article is to other quality levels (e.g. whether a "Stub" article is fairly close to being "Start" class). Halfaker lists the six official quality measures in Wikipedia as: Stub, Start, C, B, GA (Good Article), and FA (Featured Article).
As a demonstration of the use of the quality-assessment tools, Halfaker proceeds to analyze the quality of articles on women scientists. Previous studies had found that articles about women scientists tended to be of lower quality and that coverage overall was poor. Using the new quality analysis tools, Halfaker demonstrates that over time there has been a growth in quality for B-class articles about women scientists. This growth occurred at a higher rate than the rate at which articles about other topics have improved. This finding may be related to recent activism on the topic.
Theoretical and Practical Relevance
This article contributes to discussions on peer production quality dynamics by describing, implementing, and making available a method for analyzing article quality over time. The article also demonstrates the use of this method by analyzing the quality of articles about women scientists, which in turn contributes to the understanding and development of future work on topics such as under-served topics and the impact of activism on Wikipedia.