The State of the ML-Universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHub

From AcaWiki
Jump to: navigation, search

Citation: Danielle Gonzalez, Thomas Zimmermann, Nachiappan Nagappan The State of the ML-Universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHub.
Internet Archive Scholar (search for fulltext): The State of the ML-Universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHub
Wikidata (metadata): Q105186732
Download: https://2020.msrconf.org/details/msr-2020-papers/34/The-State-of-the-ML-universe-10-Years-of-Artificial-Intelligence-Machine-Learning-
Tagged: github (RSS)

Summary

Examines AI & ML tool (700 repositories) and application (4524) as a community, compared with 4101 unrelated repositories.

Findings include:

  • The oldest AI & ML repository on GitHub was created in 2009
  • From 2012, AI & ML grew faster than repositories overall
  • From 2017 there has been a boom, very fast growth, in AI & ML
  • More applications than tools (including libraries and frameworks) are created, but the latter are more popular
  • Python is the most popular language

Repositories were identified using the GitHub API, for topic labels artificial intelligence, deep learning, and machine learning, which were used to discover additional AI & ML topics, 439 topics in total, with additional filtering criteria for real projects and availability of data in GHTorrent. Comparison repositories were most starred repos active in 2019.

Differences in collaboration style are studied across AI tools, AI applications, and comparison repos.