Analyzing the Evolution and Maintenance of ML Models on Hugging Face

From AcaWiki
Jump to: navigation, search

Citation: Joel Castaño, Silverio Martínez-Fernández, Xavier Franch, Justus Bogner Analyzing the Evolution and Maintenance of ML Models on Hugging Face.
DOI (original publisher): 10.1145/3643991.3644898
Semantic Scholar (metadata): 10.1145/3643991.3644898
Sci-Hub (fulltext): 10.1145/3643991.3644898
Internet Archive Scholar (search for fulltext): Analyzing the Evolution and Maintenance of ML Models on Hugging Face
Wikidata (metadata): Q135972655
Download: https://dl.acm.org/doi/10.1145/3643991.3644898
Tagged:

Summary

The authors mine ~380k Hugging Face model reposto characterize ecosystem growth, author groups, model-card topics, and—centrally—maintenance. They classify commit messages into perfective, corrective, adaptive and cluster repos into high- vs. low-maintenance using k-means on commit frequency, intervals, authors, etc. Findings: commit activity is right-skewed (many models have few commits); average commit edits ~5 files (median 2); perfective dominates (~89%), with adaptive (~6%) and corrective (~2.5%) far smaller; and only ~16.5% of models fall into a high-maintenance cluster (the rest low-maintenance). High-maintenance repos are larger and more popular (downloads/likes) and tend to have longer model cards; popularity is concentrated in a small number of author groups.

Theoretical and Practical Relevance

The study provides ecosystem-level evidence on how open-model projects are actually maintained: most are low-maintenance and dominated by incremental/perfective updates, while a small, active core anchors the platform’s maintained models. This sharpens our understanding of open-weight development by distinguishing lineage-level reuse from within-repo upkeep, and by linking maintenance intensity to popularity, size, and documentation depth. Practically, the features they surface (commit cadence, author count, card length) are useful signals of model health for users; for platform and tool designers, the results motivate ML-specific maintenance tools (better versioning for data/models, automated monitoring for drift) and transparent maintenance logs to improve selection and trust.