An analysis and metric of reusable data licensing practices for biomedical resources
Citation: Seth Carbon, Robin Champieux, Julie A. McMurry, Lilly Winfree, Letisha R. Wyatt, Melissa A. Haendel An analysis and metric of reusable data licensing practices for biomedical resources.
Describes the (Re)usable Data Project (RDP)'s rubric defining licensing characteristics of aggregated data resources (collections of digital biomedical data from multiple contributors) and measures of how these licensing behaviors impact reuse.
Six license types, with average evaluation scores (see below) for data resources with each license type:
- Permissive (4.5)
- Copyleft (3.0)
- Restrictive (2.6)
- Private pool (1.0)
- Copyright (1.4) (e.g., (c) or all rights reserved notices without a license)
- Unknown (0.7)
- (A) Clearly stated
- (B) Comprehensive & non-negotiated
- (C) Accessible [at known location, in bulk]
- (D) Kinds of reuse [allowed]
- (E) Who may reuse
RDP’s rubric emphasizes U.S. based, non-commercial, research requirements for data reuse and redistribution.
Data resources are evaluated based on these criteria and receive a score:
- 5 stars: The license unambiguously allows the unfettered (re)use and redistribution of the data.
- 4 stars: The license unambiguously allows (re)use and redistribution of the data under some terms.
- 3 stars: The license is clearly stated, unambiguous, and of a standard type, and has clear access, but has terms that may greatly impact the (re)use and redistribution of the data.
- 2.5 or fewer stars: There are likely issues in definitively finding the license, ambiguities in the license that hamper further analysis, issues with clean data access, or terms that require legal advice.
Evaluated 56 data resources, 10 receiving 5 stars; see above for license type breakdown.
Custom licenses were used for 21 of 56 data resources evaluated.
Theoretical and practical relevance:
Rubric could be used in license selection, not only evaluation of published data resources.