Reducing Technical Debt with Reproducible Containers
From AcaWiki
Citation: Tanu Malik (2020/11/04) Reducing Technical Debt with Reproducible Containers. IDEAS-ECP Webinar (RSS)
Internet Archive Scholar (search for fulltext): Reducing Technical Debt with Reproducible Containers
Download: http://ideas-productivity.org/wordpress/wp-content/uploads/2020/11/webinar046-technicaldebt.pdf
Tagged: Computer Science
(RSS) computational science (RSS)
Summary
- Technical debt := short-term gain at the cost of increased long-term maintenance effort.
- Difficulties include:
- Reproducibility is usually an afterthought.
- Identifying all relevant I/O.
- no mapping from artifacts to the paper.
- "Containers do not reduce technical-debt"
- Still has incompletely specified dependencies, still non-deterministic.
- Sciunit can reduce technical-debt by putting experiments in a auditable, modifyable package
- Use
strace
to capture inputs and outputs automatically. - Container either include the data or exclude the data; including is expensive, but excluding is not reproducible.
- Why not partial include?
- Use
- MiDAS: Minimizing DAtaSets
- Applications only access a subset of datasets.
- Identify only relevant chunks
- Use partial evaluation to prune codebase