Reducing Technical Debt with Reproducible Containers
Citation: Tanu Malik (2020/11/04) Reducing Technical Debt with Reproducible Containers. IDEAS-ECP Webinar (RSS)
Internet Archive Scholar (search for fulltext): Reducing Technical Debt with Reproducible Containers
Tagged: Computer Science (RSS) computational science (RSS)
- Technical debt := short-term gain at the cost of increased long-term maintenance effort.
- Difficulties include:
- Reproducibility is usually an afterthought.
- Identifying all relevant I/O.
- no mapping from artifacts to the paper.
- "Containers do not reduce technical-debt"
- Still has incompletely specified dependencies, still non-deterministic.
- Sciunit can reduce technical-debt by putting experiments in a auditable, modifyable package
straceto capture inputs and outputs automatically.
- Container either include the data or exclude the data; including is expensive, but excluding is not reproducible.
- Why not partial include?
- MiDAS: Minimizing DAtaSets
- Applications only access a subset of datasets.
- Identify only relevant chunks
- Use partial evaluation to prune codebase