Technical Debt in Computational Science
Citation: Konrad Hinsen (2015/10/28) Technical Debt in Computational Science. Computer Science & Engineering (RSS)
DOI (original publisher): 10.1109/MCSE.2015.113
Semantic Scholar (metadata): 10.1109/MCSE.2015.113
Sci-Hub (fulltext): 10.1109/MCSE.2015.113
Internet Archive Scholar (search for fulltext): Technical Debt in Computational Science
Tagged: Computer Science (RSS) research software engineering (RSS)
Technical debt is future obligation implied by design choices, usually made for short-term gain. Financial debt is an apt analogy; technical debt can accrue interest, and it is sometimes desirable for the sake of growth. E.g., the migration from Python 2 to 3 pays back technical debt that was made, usually by poor choice of defaults. Even choosing a high-level platform (language or library) can be debt, because you pay the cost of maintaining when the platform updates. (Editors note: not necessarily if you pin exact versions; then you only pay when you want to upgrade, but you have no obligation to upgrade. Contrary to the author, I argue choosing a high-level platform is not a form of tech debt.)
Tech debt in research results in bad research. We have some notorious examples of software bugs invalidating published results, and there are probably more that we don't observe. Scientists often treat computational experiments like a physical ones, but computational ones have more inherent chaos, according to the author. However, they also have a possibility for perfect repeatability, an advantage over physical experiments which has largely gone underutilized. See also author's blog post.
One (institutional) reason software is not reproducible is that technical debt requires constant maintenance, but publications do not incentivize maintenance. A second (technical and social) reason is exactly specifying a program is hard; you would have to exactly specify the linker, compiler, OS, and dozens of other utilities. The technical aspect is tackled by tools like Make (editors note: add Nix, Docker, and Popper), but too few people use these tools. A third (social) reason is that scientists seek premature optimizations, according to the author.
Solutions: Develop solid infrastructure, educate scientists on software practices (CiSE, Software Carpentry).