Reproducible and User-Controlled Software Environments in HPC with Guix

From AcaWiki
Jump to: navigation, search

Citation: Ludovic Courtès, Ricardo Wurmus (2015/12/18) Reproducible and User-Controlled Software Environments in HPC with Guix. Lecture Notes in Computer Science (RSS)
DOI (original publisher): 10.1007/978-3-319-27308-2_47
Semantic Scholar (metadata): 10.1007/978-3-319-27308-2_47
Sci-Hub (fulltext): 10.1007/978-3-319-27308-2_47
Internet Archive Scholar (search for fulltext): Reproducible and User-Controlled Software Environments in HPC with Guix
Download: https://link.springer.com/chapter/10.1007/978-3-319-27308-2 47
Tagged: Computer Science (RSS) reproducibility (RSS), high-performance computing (RSS)

Summary

Functional Package Managers (like Guix and Nix) make it easy to develop reproducible HPC environments.

Theoretical and Practical Relevance

Placeholder

Problem

  1. Sysadmins want stability, but devs want to improve things.

Prior attempts

  • System package managers (e.g. apt): packages are too old, packages build on publishers machine not client, difficult to write packages, difficult to incorporate multiple channels, imperative/stateful package management.
  • Traditional third-party package managers (e.g. EasyBuild, Spack): clobbers the /usr, imperative/stateful package management, doesn't capture system configuration, built artifacts are not safely shareable.
  • Writing down every version: doesn't capture system configuration.
  • Snapshot system image (e.g. Docker image, VM image approach): hard to ship, hard to verify the environment, hard to compose.
  • Snapshot recipes (e.g. Dockerfile, Vagrantfile): too broad, almost always talks to internet which introduces non-determinism, imperative/stateful.

Their Solution: Functional Package Managers (FPM)

  • All packages are pure functions from {files of packages they depend on} to {files produced by the package}.
    • This is not just for libraries; even the C compiler is considered a dependency.
    • This encodes a DAG of packages.
    • Files (both inputs and outputs) are read-only/immutable.
      • Can safely share the cache across machines.
      • Since every node in the network needs the same packages, this reduces build burden.
  • Since they are pure, cache results on disk.
    • Each result stores the hash of its inputs, so we know when the cached result can be safely used.
  • How to maintain purity while maintaining ease-of-use?
    • For purity, FPM runs the package-function in a chroot (filesystem isolation), well-defined environment variables, PID namespace, etc.
    • For ease-of-use, FPMs inserts your dependent packages into the chrooted-filesystem, $PATH, and other env-vars.
    • This makes it easy to explicitly depend on packages, but hard to implicitly do so.
  • Implementations: Guix (considered in this paper) and Nix.

Use case of FPM

  • Guix is deployed at Delbrück Center for Molecular Medicine (MDC), Berlin.
  • The package cache is shared among 250 cluster nodes and some user workstations.
  • Custom packages are easy
    • This claim is made by the authors not the users. I would be curious to know what they think

Downsides of FPM

  • Guix Daemon requires privilege
  • Guix does not have remote daemon
    • At MDC, users have to manage their environment from a specific node. Other nodes can use but not change this environment.
  • FPM can't easily specify the kernel/OS-level things.
  • Guix's Sandboxing (chroot + ...) isn't perfect; information can still leak, causing non-determinism.
    • Some packages try to query and specialize for the specific processor, which makes it impure.
  • Guix doesn't have proprietary software.