f2aca86502
Fixes #9394 Closes #13217. ## Background Spack provides the ability to enable/disable parallel builds through two options: package `parallel` and configuration `build_jobs`. This PR changes the algorithm to allow multiple, simultaneous processes to coordinate the installation of the same spec (and specs with overlapping dependencies.). The `parallel` (boolean) property sets the default for its package though the value can be overridden in the `install` method. Spack's current parallel builds are limited to build tools supporting `jobs` arguments (e.g., `Makefiles`). The number of jobs actually used is calculated as`min(config:build_jobs, # cores, 16)`, which can be overridden in the package or on the command line (i.e., `spack install -j <# jobs>`). This PR adds support for distributed (single- and multi-node) parallel builds. The goals of this work include improving the efficiency of installing packages with many dependencies and reducing the repetition associated with concurrent installations of (dependency) packages. ## Approach ### File System Locks Coordination between concurrent installs of overlapping packages to a Spack instance is accomplished through bottom-up dependency DAG processing and file system locks. The runs can be a combination of interactive and batch processes affecting the same file system. Exclusive prefix locks are required to install a package while shared prefix locks are required to check if the package is installed. Failures are communicated through a separate exclusive prefix failure lock, for concurrent processes, combined with a persistent store, for separate, related build processes. The resulting file contains the failing spec to facilitate manual debugging. ### Priority Queue Management of dependency builds changed from reliance on recursion to use of a priority queue where the priority of a spec is based on the number of its remaining uninstalled dependencies. Using a queue required a change to dependency build exception handling with the most visible issue being that the `install` method *must* install something in the prefix. Consequently, packages can no longer get away with an install method consisting of `pass`, for example. ## Caveats - This still only parallelizes a single-rooted build. Multi-rooted installs (e.g., for environments) are TBD in a future PR. Tasks: - [x] Adjust package lock timeout to correspond to value used in the demo - [x] Adjust database lock timeout to reduce contention on startup of concurrent `spack install <spec>` calls - [x] Replace (test) package's `install: pass` methods with file creation since post-install `sanity_check_prefix` will otherwise error out with `Install failed .. Nothing was installed!` - [x] Resolve remaining existing test failures - [x] Respond to alalazo's initial feedback - [x] Remove `bin/demo-locks.py` - [x] Add new tests to address new coverage issues - [x] Replace built-in package's `def install(..): pass` to "install" something (i.e., only `apple-libunwind`) - [x] Increase code coverage |
||
---|---|---|
.. | ||
darwin | ||
linux | ||
config.yaml | ||
modules.yaml | ||
packages.yaml | ||
repos.yaml |