If the mimetype returned from `file -h -b --mime-type` contains slashes
in its subtype, the tuple returned from `spack.relocate.mime_type` will
have a size larger than two, which leads to errors.
Change-Id: I31de477e69f114ffdc9ae122d00c573f5f749dbb
Fixes#9394Closes#13217.
## Background
Spack provides the ability to enable/disable parallel builds through two options: package `parallel` and configuration `build_jobs`. This PR changes the algorithm to allow multiple, simultaneous processes to coordinate the installation of the same spec (and specs with overlapping dependencies.).
The `parallel` (boolean) property sets the default for its package though the value can be overridden in the `install` method.
Spack's current parallel builds are limited to build tools supporting `jobs` arguments (e.g., `Makefiles`). The number of jobs actually used is calculated as`min(config:build_jobs, # cores, 16)`, which can be overridden in the package or on the command line (i.e., `spack install -j <# jobs>`).
This PR adds support for distributed (single- and multi-node) parallel builds. The goals of this work include improving the efficiency of installing packages with many dependencies and reducing the repetition associated with concurrent installations of (dependency) packages.
## Approach
### File System Locks
Coordination between concurrent installs of overlapping packages to a Spack instance is accomplished through bottom-up dependency DAG processing and file system locks. The runs can be a combination of interactive and batch processes affecting the same file system. Exclusive prefix locks are required to install a package while shared prefix locks are required to check if the package is installed.
Failures are communicated through a separate exclusive prefix failure lock, for concurrent processes, combined with a persistent store, for separate, related build processes. The resulting file contains the failing spec to facilitate manual debugging.
### Priority Queue
Management of dependency builds changed from reliance on recursion to use of a priority queue where the priority of a spec is based on the number of its remaining uninstalled dependencies.
Using a queue required a change to dependency build exception handling with the most visible issue being that the `install` method *must* install something in the prefix. Consequently, packages can no longer get away with an install method consisting of `pass`, for example.
## Caveats
- This still only parallelizes a single-rooted build. Multi-rooted installs (e.g., for environments) are TBD in a future PR.
Tasks:
- [x] Adjust package lock timeout to correspond to value used in the demo
- [x] Adjust database lock timeout to reduce contention on startup of concurrent
`spack install <spec>` calls
- [x] Replace (test) package's `install: pass` methods with file creation since post-install
`sanity_check_prefix` will otherwise error out with `Install failed .. Nothing was installed!`
- [x] Resolve remaining existing test failures
- [x] Respond to alalazo's initial feedback
- [x] Remove `bin/demo-locks.py`
- [x] Add new tests to address new coverage issues
- [x] Replace built-in package's `def install(..): pass` to "install" something
(i.e., only `apple-libunwind`)
- [x] Increase code coverage
* Buildcache creation change the way prefix is copied to workdir.
* install_tree copies hardlinked files
* tarfile creates hardlinked files on extraction.
* create a temporary tarfile from prefix and extract it to workdir
* Use temp tarfile to move workdir to prefix to preserve hardlinks instead of copying
It's often useful to run a module with `python -m`, e.g.:
python -m pyinstrument script.py
Running a python script this way was hard, though, as `spack python` did
not have a similar `-m` option. This PR adds a `-m` option to `spack
python` so that we can do things like this:
spack python -m pyinstrument ./test.py
This makes it easy to write a script that uses a small part of Spack and
then profile it. Previously thee easiest way to do this was to write a
custom Spack command, which is often overkill.
Fixes#10019
If multiple instances of a package were installed in a single
instance of Spack, and they differed in terms of dependencies, then
"spack find" would not distinguish specs based on their dependencies.
For example if two instances of X were installed, one with Y and one
with Z, then "spack find X ^Y" would display both instances of X.
Using `sys.executable` to run Python in a sub-shell doesn't always work in a virtual environment as the `sys.executable` Python is not necessarily compatible with any loaded spack/other virtual environment.
- revert use of sys.executable to print out subshell environment (#14496)
- try instead to use an available python, then if there *is not* one, use `sys.executable`
- this addresses RHEL8 (where there is no `python` and `PYTHONHOME` issue in a simpler way
When removing packages from a view, extensions were being deactivated
in an arbitrary order. Extensions must be deactivated in preorder
traversal (dependents before dependencies), so when this order was
violated the view update would fail.
This commit ensures that views deactivate extensions based on a
preorder traversal and adds a test for it.
Despite trying very hard to keep dicts out of our hash algorithm, we seem
to still accidentally add them in ways that the tests can't catch. This
can cause errors when hashes are not computed deterministically.
This fixes an error we saw with Python 3.5, where dictionary iteration
order is random. In this instance, we saw a bug when reading Spack
environment lockfiles -- The load would fail like this:
```
...
File "/sw/spack/lib/spack/spack/environment.py", line 1249, in concretized_specs
yield (s, self.specs_by_hash[h])
KeyError: 'qcttqplkwgxzjlycbs4rfxxladnt423p'
```
This was because the hashes differed depending on whether we wrote `path`
or `module` first when recomputing the build hash as part of reading a
Spack lockfile. We can fix it by ensuring a determistic iteration order.
- [x] Fix two places (one that caused an issue, and one that did
not... yet) where our to_node_dict-like methods were using regular python
dicts.
- [x] Also add a check that statically analyzes our to_node_dict
functions and flags any that use Python dicts.
The test found the two errors fixed here, specifically:
```
E AssertionError: assert [] == ['Use syaml_dict instead of ...pack/spack/spec.py:1495:28']
E Right contains more items, first extra item: 'Use syaml_dict instead of dict at /Users/gamblin2/src/spack/lib/spack/spack/spec.py:1495:28'
E Full diff:
E - []
E + ['Use syaml_dict instead of dict at '
E + '/Users/gamblin2/src/spack/lib/spack/spack/spec.py:1495:28']
```
and
```
E AssertionError: assert [] == ['Use syaml_dict instead of ...ack/architecture.py:359:15']
E Right contains more items, first extra item: 'Use syaml_dict instead of dict at /Users/gamblin2/src/spack/lib/spack/spack/architecture.py:359:15'
E Full diff:
E - []
E + ['Use syaml_dict instead of dict at '
E + '/Users/gamblin2/src/spack/lib/spack/spack/architecture.py:359:15']
```
This commit introduces a `--no-check-signature` option for
`spack install` so that unsigned packages can be installed. It is
off by default (signatures required).
VSX alitvec extensions are supported by PowerISA from v2.06 (Power7+), but might
not be listed in features.
FMA has been supported by PowerISA since Power1, but might not be listed in
features.
This commit adds these features to all the power ISA family sets.
Add an optional 'submodules_delete' field to Git versions in Spack
packages that allows them to remove specific submodules.
For example: the nervanagpu submodule has become unavailable for the
PyTorch project (see issue 19457 at
https://github.com/pytorch/pytorch/issues/). Removing this submodule
allows 0.4.1 to build.
* Initialize _cached_specs at the file level and check for spec in it before searching mirrors in try_download_spec.
* Make _cached_specs a set to avoid duplicates
* Fix packaging test
* Ignore build_cache in stage when spec.yaml files are downloaded.
`spack -V` previously always returned the version of spack from
`spack.spack_version`. This gives us a general idea of what version
users are on, but if they're on `develop` or on some branch, we have to
ask more questions.
This PR makes `spack -V` check whether this instance of Spack is a git
repository, and if it is, it appends useful information from `git
describe --tags` to the version. Specifically, it adds:
- number of commits since the last release tag
- abbreviated (but unique) commit hash
So, if you're on `develop` you might get something like this:
$ spack -V
0.13.3-912-3519a1762
This means you're on commit 3519a1762, which is 912 commits ahead of
the 0.13.3 release.
If you are on a release branch, or if you are using a tarball of Spack,
you'll get the usual `spack.spack_version`:
$ spack -V
0.13.3
This should help when asking users what version they are on, since a lot
of people use the `develop` branch.
This PR adds a new command to Spack:
```console
$ spack containerize -h
usage: spack containerize [-h] [--config CONFIG]
creates recipes to build images for different container runtimes
optional arguments:
-h, --help show this help message and exit
--config CONFIG configuration for the container recipe that will be generated
```
which takes an environment with an additional `container` section:
```yaml
spack:
specs:
- gromacs build_type=Release
- mpich
- fftw precision=float
packages:
all:
target: [broadwell]
container:
# Select the format of the recipe e.g. docker,
# singularity or anything else that is currently supported
format: docker
# Select from a valid list of images
base:
image: "ubuntu:18.04"
spack: prerelease
# Additional system packages that are needed at runtime
os_packages:
- libgomp1
```
and turns it into a `Dockerfile` or a Singularity definition file, for instance:
```Dockerfile
# Build stage with Spack pre-installed and ready to be used
FROM spack/ubuntu-bionic:prerelease as builder
# What we want to install and how we want to install it
# is specified in a manifest file (spack.yaml)
RUN mkdir /opt/spack-environment \
&& (echo "spack:" \
&& echo " specs:" \
&& echo " - gromacs build_type=Release" \
&& echo " - mpich" \
&& echo " - fftw precision=float" \
&& echo " packages:" \
&& echo " all:" \
&& echo " target:" \
&& echo " - broadwell" \
&& echo " config:" \
&& echo " install_tree: /opt/software" \
&& echo " concretization: together" \
&& echo " view: /opt/view") > /opt/spack-environment/spack.yaml
# Install the software, remove unecessary deps and strip executables
RUN cd /opt/spack-environment && spack install && spack autoremove -y
RUN find -L /opt/view/* -type f -exec readlink -f '{}' \; | \
xargs file -i | \
grep 'charset=binary' | \
grep 'x-executable\|x-archive\|x-sharedlib' | \
awk -F: '{print $1}' | xargs strip -s
# Modifications to the environment that are necessary to run
RUN cd /opt/spack-environment && \
spack env activate --sh -d . >> /etc/profile.d/z10_spack_environment.sh
# Bare OS image to run the installed executables
FROM ubuntu:18.04
COPY --from=builder /opt/spack-environment /opt/spack-environment
COPY --from=builder /opt/software /opt/software
COPY --from=builder /opt/view /opt/view
COPY --from=builder /etc/profile.d/z10_spack_environment.sh /etc/profile.d/z10_spack_environment.sh
RUN apt-get -yqq update && apt-get -yqq upgrade \
&& apt-get -yqq install libgomp1 \
&& rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["/bin/bash", "--rcfile", "/etc/profile", "-l"]
```
* Add binary_distribution::get_spec which takes concretized spec
Add binary_distribution::try_download_specs for downloading of spec.yaml files to cache
get_spec is used by package::try_install_from_binary_cache to download only the spec.yaml
for the concretized spec if it exists.
The Spec parser currently calls `spec.traverse()` after every parse, in
order to set the platform if it's not set. We don't need to do a full
traverse -- we can just check the platforrm as new specs are parsed.
This takes about a second off the time required to import all packages in
Spack (from 8s to 7s).
- [x] simplify platform-setting logic in `SpecParser`.
`filename_for_package_name()` and `dirname_for_package_name()`
automatically construct a Spec from their arguments, which adds a fair
amount of overhead to importing lots of packages. Removing this removes
about 11% of the runtime of importing all packages in Spack (9s -> 8s).
- [x] `filename_for_package_name()` and `dirname_for_package_name()` now
take a string `pkg_name` arguments instead of specs.
* `Environment.__init__` is now synchronized with all writing operations
* `spack uninstall` now synchronizes its updates to any associated environment
* A side effect of this is that the environment is no longer updated piecemeal as specs are uninstalled - all specs are removed from the environment before they are uninstalled
This commit makes two fundamental corrections to tests:
1) Changes 'matches' to the correct 'match' argument for 'pytest.raises' (for all affected tests except those checking for 'SystemExit');
2) Replaces the 'match' argument for tests expecting 'SystemExit' (since the exit code is retained instead) with 'capsys' error message capture.
Both changes are needed to ensure the associated exception message is actually checked.
Updates to environments were not multi-process safe, which prevented them from taking advantage of parallel builds as implemented in #13100. This is a minimal set of changes to enable `spack install` in an environment to be parallelized:
- [x] add an internal lock, stored in the `.spack-env` directory,
to synchronize updates to `spack.yaml` and `spack.lock`
- [x] add `Environment.write_transaction` interface for this lock
- [x] makes use of `Environment.write_transaction` in `install`,
`add`, and `remove` commands
- `uninstall` is not synchronized yet; that is left for a future PR.
Spack commands referring to upstream-installed specs by hash have
been broken since 6b619da (merged September 2019), which added a new
Database function specifically for parsing hashes from command-line
specs; this function was inappropriately attempting to acquire locks
on upstream databases.
This PR updates the offending function to avoid locking upstream
databases and also updates associated tests to catch regression
errors: the upstream database created for these tests was not
explicitly set as an upstream (i.e. initialized with upstream=True)
so it was not guarding against inappropriate accesses.