When we lose a running pod (possibly loss of spot instance) or encounter
some other infrastructure-related failure of this job, we need to retry
it. This retries the job the maximum number of times in those cases.
`reuse` and `when_possible` concretization broke the invariant that
`spec[pkg_name]` has unique keys. This invariant is relied on in tons of
places, such as when setting up the build environment.
When using `when_possible` concretization, one may end up with two or
more `perl`s or `python`s among the transitive deps of a spec, because
concretization does not consider build-only deps of reusable specs.
Until the code base is fixed not to rely on this broken property of
`__getitem__`, we should disable reuse in CI.
When installing some/all specs from a buildcache, build edges are pruned
from those specs. This can result in a much smaller effective DAG. Until
now, `spack env depfile` would always generate a full DAG.
Ths PR adds the `spack env depfile --use-buildcache` flag that was
introduced for `spack install` before. This way, not only can we drop
build edges, but also we can automatically set the right buildcache
related flags on the specific specs that are gonna get installed.
This way we get parallel installs of binary deps without redundancy,
which is useful for Gitlab CI.
Currently "spack ci generate" chooses the first matching entry in
gitlab-ci:mappings to fill attributes for a generated build-job,
requiring that the entire configuration matrix is listed out
explicitly. This unfortunately causes significant problems in
environments with large configuration spaces, for example the
environment in #31598 (spack.yaml) supports 5 operating systems,
3 architectures and 130 packages with explicit size requirements,
resulting in 1300 lines of configuration YAML.
This patch adds a configuraiton option to the gitlab-ci schema called
"match_behavior"; when it is set to "merge", all matching entries
are applied in order to the final build-job, allowing a few entries
to cover an entire matrix of configurations.
The default for "match_behavior" is "first", which behaves as before
this commit (only the runner attributes of the first match are used).
In addition, match entries may now include a "remove-attributes"
configuration, which allows matches to remove tags that have been
aggregated by prior matches. This only makes sense to use with
"match_behavior:merge". You can combine "runner-attributes" with
"remove-attributes" to effectively override prior tags.
* env depfile: allow deps only install
- Refactor `spack env depfile` to use a Jinja template, making it a bit
easier to follow as a human being.
- Add a layer of indirection in the generated Makefile through an
`<prefix>/.install-deps/<hash>` target, which allows one to specify
different options when installing dependencies. For example, only
verbose/debug mode on when installing some particular spec:
```
$ spack -e my_env env depfile -o Makefile --make-target-prefix example
$ make example/.install-deps/<hash> -j16
$ make example/.install/<hash> SPACK="spack -d" SPACK_INSTALL_FLAGS=--verbose -j16
```
This could be used to speed up `spack ci rebuild`:
- Parallel install of dependencies from buildcache
- Better readability of logs, e.g. reducing verbosity when installing
dependencies, and splitting logs into deps.log and current_spec.log
* Silence please!
Caches used by repositories don't reference the global spack.repo.path instance
anymore, but get the repository they refer to during initialization.
Spec.virtual now use the index, and computation done to compute the index
use Repository.is_virtual_safe.
Code to construct mock packages and mock repository has been factored into
a unique MockRepositoryBuilder that is used throughout the codebase.
Add debug print for pushing and popping config scopes.
Changed spack.repo.use_repositories so that it can override or not previous repos
spack.repo.use_repositories updates spack.config.config according to the modifications done
Removed a peculiar behavior from spack.config.Configuration where push would always
bubble-up a scope named command_line if it existed
Basic stack of ML packages we would like to test and generate binaries for in CI.
Spack now has a large CI framework in GitLab for PR testing and public binary generation.
We should take advantage of this to test and distribute optimized binaries for popular ML
frameworks.
This is a pretty extensive initial set, including CPU, ROCm, and CUDA versions of a core
`x96_64_v4` stack.
### Core ML frameworks
These are all popular core ML frameworks already available in Spack.
- [x] PyTorch
- [x] TensorFlow
- [x] Scikit-learn
- [x] MXNet
- [x] CNTK
- [x] Caffe
- [x] Chainer
- [x] XGBoost
- [x] Theano
### ML extensions
These are domain libraries and wrappers that build on top of core ML libraries
- [x] Keras
- [x] TensorBoard
- [x] torchvision
- [x] torchtext
- [x] torchaudio
- [x] TorchGeo
- [x] PyTorch Lightning
- [x] torchmetrics
- [x] GPyTorch
- [x] Horovod
### ML-adjacent libraries
These are libraries that aren't specific to ML but are still core libraries used in ML pipelines
- [x] numpy
- [x] scipy
- [x] pandas
- [x] ONNX
- [x] bazel
Co-authored-by: Jonathon Anderson <17242663+blue42u@users.noreply.github.com>
Remove `module-info mode load` condition that prevents auto-unloading when autoloading is enabled. It looks like this condition was added to work around an issue in environment-modules that is no longer necessary.
Add quotes to make is-loaded happy
Install: Add use-buildcache option to install
* Allow differentiating between top level packages and dependencies when
determining whether to install from the cache or not.
* Add unit test for --use-buildcache
* Use metavar to display use-buildcache options.
* Update spack-completion
PR #32615 deprecated Python versions up to 3.6.X. Since
the "build-systems" pipeline requires Python 3.6.15 to
build "tut", it will fail on the first rebuild that
involves Python.
The "tut" package is meant to perform an end-to-end
test of the "Waf" build-system, which is scarcely
used. The fix therefore is just to remove it from
the pipeline.
amazon linux 2 ships a glibc that is too old to work with cuda toolkit
for aarch64.
For example:
`libcurand.so.10.2.10.50` requires the symbol `logf@@GLIBC_2.27`, but
glibc is at 2.26.
So, these specs are removed.
* ci: restore coverage computation
* Mark "test_foreground_background" as xfail
* Mark "test_foreground_background_output" as xfail
* Make number of processes explicit, remove verbosity on linux
* Run coverage on just 3 Python jobs for linux
* Run coverage on just 3 Python jobs for linux
* Run coverage on just 2 Python jobs for linux
* Add back verbose, since before we didn't encounter the xdist internal error
* Reduce the workers to 2
* Try to use command line
* ci: remove !docs from "core" filters
Written like it is now it causes package only PRs
to run with coverage.
* Try to skip job under condition, see if the workflow proceed
* Try to cancel a running CI job
* Simplify linux unit-tests, skip windows unit-tests on package PRs
* Reduce the inputs to unit-tests workflow
* Move control logic to main workflow, remove inputs
* Revert "Move control logic to main workflow, remove inputs"
This reverts commit 0c46fece4c49eb7a37585ec3ba651a31d7f958af.
* Do not compute "with_coverage" since it's always == to "core"
* Remove workflow dispatch from unit tests
* Revert "Revert "Move control logic to main workflow, remove inputs""
This reverts commit dd4e4a4e61a825901e736348fd044d37e88c90b5.
* Try to skip all from the main workflow
* Add back bootstrap to needed checks for "all"
* Restore the correct logic for conditionals
* Add two no-op jobs named "all-prechecks" and "all"
These are a suggestion from @tgamblin, they are stable named markers we
can use from gitlab and possibly for required checks to make CI more
resilient to refactors changing the names of specific checks.
* Enable parallel testing using xdist for unit testing in CI
* Normalize tmp paths to deal with macos
* add -u flag compatibility to spack python
As of now, it is accepted and ignored. The usage with xdist, where it
is invoked specifically by `python -u spack python` which is then passed
`-u` by xdist is the entire reason for doing this. It should never be
used without explicitly passing -u to the executing python interpreter.
* use spack python in xdist to support python 2
When running on python2, spack has many import cycles unless started
through main. To allow that, this uses `spack python` as the
interpreter, leveraging the `-u` support so xdist doesn't error out when
it unconditionally requests unbuffered binary IO.
* Use shutil.move to account for tmpdir being in a separate filesystem sometimes
Move the copying of the buildcache to a root job that runs after all the child
pipelines have finished, so that the operation can be coordinated across all
child pipelines to remove the possibility of race conditions during potentially
simlutandous copies. This lets us ensure the .spec.json.sig and .spack files
for any spec in the root mirror always come from the same child pipeline
mirror (though which pipeline is arbitrary). It also allows us to avoid copying
of duplicates, which we now do.
If you have an environment like
```
$ cat spack.yaml
spack:
specs: [openmpi@4.1.0+cuda]
```
this PR provides a new command `spack change` that you can use to adjust environment specs from the command line:
```
$ spack change openmpi~cuda
$ cat spack.yaml
spack:
specs: [openmpi@4.1.0~cuda]
```
in other words, this allows you to tweak the details of environment specs from the command line.
Notes:
* This is only allowed for environments that do not define matrices
* This is possible but not anticipated to be needed immediately
* If this were done, it should probably only be done for "named"/not-anonymous specs (i.e. we can change `openmpi+cuda` but not spec like `+cuda` or `@4.0.1~cuda`)
This support requires adding the '--tests' option to 'spack ci rebuild'.
Packages whose stand-alone tests are broken (in the CI environment) can
be configured in gitlab-ci to be skipped by adding them to
broken-tests-packages.
Highlights include:
- Restructured 'spack ci' help to provide better subcommand summaries;
- Ensured only one InstallError (i.e., installer's) rather than allowing
build_environment to have its own; and
- Refactored CI and CDash reporting to keep CDash-related properties and
behavior in a separate class.
This allows stand-alone tests from `spack ci` to run when the `--tests`
option is used. With `--tests`, stand-alone tests are run **after** a
**successful** (re)build of the package. Test results are collected
and report(able) using CDash.
This PR adds the following features:
- Adds `-t` and `--tests` to `spack ci rebuild` to run stand-alone tests;
- Adds `--fail-fast` to stop stand-alone tests after the first failure;
- Ensures a *single* `InstallError` across packages
(i.e., removes second class from build environment);
- Captures skipping tests for externals and uninstalled packages
(for CDash reporting);
- Copies test logs and outputs to the CI artifacts directory to facilitate
debugging;
- Parses stand-alone test results to report outputs from each `run_test` as
separate test parts (CDash reporting);
- Logs a test completion message to allow capture of timing of the last
`run_test` part;
- Adds the runner description to the CDash site to better distinguish entries
in CDash tables;
- Adds `gitlab-ci` `broken-tests-packages` to CI configuration to skip
stand-alone testing for packages with known issues;
- Changes `spack ci --help` so description of each subcommand is a single line;
- Changes `spack ci <subcommand> --help` to provide the full description of
each command (versus no description); and
- Ensures `junit` test log file ends in an `.xml` extension (versus default where
it does not).
Tasks:
- [x] Include the equivalent of the architecture information, or at least the host target, in the CDash output
- [x] Upload stand-alone test results files as `test` artifacts
- [x] Confirm tests are run in GitLab
- [x] Ensure CDash results are uploaded as artifacts
- [x] Resolve issues with CDash build-and test results appearing on same row of the table
- [x] Add unit tests as needed
- [x] Investigate why some (dependency) packages don't have test results (e.g., related from other pipelines)
- [x] Ensure proper parsing and reporting of skipped tests (as `not run`) .. post- #28701 merge
- [x] Restore the proper CDash URLand or mirror ONCE out-of-band testing completed
* modified list.py and added functionality for --tag
* Removed long and very long, shifted rest of code above return statement
* removed results variable
* added import statement at top
* added the line accidentally deleted
* added line accidentally deleted
* changed p.name to p, added line inside if statement
* line order switched
* [@spackbot] updating style on behalf of sparkyniner
* ran update completion command
* add tests
* Update lib/spack/spack/test/cmd/list.py
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
* [@spackbot] updating style on behalf of sparkyniner
* changed argument to mock_packages and moved code under filter by tag
* removed bad rebase code and added additional test
* [@spackbot] updating style on behalf of sparkyniner
* added line removed earlier
* added line removed earlier
* replaced function
* added more recommended changes
Co-authored-by: sairaj <sairaj@sairajs-MacBook-Pro.local>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
On PR pipelines we need to override the buildcache destination to
point to the "spack-binaries-prs" bucket, otherwise, those pipelines
try to push to the default mirror in a bucket for which they don't
have write permission.
`LD_LIBRARY_PATH` can break system executables (e.g., when an enviornment is loaded) and isn't necessary thanks to `RPATH`s. Packages that require `LD_LIBRARY_PATH` can set this in `setup_run_environment`.
- [x] Prefix inspections no longer set `LD_LIBRARY_PATH` by default
- [x] Document changes and workarounds for people who want `LD_LIBRARY_PATH`
`spack style` tests were annoyingly brittle because we could not easily be
specific about which tools to run (we had to use `--no-black`, `--no-isort`,
`--no-flake8`, and `--no-mypy`). We should be able to specify what to run OR
what to skip.
Now you can run, e.g.:
spack style --tool black,flake8
or:
spack style --skip black,isort
- [x] Remove `--no-black`, `--no-isort`, `--no-flake8`, and `--no-mypy` args.
- [x] Add `--tool TOOL` argument.
- [x] Add `--skip TOOL` argument.
- [x] Allow either `--tool black --tool flake8` or `--tool black,flake8` syntax.
Black will automatically fix a lot of the exceptions we previously allowed for
directives, so we don't need them in our custom `flake8_formatter` anymore.
- [x] remove `E501` (long line) exceptions for directives from `flake8_formatter`,
as they won't help us now.
- [x] Refine exceptions for long URLs in the `flake8_formatter`.
- [x] Adjust the mock `flake8-package` to exhibit the exceptions we still allow.
- [x] Update style tests for new `flake8-package`.
- [x] Blacken style test.
This adds necessary configuration for flake8 and black to work together.
This also sets the line length to 99, per the data here:
* https://github.com/spack/spack/pull/24718#issuecomment-876933636
Given the data and the spirit of black's 88-character limit, we set the limit to 99
characters for all of Spack, because:
* 99 is one less than 100, a nice round number, and all lines will fit in a
100-character wide terminal (even when the text editor puts a \ at EOL).
* 99 is just past the knee the file size curve for packages, and it means that packages
remain readable and not significantly longer than they are now.
* It doesn't seem to hurt core -- files in core might change length by a few percent but
seem like they'll be mostly the same as before -- just a bit more roomy.
- [x] set line length to 99
- [x] remove most exceptions from `.flake8` and add the ones black cares about
- [x] add `[tool.black]` to `pyproject.toml`
- [x] make `black` run if available in `spack style --fix`
Co-Authored-By: Tom Scogland <tscogland@llnl.gov>
We adopted the convention of putting binaries for each stack into
a dedicated mirror named after the directory in which the stack
(spack.yaml file) resides. This fixes the mirror url of the
radiuss-aws-aarch64 stack to follow that convention.
Add spack stacks targeted at Spack + AWS + ARM HPC User Group hackathon. Includes
a list of miniapps and full-apps that are ready to run on both x86_64 and aarch64.
Co-authored-by: Scott Wittenburg <scott.wittenburg@kitware.com>
Add two new stacks targeted at x86_64 and arm, representing an initial list of packages
used by current and planned AWS Workshops, and built in conjunction with the ISC22
announcement of the spack public binary cache.
Co-authored-by: Scott Wittenburg <scott.wittenburg@kitware.com>
Explicitly import package utilities in all packages, and corresponding fallout.
This includes:
* rename `spack.package` to `spack.package_base`
* rename `spack.pkgkit` to `spack.package`
* update all packages in builtin, builtin_mock and tutorials to include `from spack.package import *`
* update spack style
* ensure packages include the import
* automatically add the new import and remove any/all imports of `spack` and `spack.pkgkit`
from packages when using `--fix`
* add support for type-checking packages with mypy when SPACK_MYPY_CHECK_PACKAGES
is set in the environment
* fix all type checking errors in packages in spack upstream
* update spack create to include the new imports
* update spack repo to inject the new import, injection persists to allow for a deprecation period
Original message below:
As requested @adamjstewart, update all packages to use pkgkit. I ended up using isort to do this,
so repro is easy:
```console
$ isort -a 'from spack.pkgkit import *' --rm 'spack' ./var/spack/repos/builtin/packages/*/package.py
$ spack style --fix
```
There were several line spacing fixups caused either by space manipulation in isort or by packages
that haven't been touched since we added requirements, but there are no functional changes in here.
* [x] add config to isort to make sure this is maintained going forward
This PR supports the creation of securely signed binaries built from spack
develop as well as release branches and tags. Specifically:
- remove internal pr mirror url generation logic in favor of buildcache destination
on command line
- with a single mirror url specified in the spack.yaml, this makes it clearer where
binaries from various pipelines are pushed
- designate some tags as reserved: ['public', 'protected', 'notary']
- these tags are stripped from all jobs by default and provisioned internally
based on pipeline type
- update gitlab ci yaml to include pipelines on more protected branches than just
develop (so include releases and tags)
- binaries from all protected pipelines are pushed into mirrors including the
branch name so releases, tags, and develop binaries are kept separate
- update rebuild jobs running on protected pipelines to run on special runners
provisioned with an intermediate signing key
- protected rebuild jobs no longer use "SPACK_SIGNING_KEY" env var to
obtain signing key (in fact, final signing key is nowhere available to rebuild jobs)
- these intermediate signatures are verified at the end of each pipeline by a new
signing job to ensure binaries were produced by a protected pipeline
- optionallly schedule a signing/notary job at the end of the pipeline to sign all
packges in the mirror
- add signing-job-attributes to gitlab-ci section of spack environment to allow
configuration
- signing job runs on special runner (separate from protected rebuild runners)
provisioned with public intermediate key and secret signing key