The code in Spack to generate install and test reports currently suffers from unneeded complexity. For
instance, we have classes in Spack core packages, like `spack.reporters.CDash`, that need an
`argparse.Namespace` to be initialized and have "hard-coded" string literals on which they branch to
change their behavior:
```python
if do_fn.__name__ == "do_test" and skip_externals:
package["result"] = "skipped"
else:
package["result"] = "success"
package["stdout"] = fetch_log(pkg, do_fn, self.dir)
package["installed_from_binary_cache"] = pkg.installed_from_binary_cache
if do_fn.__name__ == "_install_task" and installed_already:
return
```
This PR attempt to polish the major issues encountered in both `spack.report` and `spack.reporters`.
Details:
- [x] `spack.reporters` is now a package that contains both the base class `Reporter` and all
the derived classes (`JUnit` and `CDash`)
- [x] Classes derived from `spack.reporters.Reporter` don't take an `argparse.Namespace` anymore
as argument to `__init__`. The rationale is that code for commands should be built upon Spack
core classes, not vice-versa.
- [x] An `argparse.Action` has been coded to create the correct `Reporter` object based on command
line arguments
- [x] The context managers to generate reports from either `spack install` or from `spack test` have
been greatly simplified, and have been made less "dynamic" in nature. In particular, the `collect_info`
class has been deleted in favor of two more specific context managers. This allows for a simpler
structure of the code, and less knowledge required to client code (in particular on which method to patch)
- [x] The `InfoCollector` class has been turned into a simple hierarchy, so to avoid conditional statements
within methods that assume a knowledge of the context in which the method is called.
On systems with remote groups, the primary user group may be remote and may not exist on
the local system (i.e., it might just be a number). On the CLI, it looks like this:
```console
> touch foo
> l foo
-rw-r--r-- 1 gamblin2 57095 0 Dec 29 22:24 foo
> chmod 2000 foo
chmod: changing permissions of 'foo': Operation not permitted
```
Here, the local machine doesn't know about per-user groups, so they appear as gids in
`ls` output. `57095` is also `gamblin2`'s uid, but the local machine doesn't know that
`gamblin2` is in the `57095` group.
Unfortunately, it seems that Python's `os.chmod()` just fails silently, setting
permissions to `0o0000` instead of `0o2000`. We can avoid this by ensuring that the file
has a group the user is known to be a member of.
- [x] Add `ensure_known_group()` in the permissions tests.
- [x] Call `ensure_known_group()` on tempfile in `test_chmod_real_entries_ignores_suid_sgid`.
There are a number of places in our docstrings where we write "list of X" as the type, even though napoleon doesn't actually support this. It ends up causing warnings when generating docs.
Now that we require Python 3, we don't have to rely on type hints in docs -- we can just use Python type hints and omit the types of args and return values from docstrings.
We should probably do this for all types in docstrings eventually, but this PR focuses on the ones that generate warnings during doc builds.
Some `mypy` annoyances we should consider in the future:
1. Adding some of these type annotations gets you:
```
note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs [annotation-unchecked]
```
because they are in unannotated functions (like constructors where we don't really need any annotations).
You can silence these with `disable_error_code = "annotation-unchecked"` in `pyproject.toml`
2. Right now we support running `mypy` in Python `3.6`. That means we have to support `mypy` `.971`, which does not support `disable_error_code = "annotation-unchecked"`, so I just filter `[annotation-unchecked]` lines out in `spack style`.
3. I would rather just turn on `check_untyped_defs` and get more `mypy` coverage everywhere, but that will require about 1,000 fixes. We should probably do that eventually.
4. We could also consider only running `mypy` on newer python versions. This is not easy to do while supporting `3.6`, because you have to use `if TYPE_CHECKING` for a lot of things to ensure that 3.6 still parses correctly. If we only supported `3.7` and above we could use [`from __future__ import annotations`](https://mypy.readthedocs.io/en/stable/runtime_troubles.html#future-annotations-import-pep-563), but we have to support 3.6 for now. Sigh.
- [x] Convert a number of docstring types to Python type hints
- [x] Get rid of "list of" wherever it appears
Local `git` tests will fail with `fatal: transport 'file' not allowed` when using git 2.38.1 or higher, due to a fix for `CVE-2022-39253`.
This was fixed in CI in #33429, but that doesn't help the issue for anyone's local environment. Instead of fixing this with git config in CI, we should ensure that the tests run anywhere.
- [x] Introduce `spack.util.git`.
- [x] Use `spack.util.git.get_git()` to get a git executable, instead of `which("git")` everywhere.
- [x] Make all `git` tests use a `git` fixture that goes through `spack.util.git.get_git()`.
- [x] Add `-c protocol.file.allow=always` to all `git` invocations under `pytest`.
- [x] Revert changes from #33429, which are no longer needed.
`spack graph` has been reworked to use:
- Jinja templates
- builder objects to construct the template context when DOT graphs are requested.
This allowed to add a new colored output for DOT graphs that highlights both
the dependency types and the nodes that are needed at runtime for a given spec.
`spack solve` is supposed to show you times you can compare. setup, ground, solve, etc.
all in a list. You're also supposed to be able to compare easily across runs. With
`pretty_seconds()` (introduced in #33900), it's easy to miss the units, e.g., spot the
bottleneck here:
```console
> spack solve --timers tcl
setup 22.125ms
load 16.083ms
ground 8.298ms
solve 848.055us
total 58.615ms
```
It's easier to see what matters if these are all in the same units, e.g.:
```
> spack solve --timers tcl
setup 0.0147s
load 0.0130s
ground 0.0078s
solve 0.0008s
total 0.0463s
```
And the units won't fluctuate from run to run as you make changes.
-[x] make `spack solve` timings consistent like before
Avoid text decoding and encoding when combining log files, instead
combine in binary mode.
Also do a buffered copy which is sometimes faster for large log files.
Currently, the Spack docs show documentation for submodules *before* documentation for
submodules on package doc pages. This means that if you put docs in `__init__.py` in
some package, the docs in there will be shown *after* the docs for all submodules of the
package instead of at the top as an intro to the package. See, e.g.,
[the lockfile docs](https://spack.readthedocs.io/en/latest/spack.environment.html#module-spack.environment),
which should be at the
[top of that page](https://spack.readthedocs.io/en/latest/spack.environment.html).
- [x] add the `--module-first` option to sphinx so that it generates module docs at top of page.
The sticky property will prevent clingo from changing the amdgpu_target
to work around conflicts. This is the same behaviour as was adopted for
cuda_arch in 055c9d125d.
Implement an alternative strategy to do index.json invalidation.
The current approach of pairs of index.json / index.json.hash is
problematic because it leads to races.
The standard solution for cache invalidation is etags, which are
supported by both http and s3 protocols, which allows one to do
conditional fetches.
This PR implements that for the http/https schemes. It should also work
for s3 schemes, but that requires other prs to be merged.
Also it improves unit tests for index.json fetches.
Interim fix for #34559
Spack's MSVC compiler definition uses ifx as the Fortran compiler.
Prior to #33385, the Spack MSVC compiler definition required the
executable to be called "ifx.exe"; #33385 replaced this with just
"ifx", which inadvertently led to ifx falsely indicating the
presence of MSVC on non-Windows systems (which leads to future
errors when attempting to query/use those compiler objects).
This commit applies a short-term fix by updating MSVC Fortran
version detection to always indicate a failure on non-Windows.
fixes#34518
Fix an issue due to the MRO chain of the package wrapper
during build. Before this PR we were always returning
False when the builder object was created before the
run_tests method was monkey patched.
This reverts commit 8035eeb36d.
And also removes logic around an additional HEAD request to prevent
a more expensive GET request on wrong content-type. Since large files
are typically an attachment and only downloaded when reading the
stream, it's not an optimization that helps much, and in fact the logic
was broken since the GET request was done unconditionally.
The main issue that's fixed is that Spack passes paths (as strings) to
functions that require urls. That wasn't an issue on unix, since there
you can simply concatenate `file://` and `path` and all is good, but on
Windows that gives invalid file urls. Also on Unix, Spack would not deal with uri encoding like x%20y for file paths.
It also removes Spack's custom url.parse function, which had its own incorrect interpretation of file urls, taking file://x/y to mean the relative path x/y instead of hostname=x and path=/y. Also it automatically interpolated variables, which is surprising for a function that parses URLs.
Instead of all sorts of ad-hoc `if windows: fix_broken_file_url` this PR
adds two helper functions around Python's own path2url and reverse.
Also fixes a bug where some `spack buildcache` commands
used `-d` as a flag to mean `--mirror-url` requiring a URL, and others
`--directory`, requiring a path. It is now the latter consistently.
When installing binary tarballs, Spack has to download from its
binary mirrors.
Sometimes Spack has cache available for these mirrors.
That cache helps to order mirrors to increase the likelihood of
getting a direct hit.
However, currently, when Spack can't find a spec in any local cache
of mirrors, it's very dumb:
- A while ago it used to query each mirror to see if it had a spec,
and use that information to order the mirror again, only to go
about and do exactly a part of what it just did: fetch the spec
from that mirror confused
- Recently, it was changed to download a full index.json, which
can be multiple dozens of MBs of data and may take a minute to
process thanks to the blazing fast performance you get with
Python.
In a typical use case of concretizing with reuse, the full index.json
is already available, and it likely that the local cache gives a perfect
mirror ordering on install. (There's typically no need to update any
caches).
However, in the use case of Gitlab CI, the build jobs don't have cache,
and it would be smart to just do direct fetches instead of all the
redundant work of (1) and/or (2).
Also, direct fetches from mirrors will soon be fast enough to
prefer these direct fetches over the excruciating slowness of
index.json files.
Writing a long dependency like:
```python
depends_on(
"llvm"
"targets=amdgpu,bpf,nvptx,webassembly"
"version_suffix=jl +link_llvm_dylib ~internal_unwind"
)
```
when it should be formatted like this:
```python
depends_on(
"llvm"
" targets=amdgpu,bpf,nvptx,webassembly"
" version_suffix=jl +link_llvm_dylib ~internal_unwind"
)
```
can cause really subtle errors. Specifically, you'll get something like this in
the package sanity tests:
```
AttributeError: 'NoneType' object has no attribute 'rpartition'
```
because Spack happily constructs a class that has a dependency with name `None`.
We can catch this earlier by banning anonymous dependency specs directly in
`depends_on()`. This causes the package itself to fail to parse, and emits
a much better error message:
```
==> Error: Invalid dependency specification in package 'julia':
llvmtargets=amdgpu,bpf,nvptx,webassemblyversion_suffix=jl +link_llvm_dylib ~internal_unwind
```
* `url_exists` improvements (take 2)
Make `url_exists` do HEAD request for http/https/s3 protocols
Rework the opener: construct it once and only once, dynamically dispatch
to the right one based on config.
It's very common for us to tell users to grep through the existing Spack packages to
find examples of what they want, and it's also very common for package developers to do
it. Now, searching packages is even easier.
`spack pkg grep` runs grep on all `package.py` files in repos known to Spack. It has no
special options other than the search string; all options passed to it are forwarded
along to `grep`.
```console
> spack pkg grep --help
usage: spack pkg grep [--help] ...
positional arguments:
grep_args arguments for grep
options:
--help show this help message and exit
```
```console
> spack pkg grep CMakePackage | head -3
/Users/gamblin2/src/spack/var/spack/repos/builtin/packages/3dtk/package.py:class _3dtk(CMakePackage):
/Users/gamblin2/src/spack/var/spack/repos/builtin/packages/abseil-cpp/package.py:class AbseilCpp(CMakePackage):
/Users/gamblin2/src/spack/var/spack/repos/builtin/packages/accfft/package.py:class Accfft(CMakePackage, CudaPackage):
```
```console
> spack pkg grep -Eho '(\S*)\(PythonPackage\)' | head -3
AwsParallelcluster(PythonPackage)
Awscli(PythonPackage)
Bueno(PythonPackage)
```
## Return Value
This retains the return value semantics of `grep`:
* 0 for found,
* 1 for not found
* >1 for error
## Choosing a `grep`
You can set the ``SPACK_GREP`` environment variable to choose the ``grep``
executable this command should use.
Unit tests on Windows are supposed to pass for any PR to pass CI.
However, the return code for the unit test command was not being
checked, which meant this check was always passing (effectively
disabled). This PR
* Properly checks the result of the unit tests and fails if the
unit tests fail
* Fixes (or disables on Windows) a number of tests which have
"drifted" out of support on Windows since this check was
effectively disabled