fixes#34518
Fix an issue due to the MRO chain of the package wrapper
during build. Before this PR we were always returning
False when the builder object was created before the
run_tests method was monkey patched.
This reverts commit 8035eeb36d.
And also removes logic around an additional HEAD request to prevent
a more expensive GET request on wrong content-type. Since large files
are typically an attachment and only downloaded when reading the
stream, it's not an optimization that helps much, and in fact the logic
was broken since the GET request was done unconditionally.
The main issue that's fixed is that Spack passes paths (as strings) to
functions that require urls. That wasn't an issue on unix, since there
you can simply concatenate `file://` and `path` and all is good, but on
Windows that gives invalid file urls. Also on Unix, Spack would not deal with uri encoding like x%20y for file paths.
It also removes Spack's custom url.parse function, which had its own incorrect interpretation of file urls, taking file://x/y to mean the relative path x/y instead of hostname=x and path=/y. Also it automatically interpolated variables, which is surprising for a function that parses URLs.
Instead of all sorts of ad-hoc `if windows: fix_broken_file_url` this PR
adds two helper functions around Python's own path2url and reverse.
Also fixes a bug where some `spack buildcache` commands
used `-d` as a flag to mean `--mirror-url` requiring a URL, and others
`--directory`, requiring a path. It is now the latter consistently.
When installing binary tarballs, Spack has to download from its
binary mirrors.
Sometimes Spack has cache available for these mirrors.
That cache helps to order mirrors to increase the likelihood of
getting a direct hit.
However, currently, when Spack can't find a spec in any local cache
of mirrors, it's very dumb:
- A while ago it used to query each mirror to see if it had a spec,
and use that information to order the mirror again, only to go
about and do exactly a part of what it just did: fetch the spec
from that mirror confused
- Recently, it was changed to download a full index.json, which
can be multiple dozens of MBs of data and may take a minute to
process thanks to the blazing fast performance you get with
Python.
In a typical use case of concretizing with reuse, the full index.json
is already available, and it likely that the local cache gives a perfect
mirror ordering on install. (There's typically no need to update any
caches).
However, in the use case of Gitlab CI, the build jobs don't have cache,
and it would be smart to just do direct fetches instead of all the
redundant work of (1) and/or (2).
Also, direct fetches from mirrors will soon be fast enough to
prefer these direct fetches over the excruciating slowness of
index.json files.
Writing a long dependency like:
```python
depends_on(
"llvm"
"targets=amdgpu,bpf,nvptx,webassembly"
"version_suffix=jl +link_llvm_dylib ~internal_unwind"
)
```
when it should be formatted like this:
```python
depends_on(
"llvm"
" targets=amdgpu,bpf,nvptx,webassembly"
" version_suffix=jl +link_llvm_dylib ~internal_unwind"
)
```
can cause really subtle errors. Specifically, you'll get something like this in
the package sanity tests:
```
AttributeError: 'NoneType' object has no attribute 'rpartition'
```
because Spack happily constructs a class that has a dependency with name `None`.
We can catch this earlier by banning anonymous dependency specs directly in
`depends_on()`. This causes the package itself to fail to parse, and emits
a much better error message:
```
==> Error: Invalid dependency specification in package 'julia':
llvmtargets=amdgpu,bpf,nvptx,webassemblyversion_suffix=jl +link_llvm_dylib ~internal_unwind
```
* `url_exists` improvements (take 2)
Make `url_exists` do HEAD request for http/https/s3 protocols
Rework the opener: construct it once and only once, dynamically dispatch
to the right one based on config.
It's very common for us to tell users to grep through the existing Spack packages to
find examples of what they want, and it's also very common for package developers to do
it. Now, searching packages is even easier.
`spack pkg grep` runs grep on all `package.py` files in repos known to Spack. It has no
special options other than the search string; all options passed to it are forwarded
along to `grep`.
```console
> spack pkg grep --help
usage: spack pkg grep [--help] ...
positional arguments:
grep_args arguments for grep
options:
--help show this help message and exit
```
```console
> spack pkg grep CMakePackage | head -3
/Users/gamblin2/src/spack/var/spack/repos/builtin/packages/3dtk/package.py:class _3dtk(CMakePackage):
/Users/gamblin2/src/spack/var/spack/repos/builtin/packages/abseil-cpp/package.py:class AbseilCpp(CMakePackage):
/Users/gamblin2/src/spack/var/spack/repos/builtin/packages/accfft/package.py:class Accfft(CMakePackage, CudaPackage):
```
```console
> spack pkg grep -Eho '(\S*)\(PythonPackage\)' | head -3
AwsParallelcluster(PythonPackage)
Awscli(PythonPackage)
Bueno(PythonPackage)
```
## Return Value
This retains the return value semantics of `grep`:
* 0 for found,
* 1 for not found
* >1 for error
## Choosing a `grep`
You can set the ``SPACK_GREP`` environment variable to choose the ``grep``
executable this command should use.
Unit tests on Windows are supposed to pass for any PR to pass CI.
However, the return code for the unit test command was not being
checked, which meant this check was always passing (effectively
disabled). This PR
* Properly checks the result of the unit tests and fails if the
unit tests fail
* Fixes (or disables on Windows) a number of tests which have
"drifted" out of support on Windows since this check was
effectively disabled
## Motivation
Our parser grew to be quite complex, with a 2-state lexer and logic in the parser
that has up to 5 levels of nested conditionals. In the future, to turn compilers into
proper dependencies, we'll have to increase the complexity further as we foresee
the need to add:
1. Edge attributes
2. Spec nesting
to the spec syntax (see https://github.com/spack/seps/pull/5 for an initial discussion of
those changes). The main attempt here is thus to _simplify the existing code_ before
we start extending it later. We try to do that by adopting a different token granularity,
and by using more complex regexes for tokenization. This allow us to a have a "flatter"
encoding for the parser. i.e., it has fewer nested conditionals and a near-trivial lexer.
There are places, namely in `VERSION`, where we have to use negative lookahead judiciously
to avoid ambiguity. Specifically, this parse is ambiguous without `(?!\s*=)` in `VERSION_RANGE`
and an extra final `\b` in `VERSION`:
```
@ 1.2.3 : develop # This is a version range 1.2.3:develop
@ 1.2.3 : develop=foo # This is a version range 1.2.3: followed by a key-value pair
```
## Differences with the previous parser
~There are currently 2 known differences with the previous parser, which have been added on purpose:~
- ~No spaces allowed after a sigil (e.g. `foo @ 1.2.3` is invalid while `foo @1.2.3` is valid)~
- ~`/<hash> @1.2.3` can be parsed as a concrete spec followed by an anonymous spec (before was invalid)~
~We can recover the previous behavior on both ones but, especially for the second one, it seems the current behavior in the PR is more consistent.~
The parser is currently 100% backward compatible.
## Error handling
Being based on more complex regexes, we can possibly improve error
handling by adding regexes for common issues and hint users on that.
I'll leave that for a following PR, but there's a stub for this approach in the PR.
## Performance
To be sure we don't add any performance penalty with this new encoding, I measured:
```console
$ spack python -m timeit -s "import spack.spec" -c "spack.spec.Spec(<spec>)"
```
for different specs on my machine:
* **Spack:** 0.20.0.dev0 (c9db4e50ba045f5697816187accaf2451cb1aae7)
* **Python:** 3.8.10
* **Platform:** linux-ubuntu20.04-icelake
* **Concretizer:** clingo
results are:
| Spec | develop | this PR |
| ------------- | ------------- | ------- |
| `trilinos` | 28.9 usec | 13.1 usec |
| `trilinos @1.2.10:1.4.20,2.0.1` | 131 usec | 120 usec |
| `trilinos %gcc` | 44.9 usec | 20.9 usec |
| `trilinos +foo` | 44.1 usec | 21.3 usec |
| `trilinos foo=bar` | 59.5 usec | 25.6 usec |
| `trilinos foo=bar ^ mpich foo=baz` | 120 usec | 82.1 usec |
so this new parser seems to be consistently faster than the previous one.
## Modifications
In this PR we just substituted the Spec parser, which means:
- [x] Deleted in `spec.py` the `SpecParser` and `SpecLexer` classes. deleted `spack/parse.py`
- [x] Added a new parser in `spack/parser.py`
- [x] Hooked the new parser in all the places the previous one was used
- [x] Adapted unit tests in `test/spec_syntax.py`
## Possible future improvements
Random thoughts while working on the PR:
- Currently we transform hashes and files into specs during parsing. I think
we might want to introduce an additional step and parse special objects like
a `FileSpec` etc. in-between parsing and concretization.
This commit reworks the bootstrapping procedure to use Spack environments
as much as possible.
The `spack.bootstrap` module has also been reorganized into a Python package.
A distinction is made among "core" Spack dependencies (clingo, GnuPG, patchelf)
and other dependencies. For a number of reasons, explained in the `spack.bootstrap.core`
module docstring, "core" dependencies are bootstrapped with the current ad-hoc
method.
All the other dependencies are instead bootstrapped using a Spack environment
that lives in a directory specific to the interpreter and the architecture being used.
All the vermin annotations we were using were for optional features introduced in early
Python 3 versions. We no longer need any of them, as we only support Python 3.6+. If we
start optionally using features from newer Pythons than 3.6, we will need more vermin
annotations.
Co-authored-by: Harmen Stoppels <harmenstoppels@gmail.com>
Co-authored-by: Harmen Stoppels <harmenstoppels@gmail.com>
We no longer support Python <3.6, so we don't need to check whether Python supports SSL
verification in `spack.util.web`.
- [x] Remove a bunch of logic we needed to appease Python 2
We've stopped supporting Python 2, and contributors are noticing that our CI no longer
allows Python 2.7 comment type hints. They end up having to adapt them, but this adds
extra unrelated work to PRs.
- [x] Move to 3.6 type hints across the entire code base
All Spec attributes are now represented as `attr(attribute_name, ... args ...)`, e.g.
`attr(node, "hdf5")` instead of `node("hdf5")`, as we *have* to maintain the `attr()`
form anyway, and it simplifies the encoding to just maintain one form of the Spec
information.
Background
----------
In #20644, we unified the way conditionals are done in the concretizer, but this
introduced a nasty aspect to the encoding: we have to maintain everything we want in
general conditions in two forms: `predicate(...)` and `attr("predicate", ...)`. For
example, here's the start of the table of spec attributes we had to maintain:
```prolog
node(Package) :- attr("node", Package).
virtual_node(Virtual) :- attr("virtual_node", Virtual).
hash(Package, Hash) :- attr("hash", Package, Hash).
version(Package, Version) :- attr("version", Package, Version).
...
```
```prolog
attr("node", Package) :- node(Package).
attr("virtual_node", Virtual) :- virtual_node(Virtual).
attr("hash", Package, Hash) :- hash(Package, Hash).
attr("version", Package, Version) :- version(Package, Version).
...
```
This adds cognitive load to understanding how the concretizer works, as you have to
understand the equivalence between the two forms of spec attributes. It also makes the
general condition logic in #20644 hard to explain, and it's easy to forget to add a new
equivalence to this list when adding new spec attributes (at least two people have been
bitten by this).
Solution
--------
- [x] remove the equivalence list from `concretize.lp`
- [x] simplify `spec_clauses()`, `condition()`, and other functions in `asp.py` that need
to deal with `Spec` attributes.
- [x] Convert all old-form spec attributes in `concretize.lp` to the `attr()` form
- [x] Simplify `display.lp`, where we also had to maintain a list of spec attributes. Now
we only need to show `attr/2`, `attr/3`, and `attr/4`.
- [x] Simplify model extraction logic in `asp.py`.
Performance
-----------
This seems to result in a smaller grounded problem (as there are no longer duplicated
`attr("foo", ...)` / `foo(...)` predicates in the program), but it also adds a slight
performance overhead vs. develop. Ultimately, simplifying the encoding will be a win,
particularly for improving error messages.
Notes
-----
This will simplify future node refactors in `concretize.lp` (e.g., not identifying nodes
by package name, which we need for separate build dependencies).
I'm still not entirely used to reading `attr()` notation, but I thnk it's ultimately
clearer than what we did before. We need more uniform naming, and it's now clear what is
part of a solution. We should probably continue making the encoding of `concretize.lp`
simpler and more self-explanatory. It may make sense to rename `attr` to something like
`node_attr` and to simplify the names of node attributes. It also might make sense to do
something similar for other types of predicates in `concretize.lp`.
This reverts commit d06fd26c9a.
The problem is that Bitbucket's API forwards download requests to an S3 bucket using a temporary URL. This URL includes a signature for the request, which embeds the HTTP verb. That means only GET requests are allowed, and HEAD requests would fail verification, leading to 403 erros. The same is observed when using `curl -LI ...`
Using `-Werror` is good practice for development and testing, but causes us a great
deal of heartburn supporting multiple compiler versions, especially as newer compiler
versions add warnings for released packages. This PR adds support for suppressing
`-Werror` through spack's compiler wrappers. There are currently three modes for
the `flags:keep_werror` setting:
* `none`: (default) cancel all `-Werror`, `-Werror=*` and `-Werror-*` flags by
converting them to `-Wno-error[=]*` flags
* `specific`: preserve explicitly selected warnings as errors, such as
`-Werror=format-truncation`, but reverse the blanket `-Werror`
* `all`: keeps all `-Werror` flags
These can be set globally in config.yaml, through the config command-line flags, or
overridden by a particular package (some packages use Werror as a proxy for determining
support for other compiler features). We chose to use this approach because:
1. removing `-Werror` flags entirely broke *many* build systems, especially autoconf
based ones, because of things like checking `-Werror=feature` and making the
assumption that if that did not error other flags related to that feature would also work
2. Attempting to preserve `-Werror` in some phases but not others caused similar issues
3. The per-package setting came about because some packages, even with all these
protections, still use `-Werror` unsafely. Currently there are roughly 3 such packages
known.
For reasons beyond me Python thinks it's a great idea to upgrade HEAD
requests to GET requests when following redirects. So, this PR adds a
better `HTTPRedirectHandler`, and also moves some ad-hoc logic around
for dealing with disabling SSL certs verification.
Also, I'm stumped by the fact that Spack's `url_exists` does not use
HEAD requests at all, so in certain cases Spack awkwardly downloads
something first to see if it can download it, and then downloads it
again because it knows it can download it. So, this PR ensures that both
urllib and botocore use HEAD requests.
Finally, it also removes some things that were there to support currently
unsupported Python versions.
Notice that the HTTP spec [section 10.3.2](https://datatracker.ietf.org/doc/html/rfc2616.html#section-10.3.2) just talks about how to deal
with POST request on redirect (whether to follow or not):
> If the 301 status code is received in response to a request other
> than GET or HEAD, the user agent MUST NOT automatically redirect the
> request unless it can be confirmed by the user, since this might
> change the conditions under which the request was issued.
> Note: When automatically redirecting a POST request after
> receiving a 301 status code, some existing HTTP/1.0 user agents
> will erroneously change it into a GET request.
Python has a comment about this, they choose to go with the "erroneous change".
But they then mess up the HEAD request while following the redirect, probably
because they were too busy discussing how to deal with POST.
See https://github.com/python/cpython/pull/99731
This adds super-lazy maintainer mode to `spack checksum`: Instead of
only printing the new checksums to the terminal, `-a` and
`--add-to-package` will add the new checksums to the `package.py` file
and open it in the editor afterwards for final checks.