TL;DR: there are matching groups trying to match 1 or more occurrences of
something. We don't use the matching group. Therefore it's sufficient to test
for 1 occurrence. This reduce quadratic complexity to linear time.
---
When parsing logs of an mpich build, I'm getting a 4 minute (!!) wait
with 16 threads for regexes to run:
```
In [1]: %time p.parse("mpich.log")
Wall time: 4min 14s
```
That's really unacceptably slow...
After some digging, it seems a few regexes tend to have `O(n^2)` scaling
where `n` is the string / log line length. I don't think they *necessarily*
should scale like that, but it seems that way. The common pattern is this
```
([^:]+): error
```
which matches `: error` literally, and then one or more non-colons before that. So
for a log line like this:
```
abcdefghijklmnopqrstuvwxyz: error etc etc
```
Any of these are potential group matches when using `search` in Python:
```
abcdefghijklmnopqrstuvwxyz
bcdefghijklmnopqrstuvwxyz
cdefghijklmnopqrstuvwxyz
⋮
yz
z
```
but clearly the capture group should return the longest match.
My hypothesis is that Python has a very bad implementation of `search`
that somehow considers all of these, even though it can be implemented
in linear time by scanning for `: error` first, and then greedily expanding
the longest possible `[^:]+` match to the left. If Python indeed considers
all possible matches, then with `n` matches of length `1 .. n` you
see the `O(n^2)` slowness (i verified this by replacing + with {1,k}
and doubling `k`, it doubles the execution time indeed).
This PR fixes this by removing the `+`, so effectively changing the
O(n^2) into a O(n) worst case.
The reason we are fine with dropping `+` is that we don't use the
capture group anywhere, so, we just ensure `:: error` is not a match
but `x: error` is.
After going from O(n^2) to O(n), the 15MB mpich build log is parsed
in `1.288s`, so about 200x faster.
Just to be sure I've also updated `^CMake Error.*:` to `^CMake Error`,
so that it does not match with all the possible `:`'s in the line.
Another option is to use `.*?` there to make it quit scanning as soon as
possible, but what line that starts with `CMake Error` that does not have
a colon is really a false positive...
When PR #26659 is merged, boost version boost@1.54.0 will be available for build too.
Co-authored-by: Bernhard Kaindl <43588962+bernhardkaindl@users.noreply.github.com>
Installing packages with a lot of dependencies does not have an easy way
of judging the current progress (apart from running `spack spec -I pkg`
in another terminal). This change allows Spack to update the terminal's
title with status information, including its current progress as well as
information about the current and total number of packages.
* kahip: update to cmake for v3.11, retain scons for older versions
* kahip: update build system to cmake for v3.11, retain SCons for older versions
* address PR comments and add maintainer
* address PR comments - correct version to 2.10, add deprecated and url, and remove scons version
- Do not store the full list of environment variables in
<prefix>/.spack/spack-build-env.txt because it may contain user secrets.
- Only store environment variable modifications upon installation.
- Variables like PATH may still contain user and system paths to make
spack-build-env.txt sourceable. Variables containing paths are
modified through prepending/appending, and if we don't apply these
to the current environment variable, we end up with statements like
`export PATH=/path/to/spack/bin` with system paths missing, meaning
no system binaries are in the path, which is a bad user experience.
- Do write the full environment to spack-build-env.txt in the staging dir,
but ensure it is readonly for the current user, to make it a bit safer
on shared systems.
Creates an environment in a temporary directory and activates it, which
is useful for a quick ephemeral environment:
```
$ spack env activate -p --temp
[spack-1a203lyg] $ spack add zlib
==> Adding zlib to environment /tmp/spack-1a203lyg
==> Updating view at /tmp/spack-1a203lyg/.spack-env/view
```
PR #25904 moved the `--with-tcl` option to only older versions. However,
without this option, the build breaks:
```
checking for Tcl configuration... configure: error: Can't find Tcl configuration definitions. Use --with-tcl to specify a directory containing tclConfig.sh
```
The DB should be what is trusted for certain operations.
If it is not present when read we should assume the
corresponding store is empty, rather than trying a
write operation during a read.
* Add a unit test
* Document what needs to be there in tests
* py-twisted,py-storm: dep on zope.interface, bump storm version
py-twisted and py-storm's import tests need zope.interface.
py-storm: Use pypi and add version 0.25. It didn't change reqs.
zope.infterface@4.5.0 imports removed Feature: Use setuptools@:45
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
* py-storm: all deps updated with type=('build', 'run')
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
* py-bcrypt, py-bleach, py-decorator, py-pygdal: fix python dependency
* Update var/spack/repos/builtin/packages/py-bleach/package.py
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
* py-matplotlib: fix 3.4.3
* Update var/spack/repos/builtin/packages/py-matplotlib/package.py
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
* py-keras-preprocessing: Add missing deps: six@1.9.0: and numpy@1.9.1:
Add deps: pip download --no-binary :all: keras-preprocessing==1.1.2
Collecting numpy>=1.9.1
Installing build dependencies: started
Collecting six>=1.9.0
* Update var/spack/repos/builtin/packages/py-keras-preprocessing/package.py
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>