6ee2eb21dd
This changes the hash algorithm so that it does much less object allocation and copying, and so that it is correct. The old version of `_cmp_key()` would call `sorted_deps`, which would call `flat_dependencies` to get a list of dependencies so that it could sort them in alphabetical order. This isn't necessary in the `_cmp_key()`, and in fact we want more DAG structure than that to be included in the `_cmp_key()`. The new version constructs a tuple without copying the Spec DAG, and the tuple contains hashes of sub-DAGs that are computed recursively in-place. This is way faster than the previous algorithm and reduces the numebr of copies significantly. It is also a correct DAG hash. Example timing and copy counts for the different hashing algorithms we've tried: Original (wrong) Spec hash: ``` 106,170 copies real 0m5.024s user 0m4.949s sys 0m0.104s ``` Spec hash using YAML `dag_hash()`: ``` 3,794 copies real 0m5.024s user 0m4.949s sys 0m0.104s New no-copy, no-YAML hash: ``` 3,594 copies real 0m2.543s user 0m2.435s sys 0m0.104s ``` So now we have a hash that is correct AND faster. The remaining ~3k copies happen mostly during concretization, and as all packages are initially loaded. I believe this is because Spack currently has to load all packages to figure out virtual dependency information; it could also be becasue there ar a lot of lookups of partial specs in concretize. I can investigate this further. |
||
---|---|---|
.. | ||
spack |