28c0dd9148
Fixes #9166 This is intended to reduce errors related to lock timeouts by making the following changes: * Improves error reporting when acquiring a lock fails (addressing #9166) - there is no longer an attempt to release the lock if an acquire fails * By default locks taken on individual packages no longer have a timeout. This allows multiple spack instances to install overlapping dependency DAGs. For debugging purposes, a timeout can be added by setting 'package_lock_timeout' in config.yaml * Reduces the polling frequency when trying to acquire a lock, to reduce impact in the case where NFS is overtaxed. A simple adaptive strategy is implemented, which starts with a polling interval of .1 seconds and quickly increases to .5 seconds (originally it would poll up to 10^5 times per second). A test is added to check the polling interval generation logic. * The timeout for Spack's whole-database lock (e.g. for managing information about installed packages) is increased from 60s to 120s * Users can configure the whole-database lock timeout using the 'db_lock_timout' setting in config.yaml Generally, Spack locks (those created using spack.llnl.util.lock.Lock) now have no timeout by default This does not address implementations of NFS that do not support file locking, or detect cases where services that may be required (nfslock/statd) aren't running. Users may want to be able to more-aggressively release locks when they know they are the only one using their Spack instance, and they encounter lock errors after a crash (e.g. a remote terminal disconnect mentioned in #8915). |
||
---|---|---|
.. | ||
spack/defaults |