docs: overhaul module_file_support.rst (#42585)

The section was highly outdated as it referred to old defaults, and
failed to mention `hide_implicits: true`.

This commit restructures it, moves some deeply nested sections a level
up, and promotes `hide_implicits: true` + `autoload: direct` before
talking about `exclude`.
This commit is contained in:
Harmen Stoppels 2024-02-09 13:32:43 +01:00 committed by GitHub
parent a832d31ccd
commit c59d2d5b1c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -10,7 +10,7 @@ Modules (modules.yaml)
======================
The use of module systems to manage user environment in a controlled way
is a common practice at HPC centers that is often embraced also by
is a common practice at HPC centers that is sometimes embraced also by
individual programmers on their development machines. To support this
common practice Spack integrates with `Environment Modules
<http://modules.sourceforge.net/>`_ and `Lmod
@ -21,14 +21,38 @@ Modules are one of several ways you can use Spack packages. For other
options that may fit your use case better, you should also look at
:ref:`spack load <spack-load>` and :ref:`environments <environments>`.
----------------------------
Using module files via Spack
----------------------------
-----------
Quick start
-----------
If you have installed a supported module system you should be able to
run ``module avail`` to see what module
files have been installed. Here is sample output of those programs,
showing lots of installed packages:
In the current version of Spack, module files are not generated by default. To get started, you
can generate module files for all currently installed packages by running either
.. code-block:: console
$ spack module tcl refresh
or
.. code-block:: console
$ spack module lmod refresh
Spack can also generate module files for all future installations automatically through the
following configuration:
.. code-block:: console
$ spack config add modules:default:enable:[tcl]
or
.. code-block:: console
$ spack config add modules:default:enable:[lmod]
Assuming you have a module system installed, you should now be able to use the ``module`` command
to interact with them:
.. code-block:: console
@ -65,33 +89,17 @@ scheme used at your site.
Module file customization
-------------------------
Module files are generated by post-install hooks after the successful
installation of a package.
.. note::
Spack only generates modulefiles when a package is installed. If
you attempt to install a package and it is already installed, Spack
will not regenerate modulefiles for the package. This may lead to
inconsistent modulefiles if the Spack module configuration has
changed since the package was installed, either by editing a file
or changing scopes or environments.
Later in this section there is a subsection on :ref:`regenerating
modules <cmd-spack-module-refresh>` that will allow you to bring
your modules to a consistent state.
The table below summarizes the essential information associated with
the different file formats that can be generated by Spack:
+-----------------------------+--------------------+-------------------------------+----------------------------------------------+----------------------+
| | **Hook name** | **Default root directory** | **Default template file** | **Compatible tools** |
+=============================+====================+===============================+==============================================+======================+
| **Tcl - Non-Hierarchical** | ``tcl`` | share/spack/modules | share/spack/templates/modules/modulefile.tcl | Env. Modules/Lmod |
+-----------------------------+--------------------+-------------------------------+----------------------------------------------+----------------------+
| **Lua - Hierarchical** | ``lmod`` | share/spack/lmod | share/spack/templates/modules/modulefile.lua | Lmod |
+-----------------------------+--------------------+-------------------------------+----------------------------------------------+----------------------+
+-----------+--------------+------------------------------+----------------------------------------------+----------------------+
| | Hierarchical | **Default root directory** | **Default template file** | **Compatible tools** |
+===========+==============+==============================+==============================================+======================+
| ``tcl`` | No | share/spack/modules | share/spack/templates/modules/modulefile.tcl | Env. Modules/Lmod |
+-----------+--------------+------------------------------+----------------------------------------------+----------------------+
| ``lmod`` | Yes | share/spack/lmod | share/spack/templates/modules/modulefile.lua | Lmod |
+-----------+--------------+------------------------------+----------------------------------------------+----------------------+
Spack ships with sensible defaults for the generation of module files, but
@ -102,7 +110,7 @@ In general you can override or extend the default behavior by:
2. writing specific rules in the ``modules.yaml`` configuration file
3. writing your own templates to override or extend the defaults
The former method let you express changes in the run-time environment
The former method lets you express changes in the run-time environment
that are needed to use the installed software properly, e.g. injecting variables
from language interpreters into their extensions. The latter two instead permit to
fine tune the filesystem layout, content and creation of module files to meet
@ -110,79 +118,62 @@ site specific conventions.
.. _overide-api-calls-in-package-py:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Override API calls in ``package.py``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Setting environment variables dynamically in ``package.py``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There are two methods that you can override in any ``package.py`` to affect the
content of the module files generated by Spack. The first one:
There are two methods that you can implement in any ``package.py`` to dynamically affect the
content of the module files generated by Spack. The most important one is
``setup_run_environment``, which can be used to set environment variables in the module file that
depend on the spec:
.. code-block:: python
def setup_run_environment(self, env):
pass
if self.spec.satisfies("+foo"):
env.set("FOO", "bar")
can alter the content of the module file associated with the same package where it is overridden.
The second method:
The second, less commonly used, is ``setup_dependent_run_environment(self, env, dependent_spec)``,
which allows a dependency to set variables in the module file of its dependents. This is typically
used in packages like ``python``, ``r``, or ``perl`` to prepend the dependent's prefix to the
search path of the interpreter (``PYTHONPATH``, ``R_LIBS``, ``PERL5LIB`` resp.), so it can locate
the packages at runtime.
For example, a simplified version of the ``python`` package could look like this:
.. code-block:: python
def setup_dependent_run_environment(self, env, dependent_spec):
pass
if dependent_spec.package.extends(self.spec):
env.prepend_path("PYTHONPATH", dependent_spec.prefix.lib.python)
can instead inject run-time environment modifications in the module files of packages
that depend on it. In both cases you need to fill ``env`` with the desired
list of environment modifications.
.. admonition:: The ``r`` package and callback APIs
An example in which it is crucial to override both methods
is given by the ``r`` package. This package installs libraries and headers
in non-standard locations and it is possible to prepend the appropriate directory
to the corresponding environment variables:
================== =================================
LD_LIBRARY_PATH ``self.prefix/rlib/R/lib``
PKG_CONFIG_PATH ``self.prefix/rlib/pkgconfig``
================== =================================
with the following snippet:
.. literalinclude:: _spack_root/var/spack/repos/builtin/packages/r/package.py
:pyobject: R.setup_run_environment
The ``r`` package also knows which environment variable should be modified
to make language extensions provided by other packages available, and modifies
it appropriately in the override of the second method:
.. literalinclude:: _spack_root/var/spack/repos/builtin/packages/r/package.py
:pyobject: R.setup_dependent_run_environment
and would make any package that ``extends("python")`` have its library directory added to the
``PYTHONPATH`` environment variable in the module file. It's much more convenient to set this
variable here, than to repeat it in every Python extension's ``setup_run_environment`` method.
.. _modules-yaml:
^^^^^^^^^^^^^^^^^^^^^^^^^^
Write a configuration file
^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The ``modules.yaml`` config file and module sets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The configuration files that control module generation behavior
are named ``modules.yaml``. The default configuration:
The configuration files that control module generation behavior are named ``modules.yaml``. The
default configuration looks like this:
.. literalinclude:: _spack_root/etc/spack/defaults/modules.yaml
:language: yaml
activates the hooks to generate ``tcl`` module files and inspects
the installation folder of each package for the presence of a set of subdirectories
(``bin``, ``man``, ``share/man``, etc.). If any is found its full path is prepended
to the environment variables listed below the folder name.
You can define one or more **module sets**, each of which can be configured separately with regard
to install location, naming scheme, inclusion and exclusion, autoloading, et cetera.
Spack modules can be configured for multiple module sets. The default
module set is named ``default``. All Spack commands which operate on
modules default to apply the ``default`` module set, but can be
applied to any module set in the configuration.
The default module set is aptly named ``default``. All
:ref:`Spack commands that operate on modules <maintaining-module-files>` apply to the ``default``
module set, unless another module set is specified explicitly (with the ``--name`` flag).
"""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^
Changing the modules root
"""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^
As shown in the table above, the default module root for ``lmod`` is
``$spack/share/spack/lmod`` and the default root for ``tcl`` is
@ -224,25 +215,32 @@ location could be confusing to users of your modules. In the next
section, we will discuss enabling and disabling module types (module
file generators) for each module set.
""""""""""""""""""""
Activate other hooks
""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Automatically generating module files
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Any other module file generator shipped with Spack can be activated adding it to the
list under the ``enable`` key in the module file. Currently the only generator that
is not active by default is ``lmod``, which produces hierarchical lua module files.
Each module system can then be configured separately. In fact, you should list configuration
options that affect a particular type of module files under a top level key corresponding
to the generator being customized:
Spack can be configured to automatically generate module files as part of package installation.
This is done by adding the desired module systems to the ``enable`` list.
.. code-block:: yaml
modules:
default:
enable:
- tcl
- lmod
- tcl
- lmod
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Configuring ``tcl`` and ``lmod`` modules
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can configure the behavior of either module system separately, under a key corresponding to
the generator being customized:
.. code-block:: yaml
modules:
default:
tcl:
# contains environment modules specific customizations
lmod:
@ -253,16 +251,70 @@ either change the layout of the module files on the filesystem, or they will aff
their content. For the latter point it is possible to use anonymous specs
to fine tune the set of packages on which the modifications should be applied.
.. _autoloading-dependencies:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Autoloading and hiding dependencies
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
A module file should set the variables that are needed for an application to work. But since an
application often has many dependencies, where should all the environment variables for those be
set? In Spack the rule is that each package sets the runtime variables that are needed by the
package itself, and no more. This way, dependencies can be loaded standalone too, and duplication
of environment variables is avoided.
That means however that if you want to use an application, you need to load the modules for all its
dependencies. Of course this is not something you would want users to do manually.
Since Spack knows the dependency graph of every package, it can easily generate module files that
automatically load the modules for its dependencies recursively. It is enabled by default for both
Lmod and Environment Modules under the ``autoload: direct`` config option. The former system has
builtin support through the ``depends_on`` function, the latter simply uses a ``module load``
statement. Both module systems (at least in newer versions) do reference counting, so that if a
module is loaded by two different modules, it will only be unloaded after the others are.
The ``autoload`` key accepts the values ``none``, ``direct``, and ``all``. To disable it, use
``none``, and to enable, it's best to stick to ``direct``, which only autoloads the direct link and
run type dependencies, relying on recursive autoloading to load the rest.
A common complaint about autoloading is the large number of modules that are visible to the user.
Spack has a solution for this as well: ``hide_implicits: true``. This ensures that only those
packages you've explicitly installed are exposed by ``module avail``, but still allows for
autoloading of hidden dependencies. Lmod should support hiding implicits in general, while
Environment Modules requires version 4.7 or higher.
.. note::
If supported by your module system, we highly encourage the following configuration that enables
autoloading and hiding of implicits. It ensures all runtime variables are set correctly,
including those for dependencies, without overwhelming the user with a large number of available
modules. Further, it makes it easier to get readable module names without collisions, see the
section below on :ref:`modules-projections`.
.. code-block:: yaml
modules:
default:
tcl:
hide_implicits: true
all:
autoload: direct
lmod:
hide_implicits: true
all:
autoload: direct
.. _anonymous_specs:
""""""""""""""""""""""""""""
Selection by anonymous specs
""""""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Setting environment variables for selected packages in config
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In the configuration file you can use *anonymous specs* (i.e. specs
that **are not required to have a root package** and are thus used just
to express constraints) to apply certain modifications on a selected set
of the installed software. For instance, in the snippet below:
In the configuration file you can filter particular specs, and make further changes to the
environment variables that go into their module files. This is very powerful when you want to avoid
:ref:`modifying the package itself <overide-api-calls-in-package-py>`, or when you want to set
certain variables on multiple selected packages at once.
For instance, in the snippet below:
.. code-block:: yaml
@ -305,12 +357,28 @@ the variable ``FOOBAR`` will be unset.
.. note::
Order does matter
The modifications associated with the ``all`` keyword are always evaluated
first, no matter where they appear in the configuration file. All the other
spec constraints are instead evaluated top to bottom.
first, no matter where they appear in the configuration file. All the other changes to
environment variables for matching specs are evaluated from top to bottom.
""""""""""""""""""""""""""""""""""""""""""""
.. warning::
As general advice, it's often better to set as few unnecessary variables as possible. For
example, the following seemingly innocent and potentially useful configuration
.. code-block:: yaml
all:
environment:
set:
"{name}_ROOT": "{prefix}"
sets ``BINUTILS_ROOT`` to its prefix in modules for ``binutils``, which happens to break
the ``gcc`` compiler: it uses this variable as its default search path for certain object
files and libraries, and by merely setting it, everything fails to link.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Exclude or include specific module files
""""""""""""""""""""""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can use anonymous specs also to prevent module files from being written or
to force them to be written. Consider the case where you want to hide from users
@ -330,14 +398,19 @@ you will prevent the generation of module files for any package that
is compiled with ``gcc@4.4.7``, with the only exception of any ``gcc``
or any ``llvm`` installation.
It is safe to combine ``exclude`` and ``autoload``
:ref:`mentioned above <autoloading-dependencies>`. When ``exclude`` prevents a module file to be
generated for a dependency, the ``autoload`` feature will simply not generate a statement to load
it.
.. _modules-projections:
"""""""""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Customize the naming of modules
"""""""""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The names of environment modules generated by spack are not always easy to
The names of environment modules generated by Spack are not always easy to
fully comprehend due to the long hash in the name. There are three module
configuration options to help with that. The first is a global setting to
adjust the hash length. It can be set anywhere from 0 to 32 and has a default
@ -353,6 +426,13 @@ shows how to set hash length in the module file names:
tcl:
hash_length: 7
.. tip::
Using ``hide_implicits: true`` (see :ref:`autoloading-dependencies`) vastly reduces the number
modules exposed to the user. The hidden modules always contain the hash in their name, and are
not influenced by the ``hash_length`` setting. Hidden implicits thus make it easier to use a
short hash length or no hash at all, without risking name conflicts.
To help make module names more readable, and to help alleviate name conflicts
with a short hash, one can use the ``suffixes`` option in the modules
configuration file. This option will add strings to modules that match a spec.
@ -365,12 +445,12 @@ For instance, the following config options,
tcl:
all:
suffixes:
^python@2.7.12: 'python-2.7.12'
^python@3.12: 'python-3.12'
^openblas: 'openblas'
will add a ``python-2.7.12`` version string to any packages compiled with
python matching the spec, ``python@2.7.12``. This is useful to know which
version of python a set of python extensions is associated with. Likewise, the
will add a ``python-3.12`` version string to any packages compiled with
Python matching the spec, ``python@3.12``. This is useful to know which
version of Python a set of Python extensions is associated with. Likewise, the
``openblas`` string is attached to any program that has openblas in the spec,
most likely via the ``+blas`` variant specification.
@ -468,41 +548,11 @@ that are already in the Lmod hierarchy.
For hierarchies that are deeper than three layers ``lmod spider`` may have some issues.
See `this discussion on the Lmod project <https://github.com/TACC/Lmod/issues/114>`_.
""""""""""""""""""""""
Select default modules
""""""""""""""""""""""
By default, when multiple modules of the same name share a directory,
the highest version number will be the default module. This behavior
of the ``module`` command can be overridden with a symlink named
``default`` to the desired default module. If you wish to configure
default modules with Spack, add a ``defaults`` key to your modules
configuration:
.. code-block:: yaml
modules:
my-module-set:
tcl:
defaults:
- gcc@10.2.1
- hdf5@1.2.10+mpi+hl%gcc
These defaults may be arbitrarily specific. For any package that
satisfies a default, Spack will generate the module file in the
appropriate path, and will generate a default symlink to the module
file as well.
.. warning::
If Spack is configured to generate multiple default packages in the
same directory, the last modulefile to be generated will be the
default module.
.. _customize-env-modifications:
"""""""""""""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Customize environment modifications
"""""""""""""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can control which prefixes in a Spack package are added to
environment variables with the ``prefix_inspections`` section; this
@ -600,9 +650,9 @@ stack to users who are likely to inspect the modules to find full
paths to software, when it is desirable to present the users with a
simpler set of paths than those generated by the Spack install tree.
""""""""""""""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Filter out environment modifications
""""""""""""""""""""""""""""""""""""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Modifications to certain environment variables in module files are there by
default, for instance because they are generated by prefix inspections.
@ -622,49 +672,37 @@ do so by using the ``exclude_env_vars``:
The configuration above will generate module files that will not contain
modifications to either ``CPATH`` or ``LIBRARY_PATH``.
^^^^^^^^^^^^^^^^^^^^^^
Select default modules
^^^^^^^^^^^^^^^^^^^^^^
.. _autoloading-dependencies:
"""""""""""""""""""""
Autoload dependencies
"""""""""""""""""""""
Often it is required for a module to have its (transient) dependencies loaded as well.
One example where this is useful is when one package needs to use executables provided
by its dependency; when the dependency is autoloaded, the executable will be in the
PATH. Similarly for scripting languages such as Python, packages and their dependencies
have to be loaded together.
Autoloading is enabled by default for Lmod and Environment Modules. The former
has builtin support for through the ``depends_on`` function. The latter uses
``module load`` statement to load and track dependencies.
Autoloading can also be enabled conditionally:
By default, when multiple modules of the same name share a directory,
the highest version number will be the default module. This behavior
of the ``module`` command can be overridden with a symlink named
``default`` to the desired default module. If you wish to configure
default modules with Spack, add a ``defaults`` key to your modules
configuration:
.. code-block:: yaml
modules:
default:
tcl:
all:
autoload: none
^python:
autoload: direct
modules:
my-module-set:
tcl:
defaults:
- gcc@10.2.1
- hdf5@1.2.10+mpi+hl%gcc
The configuration file above will produce module files that will
load their direct dependencies if the package installed depends on ``python``.
The allowed values for the ``autoload`` statement are either ``none``,
``direct`` or ``all``.
These defaults may be arbitrarily specific. For any package that
satisfies a default, Spack will generate the module file in the
appropriate path, and will generate a default symlink to the module
file as well.
.. note::
Tcl prerequisites
In the ``tcl`` section of the configuration file it is possible to use
the ``prerequisites`` directive that accepts the same values as
``autoload``. It will produce module files that have a ``prereq``
statement, which autoloads dependencies on Environment Modules when its
``auto_handling`` configuration option is enabled. If Environment Modules
is installed with Spack, ``auto_handling`` is enabled by default starting
version 4.2. Otherwise it is enabled by default since version 5.0.
.. warning::
If Spack is configured to generate multiple default packages in the
same directory, the last modulefile to be generated will be the
default module.
.. _maintaining-module-files:
------------------------
Maintaining Module Files