Configuration

Various default parameters for pipeline operation are stored in a configuration file. The contents of this file are available to recipes as self.config, which is an instance of SafeConfigParser from the Python Standard Library.

Accepted parameters

Section DEFAULT

This section contains global pipeline configuration parameters. They can be referred to in other sections of the configuration file using %()s syntax: see the Python documentation (lined above) for details.

runtime_directory

Overall pipeline framework runtime directory. Pipeline framework metadata which is shared between multiple jobs will be written here.

This parameter is required.

recipe_directories
List of directories in which to search for recipes. Multiple directories can be specified using a Python list-like syntax: [/path/to/dir/1, /path/to/dir/2].
task_files
List of task definition files. Multiple entries may be specified in the same way as above.

Section layout

This section contains paths which may be referenced by individual pipeline recipes, for example to save state, locate parset files or write logs.

job_directory

This will contain configuration information (parsets, etc) for a given pipeline job. Metadata referring to an ongoing run may be written into this directory (and will normally be removed when the run finishes), and logs and output files are colelcted here.

This directory should be available to (eg, NFS mounted) to every node that is involved in the pipeline run.

This parameter is required.

Section cluster

This section describes the layout of a cluster which can be used for distributed processing.

clusterdesc

The full path to a clusterdesc file (see distproc) which describes the cluster configuration to be used by the pipeline.

This parameter is required if remote jobs are being used.

task_furl and multiengine_furl

Filenames which will be used for the FURL files needed for collection to an IPython cluster.

These parameters are only required if IPython is being used within the pipeline.

Section deploy

This section describes the environment used for starting up jobs on remote hosts.

engine_lpath and engine_ppath

The values of $LD_LIBRARY_PATH and $PYTHONPATH which will be used for all remote commands. Note that these are not inherited from the environment on the pipeline head node.

These parameter are required if remote jobs are being used.

controller_ppath
The value of $PYTHONPATH which will be used for an IPython controller started on the head node. Note that this is not used (or required) if IPython is not being used.
script_path
The location of scripts needed for starting IPython engines on remote hosts. This is not used (or required) if IPython is not being used, or if a non-pipeline method is used for starting the engines.

Section logging

This section enables the user to customise the pipeline logging output. Note that it is entirely optional: a log file with default settings will be written to the job_directory if this section is omitted.

log_file
Output filename for logging.
format and datefmt
Format for log entries and dates, respectively. These are used exactly as per the Python logging system; see its documentation for details.

Section remote

This section contains parameters for configuring the remote command execution strategy. It is intended for expert use only.

Table Of Contents

Previous topic

Jobs

Next topic

Tasks

This Page