Pipeline layout

The pipeline system is designed to be organised in a standard directory structure. Insofar as is possible, this contains all information needed to manage a cluster and assosciated set of pipeline tasks. It is not designed to contain the actual data that is being processed. It is assumed that this directory will be available to all the various cluster nodes, presumably using NFS.

The root directory of this structure is the runtime directory. This contains all the information about a given “cluster” – that is, all the disks, compute nodes, management node, etc described by a given clusterdesc file. This top level directory contains that clusterdesc, and, if appropriate, information about an associated IPython cluster:

  • A PID file (ipc.pid) and log files from the IPython controller (named according to the pattern ipc.log${pid}.log)
  • An engines directory, which contains PID (ipengine${N}.pid, where N simply increments according to the number of engines on the host) files and log files (log${PID}.log) from the various engines in the cluster, organised into subdirectories by hostname.
  • The files engine.furl, multiengine.furl and task.furl, which provide access to the IPython engines, multiengine client and task client resepectively. See the IPython documentation for details.

Of course, a single cluster (and hence runtime directory) may process many different jobs. These are stored in the jobs subdirectory, and are organised by job name – an arbitrary user-supplied string.

Within each job directory, three further subdirectories are found:

logs
Processing logs; where appropriate, these are filed by sub-band name.
parsets
Paramaeter sets providing configuration information for the various pipeline components. These should provide the static parameters used by tasks such as DPPP and the imager; dynamic parameters, such as the name of the files to be processed, can be added by the pipeline at runtime.
vds
Contains VDS and GDS files pointing to the location of the data to be processed on the cluster.

Previous topic

Running basic pipelines

Next topic

Jobs

This Page