The LOFAR pipeline system is built using the Python programming language. Certain features build upon the following libraries and tools. The short descriptions given here should serve as background material for those who simply wish to use the framework: directly interacting with these components should rarely be necessary. Developers, of course, will wish to learn in detail about all of these libraries.
IPython, billed as “an enhanced interactive Python”, also provides a comprehensive and easy-to-use suite of tools for parallel processing across a cluster of compute nodes using Python. This capability is may be used for writing recipes in the pipeline system.
The parallel computing capabilities are only available in recent (post-0.9) releases of IPython. The reader may wish to refer to the IPython documentation for more information, or, for a summary of the capabilities of the system, to the Notes on IPython document on the LOFAR wiki.
A slight enhancement to the standard 0.9 IPython release is included with the pipeline system. We subclass IPython.kernel.task.StringTask to create pipeline.support.LOFARTask. This adds the dependargs named argument to the standard StringTask, which, in turn, is fed to the tasks’s depend() method. This makes the dependency system significantly more useful. See, for example, the DPPP recipe for an example of its use.
An alternative method of starting a distributed process across the cluster is to use the distproc system by Ger van Diepen. This system is used internally by various pipeline components, such as the MWImager; the intested reader is referred to the MWImager Manual for an overview of the operation of this system.
Infrastructure for supporting the distproc system is well embedded within various pipeline components, so the new framework has been designed to make use of that where possible. In particular, the reader’s attention is drawn to two file tyes:
The information contained in this files is used by both the task distribution systems to schedule jobs on the appropriate compute nodes.