Job node distribution for CLC Assembly Cell

Multiple CLC Assembly Cells can be run in parallel on a multi-node cluster and as almost every cluster is set up differently, we provide the below free to download, free to use, and free to modify Perl script as an example. Please note that this is not an off-the-shelf solution that is guaranteed to work on your computer cluster but you are welcome to adjust it to fit your needs.

The script cluster_schedule distributes jobs defined in the schedule file on a number of nodes. An example could be distribution of CLC Assembly Cell reference assembly jobs. This requires an installation of CLC Assembly Cell on each node, and the best performance is reached if the reference sequence is stored locally on each node.

Each job is a list of commands which cluster_schedule will run in order on one node. If one of the commands in a job fails (error code is not zero) no more commands in the job is executed and the job is considered failed. If all commands in a job complete successfully (error codes are zero) the job is a success.

The nodes the jobs are run on can be defined on the command line or in the schedule_file. The nodes defined in command line replace all nodes defined in the schedule_file.

Each job is run on one node and each command is executed on the node using ssh.

Therefore, to use cluster_schedule make sure that all nodes are set up to use automatic ssh authentication.

Download the cluster_schedule script
Download an example schedule file for cluster_schedule
The scripts are provided for free without warranty and support.

Read the user manual for more information about CLC Assembly Cell

© QIAGEN 2017. All rights reserved - Trademarks & Disclaimers