taskmaster

Summary:

taskmaster submits a job that will repeatedly resubmit any Auto jobs in the job database that have completed but whose taskstatus is still 'Incomplete' (perhaps because the jobs has hit the walltime before completing or failed to converge) and then resubmit itself with a delay before execution. As not all compute resources allow this behavior, remember check the policy prior to using taskmaster on a new compute resource.

The job submission options can be customized by editing the prisms-jobs configuration file.

--help documentation:

Automatically resubmit jobs.

’taskmaster’ submits itself with instructions to be run after an amount of time specified by –delay (default=15:00). When it runs, it continues all auto prisms_jobs jobs in the database that are incomplete and then re-submits itself to execute again after the specified delay.

The specifics of ‘taskmaster’ submission can be customized by editing the ‘taskmaster_job_kwargs’ object in the prisms_jobs configuration file: $PRISMS_JOBS_DIR/config.json.

usage: taskmaster [-h] [-d DELAY] [--hold | --release | --kill]

Named Arguments

-d, --delay

How long to delay (“[[[DD:]HH:]MM:]SS”) between executions. Default is “15:00”.

Default: “15:00”

--hold

Place a hold on the currently running taskmaster

Default: False

--release

Release the currently running taskmaster

Default: False

--kill

Kill the currently running taskmaster

Default: False

configuration file: config.html