prisms_jobs.interface.slurm

Functions for interfacing between slurm and the prisms_jobs module

Functions

alter(jobid, arg) scontrol update job.
delete(jobid) scancel a job.
hold(jobid) scontrol delay a job.
job_id([all, name]) Get job IDs
job_rundir(jobid) Return the directory job was run in using squeue.
job_status([jobid]) Return job status using squeue
read(job, qsubstr) Raise exception
release(jobid) scontrol un-delay a job.
sub_string(job) Write Job as a string suitable for slurm
submit(substr[, write_submit_script]) Submit a job using sbatch.
prisms_jobs.interface.slurm._squeue(jobid=None, username='bpuchala', full=False, sformat=None)[source]

Return the stdout of squeue minus the header lines.

By default, ‘username’ is set to the current user. ‘full’ is the ‘-f’ option ‘jobid’ is a string or list of strings of job ids ‘sformat’ is a squeue format string (e.g., “%A %i %j %c”)
Returns:str – the text of squeue, minus the header lines
prisms_jobs.interface.slurm.alter(jobid, arg)[source]

scontrol update job.

Parameters:
  • jobid (str) – ID of job to alter
  • arg (str) – ‘arg’ is a scontrol command option string. For instance, “-a 201403152300.19”
Returns:

intscontrol returncode

prisms_jobs.interface.slurm.delete(jobid)[source]

scancel a job.

Parameters:jobid (str) – ID of job to cancel
Returns:intscancel returncode
prisms_jobs.interface.slurm.hold(jobid)[source]

scontrol delay a job.

Parameters:jobid (str) – ID of job to delay (for 30days)
Returns:intscontrol returncode
prisms_jobs.interface.slurm.job_id(all=False, name=None)[source]

Get job IDs

Parameters:
  • all (bool) – If True, use squeue to query all user jobs. Else, check
  • environment variable for ID of current job. (SLURM_JOBID) –
  • name (str) – If all==True, use name to filter results.
Returns:

One of str, List(str), or None – Returns a str if all==False and SLURM_JOBID exists, a List(str) if all==True, else None.

prisms_jobs.interface.slurm.job_rundir(jobid)[source]

Return the directory job was run in using squeue.

Parameters:jobid (str or List(str)) – IDs of jobs to get the run directory
Returns:dict – A dict, with id:rundir pairs.
prisms_jobs.interface.slurm.job_status(jobid=None)[source]

Return job status using squeue

Parameters:jobid (None, str, or List(str)) – IDs of jobs to query for status. None for all user jobs.
Returns:dict of dict – The outer dict uses jobid as key; the inner dict contains:
”name” Job name
”nodes” Number of nodes
”procs” Number of processors
”walltime” Walltime
”jobstatus” status (“Q”,”C”,”R”, etc.)
”qstatstr” result of squeue -f jobid, None if not found
”elapsedtime” None if not started, else seconds as int
”starttime” None if not started, else seconds since epoch as int
”completiontime” None if not completed, else seconds since epoch as int
prisms_jobs.interface.slurm.read(job, qsubstr)[source]

Raise exception

prisms_jobs.interface.slurm.release(jobid)[source]

scontrol un-delay a job.

Parameters:jobid (str) – ID of job to release
Returns:intscontrol returncode
prisms_jobs.interface.slurm.sub_string(job)[source]

Write Job as a string suitable for slurm

Parameters:prisms_jobs.Job – Job to be submitted
prisms_jobs.interface.slurm.submit(substr, write_submit_script=None)[source]

Submit a job using sbatch.

Parameters:
  • substr (str) – The submit script string
  • write_submit_script (bool, optional) – If true, submit via file skipping lines containing ‘#SBATCH -J’; otherwise, submit via commandline. If not specified, uses prisms_jobs.config['write_submit_script'].
Returns:

str – ID of submitted job

Raises:

JobsError – If a submission error occurs