prisms_jobs.interface.slurm¶
Functions for interfacing between slurm and the prisms_jobs module
Functions
alter (jobid, arg) |
scontrol update job. |
delete (jobid) |
scancel a job. |
hold (jobid) |
scontrol delay a job. |
job_id ([all, name]) |
Get job IDs |
job_rundir (jobid) |
Return the directory job was run in using squeue . |
job_status ([jobid]) |
Return job status using squeue |
read (job, qsubstr) |
Raise exception |
release (jobid) |
scontrol un-delay a job. |
sub_string (job) |
Write Job as a string suitable for slurm |
submit (substr[, write_submit_script]) |
Submit a job using sbatch . |
-
prisms_jobs.interface.slurm.
_squeue
(jobid=None, username='bpuchala', full=False, sformat=None)[source]¶ Return the stdout of squeue minus the header lines.
By default, ‘username’ is set to the current user. ‘full’ is the ‘-f’ option ‘jobid’ is a string or list of strings of job ids ‘sformat’ is a squeue format string (e.g., “%A %i %j %c”)Returns: str – the text of squeue, minus the header lines
-
prisms_jobs.interface.slurm.
alter
(jobid, arg)[source]¶ scontrol
update job.Parameters: - jobid (str) – ID of job to alter
- arg (str) – ‘arg’ is a scontrol command option string. For instance, “-a 201403152300.19”
Returns: int –
scontrol
returncode
-
prisms_jobs.interface.slurm.
delete
(jobid)[source]¶ scancel
a job.Parameters: jobid (str) – ID of job to cancel Returns: int – scancel
returncode
-
prisms_jobs.interface.slurm.
hold
(jobid)[source]¶ scontrol
delay a job.Parameters: jobid (str) – ID of job to delay (for 30days) Returns: int – scontrol
returncode
-
prisms_jobs.interface.slurm.
job_id
(all=False, name=None)[source]¶ Get job IDs
Parameters: - all (bool) – If True, use
squeue
to query all user jobs. Else, check - environment variable for ID of current job. (SLURM_JOBID) –
- name (str) – If all==True, use name to filter results.
Returns: One of str, List(str), or None – Returns a str if all==False and
SLURM_JOBID
exists, a List(str) if all==True, else None.- all (bool) – If True, use
-
prisms_jobs.interface.slurm.
job_rundir
(jobid)[source]¶ Return the directory job was run in using
squeue
.Parameters: jobid (str or List(str)) – IDs of jobs to get the run directory Returns: dict – A dict, with id:rundir pairs.
-
prisms_jobs.interface.slurm.
job_status
(jobid=None)[source]¶ Return job status using
squeue
Parameters: jobid (None, str, or List(str)) – IDs of jobs to query for status. None for all user jobs. Returns: dict of dict – The outer dict uses jobid as key; the inner dict contains: ”name” Job name ”nodes” Number of nodes ”procs” Number of processors ”walltime” Walltime ”jobstatus” status (“Q”,”C”,”R”, etc.) ”qstatstr” result of squeue -f jobid
, None if not found”elapsedtime” None if not started, else seconds as int ”starttime” None if not started, else seconds since epoch as int ”completiontime” None if not completed, else seconds since epoch as int
-
prisms_jobs.interface.slurm.
release
(jobid)[source]¶ scontrol
un-delay a job.Parameters: jobid (str) – ID of job to release Returns: int – scontrol
returncode
-
prisms_jobs.interface.slurm.
sub_string
(job)[source]¶ Write Job as a string suitable for slurm
Parameters: prisms_jobs.Job – Job to be submitted
-
prisms_jobs.interface.slurm.
submit
(substr, write_submit_script=None)[source]¶ Submit a job using
sbatch
.Parameters: - substr (str) – The submit script string
- write_submit_script (bool, optional) – If true, submit via file skipping
lines containing ‘#SBATCH -J’; otherwise, submit via commandline. If
not specified, uses
prisms_jobs.config['write_submit_script']
.
Returns: str – ID of submitted job
Raises: JobsError
– If a submission error occurs