Frequently Asked Questions¶
- … How do I access a compute node?
An interactive session can be started by typing
srun --account=<accountname> --pty bash --login
.- … What account should I use when requesting resources?
Faculty members should use
--account=faculty
. Staff and collaborators should use a PI-associated account that is linked to their account. For example--account=pi-<ID>
. Phd students may use either--account=phd
or a PI-associated account depending on whether the research is independent research, or if it is associated with a PI.- … My job has been in the queue for a long time. Why is my job not running?
It is possible that Mercury is overloaded and awaiting resources. To check the status of your job, view the output of
squeue
and note the “State” column.- State=(Resources)
there are currently no available resources to run your job.
- State=(Priority)
other jobs with higher priority are running before your job.
- State=(QOSGrpBilling)
your other jobs have exceeded the concurrent resource limits. Your job will be available to run as your other jobs finish.
- … How do I use a Python virtual environment in Jupyter?
You must first register the kernel using the following steps. Activate the virtual environment, then issue the command:
ipython kernel install --user --name <kernel-name>
Error Messages¶
- … sbatch: error: Batch job submission failed: Memory required by task is not available
You have requested more than the allowable memory. Modify
#SBATCH --mem
- … sbatch: error: Batch job submission failed: Node count specification invalid
You have requested more than the allowable number of nodes. Modify
#SBATCH --nodes
- … sbatch: error: Batch job submission failed: More processors requested than permitted
You have requested more than the allowable number of CPUs. Modify
#SBATCH --cpus-per-task
or submit to an alternate partition that allows it.- … sbatch: error: Batch job submission failed: Requested time limit is invalid (missing or exceeds some limit)
You have requested more than the allowable runtime for your job. Modify
#SBATCH --time
or submit to an alternate partition that allows it.- … sbatch: error: Batch job submission failed: Invalid partition name specified
Self-explanatory
- … sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified
Self-explanatory. Make sure you have access to a particular account.
sacctmgr show association where user=<BoothID>
- … sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user’s size and/or time limits)
Cannot have more than 100 combined jobs running+pending in queue. This limit is 500 for faculty-associated accounts. Also, make sure you have set your account type: e.g.
#SBATCH --account=phd
- … slurmstepd: error: * JOB CANCELLED AT 2018-01-01T12:00:00 DUE TO TIME LIMIT *
The job has exceeded the wallclock limit. Modify
#SBATCH --time=
or submit to an alternate partition that allows more longer jobs.- … srun: error: Unable to create step for job: Memory required by task is not available
Make sure you are not using nested interactive sessions. Interactive jobs should only be launched from a login node (mfe01 or mfe02).