Frequently Asked Questions

… How do I access a compute node?

An interactive session can be started by typing srun --account=<accountname> --pty bash --login.

… What account should I use when requesting resources?

Faculty members should use --account=faculty. Staff and collaborators should use a PI-associated account that is linked to their account. For example --account=pi-<ID>. Phd students may use either --account=phd or a PI-associated account depending on whether the research is independent research, or if it is associated with a PI.

… My job has been in the queue for a long time. Why is my job not running?

It is possible that Mercury is overloaded and awaiting resources. To check the status of your job, view the output of squeue and note the “State” column.

State=(Resources)

there are currently no available resources to run your job.

State=(Priority)

other jobs with higher priority are running before your job.

State=(QOSGrpBilling)

your other jobs have exceeded the concurrent resource limits. Your job will be available to run as your other jobs finish.

… How do I use a Python virtual environment in Jupyter?

You must first register the kernel using the following steps. Activate the virtual environment, then issue the command: ipython kernel install --user --name <kernel-name>

Error Messages

… sbatch: error: Batch job submission failed: Memory required by task is not available

You have requested more than the allowable memory. Modify #SBATCH --mem

… sbatch: error: Batch job submission failed: Node count specification invalid

You have requested more than the allowable number of nodes. Modify #SBATCH --nodes

… sbatch: error: Batch job submission failed: More processors requested than permitted

You have requested more than the allowable number of CPUs. Modify #SBATCH --cpus-per-task or submit to an alternate partition that allows it.

… sbatch: error: Batch job submission failed: Requested time limit is invalid (missing or exceeds some limit)

You have requested more than the allowable runtime for your job. Modify #SBATCH --time or submit to an alternate partition that allows it.

… sbatch: error: Batch job submission failed: Invalid partition name specified

Self-explanatory

… sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified

Self-explanatory. Make sure you have access to a particular account. sacctmgr show association where user=<BoothID>

… sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user’s size and/or time limits)

Cannot have more than 100 combined jobs running+pending in queue. This limit is 500 for faculty-associated accounts. Also, make sure you have set your account type: e.g. #SBATCH --account=phd

… slurmstepd: error: * JOB CANCELLED AT 2018-01-01T12:00:00 DUE TO TIME LIMIT *

The job has exceeded the wallclock limit. Modify #SBATCH --time= or submit to an alternate partition that allows more longer jobs.

… srun: error: Unable to create step for job: Memory required by task is not available

Make sure you are not using nested interactive sessions. Interactive jobs should only be launched from a login node (mfe01 or mfe02).