Why is my job not starting?¶
This is a very difficult question to answer, there could be many reasons!
Some of them are:
- There are jobs with higher priority ahead of you. New submitted jobs that have higher priority than yours will be placed ahead in the queue.
- Even if you see idle nodes with
sinfo
these might have been reserved by the scheduler for a large job to start. - You may have requested an impossible combination of resources node type/memory. In this case your job will never start.
- Your job requested more time than what is allowed by the QOS (check the available QoS VSC-5 QoS/VSC-4 QoS) and it will never start.
- You can che ck the meaning of the reason codes displayed by
squeue
here.
Warning
If your job has been in the queue for months, check that you requested resources appropriately.
- Nodes might be reserved. It can be that compute have been reserved for a particular event like training or courses for example.
- The cluster is down!
- Your have run out of your allocated time.
Tip
Request resources efficiently. Do not ask for more nodes and time than you need. Larger and longer jobs will wait longer in the queue. If possible make tests to estimate the required resources. Use the devel QoS!