ARCHER logo ARCHER banner

The ARCHER Service is now closed and has been superseded by ARCHER2.

  • ARCHER homepage
  • About ARCHER
    • About ARCHER
    • News & Events
    • Calendar
    • Blog Articles
    • Hardware
    • Software
    • Service Policies
    • Service Reports
    • Partners
    • People
    • Media Gallery
  • Get Access
    • Getting Access
    • TA Form and Notes
    • kAU Calculator
    • Cost of Access
  • User Support
    • User Support
    • Helpdesk
    • Frequently Asked Questions
    • ARCHER App
  • Documentation
    • User Guides & Documentation
    • Essential Skills
    • Quick Start Guide
    • ARCHER User Guide
    • ARCHER Best Practice Guide
    • Scientific Software Packages
    • UK Research Data Facility Guide
    • Knights Landing Guide
    • Data Management Guide
    • SAFE User Guide
    • ARCHER Troubleshooting Guide
    • ARCHER White Papers
    • Screencast Videos
  • Service Status
    • Detailed Service Status
    • Maintenance
  • Training
    • Upcoming Courses
    • Online Training
    • Driving Test
    • Course Registration
    • Course Descriptions
    • Virtual Tutorials and Webinars
    • Locations
    • Training personnel
    • Past Course Materials Repository
    • Feedback
  • Community
    • ARCHER Community
    • ARCHER Benchmarks
    • ARCHER KNL Performance Reports
    • Cray CoE for ARCHER
    • Embedded CSE
    • ARCHER Champions
    • ARCHER Scientific Consortia
    • HPC Scientific Advisory Committee
    • ARCHER for Early Career Researchers
  • Industry
    • Information for Industry
  • Outreach
    • Outreach (on EPCC Website)

You are here:

  • ARCHER
  • User Guides & Documentation
  • Essential Skills
  • Quick Start Guide
  • ARCHER User Guide
  • ARCHER Best Practice Guide
  • Scientific Software Packages
  • UK Research Data Facility Guide
  • Knights Landing Guide
  • Data Management Guide
  • SAFE User Guide
  • ARCHER Troubleshooting Guide
  • ARCHER White Papers
  • Screencast Videos

Contact Us

support@archer.ac.uk

Twitter Feed

Tweets by @ARCHER_HPC

ISO 9001 Certified

ISO 27001 Certified

Troubleshooting Guide

Commonly occurring errors

TypeErrorNotesSolution
BATCH SYSTEM apsched: request exceeds max nodes, alloc aprun is being used on a login node. Use aprun within a PBS job script submitted to the compute nodes using qsub.
BATCH SYSTEM apsched: claim exceeds reservation's node-count The number of nodes required for an aprun command within a PBS job script is larger than the number of nodes requested with the -l select option. Change the number of nodes requested in the -l select option to match the number required for the aprun command. If the aprun option -n alone is used, the number of nodes required is the number of processes divided by 24, rounded up. If other options are used to change the number of processes per node (e.g., -N, -S, -j) then the calculation is not so easy: please see the aprun section in the User Guide and the aprun man page.
BATCH SYSTEM qsub: Archer: Please use the select/place resource selection language Encountered if the mppwidth/mppnppn combination is used in the job script, and not -l select=[nodes] Use -l select=[nodes] and remove mppwidth/mppnppn combination as described in the ARCHER User Guide
BATCH SYSTEM qsub: request rejected as filter hook 'update_user_environment' encountered an exception. Please inform Admin. Encountered if there is no select statement in the job script Use -l select=[nodes] as described in the ARCHER User Guide
BATCH SYSTEM The command 'qstat -f [job id]' shows the information 'comment = Not Running: Insufficient amount of resource vntype (cray_compute != )' Encountered when the system is full. There are not enough free nodes for the job to run. Job will run when enough resources become free.
BATCH SYSTEM The command 'qstat -f [job id]' shows the information 'comment = Not Running: Insufficient amount of resource ncpus (R: 264 A: 234 T: 118320)' (the values for R, A and T will vary) Encountered when the system is full. There are not enough free nodes for the job to run. Job will run when enough resources become free.
BATCH SYSTEM Jobid 0000.sdb - will not start (comment = Not Running: Host set host=archer_3071 has too few free resources) Encountered when the system is full. There are not enough free nodes for the job to run. Job will run when enough resources become free.
BATCH SYSTEM Jobid 00000.sdb (user auser) - will not start (comment = Not Running: PBS Error: ARCHER: User auser is not in XXXXXX) Error seen when account auser is not a member of project group XXXXXX but is trying to access that budget via #PBS -A directive in the job submission script Check and make sure your ARCHER account auser belongs to the correct projects group on resource pool XC (ARCHER). Do this via SAFE. Note - project group membership is not automatically inherited from HECToR.
BATCH SYSTEM Jobid 00000.sdb (user auser) - will not start (comment = Not Running: PBS Error: budget XXXXX does not have enough resource) Error seen when budget XXXXX has been used up since the job was submitted. Ask the PI to add additional time resource to budget XXXXX then release job from held state using command qrls -h u <job ID>
BATCH SYSTEM [NID 00600] 2013-11-22 16:26:43 Exec my_app.x failed: chdir /home3/x01/x01/user/dir No such file or directory Error seen when trying to run job on compute nodes on the /home filesystem Resubmit job using the /work filesystem. (/home not accessible on the compute nodes.)
LOGINS AND PASSWORDS Authentication token manipulation error. Error seen when trying to change password manually on command line more than once per day Request change via SAFE instead.
COMPILING Illegal Instruction Error seen when running code on the postprocessing/serial nodes that has been compiled for compute nodes. Recompile for postprocessing/serial nodes: Compiling for Postprocessing/Serial Nodes.
LOGIN NODES Process terminates unexpectedly Processes running on the login nodes that consume more than 10 minutes CPU time may be killed to prevent overloading of the nodes. Submit a job to the serial queue: Postprocessing/Serial Jobs.
LOGIN NODES /usr/bin/xauth: error in locking authority file /home/... Most likely cause is due to /home usage exceeding quota meaning files such as .Xauthority cannot be saved in your home space Check disk usage within /home file system against quota with SAFE and delete files or request increased quota from PI
COMPILING forrtl: severe (168): Program Exception - illegal instruction Error seen when running code that has been compiled with the Intel Fortran compiler with the -real-size 64 or -r8 options. The compiler produces AVX2 instructions, which are not provided by the processors in the compute nodes. Recompile with the -xAVX option: Useful compiler options.
COMPILING Please verify that both the operating system and the processor support Intel(R) F16C instructions. Runtime error seen when running serial code in job scripts without using "aprun". The job launcher nodes have Sandy Bridge processors so do not support some instructions produced for Ivy Bridge processors Run executable on the compute nodes by prepending with "aprun". If this is a serial application then use "aprun -n 1"; otherwise specify the number of parallel tasks using the "-n" option.

Copyright © Design and Content 2013-2019 EPCC. All rights reserved.

EPSRC NERC EPCC