ARCHER logo ARCHER banner

The ARCHER Service is now closed and has been superseded by ARCHER2.

  • ARCHER homepage
  • About ARCHER
    • About ARCHER
    • News & Events
    • Calendar
    • Blog Articles
    • Hardware
    • Software
    • Service Policies
    • Service Reports
    • Partners
    • People
    • Media Gallery
  • Get Access
    • Getting Access
    • TA Form and Notes
    • kAU Calculator
    • Cost of Access
  • User Support
    • User Support
    • Helpdesk
    • Frequently Asked Questions
    • ARCHER App
  • Documentation
    • User Guides & Documentation
    • Essential Skills
    • Quick Start Guide
    • ARCHER User Guide
    • ARCHER Best Practice Guide
    • Scientific Software Packages
    • UK Research Data Facility Guide
    • Knights Landing Guide
    • Data Management Guide
    • SAFE User Guide
    • ARCHER Troubleshooting Guide
    • ARCHER White Papers
    • Screencast Videos
  • Service Status
    • Detailed Service Status
    • Maintenance
  • Training
    • Upcoming Courses
    • Online Training
    • Driving Test
    • Course Registration
    • Course Descriptions
    • Virtual Tutorials and Webinars
    • Locations
    • Training personnel
    • Past Course Materials Repository
    • Feedback
  • Community
    • ARCHER Community
    • ARCHER Benchmarks
    • ARCHER KNL Performance Reports
    • Cray CoE for ARCHER
    • Embedded CSE
    • ARCHER Champions
    • ARCHER Scientific Consortia
    • HPC Scientific Advisory Committee
    • ARCHER for Early Career Researchers
  • Industry
    • Information for Industry
  • Outreach
    • Outreach (on EPCC Website)

You are here:

  • ARCHER

ARCHER Best Practice Guide

  • 1. Introduction
  • 2. System Architecture and Configuration
  • 3. Programming Environment
  • 4. Job Submission System
  • 5. Performance analysis
  • 6. Tuning
  • 7. Debugging
  • 8. I/O on ARCHER
  • 9. Tools
  • User Guides & Documentation
  • Essential Skills
  • Quick Start Guide
  • ARCHER User Guide
  • ARCHER Best Practice Guide
  • Scientific Software Packages
  • UK Research Data Facility Guide
  • Knights Landing Guide
  • Data Management Guide
  • SAFE User Guide
  • ARCHER Troubleshooting Guide
  • ARCHER White Papers
  • Screencast Videos

Contact Us

support@archer.ac.uk

Twitter Feed

Tweets by @ARCHER_HPC

ISO 9001 Certified

ISO 27001 Certified

ARCHER Best Practice Guide

Version 4.1 (January 2019)

Andrew Turner (EPCC) a.turner@epcc.ed.ac.uk
Xu Guo (EPCC) xguo@epcc.ed.ac.uk
Lilit Axner (KTH) lilit@kth.se
Mark Filipiak (EPCC) m.filipiak@epcc.ed.ac.uk

This guide includes sections detailing the hardware, optimising your code (in serial and parallel), profiling your code and debugging your code. This guide is also being continually updated with more content.

The Best Practice Guide is based on work by staff at EPCC and KTH (Sweden) as part of the PRACE (Partnership for Research and Advanced Computing in Europe) initiative.

Contents

1. Introduction

Description of the guide and useful links.

  • Useful Links

2. System Architecture and Configuration

Detailed description of the ARCHER hardware and system software. This includes a in-depth look at the Intel Xeon E5-2697 (Ivy Bridge) architecture and memory layout.

  • Processor architecture
    • Vector-type instructions
  • Memory architecture
  • Available file systems
  • Operating system (CLE)

3. Programming Environment

Details on how to compile codes; use numerical libraries; MPI libraries and other parallel programming options.

  • Modules environment
  • Available compilers
    • Partitioned Global Address Space (PGAS)
  • Available (vendor optimised) numerical libraries
    • Math Kernel Library (MKL)
  • MPI
    • Maximum MPI_TAG value
  • OpenMP
    • Compiler flags
  • Shared Memory Access (SHMEM)

4. Job Submission System

Advanced use of the ARCHER batch system.

  • Multiple aprun commands in a single job
  • Job arrays
  • Job chaining

5. Performance analysis

How to use the performance analysis tools installed on the system.

  • Performance Analysis Tools
  • Cray Performance Analysis Tool (CrayPat)
    • Instrumenting a code with pat_build
    • Analysing profile results using pat_report
    • Using hardware counters
  • Cray Reveal
  • MAP Profiler (Arm Forge)
    • Create a MAP profiling MPI library
    • Re-compile the code, linking to the MAP MPI libraries
    • Run the code to collect profiling information
    • Visualise the profiling information
  • Tuning and Analysis Utilities (TAU)
  • Valgrind
    • Memory profiling with Massif
  • General hints for interpreting profiling results
    • Spotting load-imbalance

6. Tuning

Tips on how to optimise the performance of your code in both serial and parallel.

  • Optimisation summary
  • Serial (single-core) optimisation
    • Compiler optimisation flags
    • Using Libraries
    • Writing Optimal Serial Code
    • Cache Optimisation
  • Parallel optimisation
    • Load-imbalance
    • MPI Optimisation
    • Mapping tasks/threads onto cores
  • Advanced OpenMP usage
    • Environment variables
    • Intel OpenMP Affinity and Helper Thread
    • Compiler optimisations affecting numerical accuracy
  • Memory optimisation
    • Memory affinity
    • Memory allocation (malloc) tuning
    • Using huge pages
  • Intel Hyper Threading
    • Common Usage
    • Unpacked Nodes
    • Example Hyper Threading Job Script

7. Debugging

How to use the debugging tools installed on the system.

  • Available Debuggers
  • Cray ATP
    • ATP Example
  • STAT
    • STAT Example
  • DDT Debugger (Arm Forge)
    • Download and install the remote client
    • Compile the code for debugging
    • Setup the remote client to access ARCHER
    • Setup the debugger to submit jobs to ARCHER
    • Run your debugging session on your program
    • Finishing your debugging session
    • Using DDT directly on the compute nodes
    • Memory debugging of statically-linked programs

8. I/O on ARCHER

This section provides information on getting the best performance out of the parallel /work file systems on ARCHER when writing data, particularly using parallel I/O patterns.

  • Common I/O Patterns
    • Single file, single writer (Serial I/O)
    • File-per-process (FPP)
    • Single file, multiple writers without collective operations
    • Single Shared File with collective writes (SSF)
    • Schemes benchmarked
  • Lustre parallel file systems
    • Lustre basic concepts
    • Stripe Count
    • Stripe Size
    • ARCHER /work default stripe settings and OST counts
    • How can I find out which file system I am on?
  • Do I have a problem with my I/O?
    • What can I measure and tune?
  • Summary of performance advice
  • File Per Process (FPP)
  • Single Shared File (SSF)
    • MPI-IO
  • I/O Profiling

9. Tools

  • bbcp
  • Allinea Forge Client

Copyright © Design and Content 2013-2019 EPCC. All rights reserved.

EPSRC NERC EPCC