ARCHER logo ARCHER banner

The ARCHER Service is now closed and has been superseded by ARCHER2.

  • ARCHER homepage
  • About ARCHER
    • About ARCHER
    • News & Events
    • Calendar
    • Blog Articles
    • Hardware
    • Software
    • Service Policies
    • Service Reports
    • Partners
    • People
    • Media Gallery
  • Get Access
    • Getting Access
    • TA Form and Notes
    • kAU Calculator
    • Cost of Access
  • User Support
    • User Support
    • Helpdesk
    • Frequently Asked Questions
    • ARCHER App
  • Documentation
    • User Guides & Documentation
    • Essential Skills
    • Quick Start Guide
    • ARCHER User Guide
    • ARCHER Best Practice Guide
    • Scientific Software Packages
    • UK Research Data Facility Guide
    • Knights Landing Guide
    • Data Management Guide
    • SAFE User Guide
    • ARCHER Troubleshooting Guide
    • ARCHER White Papers
    • Screencast Videos
  • Service Status
    • Detailed Service Status
    • Maintenance
  • Training
    • Upcoming Courses
    • Online Training
    • Driving Test
    • Course Registration
    • Course Descriptions
    • Virtual Tutorials and Webinars
    • Locations
    • Training personnel
    • Past Course Materials Repository
    • Feedback
  • Community
    • ARCHER Community
    • ARCHER Benchmarks
    • ARCHER KNL Performance Reports
    • Cray CoE for ARCHER
    • Embedded CSE
    • ARCHER Champions
    • ARCHER Scientific Consortia
    • HPC Scientific Advisory Committee
    • ARCHER for Early Career Researchers
  • Industry
    • Information for Industry
  • Outreach
    • Outreach (on EPCC Website)

You are here:

  • ARCHER
  • Upcoming Courses
  • Online Training
  • Driving Test
  • Course Registration
  • Course Descriptions
  • Virtual Tutorials and Webinars
  • Locations
  • Training personnel
  • Past Course Materials Repository
  • Feedback

Contact Us

support@archer.ac.uk

Twitter Feed

Tweets by @ARCHER_HPC

ISO 9001 Certified

ISO 27001 Certified

Optimising MPI Rank Placement at Runtime

The advances of the multi-core hardware era have been delivering nodes with ever large numbers of cores in each generation of HPC systems. A brief review of Cray architectures from HECToR to ARCHER shows a move from two to potentially forty-eight CPUs per node.

In the majority of applications, this results in more than one MPI rank running on the same node (even with hybrid MPI/OpenMP applications) and the ordering of the ranks on the nodes is left to the system default. However, the ordering of ranks on the Cray XC30 nodes can be very easily modified without any additional changes to the application. This allows users to take advantage of the shared memory between ranks on the same node and exploit it to improve overall application performance.

Codes that use a predictable communication pattern can reorder the layout of ranks on nodes to maximise the amount of intra-node communication via the shared memory system, which has a higher bandwidth and lower latency than inter-node communication over the network. This can result in significantly improved communication (and thus application) performance and can be done entirely through environment variables and a single additional input file. In many cases the Cray Performance Analysis Toolkit (CrayPAT) can detect communication patterns and provide an optimised layout for future runs. This tech-forum aims to cover:

  • How being aware of the properties of multi-core nodes can be exploited to improve application performance, even for non-threaded or hybrid applications.
  • How to use the Cray supplied rank-reordering techniques via environment variables
  • How to generate customised rank reorder mappings for Cartesian communication patterns
  • How to use CrayPAT to automatically detect and optimise your communication patterns.

Copyright © Design and Content 2013-2019 EPCC. All rights reserved.

EPSRC NERC EPCC