Skip to content

AWS OFI NCCL v1.12.0

Latest
Compare
Choose a tag to compare
@AmedeoSapio AmedeoSapio released this 08 Oct 01:41
· 134 commits to master since this release
v1.12.0-aws

This release is intended only for use on AWS P* instances. A general release that supports other Libfabric networks will be made in the near future. This release requires Libfabric v1.18.0 or later and supports NCCL 2.23.4-1 while maintaining backward compatibility with older NCCL versions (NCCL v2.17.1 and later).

New Features:

  • Support for tuner v3 APIs
  • Support for AllGather and ReduceScatter in the tuner
  • Support for PAT algorithm in the tuner

Bug fixes:

  • Fixed NULL pointer access in the endpoint per communicator path
  • Replaced the NVLSTree option in the tuner with RING if nRanks==nNodes

The plugin has been tested with following libfabric providers using tests bundled in the source code and nccl-tests suite:

  • efa

Checksum (sha512) for the release tarball:

7d9e41ce04253a32a13542e7f4c2d20c2a5a43cdfb575fe153954c5faed8cf85eb08dab76ee0f883109f7610bb43cb8b703fe2f1e98b8f02bbfa866dd1c268e1  aws-ofi-nccl-1.12.0-aws.tar.gz