Horovod 0.23.0-foss-2021a-CUDA-11.3.1-TensorFlow-2.6.0

Horovod is a distributed training framework

Accessing Horovod 0.23.0-foss-2021a-CUDA-11.3.1-TensorFlow-2.6.0

To load the module for Horovod 0.23.0-foss-2021a-CUDA-11.3.1-TensorFlow-2.6.0 please use this command on the BEAR systems (BlueBEAR, BEARCloud VMs, and CaStLeS VMs):

module load Horovod/0.23.0-foss-2021a-CUDA-11.3.1-TensorFlow-2.6.0

BEAR Apps Version

2021a

Architectures

EL8-haswell (GPUs: NVIDIA P100) — EL8-icelake (GPUs: NVIDIA A100, NVIDIA A30)

The listed architectures consist of two part: OS-CPU.

  • BlueBEAR: The OS used on BlueBEAR is represented by EL and there are several different processor (CPU) types available on BlueBEAR. More information about the processor types on BlueBEAR is available on the BlueBEAR Job Submission page.
  • BEAR and CaStLeS Cloud VMs: These VMs can have one of two OSes. Those with access to a BEAR Cloud or CaStLeS VM should check that the listed architectures for an application include the OS of VM being used. The VMs, irrespective of OS, will use the haswell CPU type.

Extensions

  • cloudpickle 2.0.0
  • horovod 0.23.0

More Information

For more information visit the Horovod website.

Dependencies

This version of Horovod has a direct dependency on: CUDA/11.3.1 foss/2021a NCCL/2.10.3-GCCcore-10.3.0-CUDA-11.3.1 Python/3.9.5-GCCcore-10.3.0 PyYAML/5.4.1-GCCcore-10.3.0 TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1 UCX-CUDA/1.10.0-GCCcore-10.3.0-CUDA-11.3.1

Other Versions

These versions of Horovod are available on the BEAR systems (BlueBEAR, BEARCloud VMs, and CaStLeS VMs). These will be retained in accordance with our Applications Support and Retention Policy.

Last modified on 17th January 2022