Focus demo 5 (GPU)
When
Where
GPU implementations for core astronomical libraries
Manipulating and processing massive data sets is challenging. In astrophysics as in the vast majority of research communities, the conventional approach involves breaking up and analyzing data in smaller chunks, an inefficient and error-prone process. The problem is exacerbated on GPUs, due to the limited available memory.
Popular solutions to distribute NumPy/SciPy computations are based on task parallelism, introducing significant runtime overhead, complicating implementation, and often restricting GPU support to specific vendors.
In this demo, I want to show you an alternative based on data parallelism instead. The open-source library Heat [1, 2] capitalizes on PyTorch and mpi4py to simplify migration of NumPy/SciPy-based code to GPU (CUDA, ROCm, including multi-GPU, multi-node clusters). Under the hood, Heat distributes massive memory-intensive operations over multi-node resources via MPI communication. From a user's perspective, with its Numpy-like API Heat can be plugged in seamlessly in the Python array ecosystem.
I'll show you some practical examples:
- distributed (multi-GPU) I/O from shared memory
- accelerating memory-intensive operations in existing code (e.g. matrix multiplication)
- Heat as a backend for your pipeline: array manipulations, statistics, signal processing, machine learning...
- prototype on your laptop, run on cluster
I'll also touch upon Heat's implementation roadmap, particularly regarding signal and image processing, and potential avenues for collaboration.
[1] https://github.com/helmholtz-analytics/heat
[2] M. Götz et al., "HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics," 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 2020, pp. 276-287, doi: 10.1109/BigData50022.2020.9378050.