@badhatharry4323 - thanks for pointing this out. The links have been fixed . See fluidnumerics.github.io/gpu-programming/codelabs/build-a-gpu-app-openmp-fortran/#0 for the codelab
Thanks for the video! I'm curious if there will be some performance difference comapred to the native implementation of smoothing kernel on CUDA fortran? And also, does it make sense to try to parallelize the hydrodynamic code (atmosphere model, for example) in such an OpenMP "offload" manner?
It's possible that using a kernel approach, like CUDA-Fortran (or HIP+HIPFort), will give you different performance. Kernel based approaches to GPU programming give you more direct control of the operations each thread executes. With directive based approaches, like OpenMP or OpenACC, you hint to the compiler what needs to be done, and the performance is highly dependent on the compiler. The nice thing about directive based approaches is that you can easily start offloading your code to the GPU. For large hydrodynamic codes, this can be a good way to get started. Then, using your profiler, find kernels that could benefit from writing your own kernels.
the two github links in the description are broken, is there a chance to update those?
@badhatharry4323 - thanks for pointing this out. The links have been fixed .
See fluidnumerics.github.io/gpu-programming/codelabs/build-a-gpu-app-openmp-fortran/#0 for the codelab
Thanks for the video! I'm curious if there will be some performance difference comapred to the native implementation of smoothing kernel on CUDA fortran? And also, does it make sense to try to parallelize the hydrodynamic code (atmosphere model, for example) in such an OpenMP "offload" manner?
It's possible that using a kernel approach, like CUDA-Fortran (or HIP+HIPFort), will give you different performance. Kernel based approaches to GPU programming give you more direct control of the operations each thread executes. With directive based approaches, like OpenMP or OpenACC, you hint to the compiler what needs to be done, and the performance is highly dependent on the compiler. The nice thing about directive based approaches is that you can easily start offloading your code to the GPU. For large hydrodynamic codes, this can be a good way to get started. Then, using your profiler, find kernels that could benefit from writing your own kernels.