Dynamic Loading in the CUDA Runtime

-
Introduction
- New CUDA runtime APIs enable dynamic GPU device code loading, providing a context-independent way to select and load GPU device code.
-
Benefits of dynamic GPU device code loading
- Explicit control over GPU device code loading, especially useful when the code is modified separately from the loading compilation unit.
-
Static loading in the CUDA runtime
- The CUDA runtime maintains state of loaded GPU device code modules during initialization, based on what is compiled and linked with compilation tools.
-
Dynamic loading in the CUDA driver
- Requires dynamic loading of GPU device code to execute and manage more state such as CUDA contexts.
-
Dynamic loading in the CUDA runtime
- New changes in CUDA support dynamic loading in the runtime, providing flexibility in dynamically loading GPU device code.
-
Benefits
- Pure CUDA runtime API usage, interchangeability of types between CUDA driver and runtime, and handle sharing between runtime instances.
-
Sharing of CUDA kernel handles
- Allows sharing of CUDA kernels between different libraries by passing handles between libraries.
-
Get started with CUDA runtime dynamic loading
- Introduction to new CUDA runtime APIs for loading and executing device code on the GPU using a simpler approach when only CUDA runtime API is needed.