Dynamic Loading in the CUDA Runtime

- 
Introduction - New CUDA runtime APIs enable dynamic GPU device code loading, providing a context-independent way to select and load GPU device code.
 
- 
Benefits of dynamic GPU device code loading - Explicit control over GPU device code loading, especially useful when the code is modified separately from the loading compilation unit.
 
- 
Static loading in the CUDA runtime - The CUDA runtime maintains state of loaded GPU device code modules during initialization, based on what is compiled and linked with compilation tools.
 
- 
Dynamic loading in the CUDA driver - Requires dynamic loading of GPU device code to execute and manage more state such as CUDA contexts.
 
- 
Dynamic loading in the CUDA runtime - New changes in CUDA support dynamic loading in the runtime, providing flexibility in dynamically loading GPU device code.
 
- 
Benefits - Pure CUDA runtime API usage, interchangeability of types between CUDA driver and runtime, and handle sharing between runtime instances.
 
- 
Sharing of CUDA kernel handles - Allows sharing of CUDA kernels between different libraries by passing handles between libraries.
 
- 
Get started with CUDA runtime dynamic loading - Introduction to new CUDA runtime APIs for loading and executing device code on the GPU using a simpler approach when only CUDA runtime API is needed.
 
