
NVIDIA Math Libraries for the Python Ecosystem - GitHub
Device-side APIs nvmath-python exposes NVIDIA's device-side (Dx) APIs. This allows developers to call NVIDIA library functions inside their custom device kernels. For example, a numba jit function can …
nvmath-python/examples/fft/example19_convolution_prolog ... - GitHub
NVIDIA Math Libraries for the Python Ecosystem. Contribute to NVIDIA/nvmath-python development by creating an account on GitHub.
nvmath-python/examples/device/cufftdx_convolution_performance
NVIDIA Math Libraries for the Python Ecosystem. Contribute to NVIDIA/nvmath-python development by creating an account on GitHub.
Inquiry about the implementation feasibility of a FFT-based ... - GitHub
I am curious if we can implement the whole iterative algorithm as a single kernel using this lib (rather than just fusing the convolution op)? That's said, is there any space I can exploit to further accelerate …
nvmath-python/examples/device/cufftdx_autotuning.py at main - GitHub
cufftdx_convolution_signal.py cufftdx_fft_2d.py cufftdx_fft_2d_r2c_c2r.py cufftdx_fft_2d_single_kernel.py
nvmath-python/examples/device/common_numba.py at main - GitHub
cufftdx_convolution.py cufftdx_convolution_performance.py cufftdx_convolution_r2c_c2r.py cufftdx_convolution_r2c_c2r_packed_fold_optimized.py cufftdx_convolution_signal.py …
cublasdx_simple_gemm_fp32.py - GitHub
cufftdx_convolution.py cufftdx_convolution_performance.py cufftdx_convolution_r2c_c2r.py cufftdx_convolution_r2c_c2r_packed_fold_optimized.py cufftdx_convolution_signal.py