CUDA | GopherTrunk

Also known as: Compute Unified Device Architecture

CUDA is NVIDIA’s parallel computing platform and programming model that lets ordinary, general-purpose code run on the GPU instead of only the CPU.¹

Overview

A CUDA program splits work into a kernel — a small function executed in parallel by thousands of lightweight threads, each handling one element of the data. The platform exposes the GPU through extensions to C and C++ (and bindings for Python, Fortran, and others), plus tuned libraries such as cuBLAS for linear algebra and cuDNN for neural networks. Because it is proprietary to NVIDIA hardware, CUDA competes with the cross-vendor OpenCL and with newer portable frameworks, but its mature tooling made it the de facto standard for GPU computing.²

Where it fits

CUDA is the bridge that turned the GPU from a graphics device into a general accelerator (see GPGPU), and it underpins most modern AI accelerator workloads on NVIDIA hardware. For a signal-processing pipeline like GopherTrunk, a CUDA kernel can run massively parallel work — large FFTs across many channels, or batched filtering — far faster than a CPU, though for a handful of narrowband channels the data-transfer overhead to the GPU often outweighs the gain.

Sources

CUDA — Wikipedia, on NVIDIA’s parallel computing platform and programming model. ↩
CUDA Zone — NVIDIA’s developer site for the CUDA toolkit and libraries. ↩

Overview

Where it fits

Sources

See also

Join the GopherTrunk community