Compute Unified Device Architecture

The ComputeUnifiedDeviceArchitecture (CUDA) defines a set of extensions to CeeLanguage which enable general purpose programming of graphics processors supplied by NvidiaCorporation. The extensions also work in CeePlusPlus.

OpenCl is an open standard alternative.

There is a lot of work published on the use of this architecture to speed up numerical computation. This can be done either using the memory and processors available in the graphics processor of the computer, or made available by adding extra cards for this purpose.

Note: Successive versions of CUDA differ significantly. CUDA 4.0 has a lot of features not present in the previous versions, including CUDA 3.2. Also, older graphics devices will not be able to use all the features of later software. CUDA includes software to detect the hardware version of the graphics display so that programs can test for needed features. There is a lot of example code available which illustrates this.

CUDA 4.1 also differs significantly and is said to have a different C and C++ compiler based on LLVM (see LowLevelVirtualMachine). It also has Thrust 1.5.1 (See ThrustLibrary)
See also GeneralPurposeGraphicsProcessUnits, MagmaLibrary, CudaMpi, ProgrammingCudaCee.
CUDA Resources

There is a series of articles on CUDA online from DrDobbsJournal - the first is at by Rob Farber

Others in the series can be found by searching on his name:

CudaApplicationDesignAndDevelopment by Rob Farber is a new book (2012) which looks extremely useful. -- JohnFletcher
CategoryCee CategoryCpp CategoryFortran ParallelProgrammingModel CategoryGpgpu

View edit of March 5, 2012 or FindPage with title or text search