Introduction

CUDA stands for Compute Unified Device Architecture. It is a suite of technologies for programming NVIDIA graphics cards and computer hardware. CUDA C is an extension to C or C++; there are also extensions to other languages like FORTRAN, Python, and C#. CUDA is the official GPGPU architecture developed by NVIDIA. It is a mature architecture and has been actively developed since 2007. It is regularly updated and there is an abundance of documentation and libraries available.

GPGPU stands for general-purpose computing on graphics processing units. General purpose programming refers to any programming task that performs computations rather than standard graphics processing (CUDA is also excellent at graphics processing). Because graphics cards were originally intended to process graphics, there are very good reasons to want to harness their processing power for solving other problems. The most obvious reason is that they are extremely powerful processing units and they can take a lot of the workload off the CPU. The GPU often performs processing simultaneously with the CPU and is very efficient at certain types of computation—much more efficient than the CPU.

Parallel programming has become increasingly important in recent years and will continue to increase in importance in the coming years. The core clock speed of a CPU cannot increase indefinitely and we have almost reached the limit in this technology as it stands today. To increase the core clock speed of a CPU beyond the 3.5 GHz to 4.0 GHz range becomes increasingly expensive to power and keep cool. The alternative to increasing the clock speed of the processor is simply to include more than one processor in the same system. This alternative is exactly the idea behind graphics cards. They contain many hundreds (even thousands) of low-powered compute cores. Most graphics cards (at least the ones we will be programming) are called massively parallel devices. They work best when there are hundreds or thousands of active threads, as opposed to a CPU which is designed to execute perhaps four or five simultaneous threads. CUDA is all about harnessing the power of thousands of concurrent threads, splitting large problems up, and turning them inside out. It is about efficiently using graphics hardware instead of just leaving the GPU idle while the CPU struggles through problems with its handful of threads.

Studying CUDA gives us particular insight into how NVIDIA’s hardware works. This is of great benefit to programmers who use these devices for graphics processing. The view of the hardware from the perspective of CUDA tends to be at a much lower level than that of a programmer who uses the GPUs to only produce graphics. CUDA gives us insight into the structure and workings of these devices outside the verbose and often convoluted syntax of modern graphics APIs.

This book is aimed at readers who wish to explore GPGPU with NVIDIA hardware using CUDA. This book is intended for folks with at least some background knowledge of C++ since all of the code examples will be using this language. I will be using the Visual Studio Express 2012 integrated development environment (IDE) but the examples should be easy to follow with the 2010 or 2013 versions of Visual Studio. Chapter 9 focuses on Nsight which is only applicable to Visual Studio Professional editions, but the Express edition of Visual Studio will suffice for all the other chapters.

Build apps 2X faster

using Syncfusion Essential Studio^® suite

1800+ high-performance UI components.
Includes popular controls such as Grid, Chart, Scheduler, and more.
24x5 unlimited support by developers.

Get Your Free Trial Now

Introduction

DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.