By Jason Sanders, Edward Kandrot

"This ebook is needed interpreting for a person operating with accelerator-based computing systems."
–From the Foreword through Jack Dongarra, college of Tennessee and Oak Ridge nationwide Laboratory
CUDA is a computing structure designed to facilitate the improvement of parallel courses. along side a complete software program platform, the CUDA structure allows programmers to attract at the colossal strength of images processing devices (GPUs) while development high-performance functions. GPUs, in fact, have lengthy been on hand for hard snap shots and online game functions. CUDA now brings this important source to programmers engaged on purposes in different domain names, together with technology, engineering, and finance. No wisdom of portraits programming is required–just the power to software in a modestly prolonged model of C.

[b]CUDA by means of Example,[/b] written through senior individuals of the CUDA software program platform group, exhibits programmers tips on how to hire this new know-how. The authors introduce every one sector of CUDA improvement via operating examples. After a concise creation to the CUDA platform and structure, in addition to a quick-start consultant to CUDA C, the ebook information the innovations and trade-offs linked to each one key CUDA function. You'll detect while to exploit every one CUDA C extension and the way to write down CUDA software program that gives you really remarkable performance.

Major issues lined include

  • Parallel programming
  • Thread cooperation
  • consistent reminiscence and occasions
  • Texture memory
  • snap shots interoperability
  • Atomics
  • Streams
  • CUDA C on a number of GPUs
  • complicated atomics
  • extra CUDA resources

All the CUDA software program instruments you'll want are freely to be had for obtain from NVIDIA.

Show description

Read Online or Download CUDA by Example: An Introduction to General-Purpose GPU Programming PDF

Best design books

Optimum Design of Structures: With Special Reference to Alternative Loads Using Geometric Programming

This e-book offers the built-in strategy of study and optimum layout of buildings. This process, that is easier than the so-called nested technique, has the trouble of producing a wide optimization challenge. to beat this challenge a strategy of decomposition by means of multilevel is built.

Surface Plasmon Resonance Sensors: A Materials Guide to Design and Optimization

This ebook addresses the real actual phenomenon of floor Plasmon Resonance or floor Plasmon Polaritons in skinny steel motion pictures, a phenomenon that is exploited within the layout of a giant number of physico-chemical optical sensors. during this therapy, the most important fabrics features for layout and optimization of SPR sensors are investigated and defined intimately.

Multifunctional Polymeric Nanocomposites Based on Cellulosic Reinforcements

Multifunctional Polymeric Nanocomposites in keeping with Cellulosic Reinforcements introduces the cutting edge functions of polymeric fabrics in response to nanocellulose, and covers extraction equipment, functionalization ways, and meeting how to let those functions. The e-book offers the state of the art of this novel nano-filler and the way it permits new purposes in lots of varied sectors, past current items.

Additional resources for CUDA by Example: An Introduction to General-Purpose GPU Programming

Sample text

From the available downloads, you need to download the CUDA Toolkit in order to build the code examples contained in this book. Additionally, you are encouraged, although not required, to download the GPU Computing SDK code samples, which contains dozens of helpful example programs. The GPU Computing SDK code samples will not be covered in this book, but they nicely complement the material we intend to cover, and as with learning any style of programming, the more examples, the better. You should also take note that although nearly all the code in this book will work on the Linux, Windows, and Mac OS platforms, we have targeted the applications toward Linux and Windows.

The NVIDIA tools simply feed this host compiler your code, and everything behaves as it would in a world without CUDA. Now we see that CUDA C adds the __global__ qualifier to standard C. This mechanism alerts the compiler that a function should be compiled to run on a device instead of the host. In this simple example, nvcc gives the function kernel() to the compiler that handles device code, and it feeds main() to the host compiler as it did in the previous example. So, what is the mysterious call to kernel(), and why must we vandalize our standard C with angle brackets and a numeric tuple?

The University of Cambridge, in a great tradition started by Charles Babbage, is home to active research into advanced parallel computing. Dr. Graham Pullan and Dr. Tobias Brandvik of the “many-core group” correctly identified the potential in NVIDIA’s CUDA Architecture to accelerate computational fluid dynamics unprecedented levels. Their initial investigations indicated that acceptable levels of performance could be delivered by GPU-powered, personal workstations. Later, the use of a small GPU cluster easily outperformed their much more costly supercomputers and further confirmed their suspicions that the capabilities of NVIDIA’s GPU matched extremely well with the problems they wanted to solve.

Download PDF sample

Rated 4.74 of 5 – based on 24 votes