Books on OpenCL and CUDA
Dear colleagues, we would like to present books on OpenCL and CUDA that were published in 2010-2014.
Other related books you can find in following ››› LINK ‹‹‹ of our site. Also you can find PhD and master ››› THESES ‹‹‹.
Moreover, you can study programming techniques directly with the
source codes, provided by the authors. The list of OpenCL, CUDA, etc.
open-source codes could be found ››› HERE ‹‹‹.
The CUDA Handbook: A Comprehensive Guide to GPU Programming
The CUDA Handbook begins where CUDA By Example leaves off, discussing both CUDA hardware and software in detail that will engage any CUDA developer, from the casual to the most hardcore. Newer CUDA developers will see how the hardware processes commands and the driver checks progress; hardcore CUDA developers will appreciate topics such as the driver API, context migration, and how best to structure CPU/GPU data interchange and synchronization.
The book is partly a reference and partly a cookbook. Careful descriptions of hardware and software abstractions, best practices, and example source code are included. Much of the source code appears in the form of reusable “microbenchmarks” or “microdemos” designed to expose specific hardware characteristics or highlight specific use cases. Best practices are discussed and accompanied with source code. One idea emphasized is the “EERS Principle” (Empirical Evidence Reigns Supreme): that is, determining the fastest way to perform a given operation is best done empirically.
The book includes an extensive glossary, because it’s difficult to write about this topic without throwing word salad at the reader.
OpenCL in Action is a thorough, hands-on presentation of OpenCL, with an eye toward showing developers how to build high-performance applications of their own. It begins by presenting the core concepts behind OpenCL, including vector computing, parallel programming, and multi-threaded operations, and then guides you step-by-step from simple data structures to complex functions.
About the technology
Whatever system you have, it probably has more raw processing power than you’re using. OpenCL is a high-performance programming language that maximizes computational power by executing on CPUs, graphics processors, and other number-crunching devices. It’s perfect for speed-sensitive tasks like vector computing, matrix operations, and graphics acceleration.
About this book
OpenCL in Action blends the theory of parallel computing with the practical reality of building high-performance applications using OpenCL. It first guides you through the fundamental data structures in an intuitive manner. Then, it explains techniques for high-speed sorting, image processing, matrix operations, and fast Fourier transform. The book concludes with a deep look at the all-important subject of graphics acceleration. Numerous challenging examples give you different ways to experiment with working code.
A background in C or C++ is helpful, but no prior exposure to OpenCL is needed.
- Learn OpenCL step by step
- Tons of annotated code
- Tested algorithms for maximum performance
Performance Analysis and Tuning For: General-Purpose Graphics Processing Units (GPGPU)
Morgan & Claypool Publishers, Amazon, Google books, hgpu.org
General-purpose graphics processing units (GPGPU) have emerged as an important class of shared memory parallel processing architectures, with widespread deployment in every computer class from high-end supercomputers to embedded mobile platforms. Relative to more traditional multicore systems of today, GPGPUs have distinctly higher degrees of hardware multithreading (hundreds of hardware thread contexts vs. tens), a return to wide vector units (several tens vs. 1-10), memory architectures that deliver higher peak memory bandwidth (hundreds of gigabytes per second vs. tens), and smaller caches/scratchpad memories (less than 1 megabyte vs. 1-10 megabytes).
In this book, we provide a high-level overview of current GPGPU architectures and programming models. We review the principles that are used in previous shared memory parallel platforms, focusing on recent results in both the theory and practice of parallel algorithms, and suggest a connection to GPGPU platforms. We aim to provide hints to architects about understanding algorithm aspect to GPGPU. We also provide detailed performance analysis and guide optimizations from high-level algorithms to low-level instruction level optimizations. As a case study, we use n-body particle simulations known as the fast multipole method (FMM) as an example. We also briefly survey the state-of-the-art in GPU performance analysis tools and techniques.
Table of Contents:
- GPU Design, Programming, and Trends
- Performance Principles
- From Principles to Practice: Analysis and Tuning
- Using Detailed Performance Analysis to Guide Optimization
Heterogeneous Computing with OpenCL
“Heterogeneous Computing with OpenCL” teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs) such as AMD Fusion technology. Designed to work on multiple platforms and with wide industry support, OpenCL will help you more effectively program for a heterogeneous future.
Written by leaders in the parallel computing and OpenCL communities, this book will give you hands-on OpenCL experience to address a range of fundamental parallel algorithms. The authors explore memory spaces, optimization techniques, graphics interoperability, extensions, and debugging and profiling. Intended to support a parallel programming course, Heterogeneous Computing with OpenCL includes detailed examples throughout, plus additional online exercises and other supporting materials.
- Explains principles and strategies to learn parallel programming with OpenCL, from understanding the four abstraction models to thoroughly testing and debugging complete applications.
- Covers image processing, web plugins, particle simulations, video editing, performance optimization, and more.
- Shows how OpenCL maps to an example target architecture and explains some of the tradeoffs associated with mapping to various architectures.
- Addresses a range of fundamental programming techniques, with multiple examples and case studies that demonstrate OpenCL extensions for a variety of hardware platforms.
OpenCL Programming Guide
Using the new OpenCL (Open Computing Language) standard, you can write applications that access all available programming resources: CPUs, GPUs, and other processors such as DSPs and the Cell/B.E. processor. Already implemented by Apple, AMD, Intel, IBM, NVIDIA, and other leaders, OpenCL has outstanding potential for PCs, servers, handheld/embedded devices, high performance computing, and even cloud systems. This is the first comprehensive, authoritative, and practical guide to OpenCL 1.1 specifically for working developers and software architects.
Written by five leading OpenCL authorities, OpenCL Programming Guide covers the entire specification. It reviews key use cases, shows how OpenCL can express a wide range of parallel algorithms, and offers complete reference material on both the API and OpenCL C programming language.
Through complete case studies and downloadable code examples, the authors show how to write complex parallel programs that decompose workloads across many different devices. They also present all the essentials of OpenCL software performance optimization, including probing and adapting to hardware. Coverage includes
- Understanding OpenCL’s architecture, concepts, terminology, goals, and rationale
- Programming with OpenCL C and the runtime API
- Using buffers, sub-buffers, images, samplers, and events
- Sharing and synchronizing data with OpenGL and Microsoft’s Direct3D
- Simplifying development with the C++ Wrapper API
- Using OpenCL Embedded Profiles to support devices ranging from cellphones to supercomputer nodes
- Case studies dealing with physics simulation; image and signal processing, such as image histograms, edge detection filters, Fast Fourier Transforms, and optical flow; math libraries, such as matrix multiplication and high-performance sparse matrix multiplication; and more
The OpenCL Programming Book
The OpenCL Programming Book starts with the basics of parallelization, then covers the main concepts and terminology, also teaching how to set up a development environment for OpenCL, while concluding with a walkthrough of the source code of an implementation of the fast Fourier transform (FFT) and Mersenne twister algorithms written in OpenCL.
The revised edition includes a summary of changes made in the OpenCL Specification, version 1.2, including a reference of new corresponding functions, and updates to the execution environments.
It is highly recommended for those wishing to get started with programming in OpenCL.
CUDA Application Design and Development
As the computer industry retools to leverage massively parallel graphics processing units (GPUs), this book is designed to meet the needs of working software developers who need to understand GPU programming with CUDA and increase efficiency in their projects. CUDA Application Design and Development starts with an introduction to parallel computing concepts for readers with no previous parallel experience, and focuses on issues of immediate importance to working software developers: achieving high performance, maintaining competitiveness, analyzing CUDA benefits versus costs, and determining application lifespan.
The book then details the thought behind CUDA and teaches how to create, analyze, and debug CUDA applications. Throughout, the focus is on software engineering issues: how to use CUDA in the context of existing application code, with existing compilers, languages, software tools, and industry-standard API libraries.
Using an approach refined in a series of well-received articles at Dr Dobb’s Journal, author Rob Farber takes the reader step-by-step from fundamentals to implementation, moving from language theory to practical coding.
- Includes multiple examples building from simple to more complex applications in four key areas: machine learning, visualization, vision recognition, and mobile computing
- Addresses the foundational issues for CUDA development: multi-threaded programming and the different memory hierarchy
- Includes teaching chapters designed to give a full understanding of CUDA tools, techniques and structure.
CUDA by Example: An Introduction to General-Purpose GPU Programming
“This book is required reading for anyone working with accelerator-based computing systems.”
– from the Foreword by Jack Dongarra, University of Tennessee and Oak Ridge National Laboratory
CUDA is a computing architecture designed to facilitate the development of parallel programs. In conjunction with a comprehensive software platform, the CUDA Architecture enables programmers to draw on the immense power of graphics processing units (GPUs) when building high-performance applications. GPUs, of course, have long been available for demanding graphics and game applications. CUDA now brings this valuable resource to programmers working on applications in other domains, including science, engineering, and finance. No knowledge of graphics programming is required-just the ability to program in a modestly extended version of C.
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples. After a concise introduction to the CUDA platform and architecture, as well as a quick-start guide to CUDA C, the book details the techniques and trade-offs associated with each key CUDA feature. You’ll discover when to use each CUDA C extension and how to write CUDA software that delivers truly outstanding performance.
Major topics covered include
- Parallel programming
- Thread cooperation
- Constant memory and events
- Texture memory
- Graphics interoperability
- CUDA C on multiple GPUs
- Advanced atomics
- Additional CUDA resources
All the CUDA software tools you’ll need are freely available for download from NVIDIA.
GPU Computing Gems: Jade Edition
This is the second volume of Morgan Kaufmann’s GPU Computing Gems, offering an all-new set of insights, ideas, and practical “hands-on” skills from researchers and developers worldwide. Each chapter gives you a window into the work being performed across a variety of application domains, and the opportunity to witness the impact of parallel GPU computing on the efficiency of scientific research.
GPU Computing Gems: Jade Edition showcases the latest research solutions with GPGPU and CUDA, including:
- Improving memory access patterns for cellular automata using CUDA
- Large-scale gas turbine simulations on GPU clusters
- Identifying and mitigating credit risk using large-scale economic capital simulations
- GPU-powered MATLAB acceleration with Jacket
- Biologically-inspired machine vision
- An efficient CUDA algorithm for the maximum network flow problem
- 30 more chapters of innovative GPU computing ideas, written to be accessible to researchers from any industry
GPU Computing Gems: Jade Edition contains 100% new material covering a variety of application domains: algorithms and data structures, engineering, interactive physics for games, computational finance, and programming tools.
- This second volume of GPU Computing Gems offers 100% new material of interest across industry, including finance, medicine, imaging, engineering, gaming, environmental science, green computing, and more
- Covers new tools and frameworks for productive GPU computing application development and offers immediate benefit to researchers developing improved programming environments for GPUs
- Even more hands-on, proven techniques demonstrating how general purpose GPU computing is changing scientific research
GPU Computing Gems: Emerald Edition
“…the perfect companion to Programming Massively Parallel Processors by Hwu & Kirk.” – Nicolas Pinto, Research Scientist at Harvard & MIT, NVIDIA Fellow 2009-2010
Graphics processing units (GPUs) can do much more than render graphics. Scientists and researchers increasingly look to GPUs to improve the efficiency and performance of computationally-intensive experiments across a range of disciplines.
GPU Computing Gems: Emerald Edition brings their techniques to you, showcasing GPU-based solutions including:
- Black hole simulations with CUDA
- GPU-accelerated computation and interactive display of molecular orbitals
- Temporal data mining for neuroscience
- GPU -based parallelization for fast circuit optimization
- Fast graph cuts for computer vision
- Real-time stereo on GPGPU using progressive multi-resolution adaptive windows
- GPU image demosaicing
- Tomographic image reconstruction from unordered lines with CUDA
- Medical image processing using GPU -accelerated ITK image filters
- 41 more chapters of innovative GPU computing ideas, written to be accessible to researchers from any domain
GPU Computing Gems: Emerald Edition is the first volume in Morgan Kaufmann’s Applications of GPU Computing Series, offering the latest insights and research in computer vision, electronic design automation, emerging data-intensive applications, life sciences, medical imaging, ray tracing and rendering, scientific simulation, signal and audio processing, statistical modeling, and video / image processing.
- Covers the breadth of industry from scientific simulation and electronic design automation to audio / video processing, medical imaging, computer vision, and more
- Many examples leverage NVIDIA’s CUDA parallel computing architecture, the most widely-adopted massively parallel programming solution
GPU Solutions to Multi-scale Problems in Science and Engineering
This book covers the new topic of GPU computing with many applications involved, taken from diverse fields such as networking, seismology, fluid mechanics, nano-materials, data-mining, earthquakes, mantle convection, visualization. It will show the public why GPU computing is important and easy to use. It will offer a reason why GPU computing is useful and how to implement codes in an everyday situation.
Using OpenCL: Programming Massively Parallel Computers
In 2011 many computer users were exploring the opportunities and the benefits of the massive parallelism offered by heterogeneous computing. In 2000 the Khronos Group, a not-for-profit industry consortium, was founded to create standard open APIs for parallel computing, graphics and dynamic media. Among them has been OpenCL, an open system for programming heterogeneous computers with components made by multiple manufacturers. This publication explains how heterogeneous computers work and how to program them using OpenCL. It also describes how to combine OpenCL with OpenGL for displaying graphical effects in real time. Chapter 1 describes briefly two older de facto standard and highly successful parallel programming systems: MPI and OpenMP. Collectively, the MPI, OpenMP, and OpenCL systems cover programming of all major parallel architectures: clusters, shared-memory computers, and the newest heterogeneous computers. Chapter 2, the technical core of the book, deals with OpenCL fundamentals: programming, hardware, and the interaction between them. Chapter 3 adds important information about such advanced issues as double-versus-single arithmetic precision, efficiency, memory use, and debugging. Chapters 2 and 3 contain several examples of code and one case study on genetic algorithms. These examples are related to linear algebra operations, which are very common in scientific, industrial, and business applications. Most of the books examples can be found on the enclosed CD, which also contains basic projects for Visual Studio, MinGW, and GCC. This supplementary material will assist the reader in getting a quick start on OpenCL projects.
CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs
If you need to learn CUDA but don’t have experience with parallel computing, CUDA Programming: A Developer’s Introduction offers a detailed guide to CUDA with a grounding in parallel fundamentals. It starts by introducing CUDA and bringing you up to speed on GPU parallelism and hardware, then delving into CUDA installation. Chapters on core concepts including threads, blocks, grids, and memory focus on both parallel and CUDA-specific issues. Later, the book demonstrates CUDA in practice for optimizing applications, adjusting to new hardware, and solving common problems.
- Comprehensive introduction to parallel programming with CUDA, for readers new to both
- Detailed instructions help readers optimize the CUDA software development kit
- Practical techniques illustrate working with memory, threads, algorithms, resources, and more
- Covers CUDA on multiple hardware platforms: Mac, Linux and Windows with several NVIDIA chipsets
- Each chapter includes exercises to test reader knowledge
GPU Pro 2
This book focuses on advanced rendering techniques that run on the DirectX and/or OpenGL run-time with any shader language available. It includes articles on the latest and greatest techniques in real-time rendering, including MLAA, adaptive volumetric shadow maps, light propagation volumes, wrinkle animations, and much more. The book emphasizes techniques for handheld programming to reflect the increased importance of graphics on mobile devices. It covers geometry manipulation, effects in image space, shadows, 3D engine design, GPGPU, and graphics-related tools.
GPU Pro 3: Advanced Rendering Techniques
GPU Pro3, the third volume in the GPU Pro book series, offers practical tips and techniques for creating real-time graphics that are useful to beginners and seasoned game and graphics programmers alike.
Section editors Wolfgang Engel, Christopher Oat, Carsten Dachsbacher, Wessam Bahnassi, and Sebastien St-Laurent have once again brought together a high-quality collection of cutting-edge techniques for advanced GPU programming. With contributions by more than 50 experts, GPU Pro3: Advanced Rendering Techniques covers battle-tested tips and tricks for creating interesting geometry, realistic shading, real-time global illumination, and high-quality shadows, for optimizing 3D engines, and for taking advantage of the advanced power of the GPGPU.
Sample programs and source code to accompany some of the chapters are available at http://www.akpeters.com/gpupro
“GPU Pro4: Advanced Rendering Techniques” presents ready-to-use ideas and procedures that can help solve many of your day-to-day graphics programming challenges. Focusing on interactive media and games, the book covers up-to-date methods for producing real-time graphics.
Section editors Wolfgang Engel, Christopher Oat, Carsten Dachsbacher, Michal Valient, Wessam Bahnassi, and Sebastien St-Laurent have once again assembled a high-quality collection of cutting-edge techniques for advanced graphics processing unit (GPU) programming. Divided into six sections, the book begins with discussions on the ability of GPUs to process and generate geometry in exciting ways. It next introduces new shading and global illumination techniques for the latest real-time rendering engines and explains how image space algorithms are becoming a key way to achieve a more realistic and higher quality final image. Moving on to the difficult task of rendering shadows, the book describes the state of the art in real-time shadow maps. It then covers game engine design, including quality, optimization, and high-level architecture. The final section explores approaches that go beyond the normal pixel and triangle scope of GPUs as well as techniques that take advantage of the parallelism of modern graphic processors in a variety of applications.
Useful to beginners and seasoned game and graphics programmers alike, this color book offers practical tips and techniques for creating real-time graphics. Example programs and source code are available for download on the book’s CRC Press web page. The directory structure of the online material closely follows the book structure by using the chapter numbers as the name of the subdirectory.
Programming Massively Parallel Processors: A Hands-on Approach (Applications of GPU Computing Series)
Multi-core processors are no longer the future of computing-they are the present day reality. A typical mass-produced CPU features multiple processor cores, while a GPU (Graphics Processing Unit) may have hundreds or even thousands of cores. With the rise of multi-core architectures has come the need to teach advanced programmers a new and essential skill: how to program massively parallel processors.
Programming Massively Parallel Processors: A Hands-on Approach shows both student and professional alike the basic concepts of parallel programming and GPU architecture. Various techniques for constructing parallel programs are explored in detail. Case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs.
- Teaches computational thinking and problem-solving techniques that facilitate high-performance parallel computing.
- Utilizes CUDA (Compute Unified Device Architecture), NVIDIA’s software development tool created specifically for massively parallel environments.
Programming Massively Parallel Processors: A Hands-on Approach shows both student and professional alike the basic concepts of parallel programming and GPU architecture. Various techniques for constructing parallel programs are explored in detail. Case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. Topics of performance, floating-point format, parallel patterns, and dynamic parallelism are covered in depth.
This best-selling guide to CUDA and GPU parallel programming has been revised with more parallel programming examples, commonly-used libraries such as Thrust, and explanations of the latest tools. With these improvements, the book retains its concise, intuitive, practical approach based on years of road-testing in the authors’ own parallel computing courses.
Updates in this new edition include:
- New coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and more
- Increased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism
- Two new case studies (on MRI reconstruction and molecular visualization) explore the latest applications of CUDA and GPUs for scientific research and high-performance computing
Virtually all semiconductor market domains, including PCs, game consoles, mobile handsets, servers, supercomputers, and networks, are converging to concurrent platforms. There are two important reasons for this trend. First, these concurrent processors can potentially offer more effective use of chip space and power than traditional monolithic microprocessors for many demanding applications. Second, an increasing number of applications that traditionally used Application Specific Integrated Circuits (ASICs) are now implemented with concurrent processors in order to improve functionality and reduce engineering cost. The real challenge is to develop applications software that effectively uses these concurrent processors to achieve efficiency and performance goals. The aim of this course is to provide students with knowledge and hands-on experience in developing applications software for processors with massively parallel computing resources. In general, we refer to a processor as massively parallel if it has the ability to complete more than 64 arithmetic operations per clock cycle. Many commercial offerings from NVIDIA, AMD, and Intel already offer such levels of concurrency. Effectively programming these processors will require in-depth knowledge about parallel programming principles, as well as the parallelism models, communication models, and resource limitations of these processors. The target audiences of the course are students who want to develop exciting applications for these processors, as well as those who want to develop programming tools and future implementations for these processors. Visit the CS193G companion website for course materials.
Code Optimization Techniques for Graphics Processing Units
Books on parallel programming theory often talk about such weird beasts like the PRAM model, a hypothetical hardware that would provide the programmer with a number of processors that is proportional to the input size of the problem at hand. Modern general purpose computers afford only a few processing units; four is currently a reasonable number. This limitation makes the development of highly parallel applications quite difficult to the average computer user. However, the low cost and the increasing programmability of graphics processing units, popularly know as GPUs, is contributing to overcome this difficulty. Presently, the application developer can have access, for a few hundred dollars, to a hardware boosting hundreds of processing elements. This brave new world that is now open to many programmers brings, alongside the incredible possibilities, also difficulties and challenges. Perhaps, for the first time since the popularization of computers, it makes sense to open the compiler books on the final chapters, which talk about very unusual concepts, such as polyhedral loops, iteration space and Fourier-Motskin transformations, only to name a few of these chimerical creatures. This material covers, in a very condensed way, some code generation and optimization techniques that a compiler would use to produce efficient code for graphics processing units. Through these techniques, the compiler writer tries to free the application developer from the intricacies and subtleties of GPU programming, giving him more freedom to focus on algorithms instead of micro-optimizations. We will discuss a little bit of what are GPUs, which applications should target them, how the compiler sees a GPU program and how the compiler can transform this program so that it will take more from this very powerful hardware.
Contemporary High Performance Computing: From Petascale toward Exascale
Amazon, ACM, CRC, Google books
“Contemporary High Performance Computing: From Petascale toward Exascale” focuses on the ecosystems surrounding the world’s leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors.
The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the book provides a comprehensive overview of 18 HPC ecosystems from around the world. Each chapter in this section describes programmatic motivation for HPC and their important applications; a flagship HPC system overview covering computer architecture, system software, programming systems, storage, visualization, and analytics support; and an overview of their data center/facility. The last part of the book addresses the role of clouds and grids in HPC, including chapters on the Magellan, FutureGrid, and LLGrid projects.
With contributions from top researchers directly involved in designing, deploying, and using these supercomputing systems, this book captures a global picture of the state of the art in HPC.
“High Performance Computing: Programming and Applications” presents techniques that address new performance issues in the programming of high performance computing (HPC) applications. Omitting tedious details, the book discusses hardware architecture concepts and programming techniques that are the most pertinent to application developers for achieving high performance. Even though the text concentrates on C and Fortran, the techniques described can be applied to other languages, such as C++ and Java.
Drawing on their experience with chips from AMD and systems, interconnects, and software from Cray Inc., the authors explore the problems that create bottlenecks in attaining good performance. They cover techniques that pertain to each of the three levels of parallelism:
- Message passing between the nodes
- Shared memory parallelism on the nodes or the multiple instruction, multiple data (MIMD) units on the accelerator
- Vectorization on the inner level
After discussing architectural and software challenges, the book outlines a strategy for porting and optimizing an existing application to a large massively parallel processor (MPP) system. With a look toward the future, it also introduces the use of general purpose graphics processing units (GPGPUs) for carrying out HPC computations. A companion website at www.hybridmulticoreoptimization.com contains all the examples from the book, along with updated timing results on the latest released processors.
GPUs may have started life as graphics processors, but recently they’ve emerged as a fantastic numerical co-processor for high-performance general applications on the CPU. This book not only teaches you the fundamentals of parallel programming with GPUs, it helps you think in parallel. You learn best practices, algorithms, and designs for achieving greater application performance with these processors.
Amazon recently added GPU supercomputing to its cloud-computing platform-a clear sign that parallel programming is becoming an essential skill. This book includes valuable input from major CPU and GPU manufacturers-Intel, NVIDIA and AMD-to help experienced programmers get a head start on programming GPU applications.
- Understand the differences between parallel and sequential programming
- Learn about GPU architecture, including the runtime environment, threads, and memory
- Build and deploy GPU applications and libraries-and port existing applications
- Use debugging and profiling tools and techniques
- Write GPU programs for clusters and the cloud
GPUs in the Cloud
GPUs have been increasingly used as tools for high-performance computation, in addition to graphics. They play a major role in finance, bioinformatics, image processing, artificial intelligence, and other areas that require extremely high computational performance. Amazon has noticed this trend, and recently added GPU instances to EC2. This book shows you how to use Amazon’s EC2 GPU instances effectively, whatever your problem domain.
Parallel and Concurrent Programming in Haskell: Techniques for Multicore and Multithreaded Programming
This book covers the breadth of Haskell’s diverse selection of programming APIs for concurrent and parallel programming. It is split into two parts. The first part, on parallel programming, covers the techniques for using multiple processors to speed up CPU-intensive computations, including methods for using parallelism in both idiomatic Haskell and numerical array-based algorithms, and for running computations on a GPU. The second part, on concurrent programming, covers techniques for using multiple threads, including overlapping multiple I/O operations, building concurrent network servers, and distributed programming across multiple machines.
Intel Xeon Phi Coprocessor High Performance Programming
Authors Jim Jeffers and James Reinders spent two years helping educate customers about the prototype and pre-production hardware before Intel introduced the first Intel Xeon Phi coprocessor. They have distilled their own experiences coupled with insights from many expert customers, Intel Field Engineers, Application Engineers and Technical Consulting Engineers, to create this authoritative first book on the essentials of programming for this new architecture and these new products.
This book is useful even before you ever touch a system with an Intel Xeon Phi coprocessor. To ensure that your applications run at maximum efficiency, the authors emphasize key techniques for programming any modern parallel computing system whether based on Intel Xeon processors, Intel Xeon Phi coprocessors, or other high performance microprocessors. Applying these techniques will generally increase your program performance on any system, and better prepare you for Intel Xeon Phi coprocessors and the Intel MIC architecture.
- A practical guide to the essentials of the Intel Xeon Phi coprocessor
- Presents best practices for portable, high-performance computing and a familiar and proven threaded, scalar-vector programming model
- Includes simple but informative code examples that explain the unique aspects of this new highly parallel and high performance computational product
- Covers wide vectors, many cores, many threads and high bandwidth cache/memory architecture
Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors
This book will guide you to the mastery of parallel programming with Intel® Xeon® family products: Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors. It includes a detailed presentation of the programming paradigm for Intel® Xeon® product family, optimization guidelines, and hands-on exercises on systems equipped with the Intel® Xeon Phi™ coprocessors, as well as instructions on using Intel software development tools and libraries included in Intel Parallel Studio XE.
This book is targeted toward developers familiar with C/C++ programming in Linux. Developers with little parallel programming experience will be able to grasp the core concepts of these subjects from the detailed commentary in Chapter 3. For advanced developers familiar with multi-core and/or GPU programming, the ebook offers materials specific to Intel compilers and Intel® Xeon® family products, as well as optimization advice pertinent to Many Integrated Core (MIC) architecture.
We have written these materials relying on key elements for efficient learning: practice and repetition. As a consequence, the reader will find a great number of code listings in the main section of these materials. In the extended appendix, we provided numerous hands-on exercises that one can complete either under an instructors supervision, or autonomously in a self-paced training environment.
This document is different from a typical book on computer science, because we intended it to be used as a lecture plan in an intensive learning course. Speaking in programming terms, a typical book traverses material with a depth-first algorithm, describing every detail of each method or concept before moving on to the next method. In contrast, this document traverses the scope of materials with a breadth-first algorithm. First, we give an overview of multiple methods to address a certain issue. In the subsequent chapter, we re-visit these methods, this time in greater detail. We may go into even more depth down the line. In this way, we expect that developers will have enough time to absorb and comprehend the variety of programming and optimization methods presented here.
High Performance Programming for Soft Computing
This book examines the present and future of soft computer techniques. It explains how to use the latest technological tools, such as multicore processors and graphics processing units, to implement highly efficient intelligent system methods using a general purpose computer.
Multicore Computing: Algorithms, Architectures, and Applications
Every area of science and engineering today has to process voluminous data sets. Using exact, or even approximate, algorithms to solve intractable problems in critical areas, such as computational biology, takes time that is exponential in some of the underlying parameters. Parallel computing addresses this issue and has become affordable with the advent of multicore architectures. However, programming multicore machines is much more difficult due to oddities existing in the architectures.
Offering insights into different facets of this area, “Multicore Computing: Algorithms, Architectures, and Applications” focuses on the architectures, algorithms, and applications of multicore computing. It will help readers understand the intricacies of these architectures and prepare them to design efficient multicore algorithms.
Contributors at the forefront of the field cover the memory hierarchy for multicore and manycore processors, the caching strategy Flexible Set Balancing, the main features of the latest SPARC architecture specification, the Cilk and Cilk++ programming languages, the numerical software library Parallel Linear Algebra Software for Multicore Architectures (PLASMA), and the exact multipattern string matching algorithm of Aho-Corasick. They also describe the architecture and programming model of the NVIDIA Tesla GPU, discuss scheduling directed acyclic graphs onto multi/manycore processors, and evaluate design trade-offs among Intel and AMD multicore processors, IBM Cell Broadband Engine, and NVIDIA GPUs. In addition, the book explains how to design algorithms for the Cell Broadband Engine and how to use the backprojection algorithm for generating images from synthetic aperture radar data.
Enrich your 3D scenes with the power of GLSL!
- Learn about shaders in a step-by-step, interactive manner
- Create stunning visual effects using vertex and fragment shaders
- Simplify your CPU code and improve your overall performance with instanced drawing through the use of geometry shaders
Shader programming has been the largest revolution in graphics programming. OpenGL Shading Language (abbreviated: GLSL or GLslang), is a high-level shading language based on the syntax of the C programming language.With GLSL you can execute code on your GPU (aka graphics card). More sophisticated effects can be achieved with this technique. Therefore, knowing how OpenGL works and how each shader type interacts with each other, as well as how they are integrated into the system, is imperative for graphic programmers. This knowledge is crucial in order to be familiar with the mechanisms for rendering 3D objects.
GLSL Essentials is the only book on the market that teaches you about shaders from the very beginning. It shows you how graphics programming has evolved, in order to understand why you need each stage in the Graphics Rendering Pipeline, and how to manage it in a simple but concise way. This book explains how shaders work in a step-by-step manner, with an explanation of how they interact with the application assets at each stage.
This book will take you through the graphics pipeline and will describe each section in an interactive and clear way. You will learn how the OpenGL state machine works and all its relevant stages. Vertex shaders, fragment shaders, and geometry shaders will be covered, as well some use cases and an introduction to the math needed for lighting algorithms or transforms. Generic GPU programming (GPGPU) will also be covered.
After reading GLSL Essentials you will be ready to generate any rendering effect you need.
What you will learn from this book
- Use vertex shaders to dynamically displace or deform a mesh on the fly
- Colorize your pixels unleashing the power of fragment shaders
- Learn the basics of the Phong Illumination model to add emphasis to your scenes
- Combine textures to make your scene more realistic
- Save CPU and GPU cycles by performing instanced drawing
- Save bandwidth by generating geometry on the fly
- Learn about GPU Generic programming concepts
- Convert algorithms from CPU to GPU to increase performance
This book is a practical guide to the OpenGL Shading Language, which contains several real-world examples that will allow you to grasp the core concepts easily and the use of the GLSL for graphics rendering applications.
Who this book is written for
If you want upgrade your skills, or are new to shader programming and want to learn about graphic programming, this book is for you. If you want a clearer idea of shader programming, or simply want to upgrade from fixed pipeline systems to state-of-the-art shader programming and are familiar with any C-based language, then this book will show you what you need to know.
Over 60 highly focused, practical recipes to maximize your OpenGL Shading language use
- A full set of recipes demonstrating simple and advanced techniques for producing high-quality, real-time 3D graphics using GLSL 4.0
- How to use the OpenGL Shading Language to implement lighting and shading techniques
- Use the new features of GLSL 4.0 including tessellation and geometry shaders
- How to use textures in GLSL as part of a wide variety of techniques from basic texture mapping to deferred shading
- Simple, easy-to-follow examples with GLSL source code, as well as a basic description of the theory behind each technique
The OpenGL Shading Language (GLSL) is a programming language used for customizing parts of the OpenGL graphics pipeline that were formerly fixed-function, and are executed directly on the GPU. It provides programmers with unprecedented flexibility for implementing effects and optimizations utilizing the power of modern GPUs. With version 4.0, the language has been further refined to provide programmers with greater flexibility, and additional features have been added such as an entirely new stage called the tessellation shader.
The OpenGL Shading Language 4.0 Cookbook provides easy-to-follow examples that first walk you through the theory and background behind each technique then go on to provide and explain the GLSL and OpenGL code needed to implement it. Beginning level through to advanced techniques are presented including topics such as texturing, screen-space techniques, lighting, shading, tessellation shaders, geometry shaders, and shadows.
The OpenGL Shading Language 4.0 Cookbook is a practical guide that takes you from the basics of programming with GLSL 4.0 and OpenGL 4.0, through basic lighting and shading techniques, to more advanced techniques and effects. It presents techniques for producing basic lighting and shading effects; examples that demonstrate how to make use of textures for a wide variety of effects and as part of other techniques; examples of screen-space techniques, shadowing, tessellation and geometry shaders, noise, and animation.
The OpenGL Shading Language 4.0 Cookbook provides examples of modern shading techniques that can be used as a starting point for programmers to expand upon to produce modern, interactive, 3D computer graphics applications.
What you will learn from this book
- Compile, install, and communicate with shader programs
- Use new features of GLSL 4.0 such as subroutines and uniform blocks
- Implement basic lighting and shading techniques such as diffuse and specular shading, per-fragment shading, and spotlights
- Apply single or multiple textures
- Use textures as environment maps for simulating reflection or refraction
- Implement screen-space techniques such as gamma correction, blur filters, and deferred shading
- Implement geometry and tessellation shaders
- Learn shadowing techniques including shadow mapping and screen space ambient occlusion
- Use noise in shaders
- Use shaders for animation
This hands-on guide cuts short the preamble and gets straight to the point – actually creating graphics, instead of just theoretical learning. Each recipe is specifically tailored to satisfy your appetite for producing real-time 3-D graphics using GLSL 4.0.
Who this book is written for
If you are an OpenGL programmer looking to use the modern features of GLSL 4.0 to create real-time, three-dimensional graphics, then this book is for you. Familiarity with OpenGL programming, along with the typical 3D coordinate systems, projections, and transformations is assumed. It can also be useful for experienced GLSL programmers who are looking to implement the techniques that are presented here.
Over 70 recipes demonstrating simple and advanced techniques for producing high-quality, real-time 3D graphics using OpenGL and GLSL 4.x
- Discover simple and advanced techniques for leveraging modern OpenGL and GLSL
- Learn how to use the newest features of GLSL including compute shaders, geometry, and tessellation shaders
- Get to grips with a wide range of techniques for implementing shadows using shadow maps, shadow volumes, and more
- Clear, easy-to-follow examples with detailed explanations and full, cross-platform source code available from GitHub
OpenGL Shading Language (GLSL) is a programming language used for customizing parts of the OpenGL graphics pipeline that were formerly fixed-function, and are executed directly on the GPU. It provides programmers with unprecedented flexibility for implementing effects and optimizations utilizing the power of modern GPUs. With Version 4, the language has been further refined to provide programmers with greater power and flexibility, with new stages such as tessellation and compute.
OpenGL 4 Shading Language Cookbook provides easy-to-follow examples that first walk you through the theory and background behind each technique, and then go on to provide and explain the GLSL and OpenGL code needed to implement it. Beginner level through to advanced techniques are presented including topics such as texturing, screen-space techniques, lighting, shading, tessellation shaders, geometry shaders, compute shaders, and shadows.
OpenGL Shading Language 4 Cookbook is a practical guide that takes you from the fundamentals of programming with modern GLSL and OpenGL, through to advanced techniques. The recipes build upon each other and take you quickly from novice to advanced level code.
You’ll see essential lighting and shading techniques; examples that demonstrate how to make use of textures for a wide variety of effects and as part of other techniques; examples of screen-space techniques including HDR rendering, bloom, and blur; shadowing techniques; tessellation, geometry, and compute shaders; how to use noise effectively; and animation with particle systems.
OpenGL Shading Language 4 Cookbook provides examples of modern shading techniques that can be used as a starting point for programmers to expand upon to produce modern, interactive, 3D computer graphics applications.
What you will learn from this book
- Compile, debug, and communicate with shader programs
- Use new features of GLSL 4 such as subroutines, sampler objects, and uniform blocks
- Implement core lighting and shading techniques such as diffuse and specular shading, per-fragment shading, and spotlights
- Use textures for a variety of effects including cube maps for reflection or refraction
- Implement screen-space techniques such as HDR, bloom, blur filters, order-independent transparency, and deferred shading
- Utilize noise in shaders
- Use shaders for animation
- Make use of compute shaders for physics, animation, and general computing
- Learn how to use new OpenGL features such as shader storage buffer objects, and image load/store
OpenCL Programming by Example
A comprehensive guide on OpenCL programming with examples
- Learn about all of the OpenCL Architecture and major APIs
- Learn OpenCL programming with simple examples from Image Processing, Pattern Recognition and Statistics with detailed code explanation
- Explore several aspects of optimization techniques, with code examples to guide you through the process
- Understand how to use OpenCL in your problem domains
Research in parallel programming has been a mainstream topic for a decade, and will continue to be so for many decades to come. Many parallel programming standards and frameworks exist, but only take into account one type of hardware architecture. Today computing platforms come with many heterogeneous devices. OpenCL provides royalty free standard to program heterogeneous hardware.
This guide offers you a compact coverage of all the major topics of OpenCL programming. It explains optimization techniques and strategies in-depth, using illustrative examples and also provides case studies from diverse fields. Beginners and advanced application developers will find this book very useful.
Beginning with the discussion of the OpenCL models, this book explores their architectural view, programming interfaces and primitives. It slowly demystifies the process of identifying the data and task parallelism in diverse algorithms.
It presents examples from different domains to show how the problems within different domains can be solved more efficiently using OpenCL. You will learn about parallel sorting, histogram generation, JPEG compression, linear and parabolic regression and k-nearest neighborhood, a clustering algorithm in pattern recognition. Following on from this, optimization strategies are explained with matrix multiplication examples. You will also learn how to do an interoperation of OpenGL and OpenCL.
“OpenCL Programming by Example” explains OpenCL in the simplest possible language, which beginners will find it easy to understand. Developers and programmers from different domains who want to achieve acceleration for their applications will find this book very useful.
What you will learn from this book
- Understand OpenCL Platform Model, Execution Model, Memory Model, and Programming Model
- Explore the different OpenCL objects, APIs for building kernel, memory allocation, data transfer, synchronization, and many more
- Get to grips with API explanations, featuring simple examples
- Create image processing examples such as Image histogram and Image convolution
- Learn optimization techniques with Matrix Multiplication examples
- Develop Bitonic sort in OpenCL
- Build JPEG decoder using OpenCL
- Construct linear and parabolic regression equation in OpenCL
- Compose k-nearest neighborhood clustering algorithm from pattern recognition
- Use OpenCL with OpenGL interoperability
This book follows an example-driven, simplified, and practical approach to using OpenCL for general purpose GPU programming.
Over 40 recipes to help you learn, understand, and implement modern OpenGL in your applications
- Explores current graphics programming techniques including GPU-based methods from the outlook of modern OpenGL 3.3
- Includes GPU-based volume rendering algorithms
- Discover how to employ GPU-based path and ray tracing
- Create 3D mesh formats and skeletal animation with GPU skinning
- Explore graphics elements including lights and shadows in an easy to understand manner
OpenGL is the leading cross-language, multi-platform API used by masses of modern games and applications in a vast array of different sectors. Developing graphics with OpenGL lets you harness the increasing power of GPUs and really take your visuals to the next level.
OpenGL Development Cookbook is your guide to graphical programming techniques to implement 3D mesh formats and skeletal animation to learn and understand OpenGL.
OpenGL Development Cookbook introduces you to the modern OpenGL. Beginning with vertex-based deformations, common mesh formats, and skeletal animation with GPU skinning, and going on to demonstrate different shader stages in the graphics pipeline. OpenGL Development Cookbook focuses on providing you with practical examples on complex topics, such as variance shadow mapping, GPU-based paths, and ray tracing. By the end you will be familiar with the latest advanced GPU-based volume rendering techniques.
What you will learn from this book
- Create an OpenGL 3.3 rendering context
- Get to grips with camera-based viewing and object picking techniques
- Learn off-screen rendering and environment mapping techniques to render mirrors
- Discover shadow mapping techniques, including variance shadow mapping
- Implement a particle system using shaders
- Learn about GPU-based methods for global illumination using spherical harmonics and SSAO
- Understand translucent geometry and order independent transparency using dual depth peeling
- Explore GPU-based volumetric lighting using half angle slicing and physically based simulation on the GPU using transform feedback
This hands-on Cookbook cuts through the preamble and gets straight to the point. OpenGL Development Cookbook is perfect for intermediate C++ programmers. Full of practical techniques for implementing amazing computer graphics and visualizations, using OpenGL. Crammed full of useful recipes, OpenGL Development Cookbook will help you exploit OpenGL to its full potential.
Who this book is written for
OpenGL Development Cookbook is geared toward intermediate OpenGL programmers to take you to the next level and create amazing OpenGL graphics in your applications.
Videogame Graphics, BigData & Analytics:
A review of the evolution of videogames graphics processing, the importance of triangles, the efficiency of geometry, its employment within the rendering pipeline, its emergence as a driving force behind GPU development, the new world of GPU Computing and the convergence of gaming technology in the world of BigData & Analytics
The purpose of this coffee shop read is to attempt to highlight the criticality of videogames as a component of the “Convergence” of some amazing technologies (in particular: Cloud, Gaming/MMOG, Gamification and BigData) that is clear to many inside the IT world. I am not a deep technical “guru” I am a businessman that seeks to understand these technologies in order to find a mean by which they can be leveraged ultimately for commercial gain.
This short book is the output from my investigation of videogames and Massively Multi-user Online Games (MMOG) and is written in as much a chronological order as could be achieved to try to take other business, non-IT, and non-programming literate readers on the journey I took which resulted in a deepening of my understanding of why the once humble graphics processing capabilities have become part of the bedrock for our future exploitation of computer processing as a whole.
In doing so it is hoped this short book has answered some seemingly simple questions during the journey, namely:
- Why GPU’s were developed?
- Why triangles are so important to graphics processing?
- Why high degrees of parallelism are becoming increasingly important?
- How GPU’s are being utilized to deliver significant gains in industries and market sectors far beyond the original design criteria for the GPU? and
- Why GPU’s cannot wholly replace CPU’s and that the future is most likely a symbiosis of the two capabilities leveraging each for their inherent strengths?
For much more on the Convergence of these technologies please review my website:
Designing Scientific Applications on GPUs
Many of today’s complex scientific applications now require a vast amount of computational power. General purpose graphics processing units (GPGPUs) enable researchers in a variety of fields to benefit from the computational power of all the cores available inside graphics cards.
Understand the Benefits of Using GPUs for Many Scientific Applications
“Designing Scientific Applications on GPUs” shows you how to use GPUs for applications in diverse scientific fields, from physics and mathematics to computer science. The book explains the methods necessary for designing or porting your scientific application on GPUs. It will improve your knowledge about image processing, numerical applications, methodology to design efficient applications, optimization methods, and much more.
Everything You Need to Design/Port Your Scientific Application on GPUs
The first part of the book introduces the GPUs and Nvidia’s CUDA programming model, currently the most widespread environment for designing GPU applications. The second part focuses on significant image processing applications on GPUs. The third part presents general methodologies for software development on GPUs and the fourth part describes the use of GPUs for addressing several optimization problems. The fifth part covers many numerical applications, including obstacle problems, fluid simulation, and atomic physics models. The last part illustrates agent-based simulations, pseudorandom number generation, and the solution of large sparse linear systems for integer factorization. Some of the codes presented in the book are available online.