cudatoolkit.com Open in urlscan Pro
2a00:f940:2:2:1:1:0:75  Public Scan

Submitted URL: https://www.cudatoolkit.com/
Effective URL: https://cudatoolkit.com/
Submission: On December 24 via api from US — Scanned from US

Form analysis 0 forms found in the DOM

Text Content

NVIDIA CUDA TOOLKIT


Download Cuda Toolkit

CUDA® is a parallel computing platform and programming model developed by
NVIDIA. It allows you to significantly improve computing performance by
harnessing the power of the graphics processing unit (GPU).

CUDA was developed with several design goals in mind:

 * Provide a small set of extensions to standard programming languages such as C
   that provide simple implementations of parallel algorithms. With CUDA
   C/C++programmers can focus on the task of parallelizing algorithms rather
   than wasting time on their implementation.
 * Support for heterogeneous computing, where applications use both the CPU and
   GPU. The sequential portions of the applications are executed on the CPU,
   while the parallel portions are offloaded to the GPU. In this way, CUDA can
   be gradually applied to existing applications. The CPU and GPU are treated as
   separate devices that have their own memory areas. This configuration also
   allows simultaneous computation on the CPU and GPU without competing for
   memory resources.

GPUs that support CUDA have hundreds of cores that can collectively run
thousands of compute threads. These cores share resources, including a register
file and shared memory. On-chip shared memory allows parallel tasks running on
these cores to communicate without sending them across the system memory bus.

This guide shows you how to install and verify that the CUDA development tools
are working properly.


FEATURES AND HIGHLIGHTS

 * * GPU timestamp: Launch timestamp
   * Method: The name of the GPU method. This is either “memcpy*” for memory
     copies or the name of the GPU core. Memory copies have a suffix that
     describes the type of memory transfer, for example, “memcpyDToHasync” means
     asynchronous transfer from device memory to host memory
   * GPU time: This is the time it takes to execute the method on the GPU
   * CPU Time: This is the sum of the GPU time and the CPU load to run this
     method. At the driver-generated data level, CPU time is only the CPU load
     of running a method for non-blocking methods; for blocking methods, this is
     the sum of GPU time and CPU load. All kernel runs are non-blocking by
     default. But if any profiler counters are enabled, the kernel is blocked
     from starting. Requests for asynchronous memory copying in different
     threads are not blocked
   * Thread ID: identification number for the thread
   * Columns for kernel methods only
   * Occupancy: Occupancy is the ratio of the number of active warps per
     multiprocessor to the maximum number of active warps
   * Profiler Counters: For a list of supported counters, see Profiler Counters
   * grid size: The number of blocks in the grid along dimensions X,Y and Z is
     displayed as [num_blocks_X num_blocks_Y num_blocks_Z] in one column
   * block size: The number of threads in a block along the dimensions X, Y and
     Z is displayed as [num_threads_X num_threads_Y num_threads_Z] in one column
   * dyn smem per block: dynamic size of shared memory per block in bytes
   * sta smem per block: static size of shared memory per block in bytes
   * registration in stream: number of registers in stream
   * Columns for memcopy methods only
   * memory transfer size: size of memory transferred in bytes
   * Host memory transfer type: Indicates whether the memory transfer uses
     page-scanned or page-locked memory


VERIFYING THE INSTALLATION

Follow these steps to verify the installation −

Step 1 − Check the version of CUDA toolkit by entering nvcc -V at the command
line.

Step 2 − Run deviceQuery.cu located at: C:\ProgramData\NVIDIA Corporation\CUDA
Samples\v9.1\bin\ win64\Releaseto view information about your video card. The
result will look like this −



Step 3 − Run the throughput test located at C:\ProgramData\NVIDIA
Corporation\CUDA Samples\v9.1\bin\win64\Release. This ensures that the host and
device can communicate with each other correctly. The output will look like this
−



If any of the above tests fail, it means that the toolkit was not installed
properly. Repeat the installation following the instructions above.


DOWNLOAD CUDA TOOLKIT

Cuda Toolkit

Прокрутить вверх