/** 
* @file:   README
* CVS:     $Id$
* @author: Asim YarKhan yarkhan@icl.utk.edu
* @author: Heike McCraw mccraw@icl.utk.edu
* @defgroup papi_components Components
* @brief Component Specific Readme file: CUDA
*/

/** @page component_readme Component Readme 

@section Component Specific Information

cuda/ 

CUDA component update: Support for CUPTI metrics (early release)


Known problems and limitations in early release of metric support
-----------------------------------------------------------------

* Only sets of metrics and events that can be gathered in a single
  pass are supported.  Transparent multi-pass support is expected
* All metrics are returned as long long integers, which means that
  CUPTI double precision values will be truncated, possibly severely.
* The NVLink metrics have been disabled for this alpha release.


General information
-------------------

The PAPI CUDA component is a hardware performance counter measurement
technology for the NVIDIA CUDA platform which provides access to the
hardware counters inside the GPU. PAPI CUDA is based on CUPTI support
in the NVIDIA driver library. In any environment where the
CUPTI-enabled driver is installed, the PAPI CUDA component should be
able to provide detailed performance counter information regarding
events on the GPU kernels.  

NOTE: When adding CUDA related events or metrics to the CUDA
component, each event can be added within a users specified CUDA
context.  If the event is outside its context or in no context, a
default CUDA context will be created for the event.

NOTE: In order to disable and destroy the CUDA eventGroup properly,
the user has to call PAPI_cleanup_eventset( EventSet ) before calling
PAPI_shutdown() in the application. This is important since it also
frees the performance monitoring hardware on the GPU.

This PAPI CUDA component has been developed and tested using CUDA
version 10.1 and the associated CUPTI library. CUPTI is released with
the CUDA Tools SDK.

How to install PAPI with the CUDA component?
-------------------------------------------- 

There is ONE required environment variable: PAPI_CUDA_ROOT. This is
required for both compiling, and at runtime. 

An example that works on ICL's Saturn system (at this writing):
export PAPI_CUDA_ROOT=/usr/local/cuda-10.1

Within PAPI_CUDA_ROOT, we expect the following standard directories:
PAPI_CUDA_ROOT/include
PAPI_CUDA_ROOT/lib64
PAPI_CUDA_ROOT/extras/CUPTI/include
PAPI_CUDA_ROOT/extras/CUPTI/lib64

For a standard installed system, this is the only environment variable
required for both compile and runtime. 

System configurations can vary. Some systems use Spack, a package
manager, to automatically keep paths straight. Others (like our own
ICL Saturn System) require "module load" commands to provide some
services, e.g. 'module load cuda-10.1', and these may also set
environment variables and change the LD_LIBRARY_PATH search order.

Users may require the help of sysadmin personnel to navigate these
facilities and gain access to the correct libraries.

Configure PAPI with CUDA enabled. We presume you have navigated to the
directory papi/src. In that directory:
    % ./configure --with-components="cuda"

Build with PAPI_CUDA_ROOT specified (ICL's Saturn example again):
    % export PAPI_CUDA_ROOT=/usr/local/cuda-10.1
    % make 

TESTING the component is installed: Still from papi/src:
    % utils/papi_component_avail

For the CUDA component to be operational, it must find the dynamic
libraries libcuda.so, libcudart.so, and libcupti.so.

If any of these are not found (or are not functional) then the
component will be listed as "disabled" with a reason explaining the
problem. If libraries were not found, then they are not in the
expected places. The component can be configured to look for each of
these libraries in a specific place, and using an alternate name if
desired. Detailed instructions are contained in the Rules.cuda file.
They are technical, users may wish to enlist the help of a sysadmin.

To find a list of CUDA supported events:
    % utils/papi_native_avail | grep -i CUDA

*/
