Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error with 7a068df #35

Open
edisonchan opened this issue Nov 26, 2022 · 2 comments
Open

error with 7a068df #35

edisonchan opened this issue Nov 26, 2022 · 2 comments

Comments

@edisonchan
Copy link

OS: Windows 11
Compiler: Visual Studio 2022 MSVC
OpenCL SDK: KhronosGroup OpenCL SDK(https://github.com/KhronosGroup/OpenCL-Guide/blob/main/chapters/getting_started_windows.md)

mixbench-ocl

mixbench-ocl ()
Use "-h" argument to see available options
------------------------ Device specifications ------------------------
Platform: NVIDIA CUDA
Device: NVIDIA GeForce RTX 4080/NVIDIA Corporation
Driver version: 526.98
Address bits: 64
GPU clock rate: 2505 MHz
Total global mem: 16375 MB
Max allowed buffer: 4093 MB
OpenCL version: OpenCL 3.0 CUDA
Total CUs: 76

Buffer size: 256MB
Workgroup size: 256
Elements per workitem: 8
Workitem fusion degree: 4
Workitem stride: NDRange
Buffer allocation: Device allocated
Timer: CL event based
Warning: Half precision computations are not supported
Loading kernel source file...
Precompilation of kernels... OpenCL error in file 'G:\git\mixbench\mixbench-opencl\mix_kernels_ocl.cpp' in line 89 : Code -30.

@ekondis
Copy link
Owner

ekondis commented Nov 26, 2022

Thank you for reporting this. This refers to OpenCL kernel code compilation error (CL_INVALID_VALUE: -30) but it is not clear what bugs it.

Do other opencl programs run correctly? e.g. https://github.com/krrishnarraj/clpeak

@edisonchan
Copy link
Author

edisonchan commented Nov 27, 2022

Thank you for reporting this. This refers to OpenCL kernel code compilation error (CL_INVALID_VALUE: -30) but it is not clear what bugs it.

Do other opencl programs run correctly? e.g. https://github.com/krrishnarraj/clpeak

clpeak is ok here(build and run):

Platform: NVIDIA CUDA
  Device: NVIDIA GeForce RTX 4080
    Driver version  : 526.98 (Win64)
    Compute units   : 76
    Clock frequency : 2505 MHz

    Global memory bandwidth (GBPS)
      float   : 612.28
      float2  : 631.80
      float4  : 639.96
      float8  : 648.81
      float16 : 656.37

    Single-precision compute (GFLOPS)
      float   : 52304.35
      float2  : 51823.82
      float4  : 52095.66
      float8  : 51354.73
      float16 : 51322.97

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 853.48
      double2  : 852.69
      double4  : 850.52
      double8  : 846.52
      double16 : 838.56

    Integer compute (GIOPS)
      int   : 26660.84
      int2  : 26533.69
      int4  : 26473.44
      int8  : 26544.63
      int16 : 26350.34

    Integer compute Fast 24bit (GIOPS)
      int   : 26459.70
      int2  : 26463.14
      int4  : 26457.42
      int8  : 26354.03
      int16 : 25947.06

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 15.07
      enqueueReadBuffer               : 13.99
      enqueueWriteBuffer non-blocking : 15.06
      enqueueReadBuffer non-blocking  : 14.00
      enqueueMapBuffer(for read)      : 21.76
        memcpy from mapped ptr        : 22.84
      enqueueUnmap(after write)       : 26.33
        memcpy to mapped ptr          : 22.43

    Kernel launch latency : 8.61 us

There is not problem mixbench 0.04 too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants