Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

512 filter nets does not work with Cuda backend on gtx cards #1706

Open
Kovax007 opened this issue Mar 7, 2022 · 1 comment
Open

512 filter nets does not work with Cuda backend on gtx cards #1706

Kovax007 opened this issue Mar 7, 2022 · 1 comment

Comments

@Kovax007
Copy link

Kovax007 commented Mar 7, 2022

BUG REPORT

Describe the bug
A clear and concise description of what the bug is.

Steps to Reproduce

  1. only running with 512 size filter nets
    Expected behavior: runing normally, as cudnn with --backend-opts=custom_winograd=false
    Observed behavior: Unhandled exception in worker thread: CUDA error: too many resources requested for launch (C:\lc0\src\neural\cuda\winograd_helper.inc:665), if i reduce manually the kOpInpTransformBlockSize than the error is: Unhandled exception in worker thread: CUDA error: too many resources requested for launch (C:\lc0\src\neural\cuda\winograd_helper.inc:819)

Lc0 version
Lc0 master: (commit: 025105e), Windows 10 with cuda 11.5, compiled binary, cuda backend

Lc0 parameters
With and without: --backend-opts=max_batch=256

Hardware

  • gtx1080.
  • 16Gb
  • i74790k

PS D:\Leela\0.28\ONNX> .\lc0_master.exe
_
| _ | |
|_ |_ || v0.29.0-dev+git.dirty built Mar 2 2022
Detected 4 core(s) and 8 thread(s) in 1 group(s).
Group 0 has 4 core(s) and 8 thread(s).
Found configuration file: D:\Leela\0.28\ONNX/lc0.config
go nodes 1
Loading Syzygy tablebases from Z:/Syzygy/dtz;Z:/Syzygy/wdl
Found 510 WDL, 0 DTM and 510 DTZ tablebase files.
Found pb network file: D:\Leela\0.28\ONNX/512x15-t79_9-swa-256000.pb.gz
Creating backend [cuda]...
CUDA Runtime version: 11.5.0
Latest version of CUDA supported by the driver: 11.6.0
GPU: NVIDIA GeForce GTX 1080
GPU memory: 7.99969 Gb
GPU clock frequency: 1885.5 MHz
GPU compute capability: 6.1
Unhandled exception in worker thread: CUDA error: too many resources requested for launch (C:\lc0\src\neural\cuda\winograd_helper.inc:665)
PS D:\Leela\0.28\ONNX> .\lc0_launchbond.exe
_
| _ | |
|
|_ |_| v0.29.0-dev+git.dirty built Mar 7 2022
Detected 4 core(s) and 8 thread(s) in 1 group(s).
Group 0 has 4 core(s) and 8 thread(s).
Found configuration file: D:\Leela\0.28\ONNX/lc0.config
go nodes 1
Loading Syzygy tablebases from Z:/Syzygy/dtz;Z:/Syzygy/wdl
Found 510 WDL, 0 DTM and 510 DTZ tablebase files.
Found pb network file: D:\Leela\0.28\ONNX/512x15-t79_9-swa-256000.pb.gz
Creating backend [cuda]...
CUDA Runtime version: 11.5.0
Latest version of CUDA supported by the driver: 11.6.0
GPU: NVIDIA GeForce GTX 1080
GPU memory: 7.99969 Gb
GPU clock frequency: 1885.5 MHz
GPU compute capability: 6.1
Unhandled exception in worker thread: CUDA error: too many resources requested for launch (C:\lc0\src\neural\cuda\winograd_helper.inc:819)
PS D:\Leela\0.28\ONNX>

@borg323
Copy link
Member

borg323 commented Mar 8, 2022

@ankan-ban can you take a look at this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants