Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to compile LC0 on Windows 11 ARM, when using a MacBook Pro M1 MAX and Parallels Desktop? #1800

Open
Chess321 opened this issue Nov 29, 2022 · 4 comments

Comments

@Chess321
Copy link

How to compile LC0 on Windows 11 ARM, when using a MacBook Pro M1 MAX and Parallels Desktop?

When we will have the latest dev. version for Windows ARM?

It looks like I need to create a LC0.exe, which can use Apple MacBook ARM cores, to run LC0 inside ChessBase 17 on Windows 11 ARM.

Can someone try to compile on Windows 11 ARM or with Apples terminal?

Maybe this could help you to get an idea: official-stockfish/Stockfish#4241

@gsobala
Copy link
Contributor

gsobala commented Nov 29, 2022

Compile native MacOS and just link to it from the VM using ssh / putty / inbetween.exe. It's quicker.

@borg323
Copy link
Member

borg323 commented Nov 29, 2022

As far as I know, nobody has tested lc0 on windows arm. It is likely there are some assumptions that windows builds are on x64 (or x86), so code changes may be necessary. If you are still interested in trying, please ask in the #help channel of our discord chat http://lc0.org/chat - I'm certainly interested in getting this done.

@Chess321
Copy link
Author

Chess321 commented Nov 30, 2022

As far as I know, nobody has tested lc0 on windows arm. It is likely there are some assumptions that windows builds are on x64 (or x86), so code changes may be necessary. If you are still interested in trying, please ask in the #help channel of our discord chat http://lc0.org/chat - I'm certainly interested in getting this done.

@borg323
I tried your lc0.exe, which I saw on discord.
Can you please paste it here too for other people?

It's extreme slow:
_
| _ | |
|_ |_ |_| v0.30.0-dev+git.dirty built Nov 29 2022
Detected 8 core(s) and 8 thread(s) in 1 group(s).
Group 0 has 8 core(s) and 8 thread(s).
go nodes 100
Found pb network file: \Mac\Home\Desktop\LC0/d0ed346c32fbcc9eb2f0bc7e957d188c8ae428ee3ef7291fd5aa045fc6ef4ded
Creating backend [eigen]...
Using Eigen version 3.3.7
Eigen max batch size is 256.
info depth 1 seldepth 2 time 40669 nodes 3 score cp 13 nps 0 tbhits 0 pv d2d4 g8f6
info depth 1 seldepth 2 time 42339 nodes 4 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6
info depth 1 seldepth 2 time 47458 nodes 4 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6
info depth 1 seldepth 2 time 52541 nodes 4 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6
info depth 1 seldepth 2 time 57562 nodes 4 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6
info depth 2 seldepth 3 time 57894 nodes 7 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3
info depth 2 seldepth 3 time 62949 nodes 8 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3
info depth 2 seldepth 3 time 68015 nodes 8 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3
info depth 2 seldepth 3 time 73049 nodes 8 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3
info depth 2 seldepth 4 time 76831 nodes 11 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5
info depth 2 seldepth 4 time 81897 nodes 17 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5
info depth 2 seldepth 4 time 86908 nodes 17 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5
info depth 2 seldepth 4 time 91924 nodes 17 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5
info depth 2 seldepth 4 time 96949 nodes 17 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5
info depth 2 seldepth 4 time 102065 nodes 17 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5
info depth 2 seldepth 4 time 107213 nodes 17 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5
info depth 3 seldepth 5 time 107463 nodes 22 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 5 time 112530 nodes 22 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 5 time 117654 nodes 26 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 5 time 122657 nodes 31 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 5 time 127658 nodes 31 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 5 time 132671 nodes 39 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 5 time 137704 nodes 39 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 5 time 142766 nodes 39 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 5 time 147782 nodes 39 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 152077 nodes 47 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 157171 nodes 56 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 162267 nodes 64 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 167281 nodes 64 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 172358 nodes 64 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 177373 nodes 64 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 182393 nodes 64 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 187502 nodes 64 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 6 time 192577 nodes 64 score cp 14 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 7 time 195249 nodes 75 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 3 seldepth 7 time 200270 nodes 75 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3
info depth 4 seldepth 7 time 204965 nodes 100 score cp 15 nps 0 tbhits 0 pv e2e4 c7c6 g1f3 d7d5 d2d3 e7e6
bestmove e2e4 ponder c7c6

Is eigen using only the CPU?
Is it possible to use only 1-2 CPU cores and 32 GPU cores like the native version on macOS in BanksiaGUI is doing?

How to run the benchmark?

It runs fine with a net (782344) in ChessBase 17.
But it's extreme slow and it doesn't matter if I select 1 or up to 8 CPU cores, the speed is the same, and it also doesn't matter if the Buddy engine is on or off.

When I open a new board, in most cases it takes 30 to 40 seconds before the first depth and evaluation is available.
Sometimes I get depth 3 after that 30 seconds and sometimes I get depth 2 after 60 seconds.

But note that when I run the Buddy engine too, then Buddy reaches very very fast a depth between 9 and 29 depends on what Buddy is searching and showing.

@borg323
Copy link
Member

borg323 commented Nov 30, 2022

There is no point posting the binary, it will always be very slow as it is only using the cpu - you can make it a bit faster with correct settings, but this was only meant as a proof of concept and now we know it works. It may be possible to use the gpu with opencl, but I can't move it further than this without access to hardware. Moreover, the opencl backend is not supporting the latest nets, so the solution to your issue is really the one outlined in #1800 (comment) and detailed in discord.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants