Feature Request - implement nmap style model loading #1120

michieal · 2023-10-20T23:13:09Z

I didn't see a template for this...

I saw that llama.cpp (github project) has a PR (at the time I saw it) that made use of the nmap? c++ command that allows the model files to be mapped as memory space (like a dedicated swap file) and I was wondering if TorchSharp has something like that or could work with something created in c# to do that?

Having it map the model file(s) memory space to the files on the drive, allows them to be loaded near instantaneously. Also, it decreases the required memory by gigabytes.

I'm not 100% sure about all of the finer details, but I do know that the difference between running the standard llama.cpp and the nmap llama.cpp with the same model files was like night and day. It made getting the program up and running take less than a minute, and didn't slow down the model in a noticeable way.

So, I was wondering if something like this could be implemented, as that would be awesome (and would work cross-platform too.)

NiklasGustafsson · 2023-10-20T23:32:09Z

I believe that torch.from_file already does this for tensors, but not for modules. In other words, the building blocks already exist, and we would use that instead of loading a state dict. It will require some mulling over in order to get it right, I think.

michieal mentioned this issue Oct 20, 2023

Godot Game Engine TorchSharp Environment #1032

Closed

NiklasGustafsson added the Missing Feature label Oct 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request - implement nmap style model loading #1120

Feature Request - implement nmap style model loading #1120

michieal commented Oct 20, 2023

NiklasGustafsson commented Oct 20, 2023 •

edited

Feature Request - implement nmap style model loading #1120

Feature Request - implement nmap style model loading #1120

Comments

michieal commented Oct 20, 2023

NiklasGustafsson commented Oct 20, 2023 • edited

NiklasGustafsson commented Oct 20, 2023 •

edited