Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package Cython-generated *.cpp and *.c files #4701

Closed
cphyc opened this issue Oct 10, 2023 · 5 comments · Fixed by #4905
Closed

Package Cython-generated *.cpp and *.c files #4701

cphyc opened this issue Oct 10, 2023 · 5 comments · Fixed by #4905
Labels
enhancement Making something better question

Comments

@cphyc
Copy link
Member

cphyc commented Oct 10, 2023

At the moment, we are including both *.cpp and *.c files generated by Cython in the wheels and, subsequently, in the locally-installed directory. That means that, for each .so file we compile, we have a corresponding source file.

While this isn't a problem per se, it inflates quite significantly the space occupied by yt. To give some number for yt 4.0.3, we have 51M of .c files, 23M of .cpp files and 13M of .so files once decompressed.

To measure:

find /usr/lib/python3.11/site-packages/yt -type f -name '*.so' | xargs du -ch
find /usr/lib/python3.11/site-packages/yt -type f -name '*.cpp' | xargs du -ch
find /usr/lib/python3.11/site-packages/yt -type f -name '*.c' | xargs du -ch

I dug into this because yt's AUR package for Arch Linux ends up installing more than 100M of data, which surprised me!

@cphyc cphyc added enhancement Making something better question labels Oct 10, 2023
@matthewturk
Copy link
Member

Wowza, that is a lot!

My understanding -- which may in fact be incorrect -- is that a lot of the code in these is identical between files. For instance,

grep -Fxf yt/frontends/gamer/cfields.c yt/frontends/ramses/io_utils.c|wc

returns about 1.1M of data over 32000 lines. I believe that if we had these linked into the same object file, it would not. That's ... probably too much of a lift, though.

Alternately, I think we would be in the clear if we did not package the generated source in the wheels, but retained it in the source tarballs.

@neutrinoceros
Copy link
Member

@cphyc
Copy link
Member Author

cphyc commented Oct 11, 2023

Note that, since installing yt requires Cython and Cython builds are deterministic, I don't really see why we would include the generated source files anymore. The only advantage of doing so would be to be able to compile directly the source files without needing Cython, but (a) we're currently not supporting this and (b) if we were to support it, that would make our setup more complex. Is it worth it?

@neutrinoceros
Copy link
Member

I also don't think it's worth it. I think those C files were a useful resource some time back, before the build system configuration was standardised (pyproject.toml), but it's most likely useless now that we don't support any version of Python that's older than this spec any more.

@matthewturk
Copy link
Member

matthewturk commented Oct 11, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Making something better question
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants