-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor handling of embedded magic.mgc
.
#4989
base: dev
Are you sure you want to change the base?
Conversation
It was rewritten from C++ to CMake, and compression is now being done with CMake's commands. This allows us to pack the upstream `magic.mgc` file and stop keeping a pre-compressed and pre-escaped one in source control. The size of the uncompressed file used to be kept at the start of the binary file. We no longer have the capability to easily modify binary files with CMake, so the script generates a complete header, alongside a constexpr variable with the uncompressed size.
It was also simplified a bit and `gzip_wrappers.cc` is now unused and got removed.
Compressed size dropped from 333067 to 270578 bytes. Changes to the gzip compressor were reverted. The script was also renamed and slightly updated.
Higher levels require CMake 3.26+.
…thout needing a pool.
|
||
add_custom_command( | ||
OUTPUT "${MGC_GZIPPED_H_OUTPUT_FILE}" | ||
DEPENDS "${libmagic_DICTIONARY}" "${PROJECT_SOURCE_DIR}/scripts/generate_embedded_data_header.cmake" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEPENDS "${libmagic_DICTIONARY}" "${PROJECT_SOURCE_DIR}/scripts/generate_embedded_data_header.cmake" | |
DEPENDS "${libmagic_DICTIONARY}" |
The Docker build apparently does not like the dependency to generate_embedded_data_header.cmake
. I tried to reproduce it locally on WSL (with similar versions of CMake and make) but failed. I can just remove it, but magic_mgc.zst.h_
will not be regenerated in the (rare) case the script changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to reproduce this locally on Docker, and spent lots of time trying to figure it out without success. I even docker cp
ed the generated build tree and compared the makefiles to those on my plain WSL environment (it builds there for some reason), but could not find anything suspicious. 😕
SC-25167
SC-47655
SC-47656
SC-47657
SC-47658
This PR overhauls the facilities to embed and load the
magic.mgc
file that is needed by libmagic:The most important change is the removal of
magic_mgc_gzipped.bin.tar.bz2
. This file contained a copy ofmagic.mgc
that was compressed, converted to escaped C characters, packed and compressed again to take less space, and stored in source control, so that at build time to get unpacked and#include
d inmgc_dict.cc
. Because this file was being prepared ahead of time by a manually invoked C++ program, this approach had the disadvantage that it tied the Core to a specific version of libmagic. This was made evident in Update libmagic to version 5.45 #4673, where just updating libmagic was not enough; we also had to updatemagic_mgc_gzipped.bin.tar.bz2
.What we do now is rely on CMake to find
magic.mgc
and perform its entire preparation at build time. The C++ program was rewritten to be a CMake script, which makes it much simpler and enables it to run on cross-compilation scenarios. The script accepts the uncompressedmagic.mgc
file, compresses it and produces a header file of the following format:The algorithm to compress
magic.mgc
was changed from gzip to zstd, resulting in a 17.9% reduction of the compressed size (from 333067 το 273500 bytes).Tests for
mgc_dict
were also updated to use Catch2, and were wired to run along with the other standalone unit tests.mgc_dict
, which was done as well.Validated by successfully running
unit_mgc_dict
locally.TYPE: BUILD
DESC: Improve embedding of
magic.mgc
and allow compiling with any libmagic version.