You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the 2018.1d release (and onwards) I'd like to focus on reducing the header and executable size, together with improving compile time (and, as a side effect, runtime performance). This was last done in 2013 (see the blog article) and while current workflow enforces enough rules to prevent worsening of this problem, it's not actively improving it either.
The ultimate goal is being able to ship useful utilities as "single-header" libraries without being laughed at for compile times and having compile times competitive with C header-only libs, yet staying in C++. Which, of course, means much better compile times than other C++ projects (json.hpp and Eigen, I'm looking at you).
MSVC STL has a definition with a default template argument directly in <vector>, which rules out any forward declarations
Replace with our own types in all internals and APIs, keep only in STL compatibility headers
Provide (and use) a header wrapping platform-specific forward-declaration headers for std::reference_wrapper because <functional> alone is 22k lines (and grows to 44k in C++17, WTAF!!) using our own Containers::Reference instead
<type_traits> on libstdc++, libc++ and MSVC STL as well (there it has a full definition, yay)
Provide (and use) a header wrapping platform-specific forward-declaration headers for std::tuple (<tuple> is >20k lines on libstdc++) -- mosra/corrade@2345195
Replace with Containers::Pair / Containers::Triple / custom structs in all internals and APIs, keep only in STL compatibility headers
Investigate possibility (and viability) of forward declarations for other types
does std::pair have a forward declaration somewhere? though <utility> is just 4k lines nope, but we have Containers::Pair now
<array> is also big (~20k) but we're not really using it in public APIs -- yes, mosra/corrade@fd8030d
The Utility/TypeTraits.h header (used by Debug and thus basically everything else) includes a quite heavy <iterator> that's needed only for std::begin() in libc++, it's quite big so include it conditionally only for libc++ -- mosra/corrade@9b258d7
it also includes <utility>, is it needed? decay() is in <type_traits> already -- mosra/corrade@9b258d7
Many headers conditionally include <algorithm> just to have std::min() for scalars on MSVC -- mosra/corrade@1719c57, 563dee0
Hide usage of <map>, <unordered_map> and other huge containers from headers, PIMPL these (no, I'm not going to implement them myself just yet) -- it's just Interconnect and Text libraries left, which need a significant cleanup anyway
Investigate opting into __make_integer_seq compiler builtins instead of our own GenerateSequence, might have a significant effect on build times especially with long vectors, large matrices or constexpr constructors of the new MaterialData (Material Data rework #459) -- https://reviews.llvm.org/rL252036 converted to a O(log n) implementation in mosra/corrade@0b82814, which should be good enough
Put all separate GL function symbols into a big struct to reduce amount of exported symbols (basically the same way as Vulkan does it now) -- b580458
Investigate gains of -Oz with Emscripten, switch the toolchains to that, fix Emscripten closure compiler (Emscripten closure compiler breaks WebGL 2 calls #211) and investigate how much can it shave off the JS code (around 200 kB?) -- mentioned in the docs as of df6b414, not enabling by default since it has compile time impact
Some preprocessor hook for Utility::Resource that's able to strip license headers off shader sources
needs some MinifierShaderConverter
Reduce repetitive strings in debug output literals for enums (print the prefix just once) in progress -- 98232f3
Make it possible to compile for Emscripten with -s FILESYSTEM=0 (compile away parts of Utility::Directory, GL::Shader::addFile(), non-callback-based Trade::AbstractImporter::openFile() etc.
Reimplement printf-based Utility::format() without printf (float conversion with Ryū, integer conversion using "the fastest ever integer conversion" as claimed by the author of fmt) -- float32 tables in Ryū are 624 B and even float64 tables in Ryū are just 10 kB and with Utility::format() if we don't print doubles, the tables won't even get compiled in
Remove all uses of printf() -- a naive copypasted implementation using grisu3 was just 25 kB, shaving > 10 kB compared to printf (and being much faster)
Ensure dependencies (plugins) that matter for WASM don't use it (would be hard to ensure for tinygltf, ugh)
It gets used by libc++'s abort_handler(), patch emscripten to not do that
This all needs a blog post (compare to competing implementations)
Create a direct EmscriptenApplication instead of using Sdl2Application -- should trim down at least the generated *.js file size (the library_sdl.js is 137 kB (though unminified)) Emscripten application #300
Port away from tiny_gltf and json.hpp (json.hpp alone is a 400 kB header and the recent versions are almost 700 kB) -- there's a dependency-less GltfImporter since mosra/magnum-plugins@b7c4c58
just the TinyGltfImporter plugin compilation alone takes around 15 seconds -- for a single file -- which is more than all other plugins combined
I bet it has some effect on WASM output size as well, just don't know how much
Remove hard dependncy on GL from the Text library by creating an abstract API-independent base for glyph cache -- 834c5fe
That'll allow the plugins to be built and tested without needing to take care of GL/GLES/WebGL differences -- mosra/magnum-plugins@e6f8792
Make it possible to fully disable the debug output (and then define CORRADE_ASSERT to the C assert) -- needed for the single-header libs, done in mosra/corrade@64c56aa and cee5307
Similarly for configuration value and tweakable literal parsers -- 64bc7f9 and 77a8c0c
A separate repository for all dependencies we currently build manually in every CI job (ANGLE, SDL for WinRT, Bullet, GLFW...) -- https://github.com/mosra/magnum-ci
download the binaries from ci.magnum.graphics
some token authentication so it's not publicly accessible (just restricting to Travis/AppVeyor IPs is not enough, as we want to prevent mainly CI users from abusing the server) turned out to not be a problem in practice
Build the ES2/ES3 variants without code that's not API-dependent -- a lot of it still is though including e.g. SceneGraph doc snippets, so it doesn't make that much of a difference
and remove ES2/ES3 jobs from plugins once all plugin interfaces will be API-independent -- mosra/magnum-plugins@e6f8792
Long-term
Opt-in resizing APIs for Containers::Array so we can ditch std::vector as a growable storage, eliminating it from headers completely -- mosra/corrade@3cf41e3 and following
compare with other 3rd party implementations (stb stretchy buffer, folly fbvector), benchmarking with various amounts of appended data
A string view class so we can get rid all the const char* / const char(&)[n] / const std::string& overloads everywhere, again eliminating std::string from headers completely -- mosra/corrade@72f652d
Gradually convert all other APIs to use it (introduces a lot of backwards-incompatible changes, do it gradually)
Drop all compatibility StringStl.h includes once enough time passes
Further work
Investigate compiling with a lighter-weight STL implementation (e.g. nanostl, EASTL?) -- most of them have no type_traits, we need type_traits not happening, at this point it's easier to just ditch the remaining use of STL altogether and rely on compiler-specific builtins
Can C++20(?) modules help in any way with compile times? So far I didn't see any experiment that would prove a breakthrough in compile times -- probably not
The Utility/Debug.h headers will be still quite heavy after forward-declaring strings and removing <iterator> and since these get used almost everywhere, what to do? no it's not, it's fine since mosra/corrade@89da382 (just <utility> and <type_traits>)
For the 2018.1d release (and onwards) I'd like to focus on reducing the header and executable size, together with improving compile time (and, as a side effect, runtime performance). This was last done in 2013 (see the blog article) and while current workflow enforces enough rules to prevent worsening of this problem, it's not actively improving it either.
The ultimate goal is being able to ship useful utilities as "single-header" libraries
without being laughed at for compile timesand having compile times competitive with C header-only libs, yet staying in C++. Which, of course, means much better compile times than other C++ projects (json.hpp and Eigen, I'm looking at you).Compile time improvements
#pragma once
in all headers, as it leads to measurable compile time improvementCORRADE_TARGET_LIBCXX
/CORRADE_TARGET_LIBSTDCXX
macro that tells me whether libc++ or libstdc++ is used (needed by the things below) -- detection using<ciso646>
and _LIBCPP_VERSION, https://stackoverflow.com/questions/31657499/how-to-detect-stdlib-libc-in-the-preprocessor, there needs to be an exception for GCC < 6.1: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65473 (and a TODO to use<version>
in C++20) -- mosra/corrade@b6b37fbstd::string
-- mosra/corrade@2345195<xstring>
, ruling out any forward declarationstd::vector
-- mosra/corrade@2345195anything in libstdc++?nope<vector>
, which rules out any forward declarationsProvide (and use) a header wrapping platform-specific forward-declaration headers forusing our ownstd::reference_wrapper
because<functional>
alone is 22k lines (and grows to 44k in C++17, WTAF!!)Containers::Reference
instead<type_traits>
on libstdc++, libc++ and MSVC STL as well (there it has a full definition, yay)std::tuple
(<tuple>
is >20k lines on libstdc++) -- mosra/corrade@2345195<type_traits>
on libstdc++<utility>
on MSVC STL, defined next tostd::pair
Containers::Pair
/Containers::Triple
/ custom structs in all internals and APIs, keep only in STL compatibility headersdoesnope, but we havestd::pair
have a forward declaration somewhere? though<utility>
is just 4k linesContainers::Pair
now<array>
is also big (~20k) but we're not really using it in public APIs -- yes, mosra/corrade@fd8030d<iterator>
that's needed only forstd::begin()
in libc++, it's quite big so include it conditionally only for libc++ -- mosra/corrade@9b258d7<utility>
, is it needed?decay()
is in<type_traits>
already -- mosra/corrade@9b258d7<algorithm>
just to havestd::min()
for scalars on MSVC -- mosra/corrade@1719c57, 563dee0<map>
,<unordered_map>
and other huge containers from headers, PIMPL these (no, I'm not going to implement them myself just yet) -- it's justInterconnect
andText
libraries left, which need a significant cleanup anyway<algorithm>
there -- mosra/corrade@1719c57std::unique_ptr
andstd::reference_wrapper
-- mosra/corrade@4e71957 and mosra/corrade@a874478<cmath>
in C++17: https://twitter.com/czmosra/status/1085993965529255936 -- mosra/corrade@115b56eLook into https://github.com/RPGillespie6/fastcov to have faster coverage buildsno longer needed, lcov 2.0 is fast enoughInvestigate opting intoconverted to a O(log n) implementation in mosra/corrade@0b82814, which should be good enough__make_integer_seq
compiler builtins instead of our own GenerateSequence, might have a significant effect on build times especially with long vectors, large matrices or constexpr constructors of the new MaterialData (Material Data rework #459) -- https://reviews.llvm.org/rL252036Executable size reduction / perf improvements (mainly WebAssembly-focused)
-Wglobal-constructors -Wexit-time-destructors
on ClangGL::defaultFramebuffer
,GL::Context
-- b05c887-Oz
with Emscripten, switch the toolchains to that, fix Emscripten closure compiler (Emscripten closure compiler breaks WebGL 2 calls #211) and investigate how much can it shave off the JS code (around 200 kB?) -- mentioned in the docs as of df6b414, not enabling by default since it has compile time impactSome preprocessor hook forstrip license headers off shader sourcesUtility::Resource
that's able toMinifierShaderConverter
std::sort()
replaced with a counting sort, saving ~8 kB -- e96996estd::unordered_map
replaced with hand-sorted compile-time array, saving ~8 kB -- 54c42df, e2621fa-s FILESYSTEM=0
(compile away parts ofUtility::Directory
,GL::Shader::addFile()
, non-callback-basedTrade::AbstractImporter::openFile()
etc.Bigger tasks
std::cout
adds 250 kB to JS + WASM size) -- depends on string/stringview classesUtility::format()
-- app usingstd::printf()
has only 40 kB compared to thatUtility::Debug
classUtility::Directory
-- mosra/corrade@c1a5eedUtility::format()
without printf (float conversion with Ryū, integer conversion using "the fastest ever integer conversion" as claimed by the author of fmt) -- float32 tables in Ryū are 624 B and even float64 tables in Ryū are just 10 kB and withUtility::format()
if we don't print doubles, the tables won't even get compiled inprintf()
-- a naive copypasted implementation using grisu3 was just 25 kB, shaving > 10 kB compared to printf (and being much faster)abort_handler()
, patch emscripten to not do thatEmscriptenApplication
instead of usingSdl2Application
-- should trim down at least the generated*.js
file size (thelibrary_sdl.js
is 137 kB (though unminified)) Emscripten application #300tiny_gltf might be going away from json.hpp on its own (Use rapidjson as a JSON library syoyo/tinygltf#141)just the TinyGltfImporter plugin compilation alone takes around 15 seconds -- for a single file -- which is more than all other plugins combinedI bet it has some effect on WASM output size as well, just don't know how muchText
library by creating an abstract API-independent base for glyph cache -- 834c5feCORRADE_ASSERT
to the Cassert
) -- needed for the single-header libs, done in mosra/corrade@64c56aa and cee5307#include <DebugStl.h>
to be able to print STL typesstd::pair
printing toDebugStl.h
as well, once it's gone from most headersCI speedup
ci.magnum.graphics
some token authentication so it's not publicly accessible (just restricting to Travis/AppVeyor IPs is not enough, as we want to prevent mainly CI users from abusing the server)turned out to not be a problem in practiceLong-term
Containers::Array
so we can ditchstd::vector
as a growable storage, eliminating it from headers completely -- mosra/corrade@3cf41e3 and followingarrayRemoveUnordered()
as well -- mosra/corrade@c9089f7const char*
/const char(&)[n]
/const std::string&
overloads everywhere, again eliminatingstd::string
from headers completely -- mosra/corrade@72f652dString
class as well, with small string optimization (https://wg21.link/P1330, chapter 4) and easily convertible from/to the string view -- mosra/corrade@64f5836Utility::String
to use it in progressUtility::format()
work with it -- mosra/corrade@ea9f217formatString()
toFormatStl.h
-- mosra/corrade@e941c84StringStl.h
includes once enough time passesFurther work
Investigate compiling with a lighter-weight STL implementation (e.g. nanostl, EASTL?) -- most of them have no type_traits, we need type_traitsnot happening, at this point it's easier to just ditch the remaining use of STL altogether and rely on compiler-specific builtinsCan C++20(?) modules help in any way with compile times? So far I didn't see any experiment that would prove a breakthrough in compile times-- probably notThe Utility/Debug.h headers will be still quite heavy after forward-declaring strings and removingno it's not, it's fine since mosra/corrade@89da382 (just<iterator>
and since these get used almost everywhere, what to do?<utility>
and<type_traits>
)Further read / references:
The text was updated successfully, but these errors were encountered: