Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that macro code doesn't leak into released applications #1892

Closed
jakemac53 opened this issue Oct 8, 2021 · 9 comments
Closed

Ensure that macro code doesn't leak into released applications #1892

jakemac53 opened this issue Oct 8, 2021 · 9 comments
Labels
static-metaprogramming Issues related to static metaprogramming

Comments

@jakemac53
Copy link
Contributor

jakemac53 commented Oct 8, 2021

With the macro proposal as it stands today application code ends up with transitive imports to the macro libraries and all their supporting code - this is because the macro classes themselves are what those users use to apply macros.

Compare that to a solution like build_runner where the builder code is decoupled from the annotations that cause it to be applied - this means application code doesn't end up with analyzer or build dependencies, etc.

One approach to this problem is to just rely on tree shaking, but I don't believe that is a sufficient solution. At a minimum it will involve significant additional code bloat for debug builds, which will degrade the developer experience.

#1831 was one attempt to solve this issue that we decided against, due to complexity as well as a loss of testability.

Possibly we should decouple the macro implementation from the annotation you use to apply it, similar to how build_runner works?

@jakemac53 jakemac53 added the static-metaprogramming Issues related to static metaprogramming label Oct 8, 2021
@munificent
Copy link
Member

munificent commented Oct 8, 2021

We already require that macro definitions be in not just a separate library but a separate library cycle from any place where the macro is used. That means that every library the macro definition imports is also not part of any cycle going back to places where the macro is applied.

I think that means that, unless the user does actually use code from the macro definition at runtime, it should be pretty easy for any implementation to tree-shake out the entire macro definition library and its dependencies. If that's not the case, I'd want to know that before we add complexity to the feature to solve it. If it's not actually a problem, I'd rather keep things simple.

If it does turn out to be a problem, I still think that "compile time" should be a property of the import and not the library. In other words, I like the idea (that @kevmoo initially suggested in #1360) of something like:

const import 'some_macro_lib.dart';

The const modifier here (or some other syntax) means "report an error if any identifier imported from this library is used outside of metadata annotations". That would ensure that if a user expects to have that library tree-shaken from their app (which they indicated by using const import) then the compiler will tell them if it can't be.

@lrhn
Copy link
Member

lrhn commented Oct 11, 2021

Macros are special in that they are run during compilation, at a time when other code is still just syntax, and possibly not yet valid Dart.

Macros are like a separate program - they need to be runnable by themselves, which does mean that their code must be complete and valid Dart at the time they are run. The part of macros that is embedded in the library being macro-modified, the annotation as we are currently doing it, needs to be compiled and run before the library it's part of, so we're really talking multi-stage programming/a multi-level language. Which means we likely do need syntactic markers to distinguish the separate stages, to distinguish the code being run from the code that is data for the current stage.

A const import (or macro import or whatever) is one way to do that, but I also think we need to mark the annotation.
(If we allow macros to introduce new macro annotations, which I believe we do, which then needs to run in a later stage, we may need a const const import and @const const macroAnnotation() in order to generate a const import and @const otherMacroAnnotation() and have that run at a later stage. Or maybe just generating a new const import and @macroAnnotation after the first one is sufficient, since the original one has been removed when it's executed, and any remaining multi-stage code is then necessarily run in a later stage.)

So, I recommend having macro-specific syntax for both import and annotation. Running the macro should remove both and retain only the remaining program syntax, plus anything added by the macro. If that introduces more macros, we run another macro expansion step, which can see the code generated so far.

Effectively, a macro-annotated program is a template, it contains program syntax and macro syntax. Compiling and executing the macro syntax with the program syntax as data will generate a new program. That program may contain more macros, which will then need to be run again, until we get just a plain program syntax with no macros. This is not higher-order macros, just curried code generation, which at least keeps the complexity level down.

@jakemac53
Copy link
Contributor Author

jakemac53 commented Oct 11, 2021

If it does turn out to be a problem, I still think that "compile time" should be a property of the import and not the library. In other words, I like the idea (that @kevmoo initially suggested in #1360) of something like:

const import 'some_macro_lib.dart';

That was my first intuition for how to deal with this, and I think we could make that approach work. But I liked controlling this at the library level instead for a few reasons:

We don't have to introduce a new kind of import for people to learn

This isn't a huge deal, I don't think it adds a ton of complexity for users.

You can't accidentally use a normal import and bloat the app

I think we could mostly resolve by only allowing macro applications to invoke constructors from const imports.

Macros and other code could be safely exposed through a single import.

I do think this was a really nice part about the macro library approach. People will end up needing to have 2 imports for any library that has both runtime code and macros with this proposal.


But it does come with the downside of making testing more difficult. I think for users of macros the library approach is better, and for authors of macros the import approach is better.

@jakemac53
Copy link
Contributor Author

Macros are like a separate program - they need to be runnable by themselves, which does mean that their code must be complete and valid Dart at the time they are run.

This is also part of why I like making macro libraries an explicit thing. It gives you a quick/easy way to correctly identify the libraries that define macros, which only requires parsing the library. This allows a compiler to smartly break up the program into an optimal number of chunks (which become separate apps effectively) early in the process. Specifically it can group macros that don't depend on each other at all together and create a single macro expansion program for each of those groups.

Which means we likely do need syntactic markers to distinguish the separate stages, to distinguish the code being run from the code that is data for the current stage.

The current design, with the 3 phase approach, does achieve this without any special syntax.

If that introduces more macros, we run another macro expansion step, which can see the code generated so far.

I understand the appeal of this iterative approach, but macros don't compose together as well as they do with the 3 phase approach. If two separate macros that don't know about each other are used (explicitly by the user), and one adds a type that the other needs to be able to see, then the explicit 3 phase approach can deal with that while the iterative approach can't.

@lrhn
Copy link
Member

lrhn commented Oct 11, 2021

Macros and other code could be safely exposed through a single import.

I do think this was a really nice part about the macro library approach. People will end up needing to have 2 imports for any > library that has both runtime code and macros with this proposal.

If the macro can inject imports into the modified library (which it probably needs to be able to, if the code it generates can depend on anything), then that should be solvable by just having the macro inject whichever libraries are needed at run-time.

I'm not sold on making macro libraries be special libraries. They contain Dart code, which runs like normal Dart code, possibly triggered by an annotation instead of calling main (but that's not much different from what Isolate.spawn does). If you split your macro code into multiple libraries, the "macro library" will probably be the one exporting the rest ... but it doesn't contain any specific code itself, so there is nothing macro-special about it. (Definitely do not want to make macro-libraries only depend on other macro-libraries).

Importing a library as a macro (first-stage code) allows it to be evaluated (to a constant value) before the surrounding (later-stage) code is even valid. All that requires is that the imported library is complete and compileable.
If macro libraries are not special, you can test them like normal code. If they are somehow special and can only be imported as macros, testing becomes much harder (unless you just test the sub-libraries directly, and then there is no point in having a special macro library).

@jakemac53
Copy link
Contributor Author

If the macro can inject imports into the modified library (which it probably needs to be able to, if the code it generates can depend on anything), then that should be solvable by just having the macro inject whichever libraries are needed at run-time.

We don't have an api for macros to directly inject imports - they can only emit resolved symbols (which will probably be translated into a prefixed import and prefixed reference when visualized to the user). In general that feature seems like it would be likely to cause difficult to resolve/understand conflicts between imported symbols - especially if you could import them without a prefix.

We seem to be generally trending more towards macros compiling to library augmentation files which would have their own imports/scope - and not be able to affect the scope in the macro application library. So if we did add an api to enable you to explicitly add an import, it would only be into the library augmentation file.

Importing a library as a macro (first-stage code) allows it to be evaluated (to a constant value) before the surrounding (later-stage) code is even valid.

Yes in general I agree some way of marking either the import or library itself is a good thing.

If you split your macro code into multiple libraries, the "macro library" will probably be the one exporting the rest ... but it doesn't contain any specific code itself, so there is nothing macro-special about it.

Can you clarify what you mean by this?

I my macro library proposal I allowed you to create a normal library which exports both macro libraries and normal libraries, not sure if this is related.

(Definitely do not want to make macro-libraries only depend on other macro-libraries).

I agree macro libraries should be able to import any kind of library, and possibly export them too but that might be dubious (especially with this proposal, it would likely be a footgun).

If macro libraries are not special, you can test them like normal code. If they are somehow special and can only be imported as macros, testing becomes much harder (unless you just test the sub-libraries directly, and then there is no point in having a special macro library).

Yes I agree they would have to be tested differently and that presents a challenge, but I think they would still be unit testable. The test would have to run as a macro itself so they would look a little different than normal tests, and would fail at compile time and not runtime, but all the functionality could still be there.

@lrhn
Copy link
Member

lrhn commented Oct 12, 2021

Can you clarify what you mean by this?

I my macro library proposal I allowed you to create a normal library which exports both macro libraries and normal libraries, not sure if this is related.

Macros code is plain Dart code, so you'd likely be using a lot of helper libraries and support classes. Not all of that is uniquely macro code. So, like most other packages, you're likely to have dozens of library files in src/ of your package, and maybe export a bunch of them through a top-level library.

You say that a normal library can export a macro library (without becoming one). That worries me in two ways. 

First, you don't export libraries, you export declarations, so if the designation of being a "macro library" is really on the library, then it's not clear what exporting declarations from that library does with the "being a macro library" part.

Also, I was assuming that the top-level library which is exporting the macro definitions would need to be a macro library itself. If importing a macro library is special, in that use them as macro annotations in otherwise incomplete libraries, then a "normal" library containing a macro definition seems like it shouldn't work. (That's why I assumed the top-level library would be the macro library, and it would just be exporting normal libraries as macros).

Is the underlying idea here to have multi-stage imports? Where you have one import, which imports both macro declarations to be used in "stage one" (running macros), and further declarations to be used in "stage two" (compiling after running the macros and removing the stage-one code)?
That does make sense, and with some kind of marker on the export, it might just work.

Consider a "macro library" like:

export "src/macro_decl.dart"late export "src/helper.dart";

which you can then const import "package:macro/macro.dart"; and get the macro available, and also immediately add the helper declarations to library being generated. (And each declaration exported would then remember whether it's available now or later).

So, basically, what does it really mean to be "a macro library"?

I worry that making the distinction between normal and macro libraries too big is not worth the complexity it adds. I'd prefer if being used as a macro is a role, not necessarily an identity.

@munificent
Copy link
Member

Macros are like a separate program - they need to be runnable by themselves, which does mean that their code must be complete and valid Dart at the time they are run.

This is true, but all that's needed to ensure that is to prohibit import cycles. In the absence of a cycle, every Dart library can be completely separately compiled.

@davidmorgan
Copy link
Contributor

This should be resolved with #3728

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
static-metaprogramming Issues related to static metaprogramming
Projects
None yet
Development

No branches or pull requests

4 participants