Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[eta] Use Eta ADTs from the Java side #690

Open
NickSeagull opened this issue Mar 5, 2018 · 15 comments
Open

[eta] Use Eta ADTs from the Java side #690

NickSeagull opened this issue Mar 5, 2018 · 15 comments
Labels

Comments

@NickSeagull
Copy link
Collaborator

I'm currently building a simple library to try stuff with Events.The idea is to have the library written in Eta, while allowing users to use it from Java.

A simplified version of the library could be something like this

data Event
    = ItemArrived String
    | ItemLeft String

eventStream :: Chan Event
eventStream = unsafePerformIO $ newChan

emit :: Event -> IO ()
emit = write eventStream

lastEvent :: IO Event
lastEvent = readChan eventStream

NOTE: Chan is used here just as an example, I'll use either Kafka, RabbitMQ or others.

The idea would be that a user of this library could use it from the Java side in a form like this for emitting events

Stream.emit(Event.itemArrived("foo"));

And in another place, someone could subscribe to it, for example, by implementing an interface.

public class Consumer implements EtaADTConsumer {
	public void eventArrived(Event event){
		if (event.instanceOf(ItemArrived) {
			ItemArrived e = (ItemArrived) event;
			System.out.println(e.$1);
		} else if (event.instanceOf(ItemLeft) {
			ItemLeft e = (ItemLeft) event;
			System.out.println(e.$1);
		}
    }
}

I don't mind losing type safety as long as I can get the data from the ADT. I can build some helper functions on top of that 😄

@rahulmutt
Copy link
Member

rahulmutt commented Mar 5, 2018

So I think this becomes easier if you take a look at how Eta generates the declarations above, we should eventually add what I write below in the docs:

Suppose this module occured in a package named event:

module A.B.C where

data Event
    = ItemArrived String
    | ItemLeft String

generates:

package event.a.b.c.tycons;

import eta.runtime.stg.DataCon;

public abstract class Event extends DataCon {}
package event.a.b.c.datacons;

import event.a.b.c.tycons.Event;

public class ItemArrived extends Event {
   public Closure x1; //Corresponds to the String
   // In x1, you can store an evaluated list OR you can store a *thunk* which eventually returns a list.
   public ItemArrived(Closure x1) {
      this.x1 = x1;
   }
}

Note that successive fields in an ADT are consistenly named xN where N starts from 1.

Now the question becomes, how do I construct lists since you need to supply a String = [Char].

data [a] = [] | a : [a]

Is pseudo syntax for how you would declare it. It goes through a process called Z-encoding to get the correspond Java method name. This declaration occurs in ghc-prim package in a module GHC.Types, hence:

import ghc_prim.ghc.types.datacons.ZMZN; // [] = ZMZN w/ Z-encoding
import ghc_prim.ghc.types.datacons.ZC; // : = ZC w/ Z-encoding
import ghc_prim.ghc.Types; // Exposes all the exported methods of GHC.Types in Z-encoded form as static methods
import event.a.b.c.tycons.Event;
import event.a.b.c.datacons.ItemArrived;

ZC string = new ZC(new Czh(48), new ZC(new Czh(49), Types.DZMZN()));
// Corresponds to 48 : (49 : []) = [48, 49] or using as ASCII table, "01"
Event e = new ItemArrived(string);

You may have noticed that I used Types.DZMZN() instead of new ZMZN(). The reason is that a singleton instance is generated for every nullary data constructor and the static method to retrieve that instance can be found in the corresponding module prefixed with a D.

Now let's say you wanted to call this method directly:

In package event:

module A.B.C where

emit :: Event -> IO ()
emit = write eventStream

Generates:

package event.a.b;

import eta.runtime.stg.Closure;
import eta.runtime.stg.StgContext;

public C {

  public static Closure emit(StgContext context, Closure x1) {
    // Implementation generated by the Eta compiler
  }
}

You can call this from Java with the Event e defined above as:

import event.a.b.C;

import eta.runtime.Runtime;

Closure result = C.emit(Runtime.getContext(), e);
// In this case, you can ignore the result since it will be of type unit, or an instance of Z0T (Z-encoding for 0-element tuples)

NOTE: Calling Eta functions directly by passing in the context like that will work only for pure functions and functions that don't throw exceptions or do concurrency.
For complicated functions that require Runtime support, you need to run Runtime.eval*() instead where you build up an expression (thunk) that performs something similar to what you want. The eval* functions setup a scheduling loop to schedule light weight threads, etc. if your code uses them.

For example, above you would have done:

Closure result = Runtime.evalIO(new Ap2Upd(C.emit(), e));

An Ap2Upd is a generic thunk that applies its first argument to the second one. C.emit() is overloaded - zero-argument version returns the function Closure and the multiple-argument version used in the former example represents a way to statically invoke it. The function Closure is used in higher-order function cases like map, etc. In fact, under the hood, foreign exports are generated like this.

Now you don't have to do any guess work or know Eta internals to remember all these rules. Compile some Eta code, and then decompile it so that it becomes easier to see the generated interface and work accordingly.

Hope that helps! I know that's a lot to keep in mind, hopefully we'll find ways to make this process easier. Or maybe it's just a matter of documenting it all and putting javadocs for the Runtime helper functions.

@NickSeagull
Copy link
Collaborator Author

NickSeagull commented Mar 5, 2018

Thank you very much @rahulmutt !

Yes, it actually does look like something not trivial.
Off the top of my head, there are two possible workarounds:

  1. Write a tool/library that allows generating Java classes directly if a datatype derives Generic. On one hand, this would make working with datatypes from the foreign side easier, but still, we have the problem of passing them to methods. Maybe this tool could also generate helper methods that wrap over Closure, Runtime.evalIO, Ap2Upd and such? Is it possible to have custom compiler hooks?

  2. Allow exporting datatypes directly from the compiler, maybe using the @EXPORT annotation as discussed in [eta] Annotations and ADT foreign export #601 , allowing to pass them through the FFI as it would be a Closure, but from the foreign side it would be another name. Although I'm not sure if this is possible.

If its related to #601 , I'll probably start working on 2 first, as I'm trying to push Eta for an internal project in my company and it is required that it can be used from Java 😄

@jneira
Copy link
Collaborator

jneira commented Mar 6, 2018

+1 to the @export anotation for data types, to make imports/exports balanced (cause this way you can import and export, methods/functions and data types)
The exporting could do the conversion between eta internal classes to the closer representation possible in java, and with nice names.

@rahulmutt
Copy link
Member

rahulmutt commented Mar 6, 2018

@NickSeagull Just it case you missed the comment in the wall of text I posted: foreign exports do exactly that - construct the thunk to and execute the expression with evalJava. But right now foreign exports don't let you put Eta types in the type signature so that's one restriction we can relax. Let's explore what the semantics will be below:

Suppose we wanted to export this:

emit :: Event -> IO ()
emit = write eventStream

foreign export java unsafe "@static [some-class].emit"
  emit :: Event -> IO ()

so that the generated signature is:

public static void emit(Event e);

This is compeletely doable, if we assumed that in the emit function, the Event argument is treated as strict, which is true if you're coming from Java land. You pass in values directly most of the time.

But there will be cases where you'd want to be able to supply deferred values and creating your own Thunk is a bit cumbersome if you directly extend from the internal Runtime classes. For that, we can make a helper LazyValue class that people can extend that contains code to eventually return an Eta value.

emit :: Event -> IO ()
emit = write eventStream

foreign export java unsafe "@static [some-class].emit"
  emit :: Lazy Event -> IO ()

so that the generated signature is:

public static void emit(Closure a);

This is flexible in that it accepts both Event and LazyValue<Event>.

NOTE: We can add generics to the eta internal types to make things more typesafe, I played around with this here. So the signature above could be:

public static void emit(Closure<Event> a);

Notice that the Lazy is a special newtype that will be recognized by the foreign export generator. Notice also that the signature of the export differs from the signature of the original function! I'm sure you though it was redundant to specify the type signature twice, but it's cases like these, to control the exported signature that we specify two of them instead of one.

Here's how LazyValue would look:

package eta.runtime.Runtime;

import java.util.concurrent.Callable;

import eta.runtime.stg.Value;

public abstract class LazyValue<A extends Value> extends UpdatatableThunk {

  public abstract A call();

  @Override
  public final Closure thunkEnter(StgContext context) {
    return call();
  }
}

So now creating a lazy Event value is as simple as:

emit(new LazyValue<Event>() {
  @Override
  public Event call() {
    // Do some deferred stuff here.
  }
});

We can also make a version of LazyValue that takes in a functional interface like Supplier so that you can supply lambdas for deferred values and make it Java 8-friendly.

All this is relatively easy to implement right now with little change - we just need to reduce the strictness of typechecking for foreign exports to allow any type to be valid in the FFI. The foreign export generator (in ETA.DeSugar.DsForeigns) will need to be tweaked to generate the right signatures, but the core thunk building code will be unchanged.

With the new @EXPORT syntax the examples above would look like:

@Export(class="[some-class]")
emit :: Event -> IO ()
emit = write eventStream
@Export(class="[some-class]")
emit :: @Lazy Event -> IO ()
emit = write eventStream

Note that if we get annotation support, we can avoid the newtype altogether and use a special @Lazy annotation instead.

@rahulmutt
Copy link
Member

rahulmutt commented Mar 6, 2018

@jneira Let's play around with how that might work.

data Event
    = ItemArrived String
    | ItemLeft String

So now the question is, when exporting, we need to decide that Java type to give String and have some internal conversion functions when the exported method is called.

@Export
data Event
   = ItemArrived (@Type(JString) String)
   | ItemLeft    (@Type(CharSequence) String)

This looks a lot nicer with record notation:

@Export(package="[some-package]")
data Event
   = ItemArrived { payload :: @Type(JString) String }
   | ItemLeft    { payload  :: @Type(CharSequence) String }

Where JString and CharSequence are JWTs. Open for suggestions on the syntax. This also means we need to know how to convert JString to String and CharSequence to String. We can have the compiler search for appropriate JavaConverter instances to get the job done. If an instance is not found, compilation error ensues.

OR maybe we just make it so that you are only allowed to put JWTs as fields in a type (or Eta types that have the @Export annotation) with @Export so that translation becomes easy, and it's the burden on the Eta developer to convert that that type into a form necessary for internal use in the given library. I feel like this solution is a lot simpler for the user and is a lot easier to implement.

@Export(package="com.somecompany")
data Event
   = ItemArrived { payload :: JString }
   | ItemLeft    { payload :: JString, payload2 :: CharSequence }

will generate:

package com.somecompany;

public abstract class Event {
  // When we have sum-types + record notation with common record selectors,
  // we define them up in the parent class.
  // Notice our conversion of generating Java getters too.
  public abstract getPayload();

}

public class ItemArrived extends Event {
  String x1;
  public ItemArrived(String x1) {
    this.x1 = x1;
  }

  public String getPayload() {
    return x1;
  }
}

public class ItemLeft extends Event {
  String x1;
  CharSequence x2;
  public ItemLeft(String x1, CharSequence x2) {
    this.x1 = x1;
    this.x2 = x2;
  }

  public String getPayload() {
    return x1;
  }

  public CharSequence getPayload2() {
    return x2;
  }
}

When foreign exporting an Eta type that has an @Export annotation, it expects the generated version as an argument (or returns the generated version) and not the Eta internal naming convention and the compiler will take care to convert to the internal type.

For fields of the form Mutable a where Mutable is a newtype that represents a changing value, setters will also be automatically generated.

I didn't mention this but all field values for Exported classes are assumed to be strict.

@NickSeagull
Copy link
Collaborator Author

Yep completely missed it, thanks again for your help Rahul😁

@NickSeagull NickSeagull changed the title Use Eta ADTs from the Java side [eta] Use Eta ADTs from the Java side Mar 7, 2018
@rahulmutt
Copy link
Member

@jneira I just realized I didn't address your question about "nice names". Take for example the built-in list type:

data [a] = [] | a : [a]

The Z-encoding is scary: ZMZN for [] and ZC for :. It would be cool if for certain standard types, there was a way to specify what class names would be generated and also control in what package they are used.

@Export(package="eta.util")
data @Name("List") [a]
  = @Name("Nil") []
   | (@Name("Cons") (:)) a [a]

So we would have eta.util.List, eta.util.Nil and eta.util.Cons which are a lot more friendly than ghc_prim.ghc.types.tycons.List, etc. Thoughts? Feedback on syntax?

@jneira
Copy link
Collaborator

jneira commented Mar 16, 2018

@rahulmutt mmm, i thought those names would be automatically generated from haskell names if the type has the @export annotation, at least for types/constructors with names. I had no take in account types/contructors using operators like list (but for user types they are not very common, right?)
I like the @Name syntax but imo it should be required for that "special types" and optional for named types/constructors. it would be possible?
Otoh, the classes, fields and methods created maybe should be final, no?

@rahulmutt
Copy link
Member

rahulmutt commented Mar 16, 2018

Yes, so we can enable the @Export and @Name annotations only for symbol-based type constructors/data constructors. @Export is useful for named constructors when you want to control which Java package contains those Eta types when exporting APIs for consumption by other JVM languages.

This same thing is also useful for value-level operators:

@Export { package="eta.util", class="Utils" }
@Name "append"
(++) :: [a] -> [a] -> [a]

@NickSeagull
Copy link
Collaborator Author

NickSeagull commented Mar 17, 2018

Should we treat these special annotations as keywords? I'm not sure if it would make sense not doing so, as they only make sense in the Eta Realm 🤔

@rahulmutt
Copy link
Member

@NickSeagull I think these can be actual annotations in say the eta.lang Java package and you have to import them using the standard import java syntax that was discussed in #647. The compiler will do some extra processing for particular annotations.

The problem with keywords becomes - how do I specify that I want to use the Name annotation from some Java framework that I imported and not the special keyword?

@NickSeagull
Copy link
Collaborator Author

NickSeagull commented Mar 27, 2018

Maybe we can omit the Export annotation (which can be called FFIExport btw) if the top level declaration has already another annotation?

For example

@GetMapping "/user/{id}"
@Export { package="eta.util", class="Utils" }
foo :: Int -> IO Int
foo _ = return 42

makes sense, but probably

@GetMapping "/user/{id}"
@Export
foo :: Int -> IO Int
foo _ = return 42

doesn't, as there is no possibility of using an annotation function if it is not exported, so we could leave it like

@GetMapping "/user/{id}"
foo :: Int -> IO Int
foo _ = return 42

which is more clear and it is not cluttered by the Export annotation.

Another thing we might have in mind is the @Static annotation, which maybe can have the same parameters as the @Export:

@Static
@Export { package = "eta.util", class = "Utils" }
bar :: Int -> String
bar _ = "Hi"

would be replaced by

@Static { package = "eta.util", class = "Utils" }
bar :: Int -> String
bar _ = "Hi"

Or even, we could call the annotation @StaticExport

@jneira
Copy link
Collaborator

jneira commented Jul 19, 2018

Another use case: i would like to export already existent ADTs from a haskell package, in my case the ADT representing the AST of dhall lang
The export could be:

@Export(package="[some-package]") Dhall.Core.Expr

And being the ADT:

data Const = Type | Kind

data Expr s a
    -- | > Const c                                  ~  c
    = Const Const
    -- | > Var (V x 0)                              ~  x
    --   > Var (V x n)                              ~  x@n
    | Var Var
    -- | > Lam x     A b                            ~  λ(x : A) -> b
    | Lam Text (Expr s a) (Expr s a)
....
deriving (Functor, Foldable, Traversable, Show, Eq, Data)

I would like to have:

public abstract class Const {}
public final class Type extends Const {
   public Type() {}
}
....

public abstract class Expr<S,A>  {
    // A gigantic church encoding of sum types ????, i am afraid it is unusable
    // it could be useful for Maybe or Either though
    public <T> match(Function<Const,T> f1, Function<Lam<S,A>,T> f2, .... another 50 params) {

    }
}

public final class Lam<S,A> extends Expr<S,A> {
   
   private final Text t;
   private final Expr<S,A>e1,e2;
   
   public Lam(Text t,Expr<S,A> e1, Expr<S,A> e2) {
      ...
   }
   // No sensible default field names (x1,x2,x3??) for product types 
   // so using a poor man's pattern matching
   public <R> match(Function3<Text,Expr<S,A>,Expr<S,A>,R> f) {
      ... 
   }
   // Not sure about adding type class methods as java methods
   public LAM<T,A> fmap (Function<S,T> f1) {
      ...
   }
  // same for traverse, and other type classes?
}

Moreover we could mark a module as exportable:

@ExportModule(package="[some-package]") Some.Module.IncludingThis

and exports all its public ADTs automatically if possible

@rahulmutt maybe i am asking for something impossible or not practical 😐

@rahulmutt
Copy link
Member

rahulmutt commented Jul 22, 2018

@jneira Thanks for presenting a nice use case! I like how you handled Church encoding and typeclass methods.

But the main issue with the Church encoding you presented is that it can have a large number of arguments. I wonder if for the Java side, it's better to present a Builder-like API for constructing pattern matches:

T result = expr.matchConst(const -> ...)
                        .matchVar(var -> ...)
                        .matchAnything(x -> ...)
                        .match();

matchAnything corresponding to _ case and all the match* functions return a MatchBuilder specific to the type - in this case maybe Expr.MatchBuilder.

That last call to match will build the information from the pattern matches done and delegate to an inner full-form Church encoding like you presented. For large data types, it may be inefficient to make such a huge function call, so we may have something that simply does a sequence of instanceof check that will be constructed from the builder. That should probably be a lot more efficient.

@jneira
Copy link
Collaborator

jneira commented Mar 25, 2019

Hi, after writing https://github.com/eta-lang/dhall-eta i did feel the pain to to do a lib that esentially creates a java binding for a haskell lib.
The main class representing the dhall adt is 500 loc but another point of annoying boilerplate was generate the JavaConverter instances for them: https://github.com/eta-lang/dhall-eta/blob/master/src/main/eta/Dhall/Eta/Core/Java.hs#L989-L1292

It would be very nice that the tool implemented to generate java classes from haskell code will generate automatically those instances.
To implement the instances i had to write some for common haskell data definitions, like Maybe or Either, it would be nice too that we had those ones in some common lib (base or other one)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants