Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CS: cptypes friendly unbox #4211

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

gus-massa
Copy link
Contributor

This is a just a initial short draft to make schemify more cptypes-friendly and make cptypes more schemify-friendly.

My idea is instead of using a macro to force unbox to be inlined, modify it so cp0 can inline it. This version expose the checks for the impersonator and chaperone, so cptypes can understand that it's a record and call a helper function to handle the difficult case once the type is confirmed.

In particular this enable some reductions like

#lang racket
(lambda (x y) (unbox x) (car y) (eq? x y))
;==>  
(lambda (x y) (unbox x) (car y) #f)

The first problem is that the new version is too big to be inlined, so I had to modify cp0 increasing score-limit to 40. This change causes a weird error in hash.ms because an auxiliary function gets inlined and a closure is transformed into a normal function.

About the main part of the change, there are a few method to check the if it's a impersonator and chaperone. I think it's better to use box-impersonator? and box-chaperone?. It's enough to mark it as a record in cptypes, but still cptypes can only track one kind of record, so I hope to increase that to two independent records in the future. Other possibility is to make box-chaperone a subrecord of box-impersonator but this alternative needs more changes in the code outside cptypes. A third possibility is to add lens to cptypes, but I'm not sure how long it will take.

I also modified the code in the helper function that deals with impersonator and chaperone, in the case when the content of the impersonator and chaperone is not a box because I don't expect it to be other kind of impersonator, but it may be a mistake.

I'm still not sure if it's better to keep the macro version of unbox to force it to be inlined always, and modify unbox so it's inlined in the more unusual cases.

@gus-massa gus-massa added the racket-cs Specific to Racket-on-Chez label Apr 18, 2022
@@ -1662,7 +1662,7 @@
[(4) (gensym)]
[(5) (open-output-string)]
[(6) (fxvector 15 55)]
[(7) (lambda () x)]
[(7) (let ([g (gensym)]) (lambda () (lisy x g)))]
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[(7) (let ([g (gensym)]) (lambda () (lisy x g)))]
[(7) (let ([g (gensym)]) (lambda () (list x g)))]

I think this is just a typo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed.

@samth
Copy link
Sponsor Member

samth commented Apr 19, 2022

For my edification, what patterns in Schemify output get helped by this?

@gus-massa
Copy link
Contributor Author

Sure. I'll use fake prefixes racket:, rumble: and chez: because otherwise it's very confusing. I'll also replace in the Chez Scheme primitives the #3% with unsafe-.

Currently there is a macro inline:unbox that causes this expansion:

(lambda (x) (racket:unbox x) (eq? x 7))
; ==>
(lambda (x) (inline:unbox x) (eq? x 7))
; ==>
(lambda (x) (if (chez:box? x)
                (cs:unsafe-unbox x)
                ($app/no-inline (top-level-value 'rumble:unbox))
            (eq? x 7))

Where rumble:unbox checks again if it's a real cs:box? or an impersonator(chaperone and in the second case calls a helper function. The call to rumble:unbox is very indirect, so cptypes don't know what it's doing, and then there is not information about x to reduce (eq? x 7).

My proposal is to make rumble:unbox inlinable and move more work there, so the new expansion in cp0 is

(lambda (x) (racket:unbox x) (eq? x 7))
; ==>
(lambda (x) (rumble:unbox x) (eq? x 7))
; ==>
(lambda (x) (if (chez:box? x)
                (cs:unsafe-unbox x)
                (if (or (chaperone-box? x) (impersonator-box? x))
                    ($app/no-inline (top-level-value 'impersonate-unbox))
                    (cs:unbox x))) ; <-- to raise an error
            (eq? x 7))

Now cptypes can see that x must be a cs:box? or a record of type chaperone-box? or impersonator-box? and then can reduce (eq? x 7).

Currently cptypes can't remember two records, so the actual internal type is something like (Union 'box? 'record?), that is enough to reduce (eq? x 7). I hope to improve this in the future, so cptypes can remember two records. (Remembering three records is more difficult because there are too many cases for unions and intersections, but the version for two records is fine.)

The current version is enough to make similar reductions with things that can't be impersonated, like

(lambda (x) (racket:unbox x) (pair? x))

But to reduce

(lambda (x) (racket:unbox x) (racket:box? x))

it's necessary to to make cptypes remember two records, and modify the implementation of racket:box? in a similar way to my proposed change of racket:unbox.

Also, to reduce

(lambda (x) (racket:unbox x) (racket:vector? x))

it's necessary to to make cptypes remember two records, and modify the implementation of racket:vector?. With the version of cptypes that can remember two records cptypes can prove that they are disjoint. Otherwise the two records are collapsed to a generic record, an cptypes will not see that they are disjoint.

@gus-massa
Copy link
Contributor Author

I uncommented the relevant test in optimize.rktl

@mflatt
Copy link
Member

mflatt commented May 1, 2022

I worry about an implementation approach that involves changing the inlining heuristic. The test failure illustrates one kind of pitfall, although that sort of thing mostly happens to tests, which often depend on assumptions about the implementation. But there's also likely to be slowdowns or increased memory use for real programs.

If the goal is to get inlining to happen always, maybe implementing unbox with a macro would be a better way to go?

With the new implementation, cptypes can collect more
information that may be useful for reductions in the
following code.
@gus-massa
Copy link
Contributor Author

About the test: I classify this as a bug in the test, it depends too much in the implementation details.

Anyway, messing with the optimization heuristic is a problem because I don't understand how they affect the tradeoff. (Note: I'm not sure if it's necessary to increase it form 20 to 40, perhaps 21 is enough. I didn't made a very exhaustive test because 21 may be enough for unbox but perhaps later I'll discover that set-box! may need 23 and then ...)

I made another version, that uses a macro to force expanding unbox. I redefined unbox so this may be expanded in every call inside rumble, but I'm not 100% sure this is true because I'm not sure how the files affect each other.

An alternative is to rename my unbox to inline:unbox and also rename general-unbox to unbox and then rename them again in rumble, so all the uses of unbox in racket are expanded but the internal uses of unbox in rumble are not expanded.

I've seen both approaches in other parts of rumble, so I'm not sure which is the best.

@mflatt
Copy link
Member

mflatt commented May 2, 2022

Defining unbox as a macro (instead of defining inline:unbox) seems like a good choice in this case. It appears that the implementation of general-unbox and general-unsafe-unbox could use the unbox and unsafe-unbox macros, respectively, to avoid duplicating the implementation.

For impersonate-unbox, I think the looping else clause is probably still needed to deal with the possibility of extra impersonators that, for example, just add impersonate properties and don't specifically adjust the behavior unbox. I forget if property-only impersonators are implemented exactly that way, but the general idea is allow a mixture of wrappers.

@gus-massa
Copy link
Contributor Author

Do you mean this?

(define-record props-impersonator impersonator ())
(define-record props-chaperone chaperone ())

IIUC this can be a huge problem for my proposed change. I'll try to add some test that are broken by my current version, and take some time to think about it. It's easy to fix this so cptypes detect that pair? and racket:box? are disjoint, but it looks more difficult to detect that racket:vector? and racket:box? are disjoint.

Perhaps the only solution is to make cptypes understand (#2%box? (impersonator-val x)). I like it in the long term, but I'm not sure how many hidden traps it has.

@mflatt
Copy link
Member

mflatt commented May 3, 2022

That's what I had in mind, but maybe it doesn't apply to boxes, after all. The do-impersonate-box function does not use its make-props-impersonator argument.

@gus-massa
Copy link
Contributor Author

I think it does not apply to boxes, but adding #f as an option for the setter and getter of a impersonated box looks like a very sensible idea. It is possible in vectors, structs and procedures. (I'm almost want to post a feature request.) So, I think it's better to keep the current implementation of rumble:box? that assume it's possible, just in case.

Another possibility is to delete props-impersonator completely and just use vector-impersonator, struct-impersonator and procedure-impersonator instead. Is there a lot of improvement using a specialized case for adding just properties?

@mflatt
Copy link
Member

mflatt commented May 5, 2022

I think you may be right that props-impersonator doesn't really help for boxes and vectors. The right layering for impersonators hasn't been obvious, and it's easy to believe that there could be a better one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
racket-cs Specific to Racket-on-Chez
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants