CS: cptypes friendly unbox #4211

gus-massa · 2022-04-18T17:40:03Z

This is a just a initial short draft to make schemify more cptypes-friendly and make cptypes more schemify-friendly.

My idea is instead of using a macro to force unbox to be inlined, modify it so cp0 can inline it. This version expose the checks for the impersonator and chaperone, so cptypes can understand that it's a record and call a helper function to handle the difficult case once the type is confirmed.

In particular this enable some reductions like

#lang racket
(lambda (x y) (unbox x) (car y) (eq? x y))
;==>  
(lambda (x y) (unbox x) (car y) #f)

The first problem is that the new version is too big to be inlined, so I had to modify cp0 increasing score-limit to 40. This change causes a weird error in hash.ms because an auxiliary function gets inlined and a closure is transformed into a normal function.

About the main part of the change, there are a few method to check the if it's a impersonator and chaperone. I think it's better to use box-impersonator? and box-chaperone?. It's enough to mark it as a record in cptypes, but still cptypes can only track one kind of record, so I hope to increase that to two independent records in the future. Other possibility is to make box-chaperone a subrecord of box-impersonator but this alternative needs more changes in the code outside cptypes. A third possibility is to add lens to cptypes, but I'm not sure how long it will take.

I also modified the code in the helper function that deals with impersonator and chaperone, in the case when the content of the impersonator and chaperone is not a box because I don't expect it to be other kind of impersonator, but it may be a mistake.

I'm still not sure if it's better to keep the macro version of unbox to force it to be inlined always, and modify unbox so it's inlined in the more unusual cases.

samth · 2022-04-18T17:58:56Z

racket/src/ChezScheme/mats/hash.ms

@@ -1662,7 +1662,7 @@
          [(4) (gensym)]
          [(5) (open-output-string)]
          [(6) (fxvector 15 55)]
-          [(7) (lambda () x)]
+          [(7) (let ([g (gensym)]) (lambda () (lisy x g)))]


Suggested change

[(7) (let ([g (gensym)]) (lambda () (lisy x g)))]

[(7) (let ([g (gensym)]) (lambda () (list x g)))]

I think this is just a typo.

Thanks! Fixed.

samth · 2022-04-19T16:47:08Z

For my edification, what patterns in Schemify output get helped by this?

gus-massa · 2022-04-20T12:35:26Z

Sure. I'll use fake prefixes racket:, rumble: and chez: because otherwise it's very confusing. I'll also replace in the Chez Scheme primitives the #3% with unsafe-.

Currently there is a macro inline:unbox that causes this expansion:

(lambda (x) (racket:unbox x) (eq? x 7))
; ==>
(lambda (x) (inline:unbox x) (eq? x 7))
; ==>
(lambda (x) (if (chez:box? x)
                (cs:unsafe-unbox x)
                ($app/no-inline (top-level-value 'rumble:unbox))
            (eq? x 7))

Where rumble:unbox checks again if it's a real cs:box? or an impersonator(chaperone and in the second case calls a helper function. The call to rumble:unbox is very indirect, so cptypes don't know what it's doing, and then there is not information about x to reduce (eq? x 7).

My proposal is to make rumble:unbox inlinable and move more work there, so the new expansion in cp0 is

(lambda (x) (racket:unbox x) (eq? x 7))
; ==>
(lambda (x) (rumble:unbox x) (eq? x 7))
; ==>
(lambda (x) (if (chez:box? x)
                (cs:unsafe-unbox x)
                (if (or (chaperone-box? x) (impersonator-box? x))
                    ($app/no-inline (top-level-value 'impersonate-unbox))
                    (cs:unbox x))) ; <-- to raise an error
            (eq? x 7))

Now cptypes can see that x must be a cs:box? or a record of type chaperone-box? or impersonator-box? and then can reduce (eq? x 7).

Currently cptypes can't remember two records, so the actual internal type is something like (Union 'box? 'record?), that is enough to reduce (eq? x 7). I hope to improve this in the future, so cptypes can remember two records. (Remembering three records is more difficult because there are too many cases for unions and intersections, but the version for two records is fine.)

The current version is enough to make similar reductions with things that can't be impersonated, like

(lambda (x) (racket:unbox x) (pair? x))

But to reduce

(lambda (x) (racket:unbox x) (racket:box? x))

it's necessary to to make cptypes remember two records, and modify the implementation of racket:box? in a similar way to my proposed change of racket:unbox.

Also, to reduce

(lambda (x) (racket:unbox x) (racket:vector? x))

it's necessary to to make cptypes remember two records, and modify the implementation of racket:vector?. With the version of cptypes that can remember two records cptypes can prove that they are disjoint. Otherwise the two records are collapsed to a generic record, an cptypes will not see that they are disjoint.

gus-massa · 2022-04-20T18:37:50Z

I uncommented the relevant test in optimize.rktl

mflatt · 2022-05-01T14:31:00Z

I worry about an implementation approach that involves changing the inlining heuristic. The test failure illustrates one kind of pitfall, although that sort of thing mostly happens to tests, which often depend on assumptions about the implementation. But there's also likely to be slowdowns or increased memory use for real programs.

If the goal is to get inlining to happen always, maybe implementing unbox with a macro would be a better way to go?

With the new implementation, cptypes can collect more information that may be useful for reductions in the following code.

gus-massa · 2022-05-02T13:30:11Z

About the test: I classify this as a bug in the test, it depends too much in the implementation details.

Anyway, messing with the optimization heuristic is a problem because I don't understand how they affect the tradeoff. (Note: I'm not sure if it's necessary to increase it form 20 to 40, perhaps 21 is enough. I didn't made a very exhaustive test because 21 may be enough for unbox but perhaps later I'll discover that set-box! may need 23 and then ...)

I made another version, that uses a macro to force expanding unbox. I redefined unbox so this may be expanded in every call inside rumble, but I'm not 100% sure this is true because I'm not sure how the files affect each other.

An alternative is to rename my unbox to inline:unbox and also rename general-unbox to unbox and then rename them again in rumble, so all the uses of unbox in racket are expanded but the internal uses of unbox in rumble are not expanded.

I've seen both approaches in other parts of rumble, so I'm not sure which is the best.

mflatt · 2022-05-02T15:46:18Z

Defining unbox as a macro (instead of defining inline:unbox) seems like a good choice in this case. It appears that the implementation of general-unbox and general-unsafe-unbox could use the unbox and unsafe-unbox macros, respectively, to avoid duplicating the implementation.

For impersonate-unbox, I think the looping else clause is probably still needed to deal with the possibility of extra impersonators that, for example, just add impersonate properties and don't specifically adjust the behavior unbox. I forget if property-only impersonators are implemented exactly that way, but the general idea is allow a mixture of wrappers.

gus-massa · 2022-05-03T17:31:14Z

Do you mean this?

racket/racket/src/cs/rumble/impersonator.ss

Lines 315 to 316 in 10a34f3

    
           (define-record props-impersonator impersonator ()) 
        
           (define-record props-chaperone chaperone ())

IIUC this can be a huge problem for my proposed change. I'll try to add some test that are broken by my current version, and take some time to think about it. It's easy to fix this so cptypes detect that pair? and racket:box? are disjoint, but it looks more difficult to detect that racket:vector? and racket:box? are disjoint.

Perhaps the only solution is to make cptypes understand (#2%box? (impersonator-val x)). I like it in the long term, but I'm not sure how many hidden traps it has.

mflatt · 2022-05-03T17:58:35Z

That's what I had in mind, but maybe it doesn't apply to boxes, after all. The do-impersonate-box function does not use its make-props-impersonator argument.

gus-massa · 2022-05-03T19:57:57Z

I think it does not apply to boxes, but adding #f as an option for the setter and getter of a impersonated box looks like a very sensible idea. It is possible in vectors, structs and procedures. (I'm almost want to post a feature request.) So, I think it's better to keep the current implementation of rumble:box? that assume it's possible, just in case.

Another possibility is to delete props-impersonator completely and just use vector-impersonator, struct-impersonator and procedure-impersonator instead. Is there a lot of improvement using a specialized case for adding just properties?

mflatt · 2022-05-05T22:21:17Z

I think you may be right that props-impersonator doesn't really help for boxes and vectors. The right layering for impersonators hasn't been obvious, and it's easy to believe that there could be a better one.

gus-massa added the racket-cs Specific to Racket-on-Chez label Apr 18, 2022

samth reviewed Apr 18, 2022

View reviewed changes

gus-massa force-pushed the 22-4-Unbox branch from 8fe47c5 to 4a537d5 Compare April 18, 2022 19:23

gus-massa force-pushed the 22-4-Unbox branch from 4a537d5 to 015c331 Compare April 20, 2022 16:19

cs: change implementation of unbox

1d88f46

With the new implementation, cptypes can collect more information that may be useful for reductions in the following code.

gus-massa force-pushed the 22-4-Unbox branch from 015c331 to 1d88f46 Compare May 2, 2022 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CS: cptypes friendly unbox #4211

CS: cptypes friendly unbox #4211

gus-massa commented Apr 18, 2022

samth Apr 18, 2022

gus-massa Apr 18, 2022

samth commented Apr 19, 2022

gus-massa commented Apr 20, 2022

gus-massa commented Apr 20, 2022

mflatt commented May 1, 2022

gus-massa commented May 2, 2022

mflatt commented May 2, 2022

gus-massa commented May 3, 2022

mflatt commented May 3, 2022

gus-massa commented May 3, 2022

mflatt commented May 5, 2022

	[(7) (let ([g (gensym)]) (lambda () (lisy x g)))]
	[(7) (let ([g (gensym)]) (lambda () (list x g)))]

CS: cptypes friendly unbox #4211

Are you sure you want to change the base?

CS: cptypes friendly unbox #4211

Conversation

gus-massa commented Apr 18, 2022

samth Apr 18, 2022

Choose a reason for hiding this comment

gus-massa Apr 18, 2022

Choose a reason for hiding this comment

samth commented Apr 19, 2022

gus-massa commented Apr 20, 2022

gus-massa commented Apr 20, 2022

mflatt commented May 1, 2022

gus-massa commented May 2, 2022

mflatt commented May 2, 2022

gus-massa commented May 3, 2022

mflatt commented May 3, 2022

gus-massa commented May 3, 2022

mflatt commented May 5, 2022