Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test the soundness of vm::Value #40

Open
4 of 5 tasks
rpjohnst opened this issue Apr 4, 2020 · 2 comments
Open
4 of 5 tasks

Test the soundness of vm::Value #40

rpjohnst opened this issue Apr 4, 2020 · 2 comments

Comments

@rpjohnst
Copy link
Owner

rpjohnst commented Apr 4, 2020

Edit: This has been addressed; this issue now tracks testing to ensure this doesn't get broken.

The VM stack contains three kinds of raw pointers, with relatively unclear rules around when and how they can be used soundly. This makes it hard to determine whether (modifications to) the interpreter are correct. Worse, it exposes the entirely-safe engine APIs to unsoundness- for example, this is the main blocker for implementing instance_create and instance_destroy.

These are the three kinds of pointers:

  • vm::Value can encode a vm::Array, which is itself an Rc<UnsafeCell>. However, because vm::Value is Copy, the Rc is not helpful at this level- it's there to implement copy-on-write arrays at the GML level.
  • vm::Row is an intermediate ptr::NonNull into a vm::Array. It is intended to be "short-lived" (i.e. not stored in instances) but we may still want Dejavu to be allowed to produce optimized bytecode such that it lives across calls to other scripts or even engine APIs.
  • A with iterator is a ptr::NonNull to some array of vm::Entitys owned by the vm::Thread or vm::World. It is "short-lived" in the same way as vm::Row, but unlike vm::Row is required by GML semantics to live across calls in the body of the with loop.

There are two separate-but-related problems we need to avoid:

  • The most obvious problem is that these pointers are not tracked by the borrow checker, making it easy to use one after its target is gone, on accident. This risk is mitigated somewhat for vm::Row and with iterators because they are confined to the VM stack, but it's still there.
  • The second problem is scarier and more subtle- the interpreter or the engine might violate Rust's aliasing rules and hit undefined behavior. This risk is also mitigated somewhat for raw pointers, but the code was not written with very much attention to this detail.

What I'd like to do to build confidence that this is sound:

  • Figure out a way to resolve the fact that Value contains an Rc but is also Copy- maybe this is okay if we can enforce that there is always a corresponding live Rc on the VM stack, or maybe we need to drop the Copy impl. (Make vm::Value non-Copy #45)
  • Fix the way with and instances work so it doesn't immediately explode if you e.g. instance_create in a with body. @Zekka suggested making the arrays copy-on-write, which solves the use-after-free issue; we just need to work out the specifics. (Implement instance_destroy based on copy-on-write entity lists #46)
  • Encode the above in the types exposed to the engine. We might need to expose the concept of a frame lifetime, add some reference counting, etc.
  • Audit the whole thing and fix any remaining shenanigans with references and other kinds of aliasing.
  • Expand the test suite with this in mind and run it through Miri, ideally on CI. The current test suite passes, but it doesn't exercise much of this stuff.
@rpjohnst rpjohnst changed the title Pointer aliasing soundness for vm::Thread and vm::World is unclear Soundness of pointers in vm::Thread and vm::World is unclear Apr 4, 2020
rpjohnst added a commit that referenced this issue Apr 13, 2020
Includes some `Symbol` clean-up from #42.

Closes #43 by treating NaN as a tagged value. Future work on #40 may take
another approach.
@rpjohnst
Copy link
Owner Author

Regarding the first checkbox (Value contains an Rc but is also Copy) I believe my original intent was that Values outside of the VM stack were really more like RcBorrows- they don't increment the refcount, but they don't decrement it either. But to expand on what the checkbox says: this isn't quite enough, because they don't actually borrow from anything.

This suggests a way to drop the Copy impl without drastically increasing refcount traffic- most of what the interpreter and engine APIs do can be done without a fully owned Value, so they can deal in &Values and use clone when they need to keep it around longer.

On its own, this just replaces all that refcount traffic with a bunch of extra references instead. It may be better to introduce something like a ValueRef<'v> type with the same representation as the Value it borrows from, but Copy like the original implementation of Value.

rpjohnst added a commit that referenced this issue May 16, 2020
This moves us closer to a sound and safe API as discussed in #40. Because
a `vm::Value` may point to an array, users must check its type and perhaps
increment its refcount anywhere they duplicate the value.

As part of overhauling `vm::Value`, this commit also slightly tweaks the
NaN-boxing scheme. The quiet bit is now always set, moving tagged values
up to the top of the available bit-pattern range. This simplifies decoding
such that all real values are contiguous. (See #43)

To avoid introducing new, redundant refcount traffic, there are now also
`vm::ValueRef` and `vm::ArrayRef` types that share a representation with
their owned counterparts, but can only exist as a borrow.

The interpreter relies entirely on codegen to provide it with sound
bytecode that does not violate host language rules. This means that each
opcode handler must carefully choose whether to borrow, move, or clone its
operands, so that codegen can rely on that behavior.
rpjohnst added a commit that referenced this issue May 16, 2020
This moves us closer to a sound and safe API as discussed in #40. Because
a `vm::Value` may point to an array, users must check its type and perhaps
increment its refcount anywhere they duplicate the value.

As part of overhauling `vm::Value`, this commit also slightly tweaks the
NaN-boxing scheme. The quiet bit is now always set, moving tagged values
up to the top of the available bit-pattern range. This simplifies decoding
such that all real values are contiguous. (See #43)

To avoid introducing new, redundant refcount traffic, there are now also
`vm::ValueRef` and `vm::ArrayRef` types that share a representation with
their owned counterparts, but can only exist as a borrow.

The interpreter relies entirely on codegen to provide it with sound
bytecode that does not violate host language rules. This means that each
opcode handler must carefully choose whether to borrow, move, or clone its
operands, so that codegen can rely on that behavior.
rpjohnst added a commit that referenced this issue May 23, 2020
Entity lists are now copy-on-write. This lets the interpreter keep
iterators pointing into them while (potentially) mutating them, and
preserves GML `with` semantics in the presence of such mutation.

In particular, this makes `instance_create` and `instance_destroy` memory
safe when called from GML, as described by #40.
rpjohnst added a commit that referenced this issue May 23, 2020
Entity lists are now copy-on-write. This lets the interpreter keep
iterators pointing into them while (potentially) mutating them, and
preserves GML `with` semantics in the presence of such mutation.

In particular, this makes `instance_create` and `instance_destroy` memory
safe when called from GML, as described by #40.
@rpjohnst rpjohnst changed the title Soundness of pointers in vm::Thread and vm::World is unclear Test the soundness of vm::Value Aug 8, 2020
@rpjohnst
Copy link
Owner Author

rpjohnst commented Aug 8, 2020

I've addressed the known issues here, so the remaining work is to make sure we have good test coverage. This might include running the test suite under Miri and other sanitizers, as well as fuzzing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant