Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NonNull vs *const in Repr #356

Open
overlookmotel opened this issue Jan 4, 2024 · 9 comments
Open

NonNull vs *const in Repr #356

overlookmotel opened this issue Jan 4, 2024 · 9 comments

Comments

@overlookmotel
Copy link
Contributor

overlookmotel commented Jan 4, 2024

This is a question more than a bug report, but just asking in case it reveals something more useful...

  • Repr's 1st field is a *const ().
  • HeapBuffer's 1st field is a ptr::NonNull<u8>.
  • StaticStr's 1st field is a ptr::NonNull<u8>.

HeapBuffer and StaticStr are both transmuted to Repr.

Is there a benefit to HeapBuffer and StaticStr using NonNull<u8> rather than *const? Does it matter that a value with a niche is transmuted to one without?

Second smaller question: Why *const () in Repr instead of *const u8? Only place which accesses this field seems to be as_slice, where it's immediately cast to *const u8. Probably it makes no difference to codegen, but I'm wondering if there's a subtlety I'm missing.

@NobodyXu
Copy link
Contributor

NobodyXu commented Jan 4, 2024

IMO it's probably because the inline repr, where the first character can be \0.

@overlookmotel
Copy link
Contributor Author

overlookmotel commented Jan 5, 2024

Sorry if my question was confused. I understand that Repr's 1st field must be *const not NonNull, because 0 is a valid bit pattern for the first 8 bytes of Repr (e.g. empty string stored inline).

My question was really: Does it matter that HeapBuffer and StaticStr's use of NonNull doesn't match with Repr, which they get transmuted to?

@NobodyXu
Copy link
Contributor

NobodyXu commented Jan 5, 2024

My question was really: Does it matter that HeapBuffer and StaticStr's use of NonNull doesn't match with Repr, which they get transmuted to?

Hmmm yeah it gets ignored currently, but I guess maybe in the future it can be changed to take advantage of that?

For example, we can represent empty inline string using ['\0', 1, ..., LENGTH_MASK] so that the first const ()* will never be null, thus enable using NonNull<()> instead.

@overlookmotel
Copy link
Contributor Author

Nice idea. That would deal with the empty string case.

But there is still one weird case where 0 is valid bit pattern for all of 1st 8 bytes: CompactString::new("\0\0\0\0\0\0\0\0"). I think that rules out using NonNull in Repr.

So question is then: Should HeapBuffer and StaticStr's first field be changed to *const () to match Repr? Or is it OK that they don't match?

And is there a reason why *const () not *const u8?

@NobodyXu
Copy link
Contributor

NobodyXu commented Jan 5, 2024

But there is still one weird case where 0 is valid bit pattern for all of 1st 8 bytes: CompactString::new("\0\0\0\0\0\0\0\0"). I think that rules out using NonNull in Repr.

Maybe we should treat \0\0 as a special case?

If a string only consists of null bytes , then treat it as empty string?

@ParkMyCar
Copy link
Owner

But there is still one weird case where 0 is valid bit pattern for all of 1st 8 bytes: CompactString::new("\0\0\0\0\0\0\0\0"). I think that rules out using NonNull in Repr.

Bingo!

And is there a reason why *const () not *const u8?

No particular reason, Repr was crafted very carefully to get the compiler to emit optimal ASM and we use the pointer as the first element, as opposed to a usize, so tools like miri can check provenance.

Should HeapBuffer and StaticStr's first field be changed to *const () to match Repr? Or is it OK that they don't match?

AFAIU it's okay that they don't match, and HeapBuffers and StaticStrs pointers should be NonNull so semantically that seemed like the best choice

@overlookmotel
Copy link
Contributor Author

AFAIU it's okay that they don't match, and HeapBuffers and StaticStrs pointers should be NonNull so semantically that seemed like the best choice

Thanks for coming back. If I may ask one more probably stupid question: Why "should" their pointers be NonNull? (as opposed to *const ())

@NobodyXu
Copy link
Contributor

NobodyXu commented May 7, 2024

so tools like miri can check provenance.

This reminds me of a problem: with arm CHERI, size of pointer is 128 bits instead of 64 bits, containing an extra tag for the heap allocation.

It looks like we might have to update this crate to support arm CHERI @ParkMyCar

@overlookmotel
Copy link
Contributor Author

overlookmotel commented May 7, 2024

I've opened a separate issue #378 for that point NobodyXu (just because if possible I'd be really interested in answer to my question above, and didn't want conversation here to get sidetracked).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants