Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Rethinking the design of the crates API #721

Open
elinorbgr opened this issue May 7, 2024 · 8 comments
Open

RFC: Rethinking the design of the crates API #721

elinorbgr opened this issue May 7, 2024 · 8 comments

Comments

@elinorbgr
Copy link
Member

So, I've been thinking about wayland-rs' API again (gotta love my brain, the running gag is still going strong), and I'm dumping these thoughts here, to see if they can serve as a starting point for a positive discussion.

As a preamble, as far as I can tell, the current API is for most users "good enough". Although some pain points remain (in particular regarding thread-safety), it is mostly usable and reasonable. As such, I don't think it's worth going into yet another redesign of the API unless we have a strong confidence that it would make things significantly better. This is why I'm opening this "RFC" issue: to gather feedback on those ideas, see if/how they can be fleshed out, and evaluate if it is a direction worth pursuing or not.

Design of the backend

In the latest big redesign, I've tried to make wayland-backend into an "as low-level as possible" wrapper around both a rust implementation of the protocol, and libwayland. It turns out that I could have gone deeper into this "going lower-level" route by relaxing a constraint I had set myself at the time: I tried to make the backend not only rust-safe, but also make it catch a lot of footguns that could trigger protocol errors. I considering relaxing that.

The backend would be reworked to completely remove the ObjectData trait, and the core API it would expose to process messages would be something akin to:

impl Backend {
    fn incoming_messages(&mut self, f: impl FnMut(Message)) { ... }
}

The user thus provides a callback that is repeatedly called for all incoming messages. Free to them to use it to process them in real time, or just store them into a buffer for later processing.

At the level of the backend, objects are only represented by their ObjectId, a handle that is Eq + Hash + Clone, and keeps representing uniquely the same object, even after it is destroyed.

The backend no longer automatically tries to track object destruction for you, and will expose a method like:

impl Backend {
    fn destroy_object(&mut self, id: ObjectId) { ... }
}

Similarly, the backend no longer checks the signature of outgoing messages, and will happily serialize to the wire whatever you give to it. This means that it would be easy to trigger protocol errors (but no rust-unsafety) if you use those API wrong.

Associating data to objects

An important need is the capacity to associate data with specific instances of Wayland objects. With the removal of ObjectData from the backend, we need a new mechanism to cover that.

I'm thinking of introducing a DataMap container, which would in essence be a glorified HashMap<ObjectId, TypeMap> (I'm assuming the runtime cost of the map lookup is negligible.). This container has in itself no synchronization backed into it, and the user would be free to use more than one if practical or relevant. Data is not automatically removed from the container, and the user needs to actively remove it when the relevant object is destroyed.

This puts data management in a much more manual fashion, as a trade-off for removing a large part of the friction related to thread-safety that was caused by storing the user data inside of the ObjectData of each object.

Dispatching messages

Probably the most central difficulty in API design for the Wayland protocol is the dispatching of incoming messages to the bits of logic that need to handle it. In this new potential design, the backend no longer does anything regarding that, so all must be done in wayland-client and wayland-server (or any other thing built on top of wayland-backend).

As a reminder of the necessary or desirable properties of such dispatching:

  • The ordering of messages must be preserved (within a client server-side, and within an event queue client-side)
  • Even when two objects are instances of the same interface, it must be possible to dispatch their incoming messages to different processing logic
  • Most interactions with the protocol (sending messages, creating or destroying objects, etc..) must be possible concurrently with a dispatching session
  • Ideally, a minimal amount of plumbing code would need to be written by the user

My current idea regarding that would be a system comprised of modules somewhat similar to an actor model structure: blocks that take messages as input (either raw messages from the Wayland socket, or higher-level user-defined enums), standardized with a trait, and that can be connected into one another to create a pipeline: a module keeps a reference (via owning, or an Rc<_>, etc..) to the nest module in the pipeline, and invokes it as necessary.

The entry point for this system would be integrated into the DataMap: each object could be associated with a handle to the initial module of the chain that needs to process its messages. As a whole, the full pipeline of modules would thus be a DAG. We could thus have a method like:

impl DataMap<E> {
    fn dispatch_message(&mut self, msg: Message) -> E { ... }
}

Where the type parameter E represents the final return type of the pipeline, to allow the representation of messages that need to be processed outside of the pipeline logic. Each module of the pipeline would have mutable access to the DataMap (or at least a subset of it) and maybe we can fit something like the global &mut State we are currently using as well.

The core idea of that is that this pipeline can take any shape the user wants, for example:

  • Each object is associated with a single module that does its processing (kinda similar to the current ObjectData)
  • Each object is associated with a module that parses its message into a common enum that is directly returned by dispatch_message() and handled directly by the main loop (like with a huge match { ... }
  • An hybrid setup where several modules are chained, each doing processing at its level (such as state-tracking) and generating higher-level messages to be processed down the chain.

In this, the modularity of crates like SCTK or Smithay would be achieved by providing such pre-implemented modules that a user could integrate in its pipeline.

Questions

So, this design is still quite high-level, and there are a lot of details to iron out, but I'd like to get the general feeling of the current users of wayland-rs with theses ideas. In particular: compared to what you are currently doing with wayland-rs, do you think this would make your life easier? Harder? What does this evoke you?

@kchibisov
Copy link
Member

I don't have mental energy to go on a big discussion here. But the only thing which frustrates me is that I have very little control over how things are represented internally.

For example, my basic desire is to have things just use &mut and have all the objects !Send + !Sync and being local to the queue thread, so I pay nothing for MT safety, because I ensure it myself, and things more a bit more predictable.

I guess the API you suggest could work for my case, while still providing MT safety guarantees for users who actually want that instead of doing queue local processing and preventing objects from moving from the queue thread.

It's a bit unclear for me how compositing will work, I'd assume there will be a module which does the parsing and then dispatches to the right module generated by the wayland-scanner crate?

Also, could you mock up how e.g. interaction between different crates will work? I'd assume that crate A should define some interface and implement a trait some core crate provides, then it'll get its messages under certain conditions defined by the user? For example, let's say I have my main Queue and I crate wl_surface and I want to plug e.g. sctk-adwaita, which also creates wl_surface, but I don't want to share or know about objects not created by my crate, I'd assume that I'd have a module which tracks my objects and if it's not mine I'll dispatch to sctk_adwaita or something, so it can done its own processing and somehow update the &mut State?

There's a very big chance that I read proposal not entirely correctly.

@Drakulix
Copy link
Member

Drakulix commented May 7, 2024

Okay, I'll first start this with my current pain-points with wayland-rs. Both from a perspective of a smithay maintainer and cosmic-comp developer:

Thread-safety

This is a huge one and I am happy this RFC tries to address this problem.

As a smithay maintainer we have two opposed design goals here:

  • Zero-cost for people like @kchibisov that don't want to deal with the added complexity or performance costs for single-threaded apps.
  • Safety for applications (like cosmic-comp), that want to do proper multi-threading. Note we are not talking about dispatching messages over multiple threads here (which I think is much more difficult), but just thread-safe mutable access to the State.

I think the DataMap makes this simpler, given a user could just wrap the whole DataMap inside a lock. Which brings up the question for me, if I can have multiple DataMaps? E.g. for different protocols? I guess not, as e.g. the fractional_scale protocol, might need to store data associated to some wl_surface.

So if I want to make parts of it accessible to different threads, do I always need to lock up the whole thing? Or store Arc<Mutex<_>>-data types? The latter wouldn't be better than the current status quo, the former leaves a lot to be desired for performance.

Which brings me neatly to the composability issue here. In smithay we would want abstractions, that can be thread-safe, so are we inherently backing locks into the data-types here again? If so, we gain nothing, because anybody seriously about eliminating locks would need to re-implement all protocols.

We have so much data, which we either associate with a specific object or the State in smithay, sometimes it can be read by downstream, sometimes it is a private part of the implementation of the protocol and tracing access here is challenging. So I would value any approach, that does offload the aspect of locking to the user of the library.

Which might mean smithay shouldn't manipulate the DataMap directly, but rather request a ReadGuard or WriteGuard via some trait, which could either simply be a (mutable) reference or some lock. If that is the solution here, then the new api makes this easier, because accessing the user-data inside smithay right now is super opaque to downstream. With this approach, we get a very central access structure via the DataMap. Though I wonder, if the actual storage of the DataMap should possibly be implemented by downstream with implementations provided to offload the locking part.

State-centric Design

This is obviously a much harder issue. I perceive it as one, because we have a couple of places, where we are generic over the whole state, because we don't know what parts downstream might need to access. The biggest offender here in smithay is our Seat-implementation, which needs the whole state for processing any input.

An attempt to solve this with the current structure was made here (but didn't lead to much feedback and only complicates the API): Smithay/smithay#1384

This RFC feels too vague to be judged in this regard to me. Given we wouldn't have a giant State-struct anymore, but just the DataMap it might potentially be easier to separate these issues. But I would need to see more of smithay being fleshed out with this new approach.

(Obviously this has also been a huge problem for any attempts at asyncifying wayland-rs. I still don't think that is a particular good idea, but it is obvious that a lot of the I/O-abstractions in rust have embraced async IO at this point. So any improvements here might have very positive effects on these attempts. - I am not saying we should provide a async-implemention at this point.)

@PolyMeilex
Copy link
Member

I don't have much to add for now, but I belive it's worth linking those two (sadly compeating) dispatching reworks that we have at the moment
Smithay/smithay#1384 and #699

Exsistence of those two kinda shows that the current API is greate on it's own, but gets anoying when we try to abstract implementations in smitahy or sctk.

So having a way to dispatch to types diferent then D seems like an important topic. Either by my delegate dispatch work or Drakulix's forward dispathing. My work on this got kinda stalled, as I'm myself not sure which solution is more worth pursuing.

@ids1024
Copy link
Member

ids1024 commented May 7, 2024

(Not) Async

If wayland-rs used async for it's API, it seems like that would naturally involve Streams (or AsyncIterator or whatever). It seems like even if that is useful, Rust's async support remains too immature at present. (No stable Stream type in the standard library, or generator syntax, etc.) So yeah, not really something to be doing at the moment. Though if there are any problems currently with creating higher-level async APIs around wayland-rs, that would be good to improve.

Some ideas I might have for async APIs also may not work well with some requirements for ordering of messages from the server and responses to them... (like if you had different async tasks handling messages from different Wayland object, instead of the well-ordered callback-based approrach Wayland is designed around, that might be an issue with some things? Or at least it would somehow have to group things like surfaces and role objects together...).

Backend

Making wayland-backend more minimal and low-level definitely sounds good.

We could also make wayland-backend not a public dependency of wayland-client and wayland-server. Wrapping any types instead of re-exporting. Then the API of the backend doesn't have to be stable. If there isn't much we'd want to expose from the backend, that could be good even if we don't expect to break much.

impl Backend {
    fn incoming_messages(&mut self, f: impl FnMut(Message)) { ... }
}

It's kind of a minor point, but for reis I just went with a next-message method returning an Option. Since the backend has to buffer bytes in a VecDeque ring-buffer anyway. A non-blocking next API like that is also easy to adapt to an Iterator or Stream.

Threading and DataMap

For threading, one thing I've noticed is that my client couldn't use types that aren't Send + Sync for object udata. Which may be inefficient, a bit more difficult, or impossible if you have a third-party type you want to store that isn't Send. This seems difficult to address though. QueueHandle would need a non-thread-safe variant, that doesn't require Send + Sync when creating objects. And either the ObjectIds and wrappers couldn't be passed to other threads, or you couldn't use Proxy::data?

DataMap sounds like it help here. And I generally prefer a map like that over C-style user-data, at least if it hopefully performs as well. Though I'm not sure exactly how that would work. Would something like dispatch_pending take an &mut DataMap? And if I want to access the data associated with an object in a different thread, I'd need to use something like Arc<Mutex<DataMap>> and put a reference somewhere that thread can access it? What do the generated methods (like WlCompositor::create_surface now look like? I guess if WlCompositor is Send, the udata type still has to be.

@bwidawsk
Copy link

bwidawsk commented May 7, 2024

make it catch a lot of footguns that could trigger protocol errors

Triggering protocol errors is important to me as someone who wants to help a Wayland test suite. However, I do believe it makes sense to make that ability opt-in. An average user likely has no need for footguns.

The backend no longer automatically tries to track object destruction for you

Is there a way to have both? I think a huge majority of the time, clients of this library don't want to do this themselves and the code will be duplicated for each. In other words, the existing destroyed fn in ObjectData is useful.

I'm thinking of introducing a DataMap container, which would in essence be a glorified HashMap<ObjectId, TypeMap>

What is TypeMap here?


Overall, it sounds like a good progression. Some bits are a bit too abstract for me to follow, but I'm happy to attempt to code it up in way-assay at some point. It took some finessing but we've managed to work around some of the constraints associated with ObjectData. Below is how we handle it, ObjectData still needs to be implemented, but it ends up being pretty neat and tidy IMO. I'd like to make sure it won't be too bad to switch to the new incoming_messages/dispatch_messages

macro_rules! protocol_bind {
    ($reg:expr, $proto:tt, $global:expr, $constraints:expr, $objdata:expr) => {{
        let global_str = paste::paste! { stringify!([<$proto:snake>]) };

        // Bind the specific version set by the caller of the protocol, or fallback to
        // the latest version supported by the server.
        let version: u32 = $constraints
            .iter()
            .find(|r| r.0 .0 == global_str)
            .map_or($global.version, { |w| w.0 .1 });

        (
            match $reg.send_constructor::<$proto>(
                wl_registry::Request::Bind {
                    name: $global.name,
                    id: ($proto::interface(), version),
                },
                $objdata,
            ) {
                Ok(y) => y,
                Err(e) => {
                    info!("Couldn't bind {} {e}", stringify!($proto));
                    continue;
                }
            },
            // Return the name as string, and version bound
            (global_str, version),
        )
    }};
}

...
contents.with_list(|globals| {
    for global in globals.iter() {
        macro_rules! bind {
            ($proto:tt, $objdata:expr, $state:expr) => {{
                let (x, proto) =
                protocol_bind!(registry, $proto, global, required_globals, $objdata);
                if required_globals.extract_if_higher_protocol(&proto.into()) {
                $state.replace(x);
                }
                }};
        }
        match &global.interface[..] {
            // TODO: When stable: https://github.com/rust-lang/rust/pull/104087
            "wl_compositor" => {
                bind!(WlCompositor, Arc::new(EventlessState), state.compositor)
            }
            "wl_subcompositor" => bind!(
                WlSubcompositor,
                Arc::new(EventlessState),
                state.subcompositor
            ),
           // shm_state implemenst ObjectData
            "wl_shm" => bind!(WlShm, shm_state.clone(), state.shm),
            _ => (),
        }
    }
});

@elinorbgr
Copy link
Member Author

Thanks for that initial feedback, that gives me some food for thought. To quickly answer a few of the questions before I come back at a later point with a more fleshed-out proposal:

@kchibisov

It's a bit unclear for me how compositing will work, I'd assume there will be a module which does the parsing and then dispatches to the right module generated by the wayland-scanner crate?

The exact shape is still to be determined, but yes, there would be something like that provided by wayland-client/-server/-protocols.

Also, regarding interaction with multiple crates, yes, my general idea is to still make it possible for them to coexist without need to be aware with each other beyond some initial setting-up by the downstream app.

@Drakulix

Can I have multiple DataMaps?

Yes, and the same object can be used as a key in multiple different DataMaps as well. So a module could use an internal DataMap to store its private information about objects, or require to be given access to a downstream-provided datamap. In the latter case, it is the responsibility of the downstream use to decide how the multiple DataMaps are organised.

So if I want to make parts of it accessible to different threads, do I always need to lock up the whole thing? Or store Arc<Mutex<_>>-data types?

You could for examples have a DataMap that only stores the data that needs to be shared between threads, and have that map inside an Arc<Mutex<_>> or Arc<RwLock<_>>, and another for the rest of the data. The thread processing Wayland events would then be responsible for keeping the contents of the first DataMap up-to-date for the other threads to access. How does that sound?

Overall, I expect this kind of design to give much more control to downstream (including Smithay/SCTK) on what data is stored how and where. But there are certainly many details to iron out.

@ids1024

Then the API of the backend doesn't have to be stable.

My general expectation for long-term was (and is still) that the API of the backend, if designed in a sufficiently low-level way, would be much more stable than that of wayland-client/wayland-server. To me this could potentially be desirable in two main ways:

  • Allow crates using different versions of wayland-client/wayland-server to coexist in the same project and still remain compatible (provided interoperability is done using the backend types)
  • Allow experimentation around other kind of client/server APIs, thus alternatives to wayland-client or wayland-server, while remaining compatible with the rest of the Smithay ecosystem

It's kind of a minor point, but for reis I just went with a next-message method returning an Option.

I proposed this callback-based API because it is lower-level than a one returning an Option, especially in the case where we are using libwayland under the hood (otherwise, it would force another layer of buffering on top of what libwayland already does). My idea is that such a "next-message" API would be implemented at a higher level, by collecting the messages into a buffer, either before or after the "message processing pipeline" depending of what the downstream user wants to do.

And if I want to access the data associated with an object in a different thread [...] ?

I think this is answered earlier in my message, as an answer to @Drakulix 's similar question

@bwidawsk

Is there a way to have both [automatic tracking of destructors or not]?

My idea is to remove this from the backend, and instead move this kind of tracking into wayland-client/wayland-server, but I'm not yet sure of how that would be implemented exactly.

What is TypeMap here?

Something like that: https://github.com/azriel91/anymap2

@Drakulix
Copy link
Member

How does that sound?

That sounds really nice actually and I feel like that might fix a bunch of problems I am fighting against with the current api.

Overall, I expect this kind of design to give much more control to downstream (including Smithay/SCTK) on what data is stored how and where. But there are certainly many details to iron out.

I am all for that!

@ids1024
Copy link
Member

ids1024 commented May 16, 2024

My general expectation for long-term was (and is still) that the API of the backend, if designed in a sufficiently low-level way, would be much more stable than that of wayland-client/wayland-server. To me this could potentially be desirable in two main ways:

  • Allow crates using different versions of wayland-client/wayland-server to coexist in the same project and still remain compatible (provided interoperability is done using the backend types)
  • Allow experimentation around other kind of client/server APIs, thus alternatives to wayland-client or wayland-server, while remaining compatible with the rest of the Smithay ecosystem

So far this hasn't really happened, but a stable wayland-backend would be useful for rust-windowing/raw-window-handle#120.

With the sys backend, it doesn't matter as much, since multiple versions are already interoperable with each other and with C libraries. But it could be better to have a stable wayland-backend, especially when there are bug fixes (the sys backend to wayland-backend is where almost all the unsafe code in wayland-rs resides).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants