Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scylla: Add support for rustls Fixes #293 #911

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

nemosupremo
Copy link
Contributor

@nemosupremo nemosupremo commented Jan 11, 2024

This PR adds support for rustls behind a feature. Rustls is a pure rust implementation of TLS. With this PR, it obviates the need for clients to install libssl separately, which can be advantageous for users who with to cross compile binaries or just want a simpler building process.

There is one breaking change:

  • Since the rustls and ssl features cannot be used together, the cloud feature no longer automatically pulls in ssl. Instead a compiler error is generated when the cloud feature is enabled without either rustls or ssl. A compiler error is also generated when both rustls and ssl are enabled.

Concerning scylla cloud:

  • I left the rustls implementation close enough to the current openssl one and I thought it would be easy to get scylla cloud to work with rustls. As I looked at the code, and realized I had no way to test it as I'm not a customer I opted to have the cloud feature depend on ssl. I don't have any business mucking around in your guy's cloud code.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

@nemosupremo
Copy link
Contributor Author

While implementing this, I noticed that

  1. The actual hostname that is used to connect to the cluster is not preserved. I think this is because Scylla doesn't actually know the hostnames of other nodes in the cluster, just the IP addresses. This presents a problem for Rustls - the default implementation expects to find the hostname or the IP address in either the common name or subject alternate name of the certificate.

  2. Disabling hostname verification isn't easy and requires some boilerplate. There was a discussion in Allow disabling Hostname Verification rustls/rustls#578 and Dangerous verifiers API proposal rustls/rustls#1197 on making this easier in rustls itself that didn't make it as other database drivers had similar issues. What a driver like github.com/redis-rs/redis-rs opted to do was to manage the configuration internally and handle the implementation of disabling hostname verification. However, scylla-rust-driver's design has a "BYOB" style of configuration and it seems expected for the user to do this themselves. This is what I have done in the rustls example.

@nemosupremo
Copy link
Contributor Author

I'm not sure how to fix the Rust / min_rust (pull_request) test.

@nemosupremo nemosupremo marked this pull request as ready for review January 12, 2024 06:44
@mykaul
Copy link
Contributor

mykaul commented Jan 12, 2024

Do we have some performance numbers to compare between the two implementations?

@Lorak-mmk Lorak-mmk mentioned this pull request Jan 12, 2024
8 tasks
@nemosupremo
Copy link
Contributor Author

As far as the implementations go, rustls is a bit faster than openssl as benchmarked here:

https://jbp.io/2019/07/01/rustls-vs-openssl-performance.html

See those posts for details and analysis. To summarise the results, though, we can say approximately:

  • rustls is 15% quicker to send data.
  • rustls is 5% quicker to receive data.
  • rustls is 20-40% quicker to set up a client connection.
  • rustls is 10% quicker to set up a server connection.
  • rustls is 30-70% quicker to resume a client connection.
  • rustls is 10-20% quicker to resume a server connection.
  • rustls uses less than half the memory of OpenSSL.

However I don't have the infrastructure to test the change in performance against a scylla cluster.

@Lorak-mmk
Copy link
Collaborator

I started to review this, but I decided against posting comments to specific fragments of code. I really like the idea of adding rustls support and we'd like such support to be part of 1.0.
But I think we should take a different approach. In this PR you introduced a new rustls feature, that is conflicting with ssl feature.
This will make documentation confusing (now you have 2 instances of struct with the same name, but enabled for different features).
It will prevent use of e.g. cargo check --all-features and make the features non-additive. Also the amount of cfg directives is quite high and it makes it hard to read the code.

The approach I think should be taken is for SslConfig to be an enum or trait. Enabling rustls / ssl features would introduce implementation of the trait / variants of the enum.
This makes features additive, and users would be able to choose at runtime the SSL implementation they want to use.
I'm not sure which is better here: enum or trait. We also need to think about how to make cloud work best with this approach. WDYT @wprzytula @piodul ?
Trait would enable external implementations. This makes sense if someone is using BoringSSL / new Amazon library or something else we don't (yet) support.
Enum OTOH might provide simpler API / better performance.

Regarding hostnames problem - I'm not familiar with rustls, but if it's possible to edit the config after we receive it from user to disable this verification - I'm for it.

Would you like to try implementing such solution, with our help?

@wprzytula
Copy link
Collaborator

I started to review this, but I decided against posting comments to specific fragments of code. I really like the idea of adding rustls support and we'd like such support to be part of 1.0. But I think we should take a different approach. In this PR you introduced a new rustls feature, that is conflicting with ssl feature. This will make documentation confusing (now you have 2 instances of struct with the same name, but enabled for different features). It will prevent use of e.g. cargo check --all-features and make the features non-additive. Also the amount of cfg directives is quite high and it makes it hard to read the code.

I strongly agree that our features should be additive.

The approach I think should be taken is for SslConfig to be an enum or trait. Enabling rustls / ssl features would introduce implementation of the trait / variants of the enum. This makes features additive, and users would be able to choose at runtime the SSL implementation they want to use. I'm not sure which is better here: enum or trait. We also need to think about how to make cloud work best with this approach. WDYT @wprzytula @piodul ? Trait would enable external implementations. This makes sense if someone is using BoringSSL / new Amazon library or something else we don't (yet) support. Enum OTOH might provide simpler API / better performance.

I'm fine with the enum.
The usual tradeoff between enum and trait is that enum is more efficient, whereas a boxed trait enables user's custom implementations.
I think we could have pros of both this way:

trait SslProvider {
  ...
}

#[cfg(feature = ssl) // consider if putting this under feature is beneficial wrt performance
enum SslConfig {
  #[cfg(feature = "rustls")
  Rustls(RustlsProvider),
  #[cfg(feature = "openssl")
  Openssl(OpensslProvider),
  // maybe a cfg here as well?
  Custom(Arc(dyn CustomSslProvider),
}

impl SslProvider for RustlsProvider { ... }
impl SslProvider for OpensslProvider { ... }
impl SslProvider for SslConfig {
   // match and use trait impl on any of the 3 enum variants
}

@Lorak-mmk
Copy link
Collaborator

Lorak-mmk commented Mar 6, 2024

The approach I think should be taken is for SslConfig to be an enum or trait. Enabling rustls / ssl features would introduce implementation of the trait / variants of the enum. This makes features additive, and users would be able to choose at runtime the SSL implementation they want to use. I'm not sure which is better here: enum or trait. We also need to think about how to make cloud work best with this approach. WDYT @wprzytula @piodul ? Trait would enable external implementations. This makes sense if someone is using BoringSSL / new Amazon library or something else we don't (yet) support. Enum OTOH might provide simpler API / better performance.

I'm fine with the enum. The usual tradeoff between enum and trait is that enum is more efficient, whereas a boxed trait enables user's custom implementations. I think we could have pros of both this way:

trait SslProvider {
  ...
}

#[cfg(feature = ssl) // consider if putting this under feature is beneficial wrt performance
enum SslConfig {
  #[cfg(feature = "rustls")
  Rustls(RustlsProvider),
  #[cfg(feature = "openssl")
  Openssl(OpensslProvider),
  // maybe a cfg here as well?
  Custom(Arc(dyn CustomSslProvider),
}

impl SslProvider for RustlsProvider { ... }
impl SslProvider for OpensslProvider { ... }
impl SslProvider for SslConfig {
   // match and use trait impl on any of the 3 enum variants
}

In your example our functions would accept SslConfig instead of Arc<dyn SslProvider>, right? I like this a lot.

@Lorak-mmk Lorak-mmk added the waiting-on-author Waiting for a response from issue/PR author label Mar 14, 2024
@aldanor
Copy link

aldanor commented Apr 9, 2024

@Lorak-mmk @wprzytula I will take a look at this to move this forward if noone minds.

Correct me if I'm wrong, but one problem with the trait + enum suggestion, if you look closely, is that there's SslContext (which is a different for openssl / rustls) stored in Session. And you can't have that Custom(Arc<dyn>) if you have an associated type in the trait... so the only option if you wanted to retain what you've suggested is to store it as ssl_context: Arc<dyn Any>, would that be suitable?

Another question I had - in regards to feature names, it gets a bit confusing, should ssl be renamed to openssl? (a bit of a breaking change, of course) Does there need to be a separate ssl feature as well then, or perhaps not?

(btw: I couldn't get dockerized tests working on a mac; docker compose up only launches one healthy node and then gets stuck...)

@nemosupremo
Copy link
Contributor Author

This fell off for me as it wasn't clear if rustls was something scylla wanted to support. I see it has the waiting-on-author label now. w.r.t ssl_context could just be SslConfig that does handwritten dynamic dispatch to the enum variants.

That said, I think changing ssl_context may be a more invasive change, and my original goal was to minimize the amount of code I touched. @aldanor let me know if you plan on taking this over - I'm not planning on refactoring this PR unless you are no longer interested.

@wprzytula wprzytula removed the waiting-on-author Waiting for a response from issue/PR author label Apr 13, 2024
@Lorak-mmk
Copy link
Collaborator

@Lorak-mmk @wprzytula I will take a look at this to move this forward if noone minds.

Correct me if I'm wrong, but one problem with the trait + enum suggestion, if you look closely, is that there's SslContext (which is a different for openssl / rustls) stored in Session. And you can't have that Custom(Arc<dyn>) if you have an associated type in the trait... so the only option if you wanted to retain what you've suggested is to store it as ssl_context: Arc<dyn Any>, would that be suitable?

I don't see why would there by any need for associated type in SslProvider trait.
Regarding the enum idea, we discussed it with @wprzytula and this optimization is not necessary here, we can just use Arc<dyn SslProvider>.

Another question I had - in regards to feature names, it gets a bit confusing, should ssl be renamed to openssl? (a bit of a breaking change, of course) Does there need to be a separate ssl feature as well then, or perhaps not?

I think the main reason for those features is to avoid pulling unnecessary dependencies. That means that the trait SslProvider and it's usages could be present unconditionally and only implementation (rustls, openssl) would be put behind features. wdyt @wprzytula ?
Renaming the feature is fine imo.

(btw: I couldn't get dockerized tests working on a mac; docker compose up only launches one healthy node and then gets stuck...)

Did you wait a few minutes? Launching Scylla takes some time.
It sometimes just hangs for me too - trying again (make down && make up) is usually enough.

@Lorak-mmk Lorak-mmk removed their request for review April 29, 2024 15:54
@aldanor
Copy link

aldanor commented May 1, 2024

I don't see why would there by any need for associated type in SslProvider trait.
Regarding the enum idea, we discussed it with @wprzytula and this optimization is not necessary here, we can just use Arc.

@Lorak-mmk As it's currently written, in Session you have an instance of an sslprovider-dependent type which is not the provider itself. Arc<dyn SslProvider> won't help you here unless you agree to store the context as Arc<dyn Any> or perhaps Arc<dyn SslContext> if the latter even makes sense. Or, do something like what @nemosupremo suggested above (config with dynamic dispatch).

Did you wait a few minutes? Launching Scylla takes some time. It sometimes just hangs for me too - trying again (make down && make up) is usually enough.

Yep, tried it all possible ways, and waited over 10 minutes, make up/down, nothing helps, sigh (I'm on an m1 mac, if that's any relevant).


@nemosupremo Thanks for answering (and apologies for taking long to get back).

@aldanor let me know if you plan on taking this over - I'm not planning on refactoring this PR unless you are no longer interested.

I think it makes sense for those present in this thread to jointly get this PR to completion in some way since it's more than halfway there already. To be frank I am barely familiar with either scylla or this particular codebase, so I'd definitely prefer it to be led by someone more knowledgeable than I am :) But if noone else wants or has time to, I could give it a try too, would really like to see it get merged in.

Cheers.

@Lorak-mmk
Copy link
Collaborator

Did you wait a few minutes? Launching Scylla takes some time. It sometimes just hangs for me too - trying again (make down && make up) is usually enough.

Yep, tried it all possible ways, and waited over 10 minutes, make up/down, nothing helps, sigh (I'm on an m1 mac, if that's any relevant).

Could you share logs from Scylla? Either from the node that hangs at boot or from all 3. docker logs <container name>, you can obtain container name from docker ps (unless something is very different on macs...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants