Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made built-in tracing broader and OTel compliant #472

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Jul 11, 2022

In effort of increasing Scylla drivers' observability, I instrumented this one with OpenTelemetry compliant tracing. The solution utilises the tokio-tracing already present in the driver, however adding more spans and tags to them. The structure and content of those are expected to be roughly consistent among various drivers, so I kept close to Java driver's implementation. For converting tokio-tracing traces into OTel traces, I've used tracing_opentelemetry compatibility layer.

Considerations yet to be done

  • configurable Verbosity Level,
  • whether we should turn the new tracing features (including possibly heavy data transfers upon creating spans and filling them with tags) on only when explicitly requested,
  • how we can extend Value and ValueList so that they enable putting their string representation into bound_values and partition_key tags.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I added appropriate Fixes: annotations to PR description.

@wprzytula wprzytula marked this pull request as draft July 11, 2022 12:29
@wprzytula wprzytula force-pushed the otel-tracing branch 2 times, most recently from cbe0dd9 to 8579a70 Compare August 4, 2022 05:20
@wprzytula wprzytula force-pushed the otel-tracing branch 2 times, most recently from 72cc9ad to da2a8de Compare August 23, 2022 08:31
@wprzytula wprzytula force-pushed the otel-tracing branch 3 times, most recently from 4abd1a1 to cf5a580 Compare September 9, 2022 11:48
@wprzytula wprzytula marked this pull request as ready for review September 9, 2022 11:56
@wprzytula wprzytula force-pushed the otel-tracing branch 4 times, most recently from 6451697 to 38d4bc4 Compare September 28, 2022 14:32
@piodul
Copy link
Collaborator

piodul commented Nov 4, 2022

@wprzytula There are conflicts with main - please rebase before we proceed with review.

@wprzytula wprzytula force-pushed the otel-tracing branch 2 times, most recently from 0469dbe to 0e04b9a Compare November 4, 2022 11:15
@wprzytula wprzytula force-pushed the otel-tracing branch 2 times, most recently from fa0419e to 419b1b1 Compare November 18, 2022 11:25
@wprzytula wprzytula force-pushed the otel-tracing branch 2 times, most recently from 37b3fc0 to e38b48b Compare March 15, 2023 16:17
@@ -881,6 +891,11 @@ impl Session {
self.extract_partitioner_name(&prepared, &self.cluster.get_data()),
);

span.record(
"db.scylladb.prepared_id",
&format!("{:X}", prepared.get_id()).as_str(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using format! unconditionally here is a bit costly because it allocates. I think you should call .record and construct the arguments for it only if the span is enabled.

Copy link
Collaborator Author

@wprzytula wprzytula Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances can the span be disabled?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @piodul

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace_span!, which underlies otel_span!, might not emit anything if necessary criteria are not met (e.g. tracing level is not high enough), right?

I just want to make sure that the allocation cost is avoided if the otel_span! invocation doesn't have any effect. However, approach from #656 is probably simpler, so let's just make sure that we don't allocate when passing information to tracing but rather use format_args! etc.

scylla/src/utils/mod.rs Outdated Show resolved Hide resolved
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
scylla/src/utils/mod.rs Outdated Show resolved Hide resolved
@@ -21,6 +22,33 @@ use secrecy::{ExposeSecret, Secret, Zeroize};
/// serialize() should write the Value as [bytes] to the provided buffer
pub trait Value {
fn serialize(&self, buf: &mut Vec<u8>) -> Result<(), ValueTooBig>;

fn for_debug(&self) -> String {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are for_debug and its friends really needed?. Wouldn't it suffice to just require Values to implement Display trait?

Comment on lines 475 to 489
let gen_bound_values_repr = |columns: &mut dyn Iterator<Item = usize>| {
let mut s = String::new();
for i in columns {
s.push_str(&metadata.col_specs[i].name);
s.push('=');
bound_values.for_debug_append_nth(&mut s, i).ok()?;
s.push_str(", ");
}
if metadata.col_count > 0 {
s.pop();
s.pop();
}
Some(s)
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the std formatting framework in order to avoid allocations.

Comment on lines +147 to +160
fn for_debug(&self) -> String {
let mut s = String::new();
self.for_debug_append(&mut s);
s
}

fn for_debug_append(&self, s: &mut String) {
s.push_str("<Serialized>")
}

fn for_debug_append_nth(&self, s: &mut String, _n: usize) -> Result<(), NotEnoughValuesError> {
s.push_str("<Serialized>");
Ok(())
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar concern here as in Value. Is it important to always print query arguments as a comma-separated list of string? Can we get away with just requiring Display?

Copy link
Collaborator

@piodul piodul Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, this is quite fragile - I think we have some public methods which take ValueList, call serialized() and then just call other methods that also take ValueList. If the tracing logic is situated in the latter, then we will always get "<Serialized>" in tracing and this all will be useless.

Maybe, instead, we could just print raw bytes from a serialized form? We already have SerializedValues::iter() that allows going over serialized values. It doesn't look as nice, but will be more consistent and less prone to such issues. Then, we wouldn't also need Display on ValueList or Value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the serialized form of the values be as readable and helpful in debug as the Display form is?

scylla/src/utils/mod.rs Show resolved Hide resolved
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
@wprzytula
Copy link
Collaborator Author

v2:

  • got rid of unnecessary allocations,
  • reduced visibility of helper traits to pub(crate).

Added otel_span! macro and SpanExt trait. Their purpose is to avoid
boilerplate across driver code when instrumenting it with
OTel-compatible tracing.
As a way to both test the subsequently added OTel tracing features and
present their demo, I added this example. It involves emitting a trace
and sending it to Zipkin.
We'll use it for display in OTel traces.
We'll use it for display in OTel traces.
We'll use it for display in OTel traces.
In an effort to increase ScyllaDB observability, we want to make the
Rust driver emit traces complying to OpenTelemetry standard. Therefore,
I adjusted the names of existing tags and added new ones, so the tag set
resembles one already used in the Java driver and complies to OTel.
This will enable adding "partition_key" and "bound_values" tags to
produced OTel traces.
@wprzytula
Copy link
Collaborator Author

v3:

  • cleaned up and encapsulated in a module spans creation in Session. Now, there are distinct types for the spans as well as dedicated setters for values to be recorded outside the mentioned module.

@piodul piodul added this to the 1.1.0 milestone Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants