Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphQL mutations with @atomic directive and TTL are not persisted #2875

Open
marksurnin opened this issue Jan 22, 2024 · 4 comments · May be fixed by #2893 or #2895
Open

GraphQL mutations with @atomic directive and TTL are not persisted #2875

marksurnin opened this issue Jan 22, 2024 · 4 comments · May be fixed by #2893 or #2895
Assignees

Comments

@marksurnin
Copy link
Contributor

marksurnin commented Jan 22, 2024

Hi Stargate team,

We've encountered a potential bug in Stargate v2. Atomic mutations (logged batches) with any number of mutations that use TTL are not persisted. Moreover, Stargate responds with "applied": true for those mutations, which is misleading. I've reproduced this in the v2 images stargateio/coordinator-4_0:v2 and stargateio/graphqlapi:v2, as well as on the v2.1 branch.

Request (using the standard data_endpoint_auth keyspace):

mutation insert_atomic_ttl @atomic {
  m1: inserttoken(
    value: {
      auth_token: "12735332-b914-11ee-a506-0242ac120002"
      username: "2a"
    }
    options:{ttl:86400}
  ) {
    applied
  }
  m2: inserttoken(
    value: {
      auth_token: "15cdf1e0-b914-11ee-a506-0242ac120002"
      username: "2b"
    }
    options:{ttl:86400}
  ) {
    applied
  }
}

Response

{
  "data": {
    "m1": {
      "applied": true
    },
    "m2": {
      "applied": true
    }
  }
}

Subsequently querying data_endpoint_auth.token table shows that no rows have been inserted. Note that rows are persisted correctly as long as I use either @atomic directive or ttl in the mutations.

I haven't found the root cause of the failed mutation but it's concerning that applied: true and it's not persisted. This used to work in v1. Do you know what could be causing this? Thanks!

@marksurnin
Copy link
Contributor Author

Some troubleshooting findings:

  • I'm aware of special handling of @atomic mutations in MutationFetcher.java. However, this kicks in if the number of selections (i.e. mutations is > 1) but I've reproduced this behavior even in a single mutation with @atomic and TTL.
  • I've debugged Cassandra BatchMessage code too and I see the ttl value present as an attribute on the statement as seen in the screenshot. As a side note, this method returns EMPTY_RESULT regardless of whether I pass TTL or not (which also at the moment determines whether the mutation has actually been persisted).
    Screenshot 2024-01-22 at 3 11 49 PM

@marksurnin
Copy link
Contributor Author

Further investigation showed that in v2 all mutations with the @atomic directive insert rows with timestamp 1970-01-01T00:00:00Z if not otherwise specified in the query. TTL is computed relative to the write timestamp, so passing options:{ttl:86400} would result in a TTL timestamp of 1970-01-02.

Cassandra interprets such rows as having already expired (since they supposedly TTL'ed out back in 1970) and thus omits them from the results. I verified this in the commitlog and sstables using sstabledump. Moreover, this row would eventually expire, likely after a major compaction. Please see mutations.txt and sstabledump.txt attached.

"liveness_info" : { "tstamp" : "1970-01-01T00:00:00Z", "ttl" : 86400, "expires_at" : "1970-01-02T00:00:00Z", "expired" : true },
"cells" : [
  { "name" : "username", "value" : "2b" }
]

Root cause details

  1. query.proto defines timestamp and now_in_seconds as google.protobuf.Int64Value timestamp = 4;. If this value is null invoking the generated com.google.protobuf.Int64Value getTimestamp() method returns 0. This is because it's serialized to 0 by default if it is unset.
  2. In StargateGraphqlContext.java#L145-L147 the toBatchParameters method sets timestamp and nowInSeconds:
.setTimestamp(queryParameters.getTimestamp())
.setNowInSeconds(queryParameters.getNowInSeconds())

Inspecting these with a debugger shows that at runtime the timestamp field is null (screenshot) but invoking getTimestamp returns 0 as mentioned in step 1 (screenshot).

  1. At the coordinator bridge level these 0 values are passed to the actual query in BatchHandler.java. This explains the write timestamp being 1970-01-01T00:00:00Z.
if (parameters.hasTimestamp()) {
  builder.defaultTimestamp(parameters.getTimestamp().getValue());
}

if (parameters.hasNowInSeconds()) {
  builder.nowInSeconds(parameters.getNowInSeconds().getValue());
}

This can be fixed by changing StargateGraphQLContext's toBatchParameters method to set timestamp and nowInSeconds only if they are non-null. This takes advantage of the generated hasTimestamp method that makes batch parameters' timestamp field behave similar to an Optional.

private BatchParameters toBatchParameters(QueryParameters queryParameters) {
  BatchParameters.Builder builder =
    BatchParameters.newBuilder()
      .setConsistency(queryParameters.getConsistency())
      // omitted a few lines
      .setSkipMetadata(queryParameters.getSkipMetadata());
  
  if (queryParameters.hasTimestamp()) {
    builder.setTimestamp(queryParameters.getTimestamp());
  }
  if (queryParameters.hasNowInSeconds()) {
    builder.setNowInSeconds(queryParameters.getNowInSeconds());
  }
  return builder.build();
}

@jeffreyscarpenter
Copy link
Collaborator

@kathirsvn can you review/comment?

@marksurnin marksurnin linked a pull request Feb 12, 2024 that will close this issue
4 tasks
@kathirsvn
Copy link
Contributor

Sorry for the delay, I was out on a vacation.
This analysis for the bug sounds reasonable to me. Great find @marksurnin. It seems that the introduction of proto definitions (for the gRPC bridge) in the v2 code base introduced this bug.

@kathirsvn kathirsvn linked a pull request Feb 14, 2024 that will close this issue
4 tasks
@sync-by-unito sync-by-unito bot closed this as completed Feb 16, 2024
@sync-by-unito sync-by-unito bot reopened this Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants