Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal server error when fetching ditto connections via the api #1827

Open
JeffreyThijs opened this issue Nov 29, 2023 · 8 comments
Open

Comments

@JeffreyThijs
Copy link
Contributor

We upgraded recently from ditto 3.3.7 to 3.4.1 and we encounter the following exception when fetching all full connections via the connection api:

curl http://ditto-nginx:8080/api/2/connections --user "devops:mypassword" -s | jq
{
  "status": 500,
  "error": "connectivity:internalerror",
  "message": "There was a rare case of an unexpected internal error.",
  "description": "Please contact the service team."
}

When i add the ids-only or fields=id query parameter it correctly returns a list of the ditto connections (with only the id of course), but any additional or other fields yield the same error message. In the logs of both ditto-gateway and ditto-connectivity is see nothing that indicates that something is going wrong. Maybe bedaaee introduced this problem?

If you need more info, please let me know.

@thjaeckle
Copy link
Member

Hi @JeffreyThijs

Some logs would be nice, e.g. of gateway and connectivity service.
Maybe there event is a stacktrace?

@JeffreyThijs
Copy link
Contributor Author

Found following stacktrace in the gateway:

kb logs -l "app.kubernetes.io/name=ditto-gateway" -f | grep -i 'ConnectionNotAccessibleException' | jq                                                                                                                                       (dev/aloxy)
{
  "@timestamp": "2023-11-30T08:49:19.926+01:00",
  "@version": "1",
  "message": "Received response <{}> from connectivity.service.",
  "logger_name": "org.eclipse.ditto.gateway.service.endpoints.actors.DittoConnectionsRetrievalActor",
  "thread_name": "Thread-52011",
  "level": "DEBUG",
  "level_value": 10000,
  "stack_trace": "org.eclipse.ditto.connectivity.model.signals.commands.exceptions.ConnectionNotAccessibleException: The Connection with ID '327bfa0b-1984-4c50-8700-9adb1579e6e1' could not be found or requester had insufficient permissions to access it.\n\tat org.eclipse.ditto.connectivity.model.signals.commands.exceptions.ConnectionNotAccessibleException$Builder.doBuild(ConnectionNotAccessibleException.java:119)\n\tat org.eclipse.ditto.connectivity.model.signals.commands.exceptions.ConnectionNotAccessibleException$Builder.doBuild(ConnectionNotAccessibleException.java:101)\n\tat org.eclipse.ditto.base.model.exceptions.DittoRuntimeExceptionBuilder.build(DittoRuntimeExceptionBuilder.java:176)\n\tat org.eclipse.ditto.base.model.exceptions.DittoRuntimeException.fromJson(DittoRuntimeException.java:245)\n\tat org.eclipse.ditto.connectivity.model.signals.commands.exceptions.ConnectionNotAccessibleException.fromJson(ConnectionNotAccessibleException.java:84)\n\tat jdk.internal.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:568)\n\tat org.eclipse.ditto.base.model.signals.AnnotationBasedJsonParsable.parse(AnnotationBasedJsonParsable.java:94)\n\tat org.eclipse.ditto.base.model.signals.AbstractJsonParsableRegistry.fromJson(AbstractJsonParsableRegistry.java:97)\n\tat org.eclipse.ditto.base.model.signals.AbstractJsonParsableRegistry.lambda$parse$0(AbstractJsonParsableRegistry.java:79)\n\tat org.eclipse.ditto.base.model.exceptions.DittoJsonException.wrapJsonRuntimeException(DittoJsonException.java:161)\n\tat org.eclipse.ditto.base.model.signals.AbstractJsonParsableRegistry.parse(AbstractJsonParsableRegistry.java:78)\n\tat org.eclipse.ditto.base.model.signals.GlobalErrorRegistry.parse(GlobalErrorRegistry.java:91)\n\tat org.eclipse.ditto.base.model.signals.GlobalErrorRegistry.parse(GlobalErrorRegistry.java:36)\n\tat org.eclipse.ditto.internal.utils.cluster.AbstractJsonifiableWithDittoHeadersSerializer.deserializeJson(AbstractJsonifiableWithDittoHeadersSerializer.java:423)\n\tat org.eclipse.ditto.internal.utils.cluster.AbstractJsonifiableWithDittoHeadersSerializer.createJsonifiableFrom(AbstractJsonifiableWithDittoHeadersSerializer.java:325)\n\tat org.eclipse.ditto.internal.utils.cluster.AbstractJsonifiableWithDittoHeadersSerializer.tryToCreateKnownJsonifiableFrom(AbstractJsonifiableWithDittoHeadersSerializer.java:296)\n\tat org.eclipse.ditto.internal.utils.cluster.AbstractJsonifiableWithDittoHeadersSerializer.fromBinary(AbstractJsonifiableWithDittoHeadersSerializer.java:273)\n\tat org.apache.pekko.serialization.Serialization.deserializeByteBuffer(Serialization.scala:279)\n\tat org.apache.pekko.remote.MessageSerializer$.deserializeForArtery(MessageSerializer.scala:111)\n\tat org.apache.pekko.remote.artery.Deserializer$$anon$3.onPush(Codecs.scala:683)\n\tat org.apache.pekko.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:555)\n\tat org.apache.pekko.stream.impl.fusing.GraphInterpreter.processEvent(GraphInterpreter.scala:506)\n\tat org.apache.pekko.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:400)\n\tat org.apache.pekko.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:662)\n\tat org.apache.pekko.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:532)\n\tat org.apache.pekko.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:637)\n\tat org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter.org$apache$pekko$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:813)\n\tat org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:831)\n\tat org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)\n\tat org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545)\n\tat org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:729)\n\tat org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590)\n\tat org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557)\n\tat org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280)\n\tat org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241)\n\tat org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253)\n\tat java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)\n\tat java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)\n\tat java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)\n\tat java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)\n\tat java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)\n",
  "correlation-id": "d24b587e-8ec1-4581-8d17-38298b1def20"
}

@thjaeckle
Copy link
Member

I fear with that little information I cannot be of any help.

@BobClaerhout
Copy link
Contributor

So today I was trying to figure out what's going on here since not being able to fetch the existing connections is quite blocking to us.
As @JeffreyThijs stated, we were able to fetch the ids only. When I did that, I got back a list of approx. 20 connections being active in ditto. I manually tried fetching them because I wanted to check the contents of the connection. I stumbled upon one with id 327bfa0b-1984-4c50-8700-9adb1579e6e1 (the ID doesn't mind but if you look closely, it's the one that is mentioned in the log line Jeffrey posted before.
I went looking for it in mongoDB and saw 46 records in the connection_journal table with pid connection:327bfa0b-1984-4c50-8700-9adb1579e6e1. it showed following records:

  {
    _id: ObjectId("646f79710d77134491007ae2"),
    pid: 'connection:327bfa0b-1984-4c50-8700-9adb1579e6e1',
    from: Long("60"),
    to: Long("60"),
    events: [
      {
        v: 1,
        pid: 'connection:327bfa0b-1984-4c50-8700-9adb1579e6e1',
        sn: Long("60"),
        _t: 'bson',
        p: {
          type: 'persistence-actor-internal:empty-event',
          effect: 'alwaysAlive',
          __hh: {}
        },
        _tg: [ 'always-alive', 'priority-0' ],
        manifest: 'persistence-actor-internal:empty-event',
        _w: '9393e517-419d-4ade-9338-500780c519eb'
      }
    ],
    v: 1,
    _tg: [ 'always-alive', 'priority-0' ]
  }

Since I don't know what this is for, I manually removed all of these records out of the collection and restarted the ditto-connectivity pods. After that, we were able to fetch the ditto connections again using the HTTP GET endpoint.
Something to note: the connection count in ditto dropped from 20 to 7.
Any idea what might be going on? We experienced this in our development environment now and would like to avoid this in our production environment. It's worthy to mention that we recently upgraded ditto from 3.3.7 to 3.4.3

@alstanchev
Copy link
Contributor

alstanchev commented Dec 7, 2023

This record looks very strange, like it was exported with mongoExport end then restored in the database which would cause the numeric values too look like the "from:, to:, sn:". If this is what you did you should use mongoDump as it uses binary and preservs types not like mongoExport that works with json.

@BobClaerhout
Copy link
Contributor

I agree it looks strange. I double checked with my colleagues and we did not do any manual actions on the database. The origin of these are therefor unknown.
On the other hand: it's not good this makes the "list connections" call fail, is it? Wouldn't it be better to just return the connections it is able to load and skip the ones which fail instead of failing the call?

@alstanchev
Copy link
Contributor

alstanchev commented Dec 12, 2023

I would prefer to make the error more vissible with a more descriptive message (something like "Corrupted record or unable to deserialize") instead of returning what is possible only.
Otherwise such corruptions will remain hidden.
But still i doubt this was persisted by the service. Is it possible that by mistake maybe while examining the db with Mongo Compass or the atlass UI an unoticed edit was done?

@BobClaerhout
Copy link
Contributor

BobClaerhout commented Jan 29, 2024

I doubt the edit. It's not that easy to change documents in the atlass UI, let alone multiple accidental edits.
Providing a more descriptive error is also fine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants