Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious concurrent modification exception in annotation editor #4454

Open
reckart opened this issue Jan 25, 2024 · 10 comments
Open

Spurious concurrent modification exception in annotation editor #4454

reckart opened this issue Jan 25, 2024 · 10 comments
Assignees
Projects
Milestone

Comments

@reckart
Copy link
Member

reckart commented Jan 25, 2024

Describe the bug
Occasionally, when performing an action in the annotation editor, a concurrent modification of the CAS is detected.

To Reproduce
At the moment it is unclear how to provoke this action.

Expected behavior
Unless any of of the following situations occur, no concurrent modification should be reported:

  • I was observed that the file timestamps on certain Azure shared storages are
    not fully reliable. For this reason, the "cas-storage.file-system-timestamp-accuracy"
    property was introduced allow INCEpTION to work on such systems. Basically, it
    suppresses the error when the delta between the remembered timestamp and the
    on-disk timestamp is smaller than the value set in the property.

  • When a user opens a document in multiple browser windows or browsers. A document
    must only be open by one user in one browser window at a time. A special case
    is curation where only one curator may curate a document at a time since all
    curation data is normally written to the same file on disk. Such a mode of use
    is not supported.

  • When a user uses forward/backwards buttons in the browser to navigate from/to
    the annotation page. Access the annotation/curation pages only through
    the respective links in the browser.

  • When you try running multiple INCEpTION instances accessing the same database
    and data folder on disk. Such a deployment mode is not supported.

  • When a project managed performs certain actions such as changing the layer/feature
    configuration or running CAS Doctor repairs. Such actions should be performed when
    there is low risk that annotators are currently working on the affected project.

Screenshots

Error: While [writing], the file system CAS storage detected a concurrent modification to the annotation CAS for user [[USER@DOMAIN.onmicrosoft.com](mailto:USER@DOMAIN.onmicrosoft.com)] in document [FILE.pdf](866) or project [Project 5 - Batch 3](40) (expected: 2024-01-24 07:19:48.928 actual on storage: 2024-01-24 06:56:21.310, delta: 00:23:27.618), accuracy: 1000ms.

Please complete the following information:

  • Version and build ID: 30.0
  • OS: [e.g. Windows, Linux, OS X]
  • Browser: [e.g. chrome, safari]

Additional context

@reckart reckart added 🐛Bug Something isn't working Module: Annotation labels Jan 25, 2024
@reckart reckart added this to the 30.4 milestone Jan 25, 2024
@reckart reckart self-assigned this Jan 25, 2024
@reckart reckart added this to 🔖 To do in Kanban via automation Jan 25, 2024
@77neel
Copy link

77neel commented Jan 29, 2024

Environment Detail:

  • Operating system on which INCEpTION is running? - Linux
  • Deployment: local workstation, onsite server, cloud (which one?) - Cloud (Azure)
  • Which file system is used for storing the data files? - SMB Protocol File Share
  • Do multiple INCEpTION instances access the same data folder and/or database? - No

@reckart
Copy link
Member Author

reckart commented Jan 29, 2024

On the mailing list, you wrote that you see lost annotations. To what extend do you see those? Is only the last action performed lost or is the document actually reset to a much older state or even looses all of its annotations.

On the mailing list, you note that Azure Defender might mess with the timestamp. IMHO Azure Defender should not change files, right? So it should not change the "last modified" timestamp of the file that INCEpTION uses - if anything, it should only change the "last access" timestamp.

@77neel
Copy link

77neel commented Jan 29, 2024

Out of 7, 4 files had annotation lost up to 80% approx.

About Azure Defender, sorry I am not sure.

@reckart
Copy link
Member Author

reckart commented Jan 29, 2024

Could you set the log level of de.tudarmstadt.ukp.inception.annotation.storage to trace and add the line cas-storage.trace-access=true to you settings.properties. That should give us a lot of information who is reading/writing what when.

Do you get the error message suddenly during annotation or e.g. after a while of not having done anything (e.g. after a pause)?

@reckart
Copy link
Member Author

reckart commented Jan 29, 2024

@77neel a user on the mailing list reported that time stamp problems on azure disappeared when using an Azure VM with attached storage. Maybe that is an option that you could try.

@reckart reckart modified the milestones: 30.4, 30.5 Jan 29, 2024
@77neel
Copy link

77neel commented Jan 30, 2024

@reckart I faced (above mentioned) error message suddenly during annotation when "cas-storage.file-system-timestamp-accuracy" value was not increased (to 30 minutes).

After increasing value, No error appeared.

Missing of annotated data where there before and after changing property value.

I faced one error appeared in log recently while curating file :

USER@DOMAIN.onmicrosoft.com: Unable to perform action: Cannot invoke "de.tudarmstadt.ukp.inception.recommendation.api.model.Predictions.getPredictionByVID(de.tudarmstadt.ukp.clarin.webanno.model.SourceDocument, de.tudarmstadt.ukp.inception.rendering.vmodel.VID)" because "predictions" is null

@reckart
Copy link
Member Author

reckart commented Jan 30, 2024

The "predictions is null" error is likely unrelated - but can you provide a stack strace for it please? That would help locating the problem in the code.

@reckart
Copy link
Member Author

reckart commented Jan 30, 2024

@77neel If you increase cas-storage.file-system-timestamp-accuracy to a very high value, the error message is suppressed. But if you still lose data, what is the point in suppressing the message? When you do not get the message, you will not even know that you might have lost data.

Also, if you get the error message and have set storage.trace-access=true as well, then in the logs, there should be two stack traces logged: one for the last successful write and one for the current write that encounters the wrong timestamp. Along with the trace logging, maybe that helps us narrow down what is happening.

@reckart
Copy link
Member Author

reckart commented Jan 31, 2024

@77neel In your case, the problem appears on writing and at least according to the timestamps, the data that is going to be written appears to be way newer than the data on disk. So what we could try is to add an option to relax the check a bit in such a way that no error is generated if the data to be written remembers a newer timestamp - because hopefully the data in fact is newer. That may remove the problem for you. However, in general, I would not find this a particularly satisfactory solution because the we still do not know the root cause of the problem...

@77neel
Copy link

77neel commented Feb 1, 2024

@reckart I decreased "cas-storage.file-system-timestamp-accuracy" value to 500ms but I haven't encounter that error again and no data lost. I am also not sure why it working fine now. Will let you know once I find more details on this matter.

@reckart reckart modified the milestones: 30.5, 31.1 Feb 6, 2024
@reckart reckart modified the milestones: 31.1, 31.2 Feb 20, 2024
@reckart reckart modified the milestones: 31.2, 31.3, 31.4 Mar 5, 2024
@reckart reckart modified the milestones: 31.4, 31.5 Mar 26, 2024
@reckart reckart modified the milestones: 31.4, 32.1 Apr 2, 2024
@reckart reckart modified the milestones: 32.1, 32.2 Apr 29, 2024
@reckart reckart modified the milestones: 32.2, 32.3 May 9, 2024
@reckart reckart modified the milestones: 32.3, 32.4 May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Kanban
  
🔖 To do
Development

No branches or pull requests

2 participants