Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource merge transactions can cause lost data in metadata workflow #7952

Open
ianwallen opened this issue Apr 13, 2024 · 1 comment
Open
Labels

Comments

@ianwallen
Copy link
Contributor

ianwallen commented Apr 13, 2024

Describe the bug
If there are errors while merging a working copy record into a published version, it is possible for the resource files to be lost.

To Reproduce

The following scenario has occurred in our environment.

User attempts to merge working copy into published version (part 1)
The system performed the following
- copy metadata 2222(working copy) to 1111(published version)
- merge/copy files from 2222 to 1111 and then it would have deleted 2222.
- An error occurred before the transaction was complete and causing database to roll back. This means that the copy of metadata would have been rolled back but not the copy of the files.

In our case, some odd issue with a deadlock in the transaction occurred causing the transaction to be rolled back. We have not identified the cause of the deadlock.

After user notices the error, the attempts to merge working copy into published version (Again - part 2)
- copy metadata 2222(working copy) to 1111(published version)
- merge/copy files from 2222 to 1111 and then it would have deleted 2222. (Note this time, 2222 is empty so it ends up deleting all files from 1111 in order to make 1111 the same as 2222). At this point all files are lost.

Expected behavior
I expect no data loss.

Desktop (please complete the following information):
This issue was produced in GeoNetwork Version 3.12.11 however to my knowledge the same issue exists in the latest 4.4.3 release.

Additional context

Here is a potential changes that could help resolve this issue. I'm not sure how complicated it would be to apply such a change or if it would cause other issues. Would like to others input.

Current process

  • transaction start
  • metadata merge
  • resource merge
    • merge 2222 to 1111
    • delete 2222
  • transaction ends

Proposed process

  • transaction start
  • metadata merge
  • resource merge
    • backup 1111
    • merge 2222 to 1111
  • transaction ends
  • cleanup
    • if transaction failed.
      • recover 1111 backup
    • if transaction success.
      • delete 1111 backup
      • delete 2222
@wangf1122
Copy link
Contributor

The mechanism how the transaction were roll back or handled is not so clear. Not sure how to place the cleanup process to fit into the transaction. But moving the resource deletion to the end of draft merging will be helpful for short term relief to prevent resource accidentally got lost.

Here is the pull request for that change #8100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants