Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make resume data storage backups #17929

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

glassez
Copy link
Member

@glassez glassez commented Oct 24, 2022

Makes backup of resume data from time to time (only SQLite based storage is supported).
It is intended to protect against corruption of the main resume data, so it does not create a new copy every time, but simply overwrites the single one.

@glassez glassez added the Core label Oct 24, 2022
@glassez glassez added this to the 4.5.0 milestone Oct 24, 2022
@glassez glassez requested review from Chocobo1 and a team October 24, 2022 12:04
@thalieht
Copy link
Contributor

thalieht commented Oct 24, 2022

I deleted the .db file but it doesn't recreate it from .db.bak on qBt restart. It re-creates it from BT_backup (moved BT_backup to make sure)
When i started qBt without a .db or BT_bacup folder, it overwrote the .bak file with a 20 KB file (empty i guess, it was ~5.9 MB)
and created an empty BT_backup folder even though it is set to sqlite db.
Shouldn't there be a log message when it's restored from backup?
The .bak file is ~100 KB smaller than the .db, is that normal?

/off topic
Why is the .db created on the next qBt startup? Edge case: if someone changes to sqlite and deletes BT_backup before the next qBt start he will be in trouble.

@glassez
Copy link
Member Author

glassez commented Oct 24, 2022

I deleted the .db file but it doesn't recreate it from .db.bak on qBt restart.

Only "making resume data storage backups" is implemented. For now it is intended for manual troubleshooting. So everything else works as before.

@@ -4647,6 +4649,9 @@ void SessionImpl::handleTorrentResumeDataReady(TorrentImpl *const torrent, const
m_resumeDataStorage->remove(iter.value());
m_changedTorrentIDs.erase(iter);
}

if (m_numResumeData == 0)
m_resumeDataStorage->makeBackup();
Copy link
Member

@Chocobo1 Chocobo1 Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only make backups at program start?
If the db is big, this might be too slow and too much disk wearing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The application can run for quite a long time from the moment of start until the storage is damaged (which is more likely to occur when shutting down). If you make a copy only at the start, then backup copy may be too outdated. In addition, writing several tens of megabytes from time to time is not so much compared to the torrent in itself.
In any case, we need to start somewhere and wait for feedback.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a not uncommon usage: assuming a user has 10 MB database. Now he adds 10 .torrent files slowly, one by one. How many disk write for backup file occurred? 100 MB! And maybe the torrents were added in paused state and hasn't even start downloading. This is just too much IMO.

If you make a copy only at the start, then backup copy may be too outdated. In addition, writing several tens of megabytes from time to time is not so much compared to the torrent in itself.

A better idea is to use "Save resume data interval" and generate backups on this period. But users might still argue that when all torrents are paused (no statistics were updated) it is wasteful to make such backups.
So maybe add another option (under advanced tab) for it? so users can tweak it.

Also is database corruption really a common thing? So far only one user reported it and the cause is his own fault (force killing qbt). Do we really need such feature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now he adds 10 .torrent files slowly, one by one. How many disk write for backup file occurred?

None. Resume data is initially saved in other place of code.

A better idea is to use "Save resume data interval" and generate backups on this period. But users might still argue that when all torrents are paused (no statistics were updated) it is wasteful to make such backups.

If resume data isn't changed the saving is not requested.

So maybe add another option (under advanced tab) for it? so users can tweak it.

This is certainly too much for the initial implementation, IMO.

Also is database corruption really a common thing?

Fortunately, it doesn't seem to be. But it does not become less dangerous to leave it without the slightest protection. Since in case of SQLite we have a single file, the user may lose all data if it is corrupted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also users requested backups in corresponding discussion.

Copy link
Member

@Chocobo1 Chocobo1 Oct 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now he adds 10 .torrent files slowly, one by one. How many disk write for backup file occurred?

None. Resume data is initially saved in other place of code.

A better idea is to use "Save resume data interval" and generate backups on this period. But users might still argue that when all torrents are paused (no statistics were updated) it is wasteful to make such backups.

If resume data isn't changed the saving is not requested.

Have you really see how frequent the backup file is written? The backup file is generated at torrent add/pause/resume, which hardly counts as from time to time, more like always update it.
IIRC "Save resume data interval" was tweaked to current 1 hour in order to minimize disk wearing and I don't see why we want to go backwards now (generate more disk writes than necessary).

@qbittorrent/bug-handlers
Please test the disk wearing scenario mentioned in #17929 (comment). Notice that backup file modification time will be updated at every torrent add/pause/resume. Is the frequent db backup acceptable if someone has some moderate size db file (>= 10MB)? Shouldn't we care about the disk wearing it caused?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a settings for backup frequency.
Like
Once daily.
Once weekly.
Automatic.

Once daily/weekly option will keep track of last backup time and take action accordingly.

Automatic option will take backups whenever resume data is saved(per save interval).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am open to any suggestions for tweaking it. I still think that doing a backup once when launching the application is not enough (of course, it's still better than not doing it at all). Alternatively, we can simply bind to "Save resume data interval".

The backup file is generated at torrent add/pause/resume,

The counter question is how often do you do the above?

In any case, some fixed interval would be a good compromise, wouldn't it? But I don't want to add a separate option for it yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can simply bind to "Save resume data interval".

OK. But the implementation should be smart enough to not make backups when the db has not changed from the last save.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the implementation should be smart enough to not make backups when the db has not changed from the last save.

Sure.

@glassez
Copy link
Member Author

glassez commented Oct 25, 2022

The .bak file is ~100 KB smaller than the .db, is that normal?

Yes. Backup is created using VACUUM INTO clause so resulting file can be more optimized that original one.

@thalieht
Copy link
Contributor

Only "making resume data storage backups" is implemented.

Ok then, i found no problems.

@glassez glassez marked this pull request as draft November 21, 2022 19:56
@glassez glassez modified the milestones: 4.5.0, 4.5.1 Nov 27, 2022
@baccccccc
Copy link

baccccccc commented Dec 4, 2022

This sounds like a fantastic feature. If I understand that correctly, it could save me from so much trouble when I got (indirectly) impacted by this bug and got pretty much all of my categories deleted on error. Thanks for bringing this to life!

@glassez glassez modified the milestones: 4.5.1, 4.6.0 Dec 19, 2022
@glassez glassez modified the milestones: 4.6.0, 5.0 Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants