Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more than one file (2) in download dir: be lenient #29

Open
karan opened this issue Nov 15, 2020 · 6 comments
Open

more than one file (2) in download dir: be lenient #29

karan opened this issue Nov 15, 2020 · 6 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@karan
Copy link

karan commented Nov 15, 2020

Moving from JakeWharton/docker-gphotos-sync#38:

$ docker run --name gphotos-cdp-sync --rm -v /gphotos-sync:/tmp/gphotos-cdp -v /dl/cdp-gphotos-sync:/download jakewharton/gphotos-sync:0.3.1 /app/sync.sh

...snip
2020/11/15 21:26:24 Running /app/fix_time.sh on /download/AF1QipBizz_jZXshGM1QuBcdxkMmiP-_EDMi7eLjKxHY/DSC00874.JPG
/download/AF1QipBizz_jZXshGM1QuBcdxkMmiP-_EDMi7eLjKxHY/DSC00874.JPG
2020/11/15 21:26:25 Event: {Type:keyDown Modifiers:Shift Timestamp:<nil> Text: UnmodifiedText: KeyIdentifier: Code:KeyD Key:D WindowsVirtualKeyCode:68 NativeVirtualKeyCode:68 AutoRepeat:false IsKeypad:false IsSystemKey:false Location:0}
2020/11/15 21:26:25 Event: {Type:keyUp Modifiers:Shift Timestamp:<nil> Text: UnmodifiedText: KeyIdentifier: Code:KeyD Key:D WindowsVirtualKeyCode:68 NativeVirtualKeyCode:68 AutoRepeat:false IsKeypad:false IsSystemKey:false Location:0}
2020/11/15 21:26:25 ERROR: unhandled page event *page.EventDownloadWillBegin
2020/11/15 21:26:26 more than one file (2) in download dir "/download"
[cmd] /app/sync.sh exited 1
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.
[s6-finish] sending all processes the KILL signal and exiting.

This seems to be happening after a few photos are downloaded.

And the files:

-rwxr-xr-x 1 user grp       0 Nov 15 13:26 DSC00853.JPG*
-rwxr-xr-x 1 user grp 3009501 Nov 15 13:26 DSC00853.JPG.crdownload*
-rwxr-xr-x 1 user grp      76 Nov 15 13:26 .lastdone*

Reading the code, the cdp code will freak out if it sees more than 1 file in the directory (as the error says).

I put a watch on my download dir, and here's what I can tell is happening:

  1. Chrome starts the download. File name is 100_1291.JPG.crdownload.
  2. Chrome finishes the download. File is copied in to 100_1291.JPG.

For that brief period, both files exist. If you get unlucky, the if check can happen during that period, causing the container to die.

I think this can be resolved by one of:

  1. ignore .crdownload files.
  2. if >1 files, continue after waiting a beat (and let the disk state reconcile
@mpl
Copy link
Collaborator

mpl commented Nov 17, 2020

hmm, I wonder why you seem to be the only one to have .crdownload files. Of course I have seen it happening when downloading things manually on Chrome, but I've never seen it happen with the API.
Anyway, I suppose it's ok to make the check ignore .crdownload files for now. Are you interested in proposing a PR?

@mpl mpl added bug Something isn't working good first issue Good for newcomers labels Nov 17, 2020
@czheo
Copy link

czheo commented Nov 26, 2020

Why do we need this check at the first place?

@excerebrose
Copy link

@mpl I also seemed to notice the same issue along with 'DS_Store'
Hence a quick patch:

	if strings.HasSuffix(v.Name(), ".crdownload") {
				continue
			}
			if v.Name() == ".DS_Store" {
				continue
			}

@danielverkamp
Copy link

I'm seeing the same issue, although I think the sequence of events is slightly different. It seems like ioutil.ReadDir() is just observing two different names for the same underlying file, although I don't think both actually exist simultaneously. At least in my case, downloading onto ext4 on Linux, both the .crdownload file and the completed download have the exact same inode number and a link count of 1 (checked by printing out fileEntries in the "more than one file" error message), so I suspect it is just being renamed, not copied. In any case, this seems to be a valid behavior of readdir, even if renames are atomic on the filesystem in question (for example, see https://yarchive.net/comp/linux/readdir_nonatomicity.html).

I hacked around it in my local copy by just changing the check to accept 2 files instead of 1 in the download directory, but this doesn't seem very robust. I also considered just filtering out .crdownload files in the loop before the "more than one file" check, but this prevents the code below that from extending the deadline when a download is in progress.

Maybe allowing two entries if one's filename is equal to the other's filename + ".crdownload" would be best? That would make the code a bit clunky, though.

@gc-ss
Copy link

gc-ss commented Dec 31, 2021

Is this a "harmless" race condition?

Can I just restart this after it throws this specific error without concern of data corruption?

@andyseubert
Copy link

andyseubert commented Mar 17, 2022

I made a watchdog script which I run every 15 minutes to re-run the sync app since it stops almost every time with this error. I'm not sure it's helping yet

if ! pgrep -f chrome &> /dev/null 2>&1; then
  # chrome is not running
  /app/sync.sh
fi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

7 participants