Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interrupted system call error while cloning repository in gcsfuse mounted directory #1016

Open
Pardeep009 opened this issue Mar 20, 2023 · 27 comments
Assignees
Labels
bug Error or flaw in the code with unintended result known-issues Issues that are known or not supported yet. p2 P2

Comments

@Pardeep009
Copy link

Pardeep009 commented Mar 20, 2023

Hi Team.

I have mounted a gcs bucket on a directory in my ubuntu gce using gcsfuse.
gcsfuse -o allow_other --gid=XXX --uid=XXX <gcs-name> <directory>
where gid and uid are values of a user, which will also have access to this shared directory. The clone command used to clone the repository is git clone --branch <branch-name> --depth=1 --single-branch <repo-path> <destination-path>, but it fails with error

fatal: cannot create directory at '<path>': Interrupted system call

though it did downloaded some of the content from repository but keeps on failing every-time at different-different paths.

While running gcsfuse in debugging mode using flags --debug_fuse --debug_gcs, the error captured is

gcs: Req            0x3c7: -> StatObject("<a-file-path-in-repository>") (10.194465ms): Error in fetching object attributes: context canceled
FlushFile: interrupted system call, FileInode.Sync: StatObject: Error in fetching object attributes: context canceled
fuse_debug: Op 0x000033a0        connection.go:500] -> Error: "interrupted system call"
fuse: *fuseops.FlushFileOp error: interrupted system call

I am currently running on Ubuntu 20:04 and gcsfuse version 0.42.2 (Go version go1.19.5).
Kindly help.

@vadlakondaswetha
Copy link
Collaborator

This is a known issue with FUSE filesystem. There is no official support provided for gcsfuse when used with other tools like git clone.

The interrupts are coming from the kernel and has nothing to do with GCSFuse. It would be good if you can run gcsfuse with strace and see why the kernel is throwing interrupts intermittently.
Ref existing issues: #288
#402

@Pardeep009
Copy link
Author

I tried running with strace -f command and searched for interrupted keyword in the logs and found

[pid 79356] write(2, "FlushFile: interrupted system ca"..., 118 <unfinished ...>
[pid 79345] futex(0xc00045bd48, FUTEX_WAKE_PRIVATE, 1FlushFile: interrupted system call, FileInode.Sync: StatObject: Error in fetching object attributes: context canceled
 <unfinished ...>
[pid 79354] <... futex resumed>)        = 0
[pid 79345] <... futex resumed>)        = 1
[pid 79356] <... write resumed>)        = 118
[pid 79354] futex(0xc00045bd48, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 79356] futex(0xc00045bd48, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 79354] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid 79345] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 79356] <... futex resumed>)        = 0
[pid 79354] nanosleep({tv_sec=0, tv_nsec=3000},  <unfinished ...>
[pid 79356] write(2, "fuse: *fuseops.FlushFileOp error"..., 58fuse: *fuseops.FlushFileOp error: interrupted system call
) = 58
[pid 79354] <... nanosleep resumed>NULL) = 0
[pid 79356] write(2, "MkDir: interrupted system call, "..., 330MkDir: interrupted system call, CreateChildDir: error in closing writer : Post "https://storage.googleapis.com/upload/storage/v1/b/<bucket-name>/o?alt=json&ifGenerationMatch=0&name=<any-folder-path>&prettyPrint=false&projection=full&uploadType=multipart": context canceled
 <unfinished ...>
[pid 79345] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 79356] <... write resumed>)        = 330
[pid 79356] futex(0xc00045b148, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 79353] <... futex resumed>)        = 0
[pid 79345] <... nanosleep resumed>NULL) = 0
[pid 79356] write(2, "fuse: *fuseops.MkDirOp error: in"..., 54fuse: *fuseops.MkDirOp error: interrupted system call
 <unfinished ...>
[pid 79353] futex(0xc00045b148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>

@sethiay sethiay added the known-issues Issues that are known or not supported yet. label May 31, 2023
@Mistic92
Copy link

Mistic92 commented Jun 5, 2023

We have encountered this error today too. @Pardeep009 have you found any solution?

@Pardeep009
Copy link
Author

@Mistic92 I have found no solution as of now, btw just curious which use case are you trying to implement ?
since we shifted to some other central storage solution(NFS) for our use case.

@sethiay sethiay added bug Error or flaw in the code with unintended result and removed known-issues Issues that are known or not supported yet. labels Jun 6, 2023
@vadlakondaswetha vadlakondaswetha added the p1 P1 label Jun 8, 2023
@sethiay sethiay added p2 P2 and removed p1 P1 labels Jul 18, 2023
@yelinaung
Copy link

Hello, thanks for this awesome tool.
Any update on this, as I am facing the similar issue (although, doing just a simple cp locally). Did not find any interrupt log with strace .

read(8, "5\20`\r'!q\303a\242\375\202\372\275\225\252\326r\214*\222\303\316\233\35\337\205g\r\357\333\350"..., 3684) = 3684
writev(12, [{iov_base="\20\20\0\0\0\0\0\0\200\0\0\0\0\0\0\0", iov_len=16}, {iov_base="\\\365\345\324\21b]\3006z\217\353H\363\200\241\342$\23\323\234W\1\26\271\33D\308`FG"..., iov_len=4096}], 2) = 4112
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xc000101148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc000101148, FUTEX_WAKE_PRIVATE, 1) = 1
read(8, "\32I\204\273\263\272pv\22(\350\215\355\227c\320C\334G3.5\2470\237\247\330Ls\353\247I"..., 3684) = 3684
writev(12, [{iov_base="\20\20\0\0\0\0\0\0\202\0\0\0\0\0\0\0", iov_len=16}, {iov_base="\225\256\330\307'\203\354o\255fX\244\212\31\32_\225\210\332\0\0\3\323{0<W\317\32\377\0\210"..., iov_len=4096}], 2) = 4112
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xc000101148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc000101148, FUTEX_WAKE_PRIVATE, 1) = 1
read(8, "W\324\326\236`\2vH\216\fq;>|\0\371F\234\240J\261c\207\34\1!s\257#\0\346a"..., 3684) = 3684
writev(12, [{iov_base="\20\20\0\0\0\0\0\0\204\0\0\0\0\0\0\0", iov_len=16}, {iov_base="4\303f6U-\n\257\257G\377\0\4\375\204\305!\4\325H\356\2142-\225\366\"\271\330\254\321\222"..., iov_len=4096}], 2) = 4112
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xc000101148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc000101148, FUTEX_WAKE_PRIVATE, 1) = 1
read(8, "V+\246{k\260\241\177\\\357\331\326\342\274\vfU\255@<\266a\203\372\373?xMK\310\267j"..., 3684) = 2496
read(8, 0xc0002cc600, 4864)             = -1 EAGAIN (Resource temporarily unavailable)
write(2, "{\"name\":\"root\",\"levelname\":\"ERRO"..., 208{"name":"root","levelname":"ERROR","severity":"ERROR","message":"ReadFile: input/output error, fh.reader.ReadAt: Reader returned 152 too many bytes\n","timestampSeconds":1692959388,"timestampNanos":83636309}
) = 208
write(2, "{\"name\":\"root\",\"levelname\":\"ERRO"..., 177{"name":"root","levelname":"ERROR","severity":"ERROR","message":"fuse: *fuseops.ReadFileOp error: input/output error\n","timestampSeconds":1692959388,"timestampNanos":83677508}
) = 177
write(12, "\20\0\0\0\373\377\377\377\206\0\0\0\0\0\0\0", 16) = 16
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xc000076948, FUTEX_WAKE_PRIVATE, 1) = 1
getrandom("\xe7\x3f\xf0\x7a\xe8\x57\x7f\x4f\x20\x11\xfe\xa7\x30\x51\xc5\xc7", 16, 0) = 16
futex(0xc000076948, FUTEX_WAKE_PRIVATE, 1) = 1
write(8, "\27\3\3\4\364\226\300)]M\361\304\322\243\322IVr\33p\317\30s\375\223\213\361\214\227sd\241"..., 1273) = 1273
futex(0xc000076948, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xc000101148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL{"name":"root","levelname":"ERROR","severity":"ERROR","message":"ReadFile: input/output error, fh.reader.ReadAt: Reader returned 647 too many bytes\n","timestampSeconds":1692959388,"timestampNanos":124015700}
{"name":"root","levelname":"ERROR","severity":"ERROR","message":"fuse: *fuseops.ReadFileOp error: input/output error\n","timestampSeconds":1692959388,"timestampNanos":124052660}
) = 0
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
write(6, "\0", 1)                       = 1
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]})                 = 202
futex(0x17f1a28, FUTEX_WAIT_PRIVATE, 0, NULL

A bunch of Resource temporarily unavailable tho

@sethiay
Copy link
Collaborator

sethiay commented Aug 29, 2023

@yelinaung Thanks for reaching out. Could you please share more information here ? particularly

  • Is this issue reproducible at your end ? If yes, please share steps to reproduce.
  • Which version of GCSFuse are you using ?
  • Can you also share GCSFuse logs by mounting GCSFuse with flags --debug_gcs --debug_fuse --log-format text --log-file <local file path> ?
  • The machine spec (including OS version) on which you are running GCSFuse.

@yelinaung
Copy link

Hello @sethiay, I noticed there is a new gcsfuse version and I downloaded it

gcsfuse version 1.1.0 (Go version go1.20.5)

Strangely, for one file, I am getting gzipped version of the file instead of the original file. After extracting the gzip, I can see the original file (uncorrupted). For the rest of the files, seems working fine!

I checked the file in the cloud storage and it is image/jpeg
image

Have attached the log

I0901 09:30:49.068689 Start gcsfuse/1.1.0 (Go version go1.20.5) for app "" using mount point: /raw_data
I0901 09:30:49.068712 Creating Storage handle...
I0901 09:30:49.068979 Creating a mount at "/raw_data"
I0901 09:30:49.068986 Creating a new server...
I0901 09:30:49.068990 Set up root directory for bucket redated-bucket-name
D0901 09:30:49.068998 gcs: Req              0x0: <- ListObjects("")
D0901 09:30:49.156208 gcs: Req              0x0: -> ListObjects("") (87.201552ms): OK
I0901 09:30:49.156251 Mounting file system "redated-bucket-name"...
D0901 09:30:49.156263 fuse_debug: Beginning the mounting kickoff process
D0901 09:30:49.156270 fuse_debug: Parsing fuse file descriptor
D0901 09:30:49.156275 fuse_debug: Preparing for direct mounting
D0901 09:30:49.156290 fuse_debug: Successfully opened the /dev/fuse in blocking mode
D0901 09:30:49.156299 fuse_debug: Starting the unix mounting
D0901 09:30:49.156311 fuse_debug: Directmount failed. Trying fallback.
D0901 09:30:49.156381 fuse_debug: Creating a socket pair
D0901 09:30:49.156395 fuse_debug: Creating files to wrap the sockets
D0901 09:30:49.156401 fuse_debug: Starting fusermount/os mount
D0901 09:30:49.157941 fuse_debug: Wrapping socket pair in a connection
D0901 09:30:49.157965 fuse_debug: Checking that we have a unix domain socket
D0901 09:30:49.157970 fuse_debug: Read a message from socket
D0901 09:30:49.157980 fuse_debug: Successfully read the socket message.
D0901 09:30:49.157983 fuse_debug: Converting FD into os.File
D0901 09:30:49.157994 fuse_debug: Completed the mounting kickoff process
D0901 09:30:49.157997 fuse_debug: Creating a connection object
D0901 09:30:49.159228 fuse_debug: Op 0x00000002        connection.go:416] <- init
D0901 09:30:49.159258 fuse_debug: Op 0x00000002        connection.go:498] -> OK ()
D0901 09:30:49.159271 fuse_debug: Successfully created the connection
D0901 09:30:49.159278 fuse_debug: Waiting for mounting process to complete
I0901 09:30:49.159282 File system has been successfully mounted.
D0901 09:30:49.159336 fuse_debug: Op 0x00000004        connection.go:416] <- GetInodeAttributes (inode 1, PID 1207836)
D0901 09:30:49.159458 fuse_debug: Op 0x00000004        connection.go:498] -> OK ()
D0901 09:30:49.159575 fuse_debug: Op 0x00000006        connection.go:416] <- LookUpInode (parent 1, name ".Trash", PID 1207836)
D0901 09:30:49.159944 gcs: Req              0x1: <- StatObject(".Trash/")
D0901 09:30:49.159979 gcs: Req              0x2: <- StatObject(".Trash")
D0901 09:30:49.182485 gcs: Req              0x1: -> StatObject(".Trash/") (22.559307ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:49.205699 gcs: Req              0x2: -> StatObject(".Trash") (45.725138ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:49.205757 fuse_debug: Op 0x00000006        connection.go:500] -> Error: "no such file or directory"
D0901 09:30:49.205867 fuse_debug: Op 0x00000008        connection.go:416] <- LookUpInode (parent 1, name ".Trash-1000", PID 1207836)
D0901 09:30:49.205930 gcs: Req              0x3: <- StatObject(".Trash-1000")
D0901 09:30:49.205936 gcs: Req              0x4: <- StatObject(".Trash-1000/")
D0901 09:30:49.227742 gcs: Req              0x3: -> StatObject(".Trash-1000") (21.818041ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:49.241936 gcs: Req              0x4: -> StatObject(".Trash-1000/") (36.008498ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:49.241975 fuse_debug: Op 0x00000008        connection.go:500] -> Error: "no such file or directory"
D0901 09:30:50.681803 fuse_debug: Op 0x0000000a        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1186731)
D0901 09:30:50.681922 gcs: Req              0x6: <- StatObject("9Y6HRN4FxefkkcDRvH3NW8")
D0901 09:30:50.682003 gcs: Req              0x5: <- StatObject("9Y6HRN4FxefkkcDRvH3NW8/")
D0901 09:30:50.706577 gcs: Req              0x5: -> StatObject("9Y6HRN4FxefkkcDRvH3NW8/") (24.66311ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:50.711482 gcs: Req              0x6: -> StatObject("9Y6HRN4FxefkkcDRvH3NW8") (29.570608ms): OK
D0901 09:30:50.711569 fuse_debug: Op 0x0000000a        connection.go:498] -> OK (inode 2)
D0901 09:30:50.711662 fuse_debug: Op 0x0000000c        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1186731)
D0901 09:30:50.711743 fuse_debug: Op 0x0000000c        connection.go:498] -> OK (inode 2)
D0901 09:30:51.422436 fuse_debug: Op 0x0000000e        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1186731)
D0901 09:30:51.422633 fuse_debug: Op 0x0000000e        connection.go:498] -> OK (inode 2)
D0901 09:30:51.422718 fuse_debug: Op 0x00000010        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1186731)
D0901 09:30:51.422790 fuse_debug: Op 0x00000010        connection.go:498] -> OK (inode 2)
D0901 09:30:51.429046 fuse_debug: Op 0x00000012        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1266594)
D0901 09:30:51.429147 fuse_debug: Op 0x00000012        connection.go:498] -> OK (inode 2)
D0901 09:30:51.429191 fuse_debug: Op 0x00000014        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1266594)
D0901 09:30:51.429279 fuse_debug: Op 0x00000014        connection.go:498] -> OK (inode 2)
D0901 09:30:51.429324 fuse_debug: Op 0x00000016        connection.go:416] <- OpenFile (inode 2, PID 1266594)
D0901 09:30:51.429377 fuse_debug: Op 0x00000016        connection.go:498] -> OK ()
D0901 09:30:51.429574 fuse_debug: Op 0x00000018        connection.go:416] <- ReadFile (inode 2, PID 1266594, handle 0, offset 0, 208896 bytes)
D0901 09:30:51.429696 gcs: Req              0x7: <- Read("9Y6HRN4FxefkkcDRvH3NW8", [0, 208249))
D0901 09:30:51.570400 gcs: Req              0x7: -> Read("9Y6HRN4FxefkkcDRvH3NW8", [0, 208249)) (140.72551ms): OK
D0901 09:30:51.570461 fuse_debug: Op 0x00000018        connection.go:498] -> OK ()
D0901 09:30:51.570954 fuse_debug: Op 0x0000001a        connection.go:416] <- FlushFile (inode 2, PID 1266594)
D0901 09:30:51.571011 fuse_debug: Op 0x0000001a        connection.go:498] -> OK ()
D0901 09:30:51.571040 fuse_debug: Op 0x0000001c        connection.go:416] <- ReleaseFileHandle (PID 0)
D0901 09:30:51.571096 fuse_debug: Op 0x0000001c        connection.go:498] -> OK ()

I am running the following OS with Kernel version

Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy

6.2.0-31-generic

@gargnitingoogle
Copy link
Collaborator

Hello @sethiay, I noticed there is a new gcsfuse version and I downloaded it

gcsfuse version 1.1.0 (Go version go1.20.5)

Strangely, for one file, I am getting gzipped version of the file instead of the original file. After extracting the gzip, I can see the original file (uncorrupted). For the rest of the files, seems working fine!

I checked the file in the cloud storage and it is image/jpeg image

Have attached the log

I0901 09:30:49.068689 Start gcsfuse/1.1.0 (Go version go1.20.5) for app "" using mount point: /raw_data
I0901 09:30:49.068712 Creating Storage handle...
I0901 09:30:49.068979 Creating a mount at "/raw_data"
I0901 09:30:49.068986 Creating a new server...
I0901 09:30:49.068990 Set up root directory for bucket redated-bucket-name
D0901 09:30:49.068998 gcs: Req              0x0: <- ListObjects("")
D0901 09:30:49.156208 gcs: Req              0x0: -> ListObjects("") (87.201552ms): OK
I0901 09:30:49.156251 Mounting file system "redated-bucket-name"...
D0901 09:30:49.156263 fuse_debug: Beginning the mounting kickoff process
D0901 09:30:49.156270 fuse_debug: Parsing fuse file descriptor
D0901 09:30:49.156275 fuse_debug: Preparing for direct mounting
D0901 09:30:49.156290 fuse_debug: Successfully opened the /dev/fuse in blocking mode
D0901 09:30:49.156299 fuse_debug: Starting the unix mounting
D0901 09:30:49.156311 fuse_debug: Directmount failed. Trying fallback.
D0901 09:30:49.156381 fuse_debug: Creating a socket pair
D0901 09:30:49.156395 fuse_debug: Creating files to wrap the sockets
D0901 09:30:49.156401 fuse_debug: Starting fusermount/os mount
D0901 09:30:49.157941 fuse_debug: Wrapping socket pair in a connection
D0901 09:30:49.157965 fuse_debug: Checking that we have a unix domain socket
D0901 09:30:49.157970 fuse_debug: Read a message from socket
D0901 09:30:49.157980 fuse_debug: Successfully read the socket message.
D0901 09:30:49.157983 fuse_debug: Converting FD into os.File
D0901 09:30:49.157994 fuse_debug: Completed the mounting kickoff process
D0901 09:30:49.157997 fuse_debug: Creating a connection object
D0901 09:30:49.159228 fuse_debug: Op 0x00000002        connection.go:416] <- init
D0901 09:30:49.159258 fuse_debug: Op 0x00000002        connection.go:498] -> OK ()
D0901 09:30:49.159271 fuse_debug: Successfully created the connection
D0901 09:30:49.159278 fuse_debug: Waiting for mounting process to complete
I0901 09:30:49.159282 File system has been successfully mounted.
D0901 09:30:49.159336 fuse_debug: Op 0x00000004        connection.go:416] <- GetInodeAttributes (inode 1, PID 1207836)
D0901 09:30:49.159458 fuse_debug: Op 0x00000004        connection.go:498] -> OK ()
D0901 09:30:49.159575 fuse_debug: Op 0x00000006        connection.go:416] <- LookUpInode (parent 1, name ".Trash", PID 1207836)
D0901 09:30:49.159944 gcs: Req              0x1: <- StatObject(".Trash/")
D0901 09:30:49.159979 gcs: Req              0x2: <- StatObject(".Trash")
D0901 09:30:49.182485 gcs: Req              0x1: -> StatObject(".Trash/") (22.559307ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:49.205699 gcs: Req              0x2: -> StatObject(".Trash") (45.725138ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:49.205757 fuse_debug: Op 0x00000006        connection.go:500] -> Error: "no such file or directory"
D0901 09:30:49.205867 fuse_debug: Op 0x00000008        connection.go:416] <- LookUpInode (parent 1, name ".Trash-1000", PID 1207836)
D0901 09:30:49.205930 gcs: Req              0x3: <- StatObject(".Trash-1000")
D0901 09:30:49.205936 gcs: Req              0x4: <- StatObject(".Trash-1000/")
D0901 09:30:49.227742 gcs: Req              0x3: -> StatObject(".Trash-1000") (21.818041ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:49.241936 gcs: Req              0x4: -> StatObject(".Trash-1000/") (36.008498ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:49.241975 fuse_debug: Op 0x00000008        connection.go:500] -> Error: "no such file or directory"
D0901 09:30:50.681803 fuse_debug: Op 0x0000000a        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1186731)
D0901 09:30:50.681922 gcs: Req              0x6: <- StatObject("9Y6HRN4FxefkkcDRvH3NW8")
D0901 09:30:50.682003 gcs: Req              0x5: <- StatObject("9Y6HRN4FxefkkcDRvH3NW8/")
D0901 09:30:50.706577 gcs: Req              0x5: -> StatObject("9Y6HRN4FxefkkcDRvH3NW8/") (24.66311ms): gcs.NotFoundError: storage: object doesn't exist
D0901 09:30:50.711482 gcs: Req              0x6: -> StatObject("9Y6HRN4FxefkkcDRvH3NW8") (29.570608ms): OK
D0901 09:30:50.711569 fuse_debug: Op 0x0000000a        connection.go:498] -> OK (inode 2)
D0901 09:30:50.711662 fuse_debug: Op 0x0000000c        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1186731)
D0901 09:30:50.711743 fuse_debug: Op 0x0000000c        connection.go:498] -> OK (inode 2)
D0901 09:30:51.422436 fuse_debug: Op 0x0000000e        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1186731)
D0901 09:30:51.422633 fuse_debug: Op 0x0000000e        connection.go:498] -> OK (inode 2)
D0901 09:30:51.422718 fuse_debug: Op 0x00000010        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1186731)
D0901 09:30:51.422790 fuse_debug: Op 0x00000010        connection.go:498] -> OK (inode 2)
D0901 09:30:51.429046 fuse_debug: Op 0x00000012        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1266594)
D0901 09:30:51.429147 fuse_debug: Op 0x00000012        connection.go:498] -> OK (inode 2)
D0901 09:30:51.429191 fuse_debug: Op 0x00000014        connection.go:416] <- LookUpInode (parent 1, name "9Y6HRN4FxefkkcDRvH3NW8", PID 1266594)
D0901 09:30:51.429279 fuse_debug: Op 0x00000014        connection.go:498] -> OK (inode 2)
D0901 09:30:51.429324 fuse_debug: Op 0x00000016        connection.go:416] <- OpenFile (inode 2, PID 1266594)
D0901 09:30:51.429377 fuse_debug: Op 0x00000016        connection.go:498] -> OK ()
D0901 09:30:51.429574 fuse_debug: Op 0x00000018        connection.go:416] <- ReadFile (inode 2, PID 1266594, handle 0, offset 0, 208896 bytes)
D0901 09:30:51.429696 gcs: Req              0x7: <- Read("9Y6HRN4FxefkkcDRvH3NW8", [0, 208249))
D0901 09:30:51.570400 gcs: Req              0x7: -> Read("9Y6HRN4FxefkkcDRvH3NW8", [0, 208249)) (140.72551ms): OK
D0901 09:30:51.570461 fuse_debug: Op 0x00000018        connection.go:498] -> OK ()
D0901 09:30:51.570954 fuse_debug: Op 0x0000001a        connection.go:416] <- FlushFile (inode 2, PID 1266594)
D0901 09:30:51.571011 fuse_debug: Op 0x0000001a        connection.go:498] -> OK ()
D0901 09:30:51.571040 fuse_debug: Op 0x0000001c        connection.go:416] <- ReleaseFileHandle (PID 0)
D0901 09:30:51.571096 fuse_debug: Op 0x0000001c        connection.go:498] -> OK ()

I am running the following OS with Kernel version

Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy

6.2.0-31-generic

@yelinaung Does this GCS object have Content-Encoding set to gzip in its metadata ?
To check this, you can run gcloud storage objects describe gs://<bucket>/path/to/object
If yes, then this is an intended change in the August release (1.1.0). Now, any file that is uploaded to GCS bucket with content-encoding gzip is downloaded back to the gcsfuse mount directory exactly as it is stored in the bucket. Earlier, we hadn't handled GCS objects with content-encoding gzip and thus any such objects were auto-decompressed by GCS when read from GCSFuse, but there were some failing scenarios. This is documented in the File Transcoding section here as well as here .

If no, then we need to take a look at it.

@yelinaung
Copy link

Hey @gargnitingoogle you are right about the Content-Encoding being set to gzip. I didn't notice that in the file nor the changelog of 1.1.0!

@gargnitingoogle
Copy link
Collaborator

Hey @gargnitingoogle you are right about the Content-Encoding being set to gzip. I didn't notice that in the file nor the changelog of 1.1.0!

@yelinaung I am glad that it is as the intended behaviour.

@bjornleffler
Copy link
Member

bjornleffler commented Dec 8, 2023

(Common update to issues #1016 #288, #539 and #562)

We have tracked the interrupt issue down to Golang sending SIGURG, which are then sent as interrupt signals to Fuse operations.
https://github.com/golang/proposal/blob/master/design/24543-non-cooperative-preemption.md

I sent jacobsa/fuse#153 to add an option to ignore interrupts. Once that's approved, we can change gcsfuse to ignore interrupts by default.

@bkanuka
Copy link

bkanuka commented Jan 19, 2024

I can confirm that I'm seeing this error running a heavy io Go application against gcsfuse. The logs are something like the following:

ReadFile: interrupted system call, fh.reader.ReadAt: readFull: context canceled
E0119 16:53:36.829320 fuse: *fuseops.ReadFileOp error: interrupted system call
E0119 16:53:37.106603 LookUpInode: interrupted system call, StatObject: Error in fetching object attributes: context canceled
E0119 16:53:37.106671 fuse: *fuseops.LookUpInodeOp error: interrupted system call
E0119 16:53:37.384868 LookUpInode: interrupted system call, StatObject: Error in fetching object attributes: context canceled
E0119 16:53:37.384922 fuse: *fuseops.LookUpInodeOp error: interrupted system call
E0119 16:53:37.406941 LookUpInode: interrupted system call, StatObject: Error in fetching object attributes: context canceled
E0119 16:53:37.406991 fuse: *fuseops.LookUpInodeOp error: interrupted system call

Edit: After adding the ENV var GODEBUG="asyncpreemptoff=1" to my container (the main one, not the gcsfuse sidecar) I no longer see the logs. I'm not sure if this "fixes" anything but it's a big win for me because the logs were generating significant chatter (and cost!).

@dennissetiawan
Copy link

I am encountering the same problem when running dockerized java application inside Google Cloud Run, is there any update for this?

@marcoa6 marcoa6 added the known-issues Issues that are known or not supported yet. label Jan 31, 2024
@marcoa6
Copy link
Collaborator

marcoa6 commented Jan 31, 2024

@dennissetiawan We are able to reproduce the issue, but have no timeline yet on a fix. We will update this issue entry once we have more details.

@IchordeDionysos
Copy link

Are there any updates on this? :)

@ashmeenkaur
Copy link
Collaborator

Are there any updates on this? :)

Hi @IchordeDionysos, we are still working on this and update here as soon as we have a fix available.

@Chhinna
Copy link

Chhinna commented Apr 30, 2024

Are there any updates on this? :)

@ashmeenkaur
Copy link
Collaborator

ashmeenkaur commented May 2, 2024

Hi @Chhinna,

We're rolling out the changes behind a flag initially. You should be able to access them around May 27th, 2024 when our next release (v2.1.0) goes live. I'll share the specific flags/configurations needed to enable the feature soon.

Thanks,
Ashmeen

@IchordeDionysos
Copy link

IchordeDionysos commented May 2, 2024

@ashmeenkaur am I correct that Cloud Storage FUSE for Cloud Run is using gcsfuse?

Which version is being used by Cloud Run?
Can I use v2.1.0 once it goes live via Cloud Run?
Would the flags/configuration also work with the built-in Cloud Storage FUSE (via env variables), e.g. where I'm not configuring / starting gcsfuse on my own?

Just trying to figure out whether we need to configure and start our Docker image on our own again to utilize the changes :)

@ashmeenkaur
Copy link
Collaborator

@IchordeDionysos Yes, Cloud Storage FUSE for Cloud Run uses GCSFuse. Right now they are on GCSFuse version 2.0.0.

Can I use v2.1.0 once it goes live via Cloud Run?

Based on my understanding, you won't be able to use it on Cloud Run until the Cloud Run team upgrades and whitelists the new flag. Looping in @jeremiele from Cloud Run team to provide more information.

@jeremiele
Copy link

Cloud Run will be updating to 2.1.0 once it is available. Please let us know the flag so we can prioritize having it allowlisted as well. Thanks.

@ashmeenkaur
Copy link
Collaborator

ashmeenkaur commented May 6, 2024

Hi everyone,

We've made some updates to how GCSFuse handles interrupts, and we're hoping you can help us test the changes.

You can test these changes by installing GCSFuse from source using this guide: https://cloud.google.com/storage/docs/gcsfuse-install#install-source-code
Remember to use the "--ignore-interrupts=true" flag when mounting GCSFuse to enable the new behavior.

Your early feedback is invaluable as we work towards the official release (to be released on may 27).

Thank you in advance!

@williamstein
Copy link

we're hoping you can help us test the changes.

I'm a different person coming here for the same reason, namely, I want to support using git clone with a gcsfuse mount, and it doesn't work by default. I just built the client from source, tried it with the ignore-interrupts option, and it works very well.

As an aside, I've been using gcsfuse for 5+ years, and hadn't looked at it for a while, and I'm super impressed with the progress you have made recently, especially the improved support for caching to support ML workloads!

@IchordeDionysos
Copy link

Hey @ashmeenkaur, I've just tested it and it works amazingly 😍
Thanks for the work you all put into this 👌

I will wait until the new version and flag are available natively in Cloud Run (as it's just easier to set up and configure).

@marcoa6
Copy link
Collaborator

marcoa6 commented May 24, 2024

Update: As of version 2.1 on 23-May-2024, we now offers enhanced control over how GCSfuse responds to interruptions during file system operations. This addresses running Git clone operations on a GCSFuse mounted directory.

You can configure GCSFuse to ignore interruptions during file system operations via CLI using --ignore-interrupts flag (disabled by default) or via config-file using the following config:

file-system:
    ignore-interrupts: true

Once we get enough feedback, we will enable this by default (targeting next month).

@IchordeDionysos
Copy link

@jeremiele how can we stay up-to-date when this new version and the flag will be available in Cloud Run? :)

@jeremiele
Copy link

We are looking into the upgrade. Unclear on the timeline yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Error or flaw in the code with unintended result known-issues Issues that are known or not supported yet. p2 P2
Projects
None yet
Development

No branches or pull requests