Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip unsupported names in listed prefixes #1876

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

gargnitingoogle
Copy link
Collaborator

@gargnitingoogle gargnitingoogle commented Apr 29, 2024

Description

Skip unsupported directory names in list prefixes. Warn on unsupported object names returned on listing.

GCS supports objects of path-type <bucket>/<path1>//<path2>
and treats it different from <bucket>/<path1>/<path2>, while they are both the same in linux filesystem.
GCSFuse being a POSIX-compliant file-system only support the latters and throws error on former. This error can disallow the listing of all directories in the parent directory i.e. <path1>.
The current change ignores the listing of prefixes (directory names) which are empty such
as "/" above to ignore the error and logs the above event as a warning.

Similarly, GCS supports <bucket>/<path1>/./<path2> and <bucket>/<path1>/../<path2> but these have special reserved meaning in linux filesystem. So, ignoring these two prefixes as well.

Similarly, linux filesystem does not support '\0' in names of files or directories while it
is supported in GCS object names. So, ignoring those as well in the GCS list call output.

Link to the issue in case of a bug fix.

NA

Testing details

  1. Manual -
    • GCS bucket structure: 1. <bucket>//hello.txt, 2. <bucket>/a/hello.txt
    • GCSFuse mount command: gcsfuse --implicit-dirs --log-file=$logfile --debug_fuse --log-format=text $bucket $mountpath
    • Without the fix,
      ls $mountpath
      ls: reading directory $mountpath: Input/output error
      cat $logfile
      - No warning/error log -
    • With the fix,
          ls $mountpath
          a
          cat $logfile
          ...
          time=... severity=WARNING message="Ignoring unsupported prefix \"/\""
          ...
  2. Unit tests - NA
  3. Integration tests - NA

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 3d1d4bf to dfcbf92 Compare April 29, 2024 11:44
@gargnitingoogle gargnitingoogle added the execute-integration-tests Run only integration tests label Apr 29, 2024
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch 2 times, most recently from ed0d51d to 9024dc6 Compare April 30, 2024 09:11
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 9024dc6 to c1195c4 Compare May 4, 2024 20:45
internal/util/util_test.go Outdated Show resolved Hide resolved
internal/util/util_test.go Outdated Show resolved Hide resolved
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch 4 times, most recently from 02488f1 to cca204a Compare May 6, 2024 09:56
@gargnitingoogle gargnitingoogle marked this pull request as ready for review May 6, 2024 10:00
@gargnitingoogle gargnitingoogle requested review from ashmeenkaur and a team as code owners May 6, 2024 10:00
@gargnitingoogle gargnitingoogle changed the title Skip empty-directory names in listed prefixes Skip unsupported names in listed prefixes May 6, 2024
@gargnitingoogle
Copy link
Collaborator Author

Presubmit tests passed: logs

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 80306c2 to fbbc0b1 Compare May 8, 2024 09:18
Copy link
Collaborator

@ashmeenkaur ashmeenkaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to take a look at the tests

)

var (
UnsupportedObjectNameSubstrings = []string{"//", "/./", "/../", "\000"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think we can make these vars un-exported. Also are we sure that "\0" always shows up as "\000" in object names?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my linux machine, I am able to create files/directories with \000 or \0 substring. Is this expected?
Ex: touch \\0

internal/util/util.go Outdated Show resolved Hide resolved
// a/./ as a/, a//b/ as a/b/. 'a/\00/b' is not a valid substring file/directory
// in any unix file/directory name. GCSFuse simulates the same behvaiour
// by ignoring the GCS objects which have these specially reserved/unsupported unix names/substrings.
logger.Warnf("Ignoring unsupported object-prefix (implicit-directory): \"%s\"", attrs.Prefix)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that explicit directories cannot be created with these names? I believe \0 is a valid explicit directory on GCS.
If so, we might want to remove implicit-directory keyword from logs


var (
UnsupportedObjectNameSubstrings = []string{"//", "/./", "/../", "\000"}
UnsupportedDirectoryNamePrefixes = []string{"/", "./", "../"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't \0 also be here?

var err error

// Set up contents.
AssertEq(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use stretchr/testify package instead of "github.com/jacobsa/oglematchers" and "github.com/jacobsa/ogletest"


// ReadDirPicky on mountdir/foo should fail as the unsupported sub-directories in that can not be read.
_, err = fusetesting.ReadDirPicky(path.Join(mntDir, "foo"))
AssertNe(nil, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't readDir supposed to silently ignore unsupported directories?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't readDir supposed to silently ignore unsupported directories?

ReadDir is, but ReadDirPicky is't.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add another test with WalkDirPath and validate that all the supported directories are present?

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from fbbc0b1 to dc55c22 Compare May 9, 2024 07:24
GCS supports objects of path-type
e.g. <bucket>/<path1>//<path2>
and treats it different from
<bucket>/<path1>/<path2>.
GCSFuse being a POSIX-compliant file-system
can only support the latter and
throws error on former. This error can
disallow the listing of all directories in the
parent directory i.e. <path1>.
The current change ignores the listing of
prefixes (directory names) which are empty such
as "/" above to ignore the error and logs the
above event as a warning.

Similarly, GCS supports <bucket>/<path1>/./<path2>
and <bucket>/<path1>/../<path2> but these have
special reserved meaning in linux filesystem.
So, ignoring these two prefixes as well.

Similarly, linux filesystem does not support
'\0' in names of files or directories while it
is supported in GCS object names. So, ignoring
those as well in the GCS list call output.
Unsupported names mean directory names
containing '.', '..', '/' or '\0' in them.
These tests check that such objects don't show
up in the list of prefixes returned by the
bucketHandle.ListObjects .
This adds tests for unsupported directories (
directories which are either named '', or '.'
or '..', or have '//', '/./', '/../' in their
total path in their bucket.
* un-export unnecessarily exported internal constants
* update comment
* fix a log
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from dc55c22 to 1d76261 Compare May 31, 2024 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
execute-integration-tests Run only integration tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants