Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce support of GCS encryption for both CMEK and CSEK #7608

Closed

Conversation

emulatorchen
Copy link

@emulatorchen emulatorchen commented Mar 27, 2024

Closes #(Insert issue number closed by this PR)

Change Description

Background

Provide the support of GCS encryption for both CMEK and CSEK

New Feature

Issue link: #7557

  • Configuration of CMEK supported
  • Configuration of CSEK supported
  • Throw an error for generating PreSignedURL of GCS with CSEK as the user must have the key in the configuration file
    • Frontend change is required
    • If user must have the key in configuration, then the CSEK make not much sense for PreSignedURL

Testing Details

How were the changes tested?

  • CMEK and CSEK cannot be configured at the same time, or the server will fail to start
  • server will failed to start if CSEK is not a valid AES256 32bytes value
  • CSEK encrypted object will fail to read when CSEK is not configured or wrong key
  • CMEK encrypted object will fail to read when CMEK is not configured or wrong key
  • Upload the file to a CMEK enabled bucket with CMEK configuration will be success
  • Check the content of the CMEK encrypted file with CMEK configured
  • Upload the file with CSEK will be success
  • Check the content of the CSEK encrypted file with CSEK configured
  • When the CSEK is configured and PreSignedURL is enabled, generating a PreSignedURL of a CSEK encrypted object is failed

Breaking Change?

Does this change break any existing functionality? (API, CLI, Clients)

No breaking change as there's no API changed, only the server configuration is required

Additional info

Logs, outputs, screenshots of changes if applicable (CLI / GUI changes)

Contact Details

By GitHub account

@CLAassistant
Copy link

CLAassistant commented Mar 27, 2024

CLA assistant check
All committers have signed the CLA.

@Jonathan-Rosenberg Jonathan-Rosenberg added the include-changelog PR description should be included in next release changelog label Apr 8, 2024
@Jonathan-Rosenberg Jonathan-Rosenberg linked an issue Apr 8, 2024 that may be closed by this pull request
@talSofer talSofer added the P3 label Apr 10, 2024
@itaiad200
Copy link
Contributor

Hey @emulatorchen , thank you for your contribution! I should of reviewed sooner, it took me a bit longer since I needed to ramp up my gcp/kms knowledge. @Jonathan-Rosenberg and myself were debating some CSEK/CMEK questions. As we're both not very familiar with gcp, I wonder if you could help us with answering the below:

  1. What's the flow for key rotations for anyone of the 2 settings? We're obviously very worried about user's data that becomes inaccessible due to a rotation taking place, but it's not just data. lakeFS stores immutable committed metadata in the object store using the same gs adapter we're modifying, so a rotation that goes wrong could fail lakeFS completely.
  2. Can the user set CSEK/CMEK after some time of using lakeFS? Will the previous saved objects be accessible.
  3. Would you say it's a lakeFS-level setting? From what we understand, it's common for different gs buckets to have different encryption keys. A single lakeFS installation can manage multiple repositories across different buckets. What's the additional value lakeFS give to the user? It seems like we're forcing a single encryption key for all the repos. Isn't the user better off setting this in the bucket level and keep lakeFS unaware to encryption (again - not sure if that's possible).

@emulatorchen
Copy link
Author

emulatorchen commented Apr 18, 2024

Hey @emulatorchen , thank you for your contribution! I should of reviewed sooner, it took me a bit longer since I needed to ramp up my gcp/kms knowledge. @Jonathan-Rosenberg and myself were debating some CSEK/CMEK questions. As we're both not very familiar with gcp, I wonder if you could help us with answering the below:

  1. What's the flow for key rotations for anyone of the 2 settings? We're obviously very worried about user's data that becomes inaccessible due to a rotation taking place, but it's not just data. lakeFS stores immutable committed metadata in the object store using the same gs adapter we're modifying, so a rotation that goes wrong could fail lakeFS completely.
  2. Can the user set CSEK/CMEK after some time of using lakeFS? Will the previous saved objects be accessible.
  3. Would you say it's a lakeFS-level setting? From what we understand, it's common for different gs buckets to have different encryption keys. A single lakeFS installation can manage multiple repositories across different buckets. What's the additional value lakeFS give to the user? It seems like we're forcing a single encryption key for all the repos. Isn't the user better off setting this in the bucket level and keep lakeFS unaware to encryption (again - not sure if that's possible).

Really appreciate the review! It's my honor to have those critical questions being addressed.

  1. In short, rotating key in KMS does not effect existing objects, so
    a. CSMK (Available only in bucket level)

    1. Using the version of the encrypted key to decrypt
    2. Updating the key to the latest version request a new copy of the object, to my understanding it seems the same as S3

    b. CSEK (Available only in bucket level)

    1. Similar to CSMK, updating a key requires you to make a new copy
  2. As explained above, anything happened in KMS level will be fine but if the key path or the key value must be changed, the only way is to copy to a object, then either it is not possible to be done with existing implementation. So for CMEK we can still have the auto and manual in KMS way, not possible for changing key path in CMEK(may still be possible as long as the permission and the default has been set) or key value in CSEK, the old object will not be available any more.

  3. Yes and we also had thought about this before. There can be at least two perspectives of the encryption: Internal solution and Multi-Tenant.

  • Internal service: All the users are internal and most like in the same team so the point will be to ensure the data is encrypted and only be granted for specific accounts
  • Multi-tenant: Users are mostly from external and would like to manage their own encryption
    And TBH we are more like in a hybrid way. So ideally we would like to have different key setup for each repository.

The reason I am making it in lakeFS level is because there's already a S3 reference in the lakeFS level that I thought it will be easier for you to accept it so that we can meet the internal need first. Then I would like look for a possible solution as a multi-tenant proposal, and they should be able to exist in lakeFS just cannot be configured at the same time. But we all know that it will be more complicated, especially when we are providing an universal UI/API/CLI to the operation of different storage options, different keys will require not just the permission but also the key management in lakeFS itself or we may have issues in CSEK(ex. not able to preview in UI, similar reason I dropped the support of PreSignedURL when CSEK enabled) or the key of the bucket is not using the default key of the project.

And now I implementing CSEK is just providing an option, we will consider more on CMEK. The reason is that CMEK can actually be enabled almost without implementation in lakeFS as long as the key is the default key of the project and being proper authenticated. On the other hand, CSEK is more like a thing that user can have more control when there's risk happened. So the implementation is to consider the case we would like the CMEK being used as an internal protection for now, and I am happy to remove CSEK if you think it's risky.

Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the elaborated response and the PR!
I have added some comments, so lets get through them and continue on.

Comment on lines 140 to 145
if params.ServerSideEncryptionCustomerSupplied != nil {
opts = append(opts, gs.WithServerSideEncryptionCustomerSupplied(params.ServerSideEncryptionCustomerSupplied))
}
if params.ServerSideEncryptionKmsKeyID != "" {
opts = append(opts, gs.WithServerSideEncryptionKmsKeyID(params.ServerSideEncryptionKmsKeyID))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use an empty switch case to make it more idiomatic and emphasize that there can be only one config type.

if c.Blockstore.GS.ServerSideEncryptionCustomerSupplied != "" {
v, err := hex.DecodeString(c.Blockstore.GS.ServerSideEncryptionCustomerSupplied)
if err != nil || len(v) != GcpAESKeyLength {
logging.ContextUnavailable().WithError(err).Fatalf("Value of customer-supplied server side encryption is not a valid %d bytes AES key", GcpAESKeyLength)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please return an error instead. Logging is operated using a middleware. It will be fatal if the block parameters aren't returned properly (i.e. when an error is returned)

}
customerSuppliedKey = v
if c.Blockstore.GS.ServerSideEncryptionKmsKeyID != "" {
logging.ContextUnavailable().Fatal("Setting both kms and customer supplied encryption will result failure when reading/writing object")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@@ -486,17 +489,35 @@ func (c *Config) BlockstoreLocalParams() (blockparams.Local, error) {
return params, nil
}

const (
GcpAESKeyLength = 32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GcpAESKeyLength = 32
gcpAESKeyLength = 32

Comment on lines 105 to 110
Ideally we assume all the objects should be encrypted with AES key at the very beginning
But it can also be the case that some of existing objects are created before the key is introduced
So we determined the object and see if it's encrypted. But we don't support multiple keys for
individual objects, so it will generate the error when an improper key is supplied.

We don't check the key from KMS as it will be decrypted as long as the service account with proper permissions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Ideally we assume all the objects should be encrypted with AES key at the very beginning
But it can also be the case that some of existing objects are created before the key is introduced
So we determined the object and see if it's encrypted. But we don't support multiple keys for
individual objects, so it will generate the error when an improper key is supplied.
We don't check the key from KMS as it will be decrypted as long as the service account with proper permissions
prepareEncryptedReadHandle will assume that all of the objects are encrypted with an AES key at the very beginning.
It may be that some existing objects were created before the key was introduced, so objects will be examined and checked for encryption.
Multiple keys for individual objects aren't supported! An error will be generated if an improper key is supplied.
Keys are not validated in KMS. You should use a proper service account with permissions to access them.

h := (*storage.ObjectHandle)(o)
att, err := h.Attrs(ctx)
if err == nil {
a.log(ctx).Debug("Object has attribute customerKeySHA256 means it is encrypted by Customer-Supplied key: ", att.CustomerKeySHA256)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this log should be inside the if clause below.
Also, change to lower case (Object -> object)

}
}
// Assume no decryption needed when attrs is not found
a.log(ctx).Infof("Object Attrs get error %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case as well.
change to debug please

return h
}

type storageObjectHandle storage.ObjectHandle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain the reasoning behind this type. Why not use functions (instead of methods) that accept storage.ObjectHandle as parameters?

Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you once more!
Since you introduce a new type that wraps another, the methods and functions should use this type instead of the wrapped one. I gave you some comments regarding that, but in general, since the methods of the wrapped struct (storage.ObjectHandle) can be called over the wrapping type, you can use the wrapping type only. This will create consistency and a single way of using the methods.
Other than that LGTM!

Comment on lines 104 to 109
/*
Ideally we assume all the objects should be encrypted with AES key at the very beginning
It may be that some existing objects were created before the key is was introduced, so objects will be examined and checked for encryption.
Multiple keys for individual objects aren't supported! An error will be generated if an improper key is supplied.
Keys in KMS will not be validated. You should use a proper service account with permissions to access them.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments related to methods/functions in Go begin with the method/function's name. Please rephrase that.

return c
}

func (o *storageObjectHandle) newComposer(a *Adapter, srcs ...*storage.ObjectHandle) (*storage.ObjectHandle, *storage.Composer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just as you did with newCopier, please change newComposer, and newWriter to return pointers to the structs they create


func (o *storageObjectHandle) prepareWriteHandle(a *Adapter) *storageObjectHandle {
if a.ServerSideEncryptionCustomerSupplied != nil {
return &storageObjectHandle{o.Key(a.ServerSideEncryptionCustomerSupplied)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return &storageObjectHandle{o.Key(a.ServerSideEncryptionCustomerSupplied)}
o.ObjectHandle = o.Key(a.ServerSideEncryptionCustomerSupplied)

*storage.ObjectHandle
}

func (o *storageObjectHandle) prepareWriteHandle(a *Adapter) *storageObjectHandle {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is classic for the with prefix:

Suggested change
func (o *storageObjectHandle) prepareWriteHandle(a *Adapter) *storageObjectHandle {
func (o *storageObjectHandle) withWriteHandle(a *Adapter) *storageObjectHandle {

Multiple keys for individual objects aren't supported! An error will be generated if an improper key is supplied.
Keys in KMS will not be validated. You should use a proper service account with permissions to access them.
*/
func (o *storageObjectHandle) prepareReadHandle(ctx context.Context, a *Adapter) *storage.ObjectHandle {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepareReadHandle and prepareWriteHandle should return the same type for consistency.
Please return a *storageObjectHandle in both.

if err == nil {
a.log(ctx).Debug("object has attribute customerKeySHA256 means it is encrypted by Customer-Supplied key: ", att.CustomerKeySHA256)
if a.ServerSideEncryptionCustomerSupplied != nil && att.CustomerKeySHA256 != "" {
return o.Key(a.ServerSideEncryptionCustomerSupplied)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return o.Key(a.ServerSideEncryptionCustomerSupplied)
o.ObjectHandle = o.Key(a.ServerSideEncryptionCustomerSupplied)

}
// Assume no decryption needed when attrs is not found
a.log(ctx).Debugf("object Attrs get error %w", err)
return o.ObjectHandle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return o.ObjectHandle
return o

@Jonathan-Rosenberg Jonathan-Rosenberg added the no stale Using this label will prevent items from being marked as stale label May 25, 2024
Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @emulatorchen.
Your contribution is highly appreciated 🙌
We should also add docs to https://github.com/treeverse/lakeFS/blob/master/docs/howto/deploy/gcp.md, but this could be made on a different PR.

@Jonathan-Rosenberg
Copy link
Contributor

@emulatorchen you would need to sign the CLA in order to contribute...

emulatorchen and others added 14 commits May 27, 2024 18:46
…everse#7609)

* Use existing 6MiB file for S3 multipart copy test on Azure ADLS2

Fixes treeverse#7485: it just takes too long to upload some _new_ data there.

* [bug] Use correct path for existing 6 MiB object on ADLS gen2
* Configurable persistence of friendly name to KV
* Update docs and sample config with new property
* [Docs] Improve consent visibility
* Optimize resource loading for a better user experience (LCP)
* Bump ubuntu version for actions

* Fix changes

* test

* test2

* test3

* test4

* Good for now

---------

Co-authored-by: Nir Ozery <nir.ozery@treeverse.io>
N-o-Z and others added 26 commits May 27, 2024 18:47
…reeverse#7708)

* Document the correct Spark client version (0.13.0, maybe assembled)

* [CR] [bug] Fix tab heading

Tab headings don't support `` `...` `` so don't use that there.  Also add
words how to use the assembled JAR with spark-shell and friends.

* [bug] Avoid indentation in <div>-tabs notation
* make sure creds are refreshed before expiry

* small fix

* small fix

* small fix
* Task: Create Rust API Client

* Add linguist-generated

* CR Fixes
* lakectl - Support Retries
* Remove linked addresses from KV

* CR Fixes

* Add changelog

* Change delimiter and encoding

* More fixes

* More fixes 2

* More More Fixes

* Update esti/lakectl_util.go

Co-authored-by: Ariel Shaqed (Scolnicov) <ariels@treeverse.io>

* Fix azure regex

---------

Co-authored-by: Ariel Shaqed (Scolnicov) <ariels@treeverse.io>
* Add a section about the Cloud Scalability Model

* Update lakefs-cloud.md

* Apply suggestions from code review

Co-authored-by: N-o-Z <ozery.nir@gmail.com>

* Update lakefs-cloud.md

---------

Co-authored-by: N-o-Z <ozery.nir@gmail.com>
* Organize Enterprise docs

* PR fixes

* Fix Enterprise docs

* idp
* Update lakectl abuse to use set/link physical address

* CR Fixes

* Fix test
* Update CHANGELOG.md

* Update CHANGELOG.md

Co-authored-by: itaiad200 <itaiad200@gmail.com>

---------

Co-authored-by: itaiad200 <itaiad200@gmail.com>
* Support presigned image URLs in markdown
* Migrate from react-markdown to rehype-react + remark-rehype
* Fix background staging token behaviour

* fix
* Change MaxConnectionsPerHost DDB

* Change config param name
* Add Mirroring to the list of lakeFS Cloud features

* add deprecation notice for Unity Delta sharing
@emulatorchen
Copy link
Author

I think I just screwed up this PR when I tried to fix email on my commits, I will just create another PR for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor feature-request include-changelog PR description should be included in next release changelog no stale Using this label will prevent items from being marked as stale P3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KMS encryption support in GCS