Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[skip ci] Add connectorTestSuitesOptions to metadata #38188

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented May 14, 2024

What

Closes https://github.com/airbytehq/airbyte-internal-issues/issues/7549
Closes https://github.com/airbytehq/airbyte-internal-issues/issues/7550

This PR seeds the connectortTestSuitesOptions new metadata field on all our connectors

How

I generated the values with an ad-hoc airbyte-ci command (that might not be merged).
I checked test suite existence in the following manner:

For python connectors

A connector has unit tests or integration tests if these test folders have files with the test substring in it. This is how pytest discovers tests/

For java connectors

If a connector has src/test I consider they have unit test. If a connector has src/test-integration I consider that have integration test

I considered a connector has acceptanceTest if they have a file named acceptance-test-config.yml in their folder.

Secret validation

I pulled secrets from GSM and mapped a secret to a connector thanks to the connector secret label.
I ran a check with the secret config against a connector to validate that the secret configuration are valid.
Invalid secrets are not declared in the metadata files.

User Impact

None as it field it not yet used.
They'll be used when work on the following issues will be done:
https://github.com/airbytehq/airbyte-internal-issues/issues/7551
https://github.com/airbytehq/airbyte-internal-issues/issues/7552

Safe to merge without CI?

Yes I think so. I don't want to run CI on all our connectors as it will not pass on a lot of connectors.
This field introduction is not changing the connector behavior and I manually tested that the metadata files are valid.

@bnchrch merging this will trigger metadata upload of 335 connectors. Just warning you that we might observe an heavy load on Dagster.

Copy link

vercel bot commented May 14, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview May 15, 2024 1:01pm

@octavia-squidington-iii octavia-squidington-iii added the area/connectors Connector related issues label May 14, 2024
@alafanechere alafanechere force-pushed the augustin/declare-connector-test-suites-options-in-metadata branch from 9bdacfb to 8619f3d Compare May 15, 2024 12:51
@alafanechere alafanechere marked this pull request as ready for review May 15, 2024 12:55
@alafanechere alafanechere requested review from a team as code owners May 15, 2024 12:55
@alafanechere alafanechere requested a review from bnchrch May 15, 2024 12:57
@alafanechere alafanechere force-pushed the augustin/declare-connector-test-suites-options-in-metadata branch from 8619f3d to 8229cef Compare May 15, 2024 13:01
Copy link
Contributor

@bnchrch bnchrch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments.

The major one being how many secrets we have. Which feels incorrect?

Let me know your thoughts there before you merge.

Otherwise I'm going to approve now so as not to be a blocker

the next scheduled sync to start the upgrade early.

"
message: "**Do not upgrade until you have run a test upgrade as outlined [here](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2/#testing-destinations-v2-for-a-single-connection)**.\nThis version introduces [Destinations V2](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2/#what-is-destinations-v2), which provides better error handling, incremental delivery of data for large syncs, and improved final table structures. To review the breaking changes, and how to upgrade, see [here](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2/#quick-start-to-upgrading). These changes will likely require updates to downstream dbt / SQL models, which we walk through [here](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2/#updating-downstream-transformations).\nSelecting `Upgrade` will upgrade **all** connections using this destination at their next sync. You can manually sync existing connections prior to the next scheduled sync to start the upgrade early.\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❗ were risking that the platform doesnt know how to parse the newlines.

I think thats fine though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, define this message field value as a text block, c.f. https://yaml-multiline.info/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was hard to modify yaml and keep ordering, format and comments.
I used ruamel and could not find a way to not make it wrap stuff and add new lines...

@@ -50,4 +36,113 @@ data:
supportsDbt: true
tags:
- language:java
connectorTestSuitesOptions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Wow. This is more individual secrets than I expected. Are we sure they are all actually used? Did we cross reference them with the integration test file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnchrch Sort of... For instance: https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-snowflake/src/test-integration/java/io/airbyte/integrations/destination/snowflake/typing_deduping/SnowflakeInternalStagingCaseInsensitiveTypingDedupingTest.java#18

Some other are referenced in docs.
If they exists there it means that in any case:

  • They pass the check call
  • They are already fetched and mounted for tests as ci_credentials would have fetched them according to the secret label

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong but this seems like something for the destinations team to worry about. FWIW the number of secrets seems plausible to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

number of secrets is a bit crazy, I'll admit. Not sure why that is.
Do we have doc around the meaning of the alias on a testSecret field? I'm not clear if we modify the way we should add new secrets to a connector either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephane-airbyte

number of secrets is a bit crazy, I'll admit. Not sure why that is.

These are all the secrets declared for this connector on GSM at the moment... You can remove unused one from the list safely if they're not used in your tests.

Do we have doc around the meaning of the alias on a testSecret field?

Not yet, it's coming. The alias is just a way to not expose the GCP project id where the secret store is. A mapping between aliases and project id will be kept in a GHA secret.

The slight change in the way we add secret to a connector is that following the creation of a secret in GSM you'd have to update the metadata file to declare a pointer to this secret so that airbyte-ci will fetch them .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly asking because I have a PR in flight that adds a new secret to destination-s3: #38204
The secrets are already in GSM, so you can add them in this PR as well

Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend making all message fields multiline text blocks, instead of escaping newlines.

the next scheduled sync to start the upgrade early.

"
message: "**Do not upgrade until you have run a test upgrade as outlined [here](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2/#testing-destinations-v2-for-a-single-connection)**.\nThis version introduces [Destinations V2](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2/#what-is-destinations-v2), which provides better error handling, incremental delivery of data for large syncs, and improved final table structures. To review the breaking changes, and how to upgrade, see [here](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2/#quick-start-to-upgrading). These changes will likely require updates to downstream dbt / SQL models, which we walk through [here](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2/#updating-downstream-transformations).\nSelecting `Upgrade` will upgrade **all** connections using this destination at their next sync. You can manually sync existing connections prior to the next scheduled sync to start the upgrade early.\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, define this message field value as a text block, c.f. https://yaml-multiline.info/

@@ -50,4 +36,113 @@ data:
supportsDbt: true
tags:
- language:java
connectorTestSuitesOptions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong but this seems like something for the destinations team to worry about. FWIW the number of secrets seems plausible to me.

Copy link
Collaborator

@bazarnov bazarnov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, i'd like to highlight the fact, that some of the CREDS in GSM are either outdated or not used anymore, it's hard to say confidently which ones are fine to remove, it requires manual check for each source and know what secrets exactly is pulled/pulled and override the previously pulled one, etc

Let's merge, so this is not blocked on us, but when any source or destination is under future maintenance - the engineer should be able to validate the secrets pulled and clean the broken ones / duplicated once from the metadata.yaml

alias: airbyte-connector-testing-secret-store
- suite: acceptanceTests
testSecrets:
- name: SECRET_SOURCE-SALESFORCE_BULK_CREDS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used anywhere for the source-salesforce

secretStore:
type: GSM
alias: airbyte-connector-testing-secret-store
- name: SECRET_SOURCE-SALESFORCE_REST_CREDS
Copy link
Collaborator

@bazarnov bazarnov May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In CAT, the config.json and config_sandbox.json are used and fetched from GSM, other ones - are useless, as far as I can see.

- suite: unitTests
- suite: integrationTests
testSecrets:
- name: SECRET_SOURCE-SALESFORCE_BULK_CREDS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used.

secretStore:
type: GSM
alias: airbyte-connector-testing-secret-store
- name: SECRET_SOURCE-HUBSPOT_OAUTH_NO_START_DATE_CREDS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 cases for config_oauth_no_start_date.json, but we should probably include only 1 of them. I'm not sure which exactly. Maybe @maxi297 or @strosek could advise here?

@alafanechere
Copy link
Contributor Author

Looks good to me, i'd like to highlight the fact, that some of the CREDS in GSM are either outdated or not used anymore, it's hard to say confidently which ones are fine to remove, it requires manual check for each source and know what secrets exactly is pulled/pulled and override the previously pulled one, etc

Let's merge, so this is not blocked on us, but when any source or destination is under future maintenance - the engineer should be able to validate the secrets pulled and clean the broken ones / duplicated once from the metadata.yaml

💯 agree @bazarnov The single thing I know for sure is that any config listed here are passing the check command, but it does not mean it's actually use.

@alafanechere
Copy link
Contributor Author

/approve-and-merge reason="bypassing CI as it's a metedata field introduction touching all connectors"

@octavia-approvington
Copy link
Contributor

Myomoto says it looks good
thats a niiice

@octavia-approvington octavia-approvington merged commit 31c95da into master May 15, 2024
25 of 26 checks passed
@octavia-approvington octavia-approvington deleted the augustin/declare-connector-test-suites-options-in-metadata branch May 15, 2024 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants