Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Feature/azure service principal #5765

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

FrsECM
Copy link

@FrsECM FrsECM commented Apr 22, 2024

PR fulfills these requirements

  • Commit message(s) and PR title follows the format [fix|feat|ci|chore|doc]: TICKET-ID: Short description of change made ex. fix: DEV-XXXX: Removed inconsistent code usage causing intermittent errors
  • Tests for the changes have been added/updated (for bug fixes/features)
  • Docs have been added/updated (for bug fixes/features)
  • Best efforts were made to ensure docs/code are concise and coherent (checked for spelling/grammatical errors, commented out code, debug logs etc.)
  • Self-reviewed and ran all changes on a local instance (for bug fixes/features)

Change has impacts in these area(s)

  • Product design
  • Backend (Database)
  • Backend (API)
  • Frontend

Describe the reason for change

For the moment, there is only an authentication through account key to Azure.
The problem is that account key give a lot of rights on the storage that may not be necessary in order to perform readonly operations.

What does this fix?

Now it is possible to have an azure integration that is based on a service principal.

What is the new behavior?

We have to create an app registration in Azure :

We give this registration specific rights depending on what we want to do in labelstudio :

In the UI, we create a new storage integration :

Under the hood :
image

What is the current behavior?

There is no impact on the previous behavior, it's just a new one.

What libraries were added/updated?

Azure Identity

Does this change affect performance?

Get a delegation key is quite slow, that's why there is :

  • 24h cache for delegation key
  • 15mn cache for sas token

There is a second minor change :

  • When a "thumbnail" tag is present in the datarow, it will be used in the grid view
  • When there is multiple layer in the image, the gridview will only render the first channel.
    (Today there is a buggy behavior).

Does this change affect security?

It should improve the security for people who just need specific rights on the container.
For the moment, the "write" behavior have not been tested / developped on the container.

What alternative approaches were there?

We can use an enterprise application and SSO in order to be linked to a user right and act on behalf of him.

What feature flags were used to cover this change?

None

Does this PR introduce a breaking change?

(check only one)

  • Yes, and covered entirely by feature flag(s)
  • Yes, and covered partially by feature flag(s)
  • No
  • Not sure (briefly explain the situation below)

What level of testing was included in the change?

  • e2e
  • integration
  • unit

Which logical domain(s) does this change affect?

  • Authentication
  • UI

Copy link

netlify bot commented Apr 22, 2024

👷 Deploy request for label-studio-docs-new-theme pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit a9b1e67

Copy link

netlify bot commented Apr 22, 2024

👷 Deploy request for heartex-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit a9b1e67

@github-actions github-actions bot added the feat label Apr 22, 2024
@FrsECM FrsECM marked this pull request as ready for review April 22, 2024 15:48
Copy link
Member

@makseq makseq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add pytest for this storage? There should be examples in tests/ folder for aws/gcs storages.

@FrsECM FrsECM force-pushed the feature/azure-service-principal branch from 14e0ca5 to de5ad0d Compare April 23, 2024 07:44
@FrsECM
Copy link
Author

FrsECM commented Apr 23, 2024

I fixed the write access this morning.
You can now create an export integration :

Without the right right in Azure, you can't use

But if you add :
image

It will work and save the annotation :

@FrsECM
Copy link
Author

FrsECM commented Apr 23, 2024

Could you also add pytest for this storage? There should be examples in tests/ folder for aws/gcs storages.

Can you help me on this part ?
I'am not sure to understand how i'm suppose to create tests.
The test i did locally were done by the UI and on my organisation private azure storage. I'm not able to replicate this in a public env.

@FrsECM FrsECM requested a review from makseq May 3, 2024 14:08
@makseq
Copy link
Member

makseq commented May 23, 2024

Sorry for the long delay, tests are located here:
https://github.com/HumanSignal/label-studio/blob/develop/label_studio/tests/io_storages.tavern.yml#L1168

@FrsECM
Copy link
Author

FrsECM commented May 23, 2024

Sorry for the long delay, tests are located here: https://github.com/HumanSignal/label-studio/blob/develop/label_studio/tests/io_storages.tavern.yml#L1168

Thanks,

I updated the get_import_export_storage_types test case to match the new syntax.

I saw that other things for gcp and s3, but in my case i have a strong validation that the connection is valid and can be consume, then i can't use dummy client_id/client_secret/tenant_id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants