Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'revoked' field to ThrottledApplication to enable easily revoking access to client applications violating openverse TOS #4334

Conversation

madewithkode
Copy link
Contributor

@madewithkode madewithkode commented May 15, 2024

Fixes

Fixes #4321 by @sarayourfriend

Description

Add 'revoked' field to ThrottledApplication to enable easily revoking access to client applications violating openverse TOS.

Testing Instructions

  • Create a ThrottledApplication on Django admin and go ahead to use the generated credentials to authenticate and make request(s) to the API. This should work as expected and return results.

  • Head back to your ThrottledApplication on Django admin, edit this application by checking the 'revoked' field and save. Try making requests to the API and you should get a 401 PermissionDenied.

Additionally, a unit test exists for this at api/test/integration/test_auth.py::test_revoked_application_access

Checklist

  • My pull request has a descriptive title (not a vague title likeUpdate index.md).
  • My pull request targets the default branch of the repository (main) or a parent feature branch.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.
  • I ran the DAG documentation generator (just catalog/generate-docs for catalog
    PRs) or the media properties generator (just catalog/generate-docs media-props
    for the catalog or just api/generate-docs for the API) where applicable.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@openverse-bot openverse-bot added the 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work label May 15, 2024
@madewithkode
Copy link
Contributor Author

madewithkode commented May 15, 2024

Hi @sarayourfriend, this is my first attempt at navigating this issue, many thanks for the headstart! I was able to get the tests in test_auth.py to pass by making a simple tweak in the rate limiting test parameters. Again, this was more like a workaround as my current knowledge of how everything ties together is still pretty limited so feel free to point out what can be improved.
Also, I kept running into some weird recursion issues whenever I added the rate-limit-checking condition in conf/oauth2_extensions.py/OAuth2Authentication/authenticate() not sure why that was happening or how to fix yet.

Furthermore, I have a few questions as to how things work. First is, how are we able to get the new RevokedOAuth2IdRateThrottle to work without necessarily overriding allow_request to return False in the new class(this is my assumption though cos you didn’t explicitly point it out). Also, how are we able to apply this new rate limiting rule without also explicitly including it in the throttle_classes of the relevant views/viewsets, I recall you mentioned this can somehow be done dynamically through Django admin but essentially I’d like to know how it works under the hood.

@openverse-bot openverse-bot added 🟧 priority: high Stalls work on the project or its dependents 🧰 goal: internal improvement Improvement that benefits maintainers, not users 🕹 aspect: interface Concerns end-users' experience with the software 🏷 status: label work required Needs proper labelling before it can be worked on labels May 15, 2024
@sarayourfriend
Copy link
Contributor

Also, how are we able to apply this new rate limiting rule without also explicitly including it in the throttle_classes of the relevant views/viewsets, I recall you mentioned this can somehow be done dynamically through Django admin but essentially I’d like to know how it works under the hood.

The questions you're running into are good ones, and I think expose the fact that maybe a throttle isn't the right place to handle authorisation. The use of a throttle was a bit of a hack, and actually I'm realising that a new throttle class isn't necessary at all, because the revoked rate_limit_model would never actually allow any request at the start of the request and throttle classes would never run. That authentication class runs before throttling (which makes sense, because throttling depends on authentication to identify the rate limit).

For what it's worth, if we did need the throttle class to run, it would need to be added to the DEFAULT_THROTTLE_CLASSES tuple here:

DEFAULT_THROTTLE_CLASSES = (
"api.utils.throttle.BurstRateThrottle",
"api.utils.throttle.SustainedRateThrottle",
"api.utils.throttle.OpenverseReferrerBurstRateThrottle",
"api.utils.throttle.OpenverseReferrerSustainedRateThrottle",
"api.utils.throttle.OAuth2IdBurstRateThrottle",
"api.utils.throttle.OAuth2IdSustainedRateThrottle",
"api.utils.throttle.EnhancedOAuth2IdBurstRateThrottle",
"api.utils.throttle.EnhancedOAuth2IdSustainedRateThrottle",
"api.utils.throttle.ExemptOAuth2IdRateThrottle",
)

We don't define specific throttle classes on any routes except on a few routes that are more restrictive like thumbnails.

However, with your questions in mind, it really seems to me a throttle scope isn't the right choice, and a new boolean field on ThrottledApplication, checked in the same authentication method we were checking the rate limit model, would be clearer. There wouldn't be these dangling questions of "how does this throttle class get used" for a throttle class that indeed, never would.

I kept running into some weird recursion issues whenever I added the rate-limit-checking condition in conf/oauth2_extensions.py/OAuth2Authentication/authenticate() not sure why that was happening or how to fix yet.

This makes sense, I hadn't realised it would happen, but I see now that it's unavoidable with the exact conditional check I suggested.

The problem is that request.auth is a property that runs the authentication schemes for the application.

https://github.com/encode/django-rest-framework/blob/HEAD/rest_framework/request.py#L247-L256

https://github.com/encode/django-rest-framework/blob/HEAD/rest_framework/request.py#L377-L394

Those are the two code places to keep in mind.

So the authentication runs the first time some code accesses the request.auth property. The authentication routine includes calling into OAuth2Authentication::authenticate to do the authentication. But then in there, I suggested we access request.auth... This access request.auth in the middle (as it were) of request.auth already trying to resolve!

We can get around this by knowing that authenticate returns a tuple of user, auth, that each get assigned to the respective properties on request. So in the authenticate method, request.auth is result[1] would eventually be true. Instead of keeping result as an anonymous result, we can unpack the tuple knowing that the rest framework authentication API is stable. Then instead of request.auth.application, use auth.application.

diff --git a/api/conf/oauth2_extensions.py b/api/conf/oauth2_extensions.py
index 2ece1b497..4ebe41a45 100644
--- a/api/conf/oauth2_extensions.py
+++ b/api/conf/oauth2_extensions.py
@@ -11,7 +11,7 @@ class OAuth2Authentication(BaseOAuth2Authentication):
     keyword = "Bearer"
 
     def authenticate(self, request):
-        result = super().authenticate(request)
+        user, auth = super().authenticate(request)
         if getattr(request, "oauth2_error", None):
             # oauth2_error is only defined on requests that had errors
             # it will be undefined or empty for anonymous requests and
@@ -19,7 +19,11 @@ class OAuth2Authentication(BaseOAuth2Authentication):
             # `request` is mutated by `super().authenticate`
             raise AuthenticationFailed()
 
-        return result
+        if application := getattr(auth, "application", None):
+            if application.revoked:
+                raise PermissionDenied()
+
+        return user, auth
 
 
 class OAuth2OpenApiAuthenticationExtension(TokenScheme):

To summarise: your questions are excellent and expose a deep problem with my suggested implementation. Instead of a new throttle class, let's add a boolean revoked to ThrottledApplication, and check that in the authenticate method. I think the Django admin for ThrottledApplication will automatically allow making changes to revoked, but if not, here's where that model's admin is declared:

@admin.register(ThrottledApplication)
class ThrottledApplicationAdmin(admin.ModelAdmin):
search_fields = ("client_id", "name", "rate_limit_model")
list_display = ("client_id", "name", "created", "rate_limit_model")
ordering = ("-created",)
readonly_fields = (
"skip_authorization",
"verified",
"client_id",
"name",
"user",
"algorithm",
"redirect_uris",
"post_logout_redirect_uris",
"client_type",
"authorization_grant_type",
"client_secret",
)

Sorry for leading you astray with my implementation suggestion! Please let me know if this new approach makes sense, and thank you very much for asking these questions. I wouldn't have realised the problem with my suggested implementation otherwise.

@madewithkode
Copy link
Contributor Author

Hi @sarayourfriend, Once again, thank you for this detailed insights and explanations. Glad to hear my question helped make things more clearer. And yes, your new suggested approach is very clear and intuitive.

@madewithkode madewithkode force-pushed the 4321_add_revoked_throttle_scope_to_enable_easily_revoking_access_to_client_applications_violating_openverse_terms_of_service branch from ad8bdc1 to 47a402c Compare May 16, 2024 10:12
@madewithkode
Copy link
Contributor Author

Hi @sarayourfriend all suggested changes have been implemented and I can confirm that things work as expected locally. Would go ahead and undraft this PR.

@madewithkode madewithkode marked this pull request as ready for review May 16, 2024 10:14
@madewithkode madewithkode requested a review from a team as a code owner May 16, 2024 10:14
@sarayourfriend sarayourfriend added 🧱 stack: api Related to the Django API migrations Modifications to Django migrations and removed 🏷 status: label work required Needs proper labelling before it can be worked on 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels May 16, 2024
Copy link
Contributor

@sarayourfriend sarayourfriend left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @madewithkode! This LGTM, fantastic work getting this turned around after my initial misguidance 🎉 🚀. Thank you very much for the contribution 🙏

if getattr(request, "oauth2_error", None):
# oauth2_error is only defined on requests that had errors
# it will be undefined or empty for anonymous requests and
# requests with valid credentials
# `request` is mutated by `super().authenticate`
raise AuthenticationFailed()
# if this is an authed request, check and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor nit-pick for extra whitespace to help visually break the function up.

Suggested change
# if this is an authed request, check and
# if this is an authed request, check and

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sarayourfriend Updated!

@madewithkode madewithkode force-pushed the 4321_add_revoked_throttle_scope_to_enable_easily_revoking_access_to_client_applications_violating_openverse_terms_of_service branch from 47a402c to 5d48b22 Compare May 17, 2024 08:51
@madewithkode madewithkode changed the title Add 'revoked' throttle scope to enable easily revoking access to client applications violating openverse TOS Add 'revoked' field to ThrottledApplication to enable easily revoking access to client applications violating openverse TOS May 20, 2024
Copy link
Member

@zackkrida zackkrida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, works great locally! @sarayourfriend I wanted to ask a question but feel free to merge this. Should we send a different response to the user other than

{
    "detail": "You do not have permission to perform this action."
}

I can see this response leading to folks generating new credentials. Maybe something like

{
    "detail": "Your registered application has been revoked for violating Openverse's Terms of Service. Learn more here: https://docs.openverse.org/terms_of_service.html."
}

@sarayourfriend
Copy link
Contributor

I can see this response leading to folks generating new credentials. Maybe something like

To be fair, either response would provoke a bad actor to do so; this is really just a signal of "we're aware of your behaviour" more than anything.

I don't think we necessarily need to give the reason why the request was forbidden, and we should discuss internally what the policy would be if someone asked us directly about a specific client application.

@sarayourfriend
Copy link
Contributor

sarayourfriend commented May 20, 2024

@madewithkode this just needs a rebase to resolve the conflict. I'll do that later during my workday today (it's 10 AM here for me) if you don't get to it first, then I'll merge afterwards.

Thanks again!

…#4364)

Co-authored-by: krysal <9145885+krysal@users.noreply.github.com>
@madewithkode madewithkode force-pushed the 4321_add_revoked_throttle_scope_to_enable_easily_revoking_access_to_client_applications_violating_openverse_terms_of_service branch from 5d48b22 to 8f35bf2 Compare May 21, 2024 09:18
@madewithkode madewithkode requested a review from a team as a code owner May 21, 2024 09:18
@madewithkode
Copy link
Contributor Author

@sarayourfriend rebased and pushed the changes.

@sarayourfriend sarayourfriend merged commit 896ca24 into WordPress:main May 22, 2024
45 of 46 checks passed
@sarayourfriend
Copy link
Contributor

Thanks! Sorry I wasn't able to get to it yesterday in the end. Merged 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🕹 aspect: interface Concerns end-users' experience with the software 🧰 goal: internal improvement Improvement that benefits maintainers, not users migrations Modifications to Django migrations 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: api Related to the Django API
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Add ability to revoke access to specific Openverse API registered client applications
4 participants