Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL Safety #1961

Open
tghosth opened this issue May 16, 2024 · 20 comments
Open

URL Safety #1961

tghosth opened this issue May 16, 2024 · 20 comments
Assignees
Labels
1) Discussion ongoing Issue is opened and assigned but no clear proposal yet V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements _5.0 - prep This needs to be addressed to prepare 5.0

Comments

@tghosth
Copy link
Collaborator

tghosth commented May 16, 2024

This is a spinoff from #1589 with a focus on URLs.

My first question is, are we aiming this requirement at the dynamic generation of URLs or at the processor of URLs?

@jmanico

@tghosth tghosth added 1) Discussion ongoing Issue is opened and assigned but no clear proposal yet _5.0 - prep This needs to be addressed to prepare 5.0 V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements labels May 16, 2024
@elarlang
Copy link
Collaborator

elarlang commented May 16, 2024

Related: #1590 (comment)

Also related:

V5.5.5 [ADDED, MERGED FROM 13.1.1] Verify that different parsers used in the application for the same data type (e.g. JSON parsers, XML parsers, URL parsers), perform parsing in a consistent way and use the same character encoding mechanism to avoid issues such as JSON Interoperability vulnerabilities or different URI or file parsing behavior being exploited in Remote File Inclusion (RFI) or Server-side Request Forgery (SSRF) attacks.

@jmanico
Copy link
Member

jmanico commented May 16, 2024 via email

@jmanico
Copy link
Member

jmanico commented May 16, 2024 via email

@elarlang
Copy link
Collaborator

I believe both. However, I believe it’s not our job to set requirements for web servers in my opinion. I state with respect. I do believe we need to be aware of the differences between web servers.

vs

One of my concerns around URL safety is that bad URL handling to lead to a variety of different major security issues. This includes XSS, SSRF and open redirects.

I think those two statements are in conflict - if a web server handles URL incorrectly, it is exactly our goal to provide requirements for that and point to the specs, of how it must be done. It is not expected, that program on one web server is safe and on the other one is not just because they handle URL differently.

The provided topics are quite vague to comment on. For example, how "URL output encoding", "complete URL output encoding" and "safe complete URL output encoding" are different from the requirement we already have? Please provide example attacks or vulnerabilities that should be addressed.

@jmanico
Copy link
Member

jmanico commented May 18, 2024 via email

@elarlang
Copy link
Collaborator

If it behaves like you have it implemented, it's done incorrectly. Recommended reading:

https://datatracker.ietf.org/doc/html/rfc3986#section-2.4

Under normal circumstances, the only time when octets within a URI are percent-encoded is during the process of producing the URI from its component parts.
Once produced, a URI is always in its percent-encoded form.

When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters.

Implementations must not percent-encode or decode the same string more than once, as decoding an already decoded string might lead to misinterpreting a percent data octet as the beginning of a percent-encoding, or vice versa in the case of percent-encoding an already percent-encoded string.

In total, you listed 3 problems: please list examples for all of them, one by one, keeping the entire coverage of the risen problem.

@elarlang
Copy link
Collaborator

Like I demonstrated on my own server, if you url encode untrusted data and place it on a path, path traversal is possible on some (popular) HTTP servers. So URL encoding untrusted data on a path is not universally safe. This is why the JWT standard opted for base64url encoding and not url encoding.

Correct URL encoding in this situation is for functionality to work correctly.

If you have a path traversal vulnerability on the server side, you can not fix it with URL encoding for the client, it is not a trusted service layer in this context. You need to fix it on the server side - with validation and sanitization like every other user input.

So, your point of view (to say that path traversal on the server side is the result of encoding URL incorrectly for the client side) is also conceptually just wrong.

@jmanico
Copy link
Member

jmanico commented May 18, 2024 via email

@elarlang
Copy link
Collaborator

elarlang commented May 18, 2024

The point I’m making is that it is poor url handling is reality (in both url parsing classes, servers and more) that we must and should recognize, regardless of standards. Many of our other requirements would not be needed if dev’s and components just followed standards, but they do not.
Many of our other requirements would not be needed if dev’s and components just followed standards, but they do not.

This is the first statement I agree with - I recommend you start from your own server :) We are not going to tolerate vulnerabilities just because they are widespread.

As with 3 attempts here and 10s of comments before, I was not able to get answers or understand, what is the problem with current requirements, or what the proposals are to improve them, then I'll give it over to @tghosth .

My summary:

Output encoding

I stand for the statement, that the output encoding is covered with 5.3.1

  1. safe complete URL output encoding (XSS)
  2. safe inserting of untrusted data into query string parameters and paths (url encoding and possibly base64encoding)

Safe inserting means that we need to transfer the data correctly to the next component. If the problem is in the next component, it can not be solved with output encoding.

More detailed comment for that is written here: #1589 (comment)

To use base64uri or whatever other "URL-safe" encoding, it is a question of application logic to transfer its data between its components. If URL encoding is done correctly, we can not require the use of base64uri instead.

Validation

JavaScript URL’s alone cause over a third of real world XSS attacks and is on the rise.

  1. url validation (SSRF and open redirects)

We have requirement 5.1.4 for validating the data from the user, we can mention URL to give it more spotlight (issue #1552)

# Description L1 L2 L3 CWE
5.1.4 [GRAMMAR] Verify that structured data is strongly typed and validated against a defined schema including allowed characters, length and pattern (e.g. credit card numbers, e-mail addresses, telephone numbers, or validating that two related fields are reasonable, such as checking that suburb and zipcode match). 20

If validation is done and was not enough, the value must be sanitized.

# Description L1 L2 L3 CWE
5.2.2 Verify that unstructured data is sanitized to enforce safety measures such as allowed characters and length. 138

For open redirect and SSRF we have:

# Description L1 L2 L3 CWE
5.1.5 [MODIFIED, SPLIT TO 50.7.1] Verify that the application will only automatically redirect the user to a different URL directly from an application URL where the destination appears on an allow list. 601
5.2.6 Verify that the application protects against SSRF attacks, by validating or sanitizing untrusted data or HTTP file metadata, such as filenames and URL input fields, and uses allow lists of protocols, domains, paths and ports. 918

@jmanico
Copy link
Member

jmanico commented May 18, 2024 via email

@tghosth
Copy link
Collaborator Author

tghosth commented May 19, 2024

Thanks both for the comments and direction.

However, I believe it’s not our job to set requirements for web servers in my opinion. I state with respect. I do believe we need to be aware of the differences between web servers.

I think this is a subtle but important point. We are not writing requirements for how to write a web server (e.g. nginx or Apache or IIS or whatever), we are providing requirements for applications which run on these web servers.

As such, we need to advise on generating standards compliant URLs and how to manually process URLs in a safe way but I don't think we can necessarily account for every edge case or every instance where the underlying web server does not work in a standards compliant way.

In terms of the areas to cover:

1) safe complete URL output encoding (XSS)

@jmanico I would appreciate a more specific example of how this specific point should be covered. The way I understand it is that if a URL needs to be encoded for a web page based on the context where it is being written (HTML element, HTML attribute, Javascript, etc) in the same way that any content would need to be. I am not sure what is needed here in addition to what is in 5.3.1.

2) safe inserting of untrusted data into query string parameters and paths (url encoding and possibly base64encoding)

I agree with @jmanico that this has some specific points which would benefit from their own requirement. Do you think you could provide a suggested draft text for a requirement here @jmanico?

3) url validation (SSRF and open redirects)

In this case I think I agree with @elarlang that we already have a number of requirements that handle these aspects, @jmanico I am not sure we need anything additional here unless you wanted to suggest specific changes to the requirements Elar noted above.

@jmanico
Copy link
Member

jmanico commented May 19, 2024 via email

@tghosth
Copy link
Collaborator Author

tghosth commented May 19, 2024

Output encoding complete URL’s in a src or href context is not a complete defense since JavaScript: URL’s still execute XSS even when encoded.

Specifically, you mean that the JavaScript URL will execute when it is clicked on, i.e. it does not execute when the page is rendered but rather when the link is clicked (which is the theoretical purpose of the JavaScript URL anyway). Did I understand correctly?

First, build dynamic URL’s safely....

Is there any OWASP cheatsheet for this as it feels like the rules here are going to be a little complicated?

Also, validate host names and validate schemes/protocols using allow-list validation.

I am going to look to see how we currently talk about this related to SSRF and other stuff and whether we need to add more content to the existing requirements. (I will get back to you :) )

@tghosth
Copy link
Collaborator Author

tghosth commented May 23, 2024

Notes from call:

  1. Validating a complete URL to safely place in a webpage

Handled in #1589

  1. Output encoding a complete URL when placing into a HTML document

Handled in #1589

  1. Validation a complete URL for server side use (redirect/SSRF)

@tghosth to check what requirements we already have for SSRF/redirects, i.e. where we have control over the interpreter.

  1. Assembling a dynamic URL safely
    4a) Placing untrusted data in a querystring

Need a new requirement(s) @jmanico

4b) Placing untrusted data in the path of a URL (REST Style)

Need a new requirement(s) @jmanico

4c) Any other URL contexts?

Need a new requirement(s) @jmanico

@elarlang
Copy link
Collaborator

The point I’m making is that it is poor url handling is reality (in both url parsing classes, servers and more) that we must and should recognize, regardless of standards.

caddyserver/caddy#1582 - worth reading in the context of what is expected and accepted behavior by a web server and what is not.

My point is - ASVS should not go and tolerate going against specifications.

@jmanico
Copy link
Member

jmanico commented May 23, 2024

This is not just a caddy issue. Other servers have the same struggle. I understand your point, but I want to acknowledge the reality that URL encoding is not a universal defense. base64encoding is.

I plan to build a requirement that suggests URL encoding and also suggests base64url encoding when URL encoding is not sufficient to address both of our concerns. Again, this is why the JWT standard uses base64url encoding. I'm not the only one who acknowledges this reality.

@elarlang
Copy link
Collaborator

URL handling

So, if I point to RFC - you say (#1589 (comment)) your understanding of the URL handling is still right. When I point to the Caddy issue, which describes the same behavior as a bug/security issue, you keep saying, that your own server, which handles the URL the same-incorrect way, is still right?

Other servers have the same struggle.

Please provide names and facts. We should only talk about facts and examples, not "but The Others". Also the reason and explanations, why they do so. Or... is it someones configuration issue.

URL building

I plan to build a requirement that suggests URL encoding and also suggests base64url encoding

I'm waiting for your proposals, but note, that a suggestion or recommendation is not a security requirement.

This is what I have pointed out before - if you want to suggest, that it is easier for the interpreter to handle the URL, you can send the data in base64uri encoding. It is not a security hole, when someone does not use base64uri encoding instead of just using correctly URL encoding.

@jmanico
Copy link
Member

jmanico commented May 24, 2024 via email

@tghosth
Copy link
Collaborator Author

tghosth commented Jun 2, 2024

Hi @jmanico , I believe you have suggested the following requirements:

  1. If building dynamic urls, be sure to URL encode when placing
    untrusted data in a querystring parameter and validate the complete URL

and

  1. If building dynamic urls, be sure to URL encode or base64url encode
    when placing untrusted data in a path parameter and validate the
    complete URL

Do you think these could be merged?
Do these cover all cases related to URLs? Do we need to consider javascript: urls or something as well?

@jmanico
Copy link
Member

jmanico commented Jun 6, 2024

I can live with these being merged. Perhaps:

Verify that when building dynamic URLs, untrusted data is URL encoded or base64url encoded, and ensure the complete URL is validated to confirm it is legal and the protocol is safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1) Discussion ongoing Issue is opened and assigned but no clear proposal yet V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements _5.0 - prep This needs to be addressed to prepare 5.0
Projects
None yet
Development

No branches or pull requests

3 participants