Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean Up Logging #8144

Open
2 tasks
EricSoroos opened this issue Apr 3, 2024 · 0 comments
Open
2 tasks

Clean Up Logging #8144

EricSoroos opened this issue Apr 3, 2024 · 0 comments

Comments

@EricSoroos
Copy link
Contributor

CKAN version

2.9, 2.10, Master

Describe the bug

Following the recent CVE for log injection I was reviewing what we're doing with logs, and I think we should go through and change a bunch of logging calls.

For context:

The Logging api is generally:

log.[severity]("% format string", repl1, repl2, ...)

This later gets run through a log formatter to add the wrapper for the log format string -- timestamp, module, etc. It gets % evaluated at that point. Also, for performance reasons, the log formatting isn't run if the log is not going to be emitted.

However, in a lot of places, we're doing one of:

log.debug("string %s" % variable)
log.debug("string {0}".format(variable))
log.debug(f"string {variable}")

e.g.:

./ckan/lib/search/query.py281:        log.debug('Package query: %r' % query)
./ckan/logic/action/create.py1108:    log.debug('Created user {name}'.format(name=user.name))

https://termbin.com/r3jw

This is probably not directly dangerous if there's no substitution into the message anywhere in the stack. Performance wise, In the debug case, by default, all this string formatting is wasted.

The recommendations from the links are:

  • Don’t log untrusted text. Python’s logging library doesn’t protect you from newlines or other unicode characters which allow attackers to mess up — or even forge — logs.
  • Don’t format logs yourself (with f-strings or otherwise). In certain situations this could leave you vulnerable to denial-of-service attacks or even sensitive data exposure.

However, using a filter, we can prevent newlines or other interesting things injected into the logs anywhere using a logging.Filter to prevent that crlf from being added to the logs via a parameter.

e.g.

def filter(record):
   if instanceof(record.args, tuple):
      record.args = tuple(clean(s.__str__()) for s in records)
   elif isinstance(record.args, dict):
      for k,v in record.items():
         record[k] = clean(v.__str__())

Todo

  • Update all the logging calls to use log.level(format, variable, ...)
  • Create a log filter, and add it to the root logger
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant