Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common: introduce reproducible http requests #542

Open
josephjclark opened this issue May 7, 2024 · 4 comments
Open

common: introduce reproducible http requests #542

josephjclark opened this issue May 7, 2024 · 4 comments

Comments

@josephjclark
Copy link
Collaborator

This is a speculative/design issue

A common pattern coming out of job work is: I called an adaptor function with some data, it did some stuff and sent a http request, the server returned an error.

How can I SEE my raw request and response, and further more, how can I repeat it?

I feel like I want to be able to easily copy the quest out of stdout, paste it somewhere, and run curl request.txt or something to reproduce it.

We could use the CLI to reproduce requests - so long a the command is easy. But what about exporting to other tools, like postman or curl? How can I reproduce a request outside of the openfn context? This feels kinda useful to me

I think we need something like this:

  • our common request helper needs to be callable with a single object with all the input instructions
  • every request input needs to be saved to state with a unique id (maybe?). We should do this after all of our manipulation and messing around - this should be the exact payload we would send to postman or curl
  • If there's an error, we write the input object to state (better)
  • I need some easy way to extract that error payload and re-run it. Perhaps we can add a CLI command. Or export into a curl or postman friendlt format
@josephjclark
Copy link
Collaborator Author

josephjclark commented May 8, 2024

Imagine this:

  • You call an adaptor function. It fails.
  • The adaptor logs an error message as best it can
  • The adaptor serialises the entire request (url, headers, body) onto state, where it can easily be inspected.
  • The adaptor logs a message saying "the request failed and has been saved to your state. To reproduce, run openfn request state.json
  • The job ends on that error
  • You then run openfn request state.json and the same request is called without all the overhead of the runtime and job - we just call out with undici. Maybe we make the code super visible. We output the result to stdout or a file.
  • If you're not satisfied with that, you can do openfn request state.json --postman, which will instead output a config file that you can load into postman. Or openfn request state.json --curl which will output a curl config file or command (I'd need to check against the curl api)

This requires:

  • the common http helper to write the request and maybe response to state
  • AND to log a friendly message (including a cli command to reproduce)
  • BEFORE it throws an exception out to the adaptor
  • The adaptor could choose to mutate the state further, and maybe needs a flag to control this behaviour (?)
  • Obviously we also need a new CLI command to facilitate this
  • A lightning integration?

Problems with this:

  • It'll only work for adaptors using new http, which basically means a re-write of dhis2

I think this is neat and I'd like to do it going forward. But is there a more immediate solution to #541?

@josephjclark
Copy link
Collaborator Author

Thinking about the previous post some more.

Do we want at least the option to post the request object even if the request succeeded? False positives do happen.

If we fail, and write the response, we're writing potentially sensitive headers/credentials and data back to the state object. That will be saved back to lightning as dataclip and be subject to the data retention policy.

I am not comfortable with this idea as a default. Maybe even as an opt-in it's dubious - users likely don't know all the data being saved back to lightning and how secure it is.

Dumping the raw request in the log to be awkwardly copied out is no better. We're still exposing credentials, name and ideas in the logs. Also credentials will be scrubbed so it's not re-usable anyway.

What I'm looking for is a safe way to "eject" that request (with its sensitive payload and credentials) into a JSON object which is downloaded for use in the CLI and then discarded by Lightning. And we just don't have the machinery for that.

@josephjclark
Copy link
Collaborator Author

The only way I can really see it is if a user says "hey, it's cool to export my sensitive data and credentials for this one run. It's more important that I fix the issue". Which is the use-case we're chasing.

That basically means the common helper needs to use a flag which says "eject on error", which means freely dumping stuff to state.

This could be:

  • A flag passed to the request helper. Each adaptor will nee to support and forward this option, and the flag must be set in job code. This feels like the worst.
  • An env var. This could be specific like EJECT_REQUEST_ON_ERROR or general like DEBUG_MODE. In both cases Lightning needs to set the env var, and the runtime needs to feed it properly into the sandbox.

@josephjclark
Copy link
Collaborator Author

The other way of approaching this whole problem is to say: look, making your requests reproducible is a huge and unnacceptable security leak, sorry.

But if you run in the CLI, we assume you're in a trusted environment (the credential is on your system after all). So download and run the workflow locally with the CLI and take a close look at the output there.

That approach needs us to have some kind of env var like DEBUG_MODE or SAFE_MODE=false` which the CLI sets, and adaptor code can use this to make different assumptions about how careful its output should be.

This would be free to implement and doesn't require any features lightning side (we could even later allow lightning to run in unsafe mode).

The gotcha would be that its not easy yet to download your workflow and credential to run in the CLI. But that's a problem we need to solve anyway....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Icebox
Development

No branches or pull requests

1 participant