Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datapusher / Xloader / ... #95

Open
amercader opened this issue Oct 12, 2023 · 1 comment
Open

Datapusher / Xloader / ... #95

amercader opened this issue Oct 12, 2023 · 1 comment
Assignees

Comments

@amercader
Copy link
Member

The current default compose setup runs DataPusher to get data automatically into the DataStore. We should probably work towards defaulting to Xloader though, and making easier to integrate other alternatives like Datapusher-Plus. There are different steps we can take towards that, which can be done separately

1. Decoupling Datapusher

The way in which DataPusher is installed now is not very flexible, as there are commands hardcoded in start_ckan.sh, and it is assumed that the datapusher plugin will be in the enabled plugins. This means that users that don't want to use it need to override the whole start_ckan.sh file (or also like if for instance you enable the expire_api_token plugin and you need to add extra params to the user token add command used).

A good initial step would be to 1) remove the datapusher from the default plugins and 2) consolidate all setup commands in a docker-entrypoint.d/01_setup_datapusher.sh file. This could look like

#!/bin/sh

if [ ! -z "$CKAN_DATAPUSHER_URL" ] ; then
   # Setup datapusher
   # Set api token
   # Add plugin to ini file
fi

This way it's easy to turn it off completely and if you need to tweak the setup commands you just need to override this file on your setup.
I think this file should live in ckan-docker, not in ckan-docker-base.

2. Using xloader

To use Xloader instead, we need to 1) install the extension 2) Run the worker process 3) Configure it and enable the plugin.
Looking at the wiki page it seems that the suggestion is to install xloader and run the process in the same container that the web ckan one? It's hard to tell because I couldn't find the source for the ckan/ckan-base-xloader image.

In any case, perhaps we can either run the worker in the same ckan container (using supervisor to manage the process) or add a separate service in the compose setup that runs the ckan worker process. That could use exactly the same image but with a different command (ckan jobs worker instead of uwsgi or ckan run, which could be handled in start_ckan.sh using a CKAN_WORKER env var or something similar).

Of course ckanext-xloader needs to be installed. We could do it on the ckan-docker-base image but perhaps is better to add the commands in this repo, in ckan/Dockerfile.

I think 1 is a good change to introduce, and I'm open to more suggestions on 2, but keen to hear your thoughts @kowh-ai

amercader added a commit to ckan/ckan-docker-base that referenced this issue Oct 17, 2023
As discussed in ckan/ckan-docker#95

Remove datapusher from default plugin list env var and remove setup
commands from the entrypoint scripts
(start_ckan.sh/start_ckan_development.sh)

Sadly we still need the one to add a temporary value on the api token
config option, otherwise all `ckan` commands will fail, but at least we
only run it if datapusher is listed in CKAN__PLUGINS
amercader added a commit that referenced this issue Oct 17, 2023
As discussed in #95, depends on this one to be
merged/pushed first:

ckan/ckan-docker-base#32
@amercader
Copy link
Member Author

@kowh-ai I had a go at point 1 in these two PRs, let me know what you think:

ckan/ckan-docker-base#32
#97

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants