bad url join in celery task when site_url has a path components #20

dlax · 2015-11-03T09:01:45Z

Consider ckan.site_url = http://somehost/ckan in CKAN configuration file. With this, when building api_url here, one gets the wrong URL http://somehost/api/action because the "ckan" path component gets dropped by urljoin. One solution would be to have a trailing / in site_url configuration option but this is apparently not recommended. So I guess some url manipulation would be needed on extension side.

Note that other extensions (such as archiver and datastorer) have the same problem.

The text was updated successfully, but these errors were encountered:

davidread · 2015-11-03T10:10:09Z

You're quite right.

How about changing it to:

api_url = urlparse.urljoin(context['site_url'] + '/', 'api/action')

Perhaps you could create some pull requests with this or similar change?

dlax · 2015-11-03T12:42:31Z

Yes I could. However, I'm not sure what the proper way to fix this. Sure your suggestion would work, but since this appears to be broken in other extensions as well, I was looking for a standard way to handle this... And looking into ckan source code, I could find a few different "solutions":

controllers/feed.py has '/'.join([site_url, resource_path]) and also base_url + h.url_for(controller='package', ...)
controllers/package.py has
c.datastore_api = '%s/api/action' % \ config.get('ckan.site_url', '').rstrip('/')
similar things in lib/dictization/model_dictize.py or model/package.py

lib/cli.py has

fetch_url = config['ckan.site_url']
url = h.url_for(controller='package', action='read', id=dd['name'])
url = urlparse.urljoin(fetch_url, url[1:]) + '.rdf'

which, I guess, would also fail in case site_url has a path component.

Did I miss something or should we really go and fix all usages of urljoin the way you suggests?

wardi · 2015-11-03T12:54:21Z

site_url shouldn't have path components, that's what root_path is for. The url_for helper can be used to generate full paths correctly, I think.

QOL-8059 Add GitHub actions workflow

amercader mentioned this issue Feb 24, 2016

Datapusher error when root_path is set ckan/ckan#2866

Closed

HristijanVilos pushed a commit to keitaroinc/ckanext-qa that referenced this issue Jul 18, 2022

Merge pull request ckan#20 from qld-gov-au/QOL-8059-github-actions

4a07253

QOL-8059 Add GitHub actions workflow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bad url join in celery task when site_url has a path components #20

bad url join in celery task when site_url has a path components #20

dlax commented Nov 3, 2015

davidread commented Nov 3, 2015

dlax commented Nov 3, 2015

wardi commented Nov 3, 2015

bad url join in celery task when site_url has a path components #20

bad url join in celery task when site_url has a path components #20

Comments

dlax commented Nov 3, 2015

davidread commented Nov 3, 2015

dlax commented Nov 3, 2015

wardi commented Nov 3, 2015