Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A request is issued every time pyingest/config is loaded #110

Open
marblestation opened this issue Oct 23, 2020 · 2 comments
Open

A request is issued every time pyingest/config is loaded #110

marblestation opened this issue Oct 23, 2020 · 2 comments
Assignees

Comments

@marblestation
Copy link
Contributor

marblestation commented Oct 23, 2020

The config file executes a request to github and downloads the list of UAT terms:

uat_request = requests.get(UAT_URL)
uat_data = uat_request.json()
get_uat(uat_request.json(), UAT_ASTRO_URI_DICT)
UAT_ASTRO_KEYWORDS = list(UAT_ASTRO_URI_DICT.keys())
UAT_ASTRO_URI_DICT = dict((k.lower(), v) for k, v in list(UAT_ASTRO_URI_DICT.items()))

This makes pyingest depend on github being operative, and it can lead to different behaviors with time given that the downloaded file can change and it is not under our control (we cannot guarantee reproducibility). It would be prefered to have a concrete version of that file included in pyingest and update it whenever is necessary (+ commit to our repo and make a new release).

@seasidesparrow
Copy link
Member

@aaccomazzi may want to comment here, but this was by design -- this method makes sure the current version is in use whenever ingest is run. Also ADS has substantial control/knowledge of this file and its format because EH/AA are both among the team of the UAT.

@aaccomazzi
Copy link
Member

While I understand the reason behind the current implementation, I am also uncomfortable with the dependency on an external resource.

I suggest we change the behavior to use a cached version of the UAT, and have a small setup script that downloads it on demand if and when a curator chooses to do so. Among other things, I know Katie is working on modifying some of the assets within the Github repo and we wouldn't want that to break our pipeline.

@seasidesparrow seasidesparrow self-assigned this Nov 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants