-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove corrupt bandpass downloads from cache #340
Conversation
@kbarbary Would you mind taking a quick look at this? It's given our production system some issues. |
I'm sorry we ignored this PR for so long. Clearly, my notification settings need to be changed. I can review this work, but for some reason the regression testing did not execute. Do you (or @kboone) have any idea why the GitHub actions did not run on this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I have a few questions before we merge these changes.
@@ -68,6 +68,8 @@ def read_bandpass(fname, fmt='ascii', wave_unit=u.AA, | |||
raise ValueError("format {0} not supported. Supported formats: 'ascii'" | |||
.format(fmt)) | |||
t = ascii.read(fname, names=['wave', 'trans']) | |||
if len(t) == 0: | |||
raise RuntimeError(f"Bandpass file {fname} corrupt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a suggested action for the user? I am not sure what I would do if I got this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are willing to move the corrupt cache file removal into this function, then you could give a more detailed set of instructions here. In the current patch, _safe_read_bandpass
may be a better place to customize it.
sncosmo/builtins.py
Outdated
try: | ||
return read_bandpass(abspath, wave_unit=u.AA, | ||
trim_level=BANDPASS_TRIM_LEVEL, name=name) | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to not copy and past code? As we add bandpass options, I would rather not require future developers to remember to write a 5 line return statement.
_, wave, trans = np.loadtxt(abspath, unpack=True) | ||
except ValueError: | ||
rm(abspath) | ||
raise RuntimeError(f"Bandpass file {abspath} corrupt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this function have an extra line compared to the previous code blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this function does not use _safe_read_bandpass
.
Still not sure how to get the GitHub CI to run, but I ran the tests on my local machine. It looks like there are just some code style issues.
|
An extremely late chime-in here. (Apologies; the maintenance of this package has mostly been taken over by others.) This approach looks like it will work, but has the drawback of repeated code that @benjaminrose mentioned. Also, it doesn't address possible corruption in non-bandpass built-in files, and leaves open the possibility that a file would be incomplete in a way that doesn't trigger an exception when reading (missing full lines in a csv file, for example). The well-traveled solution to this sort of problem is to include checksums to check downloaded files against: then you know for sure that you got the complete and correct file. However, that would probably require substantial changes to the The @mcoughlin Were you getting edge cases where the file was non-empty but corrupt in a way that caused an exception when reading it? Or was it mostly empty files? This is probably way way too late to solve your problem, but for future viewers: a work-around for production systems would be to circumvent the sncosmo code download entirely: Download all the files you'll need once (ensuring they work); tar the entire |
@benjaminrose I can't see why the |
@kbarbary We did experience some actually corrupt files yes. |
We just got this from SVO on our production machine:
|
This looks like SVO was down and not an SNCosmo issue. |
@stefanv and @mcoughlin, I am working on getting a set of issues/PRs for a v2.10 release. I am hoping to get through them over the next few weeks. If you are able to address the comments, we can add this to the 2.10 release. |
@mcoughlin's comment above is relevant: the download succeeds, but contains invalid data (HTML instead of numerical). The service does not make a checksum available, so the best check we have is to ensure that the data is at least loadable. Re: code duplication, easiest may be to write a small wrapper around EDIT: I've added the thin wrapper. There are multiple ways to structure the code, such as including the cache removal inside |
b6397e5
to
b680097
Compare
I have rewritten the patch according to how I think it's supposed to look. Since I no longer use sncosmo directly, I am unable to verify that it works as intended; @mcoughlin would you be willing to help with that? |
No description provided.