Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove corrupt bandpass downloads from cache #340

Closed
wants to merge 2 commits into from

Conversation

stefanv
Copy link

@stefanv stefanv commented May 27, 2022

No description provided.

@stefanv stefanv changed the title When bandpasses cannot be loaded from cached downloads, delete them Remove corrupt bandpass downloads from cache May 27, 2022
@mcoughlin
Copy link
Contributor

@kbarbary Would you mind taking a quick look at this? It's given our production system some issues.

@benjaminrose benjaminrose self-requested a review July 5, 2022 12:42
@benjaminrose
Copy link
Member

I'm sorry we ignored this PR for so long. Clearly, my notification settings need to be changed.

I can review this work, but for some reason the regression testing did not execute. Do you (or @kboone) have any idea why the GitHub actions did not run on this PR?

Copy link
Member

@benjaminrose benjaminrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I have a few questions before we merge these changes.

@@ -68,6 +68,8 @@ def read_bandpass(fname, fmt='ascii', wave_unit=u.AA,
raise ValueError("format {0} not supported. Supported formats: 'ascii'"
.format(fmt))
t = ascii.read(fname, names=['wave', 'trans'])
if len(t) == 0:
raise RuntimeError(f"Bandpass file {fname} corrupt")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a suggested action for the user? I am not sure what I would do if I got this error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are willing to move the corrupt cache file removal into this function, then you could give a more detailed set of instructions here. In the current patch, _safe_read_bandpass may be a better place to customize it.

try:
return read_bandpass(abspath, wave_unit=u.AA,
trim_level=BANDPASS_TRIM_LEVEL, name=name)
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to not copy and past code? As we add bandpass options, I would rather not require future developers to remember to write a 5 line return statement.

_, wave, trans = np.loadtxt(abspath, unpack=True)
except ValueError:
rm(abspath)
raise RuntimeError(f"Bandpass file {abspath} corrupt")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this function have an extra line compared to the previous code blocks?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this function does not use _safe_read_bandpass.

@benjaminrose
Copy link
Member

Still not sure how to get the GitHub CI to run, but I ran the tests on my local machine. It looks like there are just some code style issues.

❯ tox -e codestyle
codestyle create: /Users/benrose/sncosmo/.tox/codestyle
codestyle installdeps: pycodestyle
codestyle installed: pycodestyle==2.8.0
codestyle run-test-pre: PYTHONHASHSEED='3289138844'
codestyle run-test: commands[0] | ./checkstyle
sncosmo/builtins.py:81:5: E722 do not use bare 'except'
sncosmo/builtins.py:91:5: E722 do not use bare 'except'
sncosmo/builtins.py:101:5: E722 do not use bare 'except'
sncosmo/builtins.py:111:5: E722 do not use bare 'except'
ERROR: InvocationError for command /Users/benrose/sncosmo/checkstyle (exited with code 1)
____________________________________________________ summary ____________________________________________________
ERROR:   codestyle: commands failed
❯ tox -e py3

============================================== test session starts ==============================================
platform darwin -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0
cachedir: .tox/py3/.pytest_cache
rootdir: /Users/benrose/sncosmo, configfile: pyproject.toml
plugins: remotedata-0.3.3, astropy-header-0.2.1, mock-3.8.2, astropy-0.10.0, filter-subpackage-0.1.1, hypothesis-6.50.1, openfiles-0.5.0, doctestplus-0.12.0, cov-3.0.0, arraydiff-0.5.0
collected 123 items

../lib/python3.10/site-packages/sncosmo/tests/test_bandpasses.py .........                                [  7%]
../lib/python3.10/site-packages/sncosmo/tests/test_builtins.py .....................                      [ 24%]
../lib/python3.10/site-packages/sncosmo/tests/test_fitting.py sssssssss                                   [ 31%]
../lib/python3.10/site-packages/sncosmo/tests/test_io.py ............                                     [ 41%]
../lib/python3.10/site-packages/sncosmo/tests/test_magsystems.py ....                                     [ 44%]
../lib/python3.10/site-packages/sncosmo/tests/test_models.py .............                                [ 55%]
../lib/python3.10/site-packages/sncosmo/tests/test_photdata.py .                                          [ 56%]
../lib/python3.10/site-packages/sncosmo/tests/test_plotting.py s                                          [ 56%]
../lib/python3.10/site-packages/sncosmo/tests/test_registry.py .....................                      [ 73%]
../lib/python3.10/site-packages/sncosmo/tests/test_salt2source.py ..                                      [ 75%]
../lib/python3.10/site-packages/sncosmo/tests/test_salt2utils.py .....                                    [ 79%]
../lib/python3.10/site-packages/sncosmo/tests/test_simulation.py ..                                       [ 81%]
../lib/python3.10/site-packages/sncosmo/tests/test_snanaio.py .......                                     [ 86%]
../lib/python3.10/site-packages/sncosmo/tests/test_spectrum.py .......sss                                 [ 95%]
../lib/python3.10/site-packages/sncosmo/tests/test_sugarsource.py .                                       [ 95%]
../lib/python3.10/site-packages/sncosmo/tests/test_utils.py .....                                         [100%]

=============================================== warnings summary ================================================
../lib/python3.10/site-packages/sncosmo/__init__.py:122
  /Users/benrose/sncosmo/.tox/py3/lib/python3.10/site-packages/sncosmo/__init__.py:122: AstropyDeprecationWarning: The update_default_config function is deprecated and may be removed in a future version.
    update_default_config(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================== 110 passed, 13 skipped, 1 warning in 36.14s ==================================

@kbarbary
Copy link
Member

kbarbary commented Aug 1, 2022

An extremely late chime-in here. (Apologies; the maintenance of this package has mostly been taken over by others.)

This approach looks like it will work, but has the drawback of repeated code that @benjaminrose mentioned. Also, it doesn't address possible corruption in non-bandpass built-in files, and leaves open the possibility that a file would be incomplete in a way that doesn't trigger an exception when reading (missing full lines in a csv file, for example).

The well-traveled solution to this sort of problem is to include checksums to check downloaded files against: then you know for sure that you got the complete and correct file. However, that would probably require substantial changes to the DataMirror class and maybe the built-in registry. (I didn't think of this problem when designing those.)

The DataMirror class does already check for exceptions during a download and remove the file if any occur (see here). As a stop-gap, it would be simple to also add a check for empty files to that download functionality, and remove the file and raise an exception if so. That would at least ensure that all downloaded files are at least non-empty, but doesn't cover other ways that the file could be non-empty but still corrupt.

@mcoughlin Were you getting edge cases where the file was non-empty but corrupt in a way that caused an exception when reading it? Or was it mostly empty files?

This is probably way way too late to solve your problem, but for future viewers: a work-around for production systems would be to circumvent the sncosmo code download entirely: Download all the files you'll need once (ensuring they work); tar the entire $SNCOSMO_DATA_DIR; redistribute and unpack that file yourself on worker machines before running sncosmo. Then you can ensure the integrity of the file upon download. sncosmo won't ever try to download anything if the necessary files already exist in $SNCOSMO_DATA_DIR.

@kbarbary
Copy link
Member

kbarbary commented Aug 1, 2022

@benjaminrose I can't see why the Tests action didn't run on this PR. run_tests.yml clearly has pull_request in the on: section, and it did work on the past.

@mcoughlin
Copy link
Contributor

@kbarbary We did experience some actually corrupt files yes.

@mcoughlin
Copy link
Contributor

We just got this from SVO on our production machine:

<style>
body {
color                : #000;
  margin               : 0px;
  margin-top           : 60px;
  margin-left          : 60px;
  margin-right         : 90px;
  font-size            : 12px;
  font-family          : Verdana,Arial, Helvetica, sans-serif;
  background: #000;
  background-image: url("orion2.jpg");
  background-repeat: no-repeat;
  background-position: right;
}

td {
color: #000;
background: #fff;
border: 2px #3A50A0 solid;
padding: 50px;
}
h1 {
color: #3A50A0;
}
.logo {
  position: absolute;  left: 10px;
  top: 2px;
  z-index: 0;
}

</style>
<div class=logo><img src="logo-svo.png"></div>

<table><tr><td>

<h1>SVO theoretical services</h1>

<h3>Service temporarilly unavailable.</h3>

  <p>We are having some technical problems and our services will be unavailable for some days.
  <p>We apologize for any inconvenience.

</td></tr></table>
<div><img src="logo-compl-cab+mdm_b60.png"></div><style>
body {
color                : #000;
  margin               : 0px;
  margin-top           : 60px;
  margin-left          : 60px;
  margin-right         : 90px;
  font-size            : 12px;
  font-family          : Verdana,Arial, Helvetica, sans-serif;
  background: #000;
  background-image: url("orion2.jpg");
  background-repeat: no-repeat;
  background-position: right;
}

td {
color: #000;
background: #fff;
border: 2px #3A50A0 solid;
padding: 50px;
}
h1 {
color: #3A50A0;
}
.logo {
  position: absolute;  left: 10px;
  top: 2px;
  z-index: 0;
}

</style>
<div class=logo><img src="logo-svo.png"></div>

<table><tr><td>

<h1>SVO theoretical services</h1>

<h3>Service temporarilly unavailable.</h3>

  <p>We are having some technical problems and our services will be unavailable for some days.
  <p>We apologize for any inconvenience.

</td></tr></table>
<div><img src="logo-compl-cab+mdm_b60.png"></div>

@benjaminrose
Copy link
Member

This looks like SVO was down and not an SNCosmo issue.

@benjaminrose
Copy link
Member

@stefanv and @mcoughlin, I am working on getting a set of issues/PRs for a v2.10 release. I am hoping to get through them over the next few weeks. If you are able to address the comments, we can add this to the 2.10 release.

@stefanv
Copy link
Author

stefanv commented Apr 12, 2023

@mcoughlin's comment above is relevant: the download succeeds, but contains invalid data (HTML instead of numerical). The service does not make a checksum available, so the best check we have is to ensure that the data is at least loadable.

Re: code duplication, easiest may be to write a small wrapper around read_bandpass, but load_bandpass_remote_wfc3 will still have to be handled separately.

EDIT: I've added the thin wrapper. There are multiple ways to structure the code, such as including the cache removal inside read_bandpass; I don't have enough knowledge of the project style/structure to make that call.

@stefanv
Copy link
Author

stefanv commented Apr 12, 2023

I have rewritten the patch according to how I think it's supposed to look. Since I no longer use sncosmo directly, I am unable to verify that it works as intended; @mcoughlin would you be willing to help with that?

@stefanv stefanv closed this Sep 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants