Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

best_model.hdf5 #26

Open
shahmustafa opened this issue May 7, 2020 · 10 comments
Open

best_model.hdf5 #26

shahmustafa opened this issue May 7, 2020 · 10 comments

Comments

@shahmustafa
Copy link

shahmustafa commented May 7, 2020

Does it generate the best_model? or how is it going to work?

OSError: Unable to open file (unable to open file: name = '/data1/prjs/code/ABTS/dl_4_tsc//results/fcn/UCRArchive_2018_itr_8/Coffee/best_model.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

@hfawaz
Copy link
Owner

hfawaz commented May 8, 2020

Yes, it uses model checkpoint.

@shahmustafa
Copy link
Author

Its only generating last_model.hdf5 and model_init.hdf5 and I am getting error for (FileNotError) best_model.hdf5

@hfawaz
Copy link
Owner

hfawaz commented May 8, 2020

This means that your code was not executed successfully.
Do you see any error when running the code ?

@shahmustafa
Copy link
Author

shahmustafa commented May 8, 2020

python=3.6.8
tensorflow=1.14
when running with python main.py UCRArchive_2018 Coffee fcn _itr_8

getting this error

Traceback (most recent call last):
File "main.py", line 152, in
fit_classifier()
File "main.py", line 44, in fit_classifier
classifier.fit(x_train, y_train, x_test, y_test, y_true)
File "/data1/prjs/code/ABTS/dl_4_tsc/classifiers/fcn.py", line 80, in fit
model = keras.models.load_model(self.output_directory+'best_model.hdf5')
File "/data1/prjs/code/ABTS/venv/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py", line 146, in load_model
return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
File "/data1/prjs/code/ABTS/venv/lib/python3.6/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 200, in load_model_from_hdf5
f = h5py.File(filepath, mode='r')
File "/data1/prjs/code/ABTS/venv/lib/python3.6/site-packages/h5py/_hl/files.py", line 408, in init
swmr=swmr)
File "/data1/prjs/code/ABTS/venv/lib/python3.6/site-packages/h5py/_hl/files.py", line 173, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (unable to open file: name = '/data1/prjs/code/ABTS/dl_4_tsc//results/fcn/UCRArchive_2018_itr_8/Coffee/best_model.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

@hfawaz
Copy link
Owner

hfawaz commented May 8, 2020

This means that the model was not saved, maybe recheck the paths.
If it does not work, I believe you should install TF 2.0 and work with the new version.
The code works with TF 2.0 now.

@shahmustafa
Copy link
Author

With TF2.0 I am getting this

OSError: SavedModel file does not exist at: saved_model_dir/{saved_model.pbtxt|saved_model.pb}

@hfawaz
Copy link
Owner

hfawaz commented May 11, 2020

I think it may be write permissions for the target directory, not quite sure though.

@nabito
Copy link

nabito commented Jun 14, 2020

@hfawaz @shahmustafa I'm experiencing the same issue. Here is my env:
Mac OS X: 10.15.5
Python (conda): 3.8
tensorflow 2.2.0
h5py 2.10.0 py38h3134771_0
hdf5 1.10.4
keras 2.3.1

The error seems to suggest out of memory problem when the code is trying to save intermediate result in HDF, here are some clues:

h5py/h5py#1176
https://stackoverflow.com/questions/44117315/goes-out-of-memory-when-saving-large-array-with-hdf5-py...
http://www.pytables.org/cookbook/inmemory_hdf5_files.html
https://www.pytables.org/cookbook/inmemory_hdf5_files.html
https://stackoverflow.com/questions/40449659/does-h5py-read-the-whole-file-into-memory

In my case, the problem only arise when I started using slightly larger training data (30KB vs 1.8MB). Of course, 30kb wouldn't cause such a problem.

@nabito
Copy link

nabito commented Jun 14, 2020

Here you're error log

Traceback (most recent call last):
  File "main.py", line 155, in <module>
    fit_classifier()
  File "main.py", line 44, in fit_classifier
    classifier.fit(x_train, y_train, x_test, y_test, y_true)
  File "/mnt/batch/tasks/shared/LS_root/jobs/datascience-ml/azureml/resnet-timeseries_1592133278_dfbeddf7/mounts/workspaceblobstore/azureml/resnet-timeseries_1592133278_dfbeddf7/classifiers/resnet.py", line 142, in fit
    y_pred = self.predict(x_val, y_true, x_train, y_train, y_val,
  File "/mnt/batch/tasks/shared/LS_root/jobs/datascience-ml/azureml/resnet-timeseries_1592133278_dfbeddf7/mounts/workspaceblobstore/azureml/resnet-timeseries_1592133278_dfbeddf7/classifiers/resnet.py", line 160, in predict
    model = keras.models.load_model(model_path)
  File "/azureml-envs/azureml_eca0112c9008c12b467c806af1888db3/lib/python3.8/site-packages/tensorflow/python/keras/saving/save.py", line 189, in load_model
    loader_impl.parse_saved_model(filepath)
  File "/azureml-envs/azureml_eca0112c9008c12b467c806af1888db3/lib/python3.8/site-packages/tensorflow/python/saved_model/loader_impl.py", line 110, in parse_saved_model
    raise IOError("SavedModel file does not exist at: %s/{%s|%s}" %
OSError: SavedModel file does not exist at: /mnt/batch/tasks/shared/LS_root/jobs/datascience-ml/azureml/resnet-timeseries_1592133278_dfbeddf7/mounts/workspaceblobstore/azureml/resnet-timeseries_1592133278_dfbeddf7/results/resnet/rtpcr_itr_9/qtower/best_model.hdf5/{saved_model.pbtxt|saved_model.pb}

@hfawaz Could you give us the specific version of all dependencies that works for you during publication?

@arieell25
Copy link

@hfawaz @shahmustafa I'm experiencing the same issue. Here is my env: Mac OS X: 10.15.5 Python (conda): 3.8 tensorflow 2.2.0 h5py 2.10.0 py38h3134771_0 hdf5 1.10.4 keras 2.3.1

The error seems to suggest out of memory problem when the code is trying to save intermediate result in HDF, here are some clues:

h5py/h5py#1176 https://stackoverflow.com/questions/44117315/goes-out-of-memory-when-saving-large-array-with-hdf5-py... http://www.pytables.org/cookbook/inmemory_hdf5_files.html https://www.pytables.org/cookbook/inmemory_hdf5_files.html https://stackoverflow.com/questions/40449659/does-h5py-read-the-whole-file-into-memory

In my case, the problem only arise when I started using slightly larger training data (30KB vs 1.8MB). Of course, 30kb wouldn't cause such a problem.

Has anyone been able to fix this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants