Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have a bug when reproduce "demo_tutorial/galaxy10/Galaxy10_Tutorial.ipynb" #27

Open
Junjie-Jin opened this issue May 4, 2024 · 7 comments

Comments

@Junjie-Jin
Copy link

Junjie-Jin commented May 4, 2024

System information

  • Have I written custom code?:
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04 or Windows 10 v1709 x64):Mac M1
  • astroNN (Build or Version):
  • Did you try the latest astroNN commit?:
  • TensorFlow installed from (source or binary, official build?):
  • TensorFlow version:2.16.1
  • Python version: 3.9
  • CUDA & cuDNN version (if applicable):
  • GPU model and memor (if applicable)y:
  • Exact command/script to reproduce (if applicable):

Describe the problem

have the problem when train the nerual net, the error is :

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.


TypeError Traceback (most recent call last)
Cell In[12], line 3
1 # To train the nerual net
2 # astroNN will normalize the data by default
----> 3 galaxy10net.train(train_images, train_labels)

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/shared/warnings.py:55, in deprecated_copy_signature..deco..tgt(*args, **kwargs)
49 warnings.warn(
50 f"Call to function {target.name}() is deprecated and will be removed in "
51 + f"future. Use {signature_source.name}() instead.",
52 stacklevel=2,
53 )
54 inspect.signature(signature_source).bind(*args, **kwargs)
---> 55 return target(*args, **kwargs)

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/models/base_cnn.py:702, in CNNBase.train(self, *args, **kwargs)
700 @deprecated_copy_signature(fit)
701 def train(self, *args, **kwargs):
--> 702 return self.fit(*args, **kwargs)

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/models/base_cnn.py:394, in CNNBase.fit(self, input_data, labels, sample_weight)
380 """
381 Train a Convolutional neural network
382
(...)
391 :History: 2017-Dec-06 - Written - Henry Leung (University of Toronto)
392 """
393 # Call the checklist to create astroNN folder and save parameters
--> 394 self.pre_training_checklist_child(input_data, labels, sample_weight)
396 reduce_lr = ReduceLROnPlateau(
397 monitor="val_loss",
398 factor=0.5,
(...)
403 verbose=self.verbose,
404 )
406 early_stopping = EarlyStopping(
407 monitor="val_loss",
408 min_delta=self.early_stopping_min_delta,
(...)
411 mode="min",
412 )

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/models/base_cnn.py:319, in CNNBase.pre_training_checklist_child(self, input_data, labels, sample_weight)
315 norm_labels = self.labels_normalizer.normalize(labels, calc=False)
316 if (
317 self.keras_model is None
318 ): # only compile if there is no keras_model, e.g. fine-tuning does not required
--> 319 self.compile()
321 norm_data = self._tensor_dict_sanitize(norm_data, self.keras_model.input_names)
322 norm_labels = self._tensor_dict_sanitize(
323 norm_labels, self.keras_model.output_names
324 )

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/models/base_cnn.py:235, in CNNBase.compile(self, optimizer, loss, metrics, weighted_metrics, loss_weights, sample_weight_mode)
229 raise RuntimeError(
230 'Only "regression", "classification" and "binary_classification" are supported'
231 )
233 self.keras_model = self.model()
--> 235 self.keras_model.compile(
236 loss=loss_func,
237 optimizer=self.optimizer,
238 metrics=self.metrics,
239 weighted_metrics=weighted_metrics,
240 loss_weights=loss_weights,
241 sample_weight_mode=sample_weight_mode,
242 )
244 # inject custom training step if needed
245 try:

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/keras/src/utils/traceback_utils.py:122, in filter_traceback..error_handler(*args, **kwargs)
119 filtered_tb = _process_traceback_frames(e.traceback)
120 # To get the full stack trace, call:
121 # keras.config.disable_traceback_filtering()
--> 122 raise e.with_traceback(filtered_tb) from None
123 finally:
124 del filtered_tb

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/keras/src/utils/tracking.py:26, in no_automatic_dependency_tracking..wrapper(*args, **kwargs)
23 @wraps(fn)
24 def wrapper(*args, **kwargs):
25 with DotNotTrackScope():
---> 26 return fn(*args, **kwargs)

TypeError: compile() got an unexpected keyword argument 'sample_weight_mode'

Suggestion

Optional, if you have any idea how to fix the issue

@Junjie-Jin
Copy link
Author

Maybe the command "pip list" will help, it can show which versions of packages are required.

@Junjie-Jin
Copy link
Author

This is a software compatibility issue. The problem with MAC M1 can be solved by the following installation:

conda create -n tensorflow-gpu python=3.8
conda activate tensorflow-gpu
python -m pip install -U pip
python -m pip install tensorflow-macos==2.12.0
python -m pip install tensorflow-metal
pip install scikit-learn
pip install tensorFlow-probability==0.19.0

to check whether the installation is ok:
import sys
import tensorflow.keras
import tensorflow as tf
import platform
print(f"Python Platform: {platform.platform()}")
print(f"Tensor Flow Version: {tf.version}")
print(f"Keras Version: {tensorflow.keras.version}")
print()
print(f"Python {sys.version}")
gpu = len(tf.config.list_physical_devices('GPU'))>0
print("GPU is", "available" if gpu else "NOT AVAILABLE")

you should get:
Python Platform: macOS-12.3-arm64-i386-64bit
Tensor Flow Version: 2.12.0
Keras Version: 2.12.0

Python 3.8.19 | packaged by conda-forge | (default, Mar 20 2024, 12:49:57)
[Clang 16.0.6 ]
GPU is available

@henrysky
Copy link
Owner

henrysky commented May 6, 2024

Thanks for the bug report!

Indeed this is an ongoing issue with the latest version of Tensorflow (which is separating Keras out again) and Keras v3. If you want to quickly train a neural network to classify Galaxy10, here is a notebook that fine-tunes ResNet-V2 with Keras v3 with Galaxy10 images loaded with astroNN.

https://drive.google.com/file/d/1GnrsZAPZFTfBrhuQ09zqh4n8x1QYEPOb/view?usp=sharing

Please let me know if the notebook works for you locally (it is unlikely you can run it online with Google Collab as you will get resource exhausted error due to limited compute resources there)

@Junjie-Jin
Copy link
Author

Run it locally, it report the error "ModuleNotFoundError: No module named 'keras.dtype_policies'"

@henrysky
Copy link
Owner

henrysky commented May 7, 2024

Are you using Keras v3? I think you need to use at least Keras v3 (and maybe at least Tensorflow v1.16) in order to run that notebook.

@Junjie-Jin
Copy link
Author

yes,the version of Keras is 3.3.3. I get the error:


ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 11
10 try:
---> 11 from keras.src.dtype_policies.dtype_policy import set_dtype_policy
12 except ImportError:

ModuleNotFoundError: No module named 'keras.src'

During handling of the above exception, another exception occurred:

ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 13
11 from keras.src.dtype_policies.dtype_policy import set_dtype_policy
12 except ImportError:
---> 13 from keras.dtype_policies.dtype_policy import set_dtype_policy
15 pylab_style(paper=True)
16 set_dtype_policy("mixed_float16")

ModuleNotFoundError: No module named 'keras.dtype_policies'

@henrysky
Copy link
Owner

henrysky commented May 7, 2024

Aghh I see I should check for ModuleNotFoundError not ImportError. I have fixed the notebook or you can simply change

except ImportError

to

except ModuleNotFoundError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants