Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with Hugging Face , the demo should add report_to="none" to TrainingArguments #3136

Open
dshwei opened this issue Apr 17, 2024 · 0 comments
Labels
help wanted Extra attention is needed type / bug Issue type: something isn't working

Comments

@dshwei
Copy link

dshwei commented Apr 17, 2024

馃悰 Bug

  File "/demo/project/pre_train_ft_7b.py", line 200, in run_training
    trainer = Trainer(
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/transformers/trainer.py", line 503, in __init__
    self.callback_handler = CallbackHandler(
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/transformers/trainer_callback.py", line 313, in __init__
    self.add_callback(cb)
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/transformers/trainer_callback.py", line 330, in add_callback
    cb = callback() if isinstance(callback, type) else callback
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/transformers/integrations/integration_utils.py", line 597, in __init__
    from torch.utils.tensorboard import SummaryWriter  # noqa: F401
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/torch/utils/tensorboard/__init__.py", line 12, in <module>
    from .writer import FileWriter, SummaryWriter  # noqa: F401
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/torch/utils/tensorboard/writer.py", line 10, in <module>
    from tensorboard.compat.proto import event_pb2
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/tensorboard/compat/proto/event_pb2.py", line 17, in <module>
    from tensorboard.compat.proto import summary_pb2 as tensorboard_dot_compat_dot_proto_dot_summary__pb2
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/tensorboard/compat/proto/summary_pb2.py", line 17, in <module>
    from tensorboard.compat.proto import tensor_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__pb2
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/tensorboard/compat/proto/tensor_pb2.py", line 16, in <module>
    from tensorboard.compat.proto import resource_handle_pb2 as tensorboard_dot_compat_dot_proto_dot_resource__handle__pb2
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/tensorboard/compat/proto/resource_handle_pb2.py", line 16, in <module>
    from tensorboard.compat.proto import tensor_shape_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__shape__pb2
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/tensorboard/compat/proto/tensor_shape_pb2.py", line 36, in <module>
    _descriptor.FieldDescriptor(
  File "/demo/miniconda3/envs/sqlcode/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 553, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

To reproduce

Expected behavior

training_args = TrainingArguments(output_dir="test_trainer") should be modified

training_args = TrainingArguments(output_dir="test_trainer",report_to="none")

Environment

  • Aim Version (e.g., 3.0.1)
  • Python version 3.9
  • pip version 3
  • OS (e.g., Linux) unbuntu
  • Any other relevant information

Additional context

import evaluate
import numpy as np
from aim.hugging_face import AimCallback
from datasets import load_dataset
from transformers import (AutoModelForSequenceClassification, AutoTokenizer,
                          Trainer, TrainingArguments)


def tokenize_function(examples):
    return tokenizer(examples['text'], padding='max_length', truncation=True)


def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)


dataset = load_dataset('yelp_review_full')


tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')

tokenized_datasets = dataset.map(tokenize_function, batched=True)

small_train_dataset = tokenized_datasets['train'].shuffle(seed=42).select(range(1000))
small_eval_dataset = tokenized_datasets['test'].shuffle(seed=42).select(range(1000))

metric = evaluate.load('accuracy')

model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-cased", num_labels=5
)

training_args = TrainingArguments(output_dir="test_trainer")

aim_callback = AimCallback(experiment="example_experiment")


trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=small_train_dataset,
    eval_dataset=small_eval_dataset,
    compute_metrics=compute_metrics,
    callbacks=[aim_callback],
)

trainer.train()
@dshwei dshwei added help wanted Extra attention is needed type / bug Issue type: something isn't working labels Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed type / bug Issue type: something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant