Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overfitting issues while using pre-trained weights #12794

Closed
1 task done
rees8 opened this issue May 18, 2024 · 8 comments
Closed
1 task done

Overfitting issues while using pre-trained weights #12794

rees8 opened this issue May 18, 2024 · 8 comments
Labels
question Further information is requested

Comments

@rees8
Copy link

rees8 commented May 18, 2024

Search before asking

Question

I'm training a custom dataset using pre-trained weights for YOLOv8.
The lines of code I used:

from ultralytics import YOLO
model = YOLO('yolov8n.pt')
results = model.train(data='/kaggle/input/isl-images/ISL_DATASET_DRIVE/data.yaml', epochs=100, imgsz=640, device=0)

I have implemented the trained model into an Android application and it seems to recognize the images it's been trained on but not any new images. I believe this is a case of overfitting. I am really new to YOLO and I'm not sure how to reduce overfitting while using pre-trained weights since the parameters are not being explicitly mentioned.

I would like to know how do I approach this problem and if possible, provide a sample code snippet that I can experiment with.

Additional

No response

@rees8 rees8 added the question Further information is requested label May 18, 2024
Copy link

👋 Hello @rees8, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

@glenn-jocher
Copy link
Member

Hello! It looks like you're facing overfitting with your custom-trained YOLOv8 model. Overfitting often occurs when the model learns the training data too well, including its noise and details, at the expense of its ability to generalize to new data.

Here are a few strategies you might consider to reduce overfitting:

  1. Data Augmentation: Enrich your training dataset by transforming the training images (e.g., rotations, flipping, scaling). This helps the model generalize better.
  2. Regularization Techniques: Try using techniques like dropout, or adjusting the learning rate.
  3. Early Stopping: Monitor the model's performance on a validation set and stop training before it begins to over-fit.

Here's a code snippet that incorporates data augmentation, using the configuration file to specify transformations:

from ultralytics import YOLO

# Load a model
model = YOLO('yolov8n.pt')

# Specify augmentation policies in your dataset YAML file
# Here assuming you have augmentation options set up in your data.yaml
results = model.train(data='/kaggle/input/isl-images/ISL_DATASET_DRIVE/data.yaml', 
                      epochs=100, imgsz=640, device=0)

Ensure your data.yaml includes augmentation settings. Check the Ultralytics documentation for more on how to set this up.

These steps should help mitigate overfitting, giving your model a better chance to generalize to new images not seen during training. Give it a try and see how it improves your application's performance. Good luck! 🚀

@rees8
Copy link
Author

rees8 commented May 19, 2024

Thank you, will try it out. Just as a double confirmation if this is indeed a case of overfitting or if there are some other issues, I am also attaching the performance metrics of the trained model:

confusion_matrix
F1_curve
P_curve
PR_curve
R_curve
results

@glenn-jocher
Copy link
Member

Thanks for sharing the performance metrics! 📊 Based on these charts, indeed it appears there could be some overfitting, as suggested by the narrow generalization gap on the PR curve and likely some biases in classification (as hinted by the confusion matrix). You might also want to explore if there's any class imbalance or data inconsistency which could affect the model.

Implementing the earlier suggested strategies might help in improving the generalization. Additionally, experimenting with different varieties of data augmentation could also prove beneficial. If adjustments in training don't fully resolve the issue, consider reviewing the dataset's diversity and representation.

Keep experimenting, and feel free to reach out if you encounter further issues. Good luck! 🚀

@rees8
Copy link
Author

rees8 commented May 20, 2024

I came across a blog where it mentioned that Roboflow offers augmentation options while creating the dataset, so I combined my custom dataset and one from Roboflow to have variations and then applied augmentations like shear, saturation, crop, etc. My dataset size was previously around 3388 images in train and 569 images in valid. After applying augmentations, there are 12438 images in train and 1036 in valid. I began re-training the model for 150 epochs but stopped at 130 as the P and R values reached a plateau for the last 4 epochs. My model is starting to recognize the objects in real-time now. I am attaching the new performance metrics below. What I want to know is that is this model good enough to be included in my results section (this is my capstone project) or would you suggest and further changes? I am not quite aware of what the ideal metrics should be so I'm not able to make the decision. This model's metrics seem to be too good to be true, but again, it's performing better than before.
(Sorry for bombarding you with such seemingly basic questions)

confusion_matrix
F1_curve
P_curve
PR_curve
R_curve

@glenn-jocher
Copy link
Member

@rees8 hello! It's great to hear about your progress and the improvements in your model's performance after applying augmentations. 🚀 From the metrics and the confusion matrix you've shared, it seems like your model is performing quite well, especially considering the significant increase in training data and the diversity introduced by augmentations.

For your capstone project, if the model is performing well in real-time and the metrics are stable (as indicated by the plateau in P and R values), it sounds like a solid candidate to include in your results section. The key here is the consistency of performance across different scenarios and the robustness your augmentations have introduced.

If you're looking for further validation, you might consider:

  • Conducting a few more tests in different real-world conditions to ensure the model's robustness.
  • Comparing these results with a baseline model or previous iterations to highlight the improvements.

Your approach and the results look promising! Don't hesitate to include these findings in your project, and good luck with your capstone presentation! 🌟

@rees8
Copy link
Author

rees8 commented May 20, 2024

Thanks a lot for your assistance!

@glenn-jocher
Copy link
Member

You're welcome! If you have any more questions or need further assistance as you continue working with YOLOv8, feel free to reach out. Happy coding! 😊

@rees8 rees8 closed this as completed May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants