Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New: Ultralytics YOLO-Human #12702

Open
wants to merge 106 commits into
base: main
Choose a base branch
from
Open

New: Ultralytics YOLO-Human #12702

wants to merge 106 commits into from

Conversation

Laughing-q
Copy link
Member

@Laughing-q Laughing-q commented May 15, 2024

πŸ› οΈ PR Summary

Made with ❀️ by Ultralytics Actions

🌟 Summary

Introducing new YOLOHuman model for human attribute detection! πŸš€

πŸ“Š Key Changes

  • Added YOLOHuman class as part of the model imports.
  • Introduced a new YAML configuration for the YOLOv8 human detection model.
  • Implemented additional augmentations to include human attribute data handling.
  • Established a new dataset class, HumanDataset, for loading and processing human-related datasets.
  • Included Human object in results to encapsulate detected human attributes.
  • Enriched model __init__.py to include YOLOHuman.
  • Formulated HumanPredictor, HumanTrainer, and HumanValidator under the new YOLO human module for prediction, training, and validation.

🎯 Purpose & Impact

  • Enhances Model Catalog: Expands Ultralytics' model offerings to include human-specific attribute detection.
  • Improves Dataset Handling: Offers streamlined process for datasets involving human features.
  • Facilitates Human-centric Applications: Paves the way for more sophisticated applications such as demographic analysis, security enhancements, and personalized customer experiences.

@Laughing-q
Copy link
Member Author

@glenn-jocher @ambitious-octopus ok I've re-uploaded the weight and now everything works properly in tests except the hub dataset, which I guess it'll be good when the PR that @ambitious-octopus opened merged. :)
pic-240531-1705-48

And now there's several gpus freed on our server, I'll launch several training right now.

@ambitious-octopus
Copy link
Member

docs image
val_batch0_labels

Copy link
Member

@Burhan-Q Burhan-Q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not serve the docs locally, just quickly reviewed the raw markdown on GitHub. A few notes and suggestions, but overall looks excellent!

docs/en/reference/models/yolo/human/predict.md Outdated Show resolved Hide resolved
docs/en/tasks/human.md Show resolved Hide resolved
docs/en/tasks/human.md Outdated Show resolved Hide resolved
docs/en/tasks/human.md Show resolved Hide resolved
docs/en/tasks/human.md Outdated Show resolved Hide resolved
docs/en/datasets/human/index.md Outdated Show resolved Hide resolved
docs/en/models/yolo-human.md Outdated Show resolved Hide resolved
docs/en/models/yolo-human.md Outdated Show resolved Hide resolved
docs/en/models/yolo-human.md Outdated Show resolved Hide resolved
docs/en/tasks/human.md Outdated Show resolved Hide resolved
@glenn-jocher
Copy link
Member

@glenn-jocher @ambitious-octopus Guys I removed YOLOHuman class since it's not needed while we treat human as a new task of YOLO. Also I've fixed the save_one_txt issue for human task. There's actually another update made by me, is that I figured we can directly use the save_txt method in Results instead of recreating something similar/redundant for val mode of each task.

def save_txt(self, txt_file, save_conf=False):
"""
Save predictions into txt file.
Args:
txt_file (str): txt file path.
save_conf (bool): save confidence score or not.
"""
is_obb = self.obb is not None
boxes = self.obb if is_obb else self.boxes
masks = self.masks
probs = self.probs
kpts = self.keypoints
texts = []
if probs is not None:
# Classify
[texts.append(f"{probs.data[j]:.2f} {self.names[j]}") for j in probs.top5]
elif boxes:
# Detect/segment/pose
for j, d in enumerate(boxes):
c, conf, id = int(d.cls), float(d.conf), None if d.id is None else int(d.id.item())
line = (c, *(d.xyxyxyxyn.view(-1) if is_obb else d.xywhn.view(-1)))
if masks:
seg = masks[j].xyn[0].copy().reshape(-1) # reversed mask.xyn, (n,2) to (n*2)
line = (c, *seg)
if kpts is not None:
kpt = torch.cat((kpts[j].xyn, kpts[j].conf[..., None]), 2) if kpts[j].has_visible else kpts[j].xyn
line += (*kpt.reshape(-1).tolist(),)
line += (conf,) * save_conf + (() if id is None else (id,))
texts.append(("%g " * len(line)).rstrip() % line)
if texts:
Path(txt_file).parent.mkdir(parents=True, exist_ok=True) # make directory
with open(txt_file, "a") as f:
f.writelines(text + "\n" for text in texts)

def save_one_txt(self, predn, save_conf, shape, file):
"""Save YOLO detections to a txt file in normalized coordinates in a specific format."""
gn = torch.tensor(shape)[[1, 0, 1, 0]] # normalization gain whwh
for *xyxy, conf, cls in predn.tolist():
xywh = (ops.xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format
with open(file, "a") as f:
f.write(("%g " * len(line)).rstrip() % line + "\n")

So I updated our detect/obb/human tasks with Results.save_txt. Other tasks i.e segment/pose actually have the save_one_txt part commented so I left it there for now and let's develop these two in another PR later.

# if self.args.save_txt:
# save_one_txt(predn, save_conf, shape, file=save_dir / 'labels' / f'{path.stem}.txt')

@Laughing-q really nice! I was thinking the same thing, I think some of the val methods are copied just because one line needs to be changed from the base class, its unfortunate.

We want to reduce copies/duplicates of classes as much as possible. When I was thinking about this some more YOLO-Human should really be a more flexible class that is essentially object-detection + features, with a parameterized way to define those features in the labels/loss/plots, but for now our hard-coded implementation is a good start.

@glenn-jocher
Copy link
Member

glenn-jocher commented Jun 1, 2024

@Laughing-q also yes I changed the task from detect to human in places because otherwise it was not guessing the task correctly for short commands like yolo predict model=yolov8n-human.pt

EDIT: Would be nice to be able to read the task directly from the model instead of guessing, maybe directly from the checkpoint dictionary or the torch model.

@glenn-jocher
Copy link
Member

glenn-jocher commented Jun 1, 2024

@Laughing-q maybe the regressions parameters are too many for the new attributes? We have significant overfitting on YOLO8x-human for attribute val losses, and I noticed the size difference is significant for YOLOv8n vs 8n-human:

  • YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs
  • YOLOv8n-human summary (fused): 225 layers, 4008539 parameters, 0 gradients, 12.0 GFLOPs

glenn-jocher and others added 4 commits June 1, 2024 14:20
Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
@Laughing-q
Copy link
Member Author

Laughing-q commented Jun 1, 2024

@glenn-jocher yes I think you are right! In fact I've already made this part lighter once. Perhaps we can use group convs i.e set 3 groups for each regression part(weight/height/age). BTW I was intentionally using regular convs for them cuz I think there's connection between these three attributes in real life(at least height and weight are connected), and figured not spliting them as groups might get better results.

I also noticed the overfitting in my past experiments but It's hard to tell if the issue is from lack of dataset or the model side. As we were keeping generating more data so I didn't modify the head again yet.

Meanwhile I just realized that I didn't set stronger augmentation for large models, we could retrain L/X with stronger augmentation.

@glenn-jocher Sorry I'm not with my computer right now and won't be able to log in this weekend. Please feel free to modify the head part and retrain our large models with stronger augmentation if you feel necessary.:)
FYI I trained all the models with its corresponding detection weight as pretrained weight for 100 epochs, and batch=128, other settings remained default.
Training with pretrained weight got better results than from scratch in my experiments so I kept it.

@glenn-jocher
Copy link
Member

@Laughing-q ah ok got it! Yes maybe I'll work on some updates. We have our new L40S server too this week as part of our NVIDIA launchpad trial so I can try training there.

@Laughing-q
Copy link
Member Author

@glenn-jocher oh right before any updates please make sure any modifications on model structure won't break our current deployment pipeline on iOS side, or it might need extra time to develop iOS side. (Reminding as the deadline is close. :))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request TODO Items that needs completing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants