Add bounding box converter #30852

qubvel · 2024-05-16T10:58:34Z

What does this PR do?

Intorduces output_bbox_format parameter for postprocess_for_object_detection.
Adds the convert_boxes function to convert boxes from one format to another, e.g. from YOLO to Pascal VOC.

boxes = convert_boxes(
    boxes, input_format="relative_xcycwh", output_format="absolute_xyxy", image_size=target_sizes
)

works with np.array, torch.tensor, list, tuple (could be easily extended to jax and tf)
works with single box shape=(4,) , image boxes shape=(N, 4), multiple images boxes List[(N, 4)]
supports boxes with extra metadata, e.g. box=[x_min, y_min, x_max, y_max, color, class, ...]
supports format aliases. e.g. yolo, coco

Supported input/output bounding box formats:
- absolute_xyxy (aliases: pascal_voc, xyxy): [x_min, y_min, x_max, y_max]
- absolute_xywh (aliases: coco, xywh): [x_min, y_min, width, height]
- absolute_xcycwh: [center_x, center_y, width, height]
- relative_xyxy (aliases: albumentations): [x_min, y_min, x_max, y_max] normalized to [0, 1] by image size
- relative_xywh: [x_min, y_min, width, height] normalized to [0, 1] by image size
- relative_xcycwh (aliases: yolo, xcycwh): [center_x, center_y, width, height] normalized to [0, 1] by image size

Before submitting

Did you read the contributor guideline,
Pull Request section?
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@amyeroberts could you please look at this draft, looks a bit overсomplicated, probably we should drop some functionality in favor to simplification.

HuggingFaceDocBuilderDev · 2024-05-16T11:46:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Thanks for working on this!

I haven't done a full review e.g. checking boring things like docstring formats, just looking at the overall structure.

This is really nicely structured and handled. I don't have any major comments on the overall logic - it's incredibly clean. Thanks for taking the time to write out comprehnsive tests too. Only major comment is to not have alias for the different formats, in particular for datasets like coco, it adds unnecessary complexity; I'm less sure about e.g. xyxy -> absolute_xyxy.

amyeroberts · 2024-05-16T19:18:08Z

src/transformers/image_box_utils.py

@@ -0,0 +1,523 @@
+import logging


Missing copyright header

amyeroberts · 2024-05-20T18:07:54Z

src/transformers/models/grounding_dino/image_processing_grounding_dino.py

-    return norm_annotation
+    """Convert annotation boxes from absolute xyxy (pascal voc) to relative xcycwh (yolo) format."""
+    if "boxes" in annotation:
+        annotation = annotation.copy()  # shallow copy


What's the purpose of the copy here? Only annotation["boxes"] is being passed to convert_boxes and is being set directly to annotation["boxes"] on the call

amyeroberts · 2024-05-20T18:08:32Z

src/transformers/image_box_utils.py

+ArrayType = Union["torch.Tensor", np.ndarray]
+
+
+SUPPORTED_BOX_FORMATS = [


I'd have this as an enum

amyeroberts · 2024-05-20T18:12:29Z

src/transformers/image_box_utils.py

+        or most likely as a list of 2D arrays/tensors.
+
+    Supported input/output bounding box formats:
+        - `absolute_xyxy` (aliases: `pascal_voc`, `xyxy`): [x_min, y_min, x_max, y_max]


I'd rather we didn't try to handle different aliases and just keep it simple to e.g. xyxy

amyeroberts · 2024-05-20T18:14:46Z

src/transformers/image_box_utils.py

+    input_format: str,
+    output_format: str,
+    image_size: Optional[Union[ArrayType, List, Tuple]] = None,
+    check: Optional[str] = "warn",


I can see "raise" never being used. I like the logic, but think we might be engineering something which we'll never need

amyeroberts · 2024-05-20T18:21:41Z

src/transformers/image_box_utils.py

+        return _convert_boxes_arrays_2d(boxes, input_format, output_format, image_size, check)
+
+    # Recursive approach.
+    elif depth == 1 and isinstance(boxes, (list, tuple) or is_array_type(boxes)):


Regarding the isinstance(boxes, (list, tuple) or is_array_type(boxes)) check - what are the other types boxes could be here?

qubvel added 8 commits May 16, 2024 09:59

Remove unused boxes and code from panoptic

dddc468

Convert boxes with tests

dcd7efc

Add convert_boxes to post_process_object_detection

8dadabf

Remove corners_to_center_format function

3b00531

Update docs

e9d230a

Fix typing

9f034ee

Adjust test owlvit

764117e

Fixup

1339939

qubvel marked this pull request as draft May 16, 2024 10:58

qubvel added 2 commits May 16, 2024 11:18

Add module to init

044b383

Fix style

85caa15

amyeroberts reviewed May 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bounding box converter #30852

Add bounding box converter #30852

qubvel commented May 16, 2024

HuggingFaceDocBuilderDev commented May 16, 2024

amyeroberts left a comment

amyeroberts May 16, 2024

amyeroberts May 20, 2024

amyeroberts May 20, 2024

amyeroberts May 20, 2024

amyeroberts May 20, 2024

amyeroberts May 20, 2024

		ArrayType = Union["torch.Tensor", np.ndarray]


		SUPPORTED_BOX_FORMATS = [

Add bounding box converter #30852

Are you sure you want to change the base?

Add bounding box converter #30852

Conversation

qubvel commented May 16, 2024

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented May 16, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts May 16, 2024

Choose a reason for hiding this comment

amyeroberts May 20, 2024

Choose a reason for hiding this comment

amyeroberts May 20, 2024

Choose a reason for hiding this comment

amyeroberts May 20, 2024

Choose a reason for hiding this comment

amyeroberts May 20, 2024

Choose a reason for hiding this comment

amyeroberts May 20, 2024

Choose a reason for hiding this comment