fo.core.video.make_frames_dataset is sneakily considered a frame view #4397

evatt-harvey-salinger · 2024-05-18T18:09:20Z

Describe the problem

fo.core.video.make_frames_dataset seems like it should create a basic Dataset, as opposed to Dataset.to_frames which should make a FramesView. However, this line...
(

fiftyone/fiftyone/core/video.py

Line 635 in 86125ab

dataset = fod.Dataset(name=name, _frames=True)

)
...results in Dataset._is_frames being True.

So even though the dataset is actually a type Dataset, functions that take its view() like annotate consider it a FramesView.

Is this expected behavior? I would assume that make_frames_dataset was specifically intended to create something that was distinct from a frames view, and would be treated as a normal dataset. (I'll note, even dataset.clone returns a dataset where _is_frames=False.)

Code to reproduce issue

vid_dataset = ... # contains video samples
new_dataset = fo.core.video.make_frames_dataset(vid_dataset)
print(new_dataset._is_frames) # True
dataset.annotate(...)
# ValueError: Annotating frames views is not supported
clone = new_dataset.clone
print(clone._is_frames) # False

System information

OS Platform and Distribution (e.g., Linux Ubuntu 22.04): Ubuntu
Python version (python --version): 3.12.2
FiftyOne version (fiftyone --version): 0.23.8
FiftyOne installed from (pip or source): pip

Other info/logs

Willingness to contribute

Yes. I can contribute a fix for this bug independently
Yes. I would be willing to contribute a fix for this bug with guidance
from the FiftyOne community
No. I cannot contribute a bug fix at this time

The text was updated successfully, but these errors were encountered:

benjaminpkane · 2024-05-22T00:02:57Z

Hi @evatt-harvey-salinger. Under the hood, fo.core.video.make_frames_dataset is the function used to create a Dataset.to_frames() view. The term dataset is likely overloaded in this context, but the other to_* stages use the same nomenclature, e.g. fo.core.patches.make_patches_dataset and Dataset.to_patches()

Zooming out a bit, perhaps adding support (or a best practice) for annotating frame collections is the main goal?

brimoor · 2024-05-23T14:48:25Z

@evatt-harvey-salinger thanks for calling this out!

I think it is a valid use case to directly call methods like make_frames_dataset() and that, indeed, you should get a "regular" dataset when you do that. This will be supported as of #4416.

In the meantime, it is slightly less efficient, but you can achieve the same end result via clone() like this:

patches_dataset = sample_collection.to_patches(...).clone()
frames_dataset = sample_collection.to_frames(...).clone()
clips_dataset = sample_collection.to_clips(...).clone()

evatt-harvey-salinger · 2024-05-23T19:34:18Z

Thanks @brimoor and @benjaminpkane!

Great, looks like #4416 will address the suggestion that make_frames_dataset() should return a "regular" dataset.

In general, I agree that adding support for annotating a FrameView of a video dataset would be an amazing feature. I can envision a few good use cases...

It would allow partial annotation runs on videos while maintaining the integrity of the source video. (ex: labeling a 30 fps video at 1 fps, then later adding annotating frames at 3 fps).
My initial idea was to down sample and annotate frames at ~4 fps, then train a model to auto-label the rest of the video

Currently, its seems that the workflow would be to use to_frames(...).clone to sample and annotate a subset of the video, and then maintain the video dataset alongside the "to_frames.clone" dataset. I could either (1) store the annotations in "to_frames.clone" dataset, and progressively sample more frames of the video, merging and labeling them into the "to_frames.clone" dataset in batches, or (2) store the annotations in the video dataset, by annotating the "to_frames.clone" dataset and then merging the annotations into the video frames by associating the frame_number's.

This is certainly doable. But if FrameViews could be annotated directly, and the annotations could be imported straight into the video dataset, it would prevent the need to flow back and forth between two datasets (and mitigate the risks of accidentally tweaking one dataset out of alignment to the other).

evatt-harvey-salinger · 2024-05-23T19:39:51Z

I'll close the issue, since #4416 addresses the original request. But I'd love to hear what you think more general idea of annotating FrameViews directly, so I'll stay tuned on the thread!

brimoor · 2024-05-24T14:49:39Z

Out of curiosity, is there a reason you specifically want to annotate your videos as individual frames rather than directly calling annotate() on your media_type == "video" dataset?

evatt-harvey-salinger · 2024-05-31T00:07:47Z

Hi Brian,

I've tried to answer this a few different times, but then I get new ideas and try to hack together a solution. But I haven't really found one yet.

Basically, I have many hours worth of 15 fps videos to label. Each video sample has wayyy to many frames to label all at once. I'd like to be able downsample and iteratively label portions of video datasets, while retaining the integrity of the video samples as videos (rather than just converting them into image datasets). That would enable me to annotate the videos at 1 fps, then come back and annotate at 4 fps. Or, use the 1 fps frames to training a model that can help me auto-label a portion of the unlabeled frames.

For example, I have a workflow with images datasets that looks like this:

request annotation for a view first_pass
retrieve annotations, and use the anno_results.frame_id_map to select the frame_ids to reconstruct the first_pass view (a capability we should add btw :) )
programtically exchange label_requested tags for labeled tags
train a model on the labeled samples
run inference on the unlabeled samples and label them as auto-labeled
form a new view second_pass with a portion of the auto-labeled samples, where I correct the auto-labeled predictions
retrieve those annotations, and iterate

I would like to develop an analogous workflow for video datasets. Sending FrameViews for annotation, then retrieving the annotations and pulling directly into my video dataset would be the cleanest way enable this kind of workflow.

As I said, I've been trying to find a work around, but haven't been able to achieve a solution yet that isn't terribly convoluted. I know that I can just abandon the video datasets altogether and just convert everything to image datasets, but it would be a shame to not make use of the other video datasets capabilities. I would also like to keep the source files as videos, which are cleaner to store, version, view, etc.

I've gotten close to a solution where I maintain a video dataset and a corresponding image dataset as a pair. I can use the workflow above to add annotations to the images, then use the frame_numbers (a field automatically populated by make_frames_dataset()) to merge the annotations back into the video dataset. However, this has proven to be quite tricky.

evatt-harvey-salinger · 2024-05-31T02:13:00Z

I know that I can use the frame_step parameters in annotate with the CVAT backend. But if I use the tracks feature in CVAT, then the detections actually get interpolated once they are imported into the FO dataset anyways. For example, if I use a frame_step=8 for a 32 frame video, I would only label ~4 frames in CVAT. But after importing back into FO, all 32 frames are labelled.

frame_step can't be used for datasets that already have tracks anyways.

Because of these two things, I'm going to just live with label full fps videos (with whatever downsampling i want on the front end), and achieve "partial" annotation by just sending different clips within the video at a time.

evatt-harvey-salinger · 2024-05-31T02:13:32Z

Anyways, I hope this description give you an idea of the workflow I was trying to achieve by annotating FrameViews directly!

evatt-harvey-salinger · 2024-05-31T17:40:49Z

One additional wrinkle is that dataset.annotate(frame_start=..., frame_stop=...) doesn't actually work for datasets with multiple videos. Since video samples can have overlapping frame numbers, only the last dataset's images get sent. I'll add a separate issue about that: #4447

evatt-harvey-salinger added the bug Bug fixes label May 18, 2024

brimoor mentioned this issue May 23, 2024

Allow generated datasets to be directly created #4416

Merged

evatt-harvey-salinger closed this as completed May 23, 2024

evatt-harvey-salinger mentioned this issue May 31, 2024

[FR] Support FramesView annotate method #3350

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fo.core.video.make_frames_dataset is sneakily considered a frame view #4397

fo.core.video.make_frames_dataset is sneakily considered a frame view #4397

evatt-harvey-salinger commented May 18, 2024 •

edited

benjaminpkane commented May 22, 2024 •

edited

brimoor commented May 23, 2024

evatt-harvey-salinger commented May 23, 2024 •

edited

evatt-harvey-salinger commented May 23, 2024

brimoor commented May 24, 2024

evatt-harvey-salinger commented May 31, 2024

evatt-harvey-salinger commented May 31, 2024

evatt-harvey-salinger commented May 31, 2024

evatt-harvey-salinger commented May 31, 2024 •

edited

fo.core.video.make_frames_dataset is sneakily considered a frame view #4397

fo.core.video.make_frames_dataset is sneakily considered a frame view #4397

Comments

evatt-harvey-salinger commented May 18, 2024 • edited

Describe the problem

Code to reproduce issue

System information

Other info/logs

Willingness to contribute

benjaminpkane commented May 22, 2024 • edited

brimoor commented May 23, 2024

evatt-harvey-salinger commented May 23, 2024 • edited

evatt-harvey-salinger commented May 23, 2024

brimoor commented May 24, 2024

evatt-harvey-salinger commented May 31, 2024

evatt-harvey-salinger commented May 31, 2024

evatt-harvey-salinger commented May 31, 2024

evatt-harvey-salinger commented May 31, 2024 • edited

evatt-harvey-salinger commented May 18, 2024 •

edited

benjaminpkane commented May 22, 2024 •

edited

evatt-harvey-salinger commented May 23, 2024 •

edited

evatt-harvey-salinger commented May 31, 2024 •

edited