About some details of paper #70

YahooKID · 2024-05-11T02:23:16Z

Thanks for open source such a great model, here is a little my own problem about some details of paper,

About the training stage of transition video generation model, did you freeze the Motion Modeling Module which is from AnimateDiff, or SFT this module as well as training Semantic Space Motion Predictor(Transformer Block Part)?
About Training dataset Webvid-10M, as my known, almost all of video data in this dataset have the similar watermarks with the similar position, with my limited knowledge, these watermarks with similar features will influence the capability of model. If you take any preprocess could you share it with me?

Cheers

brentjohnston · 2024-05-17T06:43:14Z

Crickets for some reason, I'd like to know also.

zhoudaquan · 2024-05-21T12:49:11Z

Thanks for open source such a great model, here is a little my own problem about some details of paper,

About the training stage of transition video generation model, did you freeze the Motion Modeling Module which is from AnimateDiff, or SFT this module as well as training Semantic Space Motion Predictor(Transformer Block Part)?

About Training dataset Webvid-10M, as my known, almost all of video data in this dataset have the similar watermarks with the similar position, with my limited knowledge, these watermarks with similar features will influence the capability of model. If you take any preprocess could you share it with me?

Cheers

Hi,

Thank you for your interest in the work. 1. we train the motion predictor together with the motion module taken from AnimateDiff. Both modules are trainable. 2. Please refer to this repo for the watermark removal on WebVid dataset: https://github.com/RoundofThree/python-scripts/blob/1f9455ce9f5832883e1002e73934afa4099a097e/watermark_removal/watermark_remover.py#L188

Regards,
Zhou Daquan

YahooKID · 2024-05-22T07:07:34Z

Thanks for open source such a great model, here is a little my own problem about some details of paper,

About the training stage of transition video generation model, did you freeze the Motion Modeling Module which is from AnimateDiff, or SFT this module as well as training Semantic Space Motion Predictor(Transformer Block Part)?

About Training dataset Webvid-10M, as my known, almost all of video data in this dataset have the similar watermarks with the similar position, with my limited knowledge, these watermarks with similar features will influence the capability of model. If you take any preprocess could you share it with me?

Cheers

Hi,

Thank you for your interest in the work. 1. we train the motion predictor together with the motion module taken from AnimateDiff. Both modules are trainable. 2. Please refer to this repo for the watermark removal on WebVid dataset: https://github.com/RoundofThree/python-scripts/blob/1f9455ce9f5832883e1002e73934afa4099a097e/watermark_removal/watermark_remover.py#L188

Regards, Zhou Daquan

thanks.

armored-guitar · 2024-05-22T12:35:09Z

@zhoudaquan Hi. Thank you for your great work! I try to reproduce your code. Can you please help me to clear some details about your work:
Do you use consistent self-attention for video training?
At the 6th page there is a picture with architecture. There said that you compress image (2xHxWx3) into a semantic space 2xNxC, What is n? 257 (clip output) or 1 (linear projection)
What is sequence length for motion transformer? If it is FxN, what is N?

Looking forward for your answer

Z-YuPeng · 2024-05-30T10:53:29Z

We encode a single image as N token vectors to represent different semantic information. Then we perform prediction. Thus, each intermediate frame corresponds to N tokens in the semantic space.

YahooKID closed this as completed May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About some details of paper #70

About some details of paper #70

YahooKID commented May 11, 2024 •

edited

brentjohnston commented May 17, 2024

zhoudaquan commented May 21, 2024

YahooKID commented May 22, 2024

armored-guitar commented May 22, 2024 •

edited

Z-YuPeng commented May 30, 2024

About some details of paper #70

About some details of paper #70

Comments

YahooKID commented May 11, 2024 • edited

brentjohnston commented May 17, 2024

zhoudaquan commented May 21, 2024

YahooKID commented May 22, 2024

armored-guitar commented May 22, 2024 • edited

Z-YuPeng commented May 30, 2024

YahooKID commented May 11, 2024 •

edited

armored-guitar commented May 22, 2024 •

edited