GitHub - Anshumaan-Chauhan02/Guided-Flow-Matching: Utilized attention incorporated UNet model for conditional image generation using Flow Matching with Conditional Optimal Transport Objective

Guided Conditional Image Generation with Conditional Flow Matching

Project Description

The project innovatively integrates Conditional Optimal Transport into an attention-based UNet model for both conditional and unconditional image generation tasks. Utilizing a Classifier Free Guidance (CFG) mechanism ensures a unified model's proficiency across tasks. Addressing the descriptive limitations of the CIFAR10 dataset, the BLIP2 FLAN T5 model is employed for image captioning, enhancing the conditioning process. The self and cross attention mechanism, incorporating timestep and tokenized text, facilitates conditioning. Extensive experimental analysis leads to an optimized architecture with a FID score of 105.54 for unconditional generation and CLIPScore/FID scores of 22.19/305.42 for conditional generation. The research highlights the model's potential, suggesting further improvements through architectural refinements and extended training.

Technical Skills

Dependencies

Transformers

  !pip install transformers

PyTorch (Check CPU/GPU Compatibility)

  https://pytorch.org/get-started/locally/

Pandas

  !pip install pandas

NumPy

  !pip install numpy

Matplotlib

  !pip install matplotlib

TorchDiffEq

  !pip install torchdiffeq

Torchmetrics

  !pip install torchmetrics

Torchviz

  !pip install torchviz

Torch Fidelity

  !pip install torch-fidelity

Dataset Information

CIFAR-10
- Publicly available at: https://www.cs.toronto.edu/~kriz/cifar.html
- For the Caption Generation check Caption_Generation.ipynb

File Content

Caption_Generation.ipynb:
- Utilizes the BLIP2 model to generate descriptive captions for images in the CIFAR dataset and stores the resulting dataset as a pickle file.
Cross_Validation.ipynb:
- Implements code for cross-validation using a list of learning rates.
Flow_Matching_Training.ipynb:
- Encompasses the entire training process, employing flow matching with a conditional optimal transport objective in conjunction with the proposed UNet model.
Flow_Inference.ipynb:
- Contains code for generating images from uniformly sampled inputs and evaluates the FID and CLIPScore metrics for the trained models.
Text_Encoding.ipynb:
- Utilizes the BLIP2 tokenizer to convert captions into tokens for subsequent use in the conditioning process.
UNet_Attn.ipynb:
- Houses the proposed UNet model, a key component in the conditional and unconditional image generation tasks.
Docs
- Project Report: Contains the documented project with the Problem Statement, Data Augmentation, Methodology, UNet Model, and the Results

How to run

Dependency Installation:
- Execute the command to install project dependencies necessary for proper functioning.
Repository Cloning:
- Clone the project repository to the local machine using the command:
```
git clone https://github.com/Anshumaan-Chauhan02/Guided-Flow-Matching
```
Caption Generation:
- Run the Caption_Generation.ipynb notebook to generate a captioned dataset utilizing the BLIP2 model.
Flow Matching Training:
- Execute the Flow_Matching_Training.ipynb notebook to initiate the training process for the unconditional/conditional generation model.
Model Evaluation and Inference:
- Run the Flow_Inference.ipynb notebook for comprehensive model evaluation and generation of inferences.

Note:

Ensure to update the specified file paths in the notebooks with the appropriate local repository path.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Caption_Generation.ipynb		Caption_Generation.ipynb
Cross_Validation.ipynb		Cross_Validation.ipynb
Flow Inference .ipynb		Flow Inference .ipynb
Flow Matching Training.ipynb		Flow Matching Training.ipynb
LICENSE		LICENSE
Project Report.pdf		Project Report.pdf
README.md		README.md
Text_Encoding.ipynb		Text_Encoding.ipynb
UNet_Attn.ipynb		UNet_Attn.ipynb
caption_generation.py		caption_generation.py
cross_validation.py		cross_validation.py
text_encoding.py		text_encoding.py
unet_attn.py		unet_attn.py

License

Anshumaan-Chauhan02/Guided-Flow-Matching

Folders and files

Latest commit

History

Repository files navigation