Add image-text-to-text and edit image-to-text task pages #553

merveenoyan · 2024-03-14T15:01:29Z

No description provided.

osanseviero

Looking nice!

packages/tasks/src/tasks/image-text-to-text/data.ts

osanseviero · 2024-03-18T09:12:24Z

packages/tasks/src/tasks/image-text-to-text/about.md

@@ -0,0 +1,32 @@
+## Use Cases
+
+### Visual Question Answering


I would avoid to put examples of things that are covered in other task pages to avoid confusion. The 3 examples now are already covered. E.g.

VQA https://huggingface.co/tasks/visual-question-answering

Doc QA https://huggingface.co/tasks/document-question-answering

Captioning https://huggingface.co/tasks/image-to-text

I think I disagree here. one could segment humans with both image segmentation and mask detection for instance (except for zero shot part). some hugging face tasks are similar. some models are very generalistic that they can handle these tasks altogether and the paradigm is shifting more towards there anyway.

these models are similar either way, if I were an MLE who had to do captioning I'd like to try both VLMs (BTW which perform better IMO) and direct captioning models (image-to-text) which has sole purpose of captioning. I wouldn't like to keep this information from the user.

Maybe we can briefly explain this just after the Use Cases heading. Also, I think we haven't defined VLMs yet. For example:

These models are commonly called vision-language models, or VLMs. They can typically generalize to various types of tasks for which specialist models may also exist. For example, you can use a VLM to caption an image, or you can use specific captioning models as described in the image to text task page.

Edit: VLMs are indeed defined in data.ts, but I think it doesn't hurt to also mention it here.

I think it would be too repetitive. See here for instance the one in data.ts will be at the top, and then right after use case one repeats it. I wouldn't like to add it there

My main concern is that now we have two tasks that cover the same thing, so it could end up confusing users. As an example, imagine if https://huggingface.co/tasks/text-generation use cases were Translation and Summarization, which are also their own separate tasks.

I wonder if we can use here specific applications rather than sub-tasks covered here (thinking of https://huggingface.co/tasks/text-generation#use-cases as a nice example)

@osanseviero I think as a specific/solid use case there's not a lot (they'd be way too specific) or they'd eventually fall under visual question answering/retrieval

an organic way for me to infer use cases for tasks were checking out example inputs to many Spaces based on a specific task, and for VLMs it's mostly visual question answering for instance.

I added bunch of other things

packages/tasks/src/tasks/image-text-to-text/about.md

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

pcuenca

Fixed a couple of formatting issues. Agree with Omar's comments :)

packages/tasks/src/tasks/image-text-to-text/about.md

packages/tasks/src/tasks/image-text-to-text/data.ts

packages/tasks/src/tasks/index.ts

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

packages/tasks/src/tasks/image-text-to-text/about.md

pcuenca · 2024-03-22T12:58:50Z

packages/tasks/src/tasks/image-text-to-text/data.ts

+const taskData: TaskDataCustom = {
+	datasets: [
+		{
+			// TODO write proper description


packages/tasks/src/tasks/image-text-to-text/data.ts

packages/tasks/src/tasks/image-to-text/about.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

merveenoyan · 2024-03-26T11:10:09Z

fyi @pcuenca this PR doesn't have to be merged until Niels' pipeline PR is merged imo

Add it2t and edit i2t

4770702

merveenoyan requested review from osanseviero, SBrandeis, gary149, Wauplin and julien-c as code owners March 14, 2024 15:01

merveenoyan changed the title ~~Add image-text-to-text and edit image-to-text~~ Add image-text-to-text and edit image-to-text task pages Mar 14, 2024

Merge branch 'main' into add-image-text-to-image

c81cb9b

osanseviero requested review from pcuenca and removed request for julien-c, gary149, Wauplin and SBrandeis March 16, 2024 09:57

osanseviero requested changes Mar 18, 2024

View reviewed changes

Update packages/tasks/src/tasks/image-text-to-text/about.md

95bf09d

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

pcuenca reviewed Mar 18, 2024

View reviewed changes

packages/tasks/src/tasks/image-text-to-text/about.md Outdated Show resolved Hide resolved

packages/tasks/src/tasks/image-text-to-text/data.ts Outdated Show resolved Hide resolved

packages/tasks/src/tasks/index.ts Outdated Show resolved Hide resolved

merveenoyan and others added 7 commits March 19, 2024 12:00

Update packages/tasks/src/tasks/image-text-to-text/about.md

3394d8a

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update packages/tasks/src/tasks/index.ts

2aca458

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update packages/tasks/src/tasks/image-text-to-text/data.ts

d8cab7c

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

address comments

80f6593

Merge branch 'main' into add-image-text-to-image

d55db78

format

c5196d3

remove device

92b9abc

merveenoyan requested review from pcuenca and osanseviero March 19, 2024 10:16

pcuenca reviewed Mar 22, 2024

View reviewed changes

merveenoyan and others added 4 commits March 22, 2024 17:11

Update packages/tasks/src/tasks/image-text-to-text/about.md

f797fe2

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update packages/tasks/src/tasks/image-text-to-text/about.md

80daa53

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update packages/tasks/src/tasks/image-text-to-text/data.ts

454dc62

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update packages/tasks/src/tasks/image-text-to-text/data.ts

309f872

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

merveenoyan and others added 4 commits March 25, 2024 22:44

Update packages/tasks/src/tasks/image-text-to-text/about.md

9b0a190

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update packages/tasks/src/tasks/image-text-to-text/about.md

d42cbc4

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update data.ts

9b1f696

Merge branch 'main' into add-image-text-to-image

d66ba45

merveenoyan added 4 commits March 26, 2024 16:48

Update about.md

a171baa

Merge branch 'main' into add-image-text-to-image

828eb4d

Update about.md

f786f53

Merge branch 'main' into add-image-text-to-image

54dda80

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add image-text-to-text and edit image-to-text task pages #553

Add image-text-to-text and edit image-to-text task pages #553

merveenoyan commented Mar 14, 2024

osanseviero left a comment

osanseviero Mar 18, 2024

merveenoyan Mar 19, 2024

pcuenca Mar 22, 2024 •

edited

merveenoyan Mar 25, 2024

osanseviero Mar 25, 2024

merveenoyan Apr 9, 2024 •

edited

merveenoyan Apr 9, 2024

merveenoyan Apr 9, 2024

pcuenca left a comment

pcuenca Mar 22, 2024

merveenoyan commented Mar 26, 2024

Add image-text-to-text and edit image-to-text task pages #553

Are you sure you want to change the base?

Add image-text-to-text and edit image-to-text task pages #553

Conversation

merveenoyan commented Mar 14, 2024

osanseviero left a comment

Choose a reason for hiding this comment

osanseviero Mar 18, 2024

Choose a reason for hiding this comment

merveenoyan Mar 19, 2024

Choose a reason for hiding this comment

pcuenca Mar 22, 2024 • edited

Choose a reason for hiding this comment

merveenoyan Mar 25, 2024

Choose a reason for hiding this comment

osanseviero Mar 25, 2024

Choose a reason for hiding this comment

merveenoyan Apr 9, 2024 • edited

Choose a reason for hiding this comment

merveenoyan Apr 9, 2024

Choose a reason for hiding this comment

merveenoyan Apr 9, 2024

Choose a reason for hiding this comment

pcuenca left a comment

Choose a reason for hiding this comment

pcuenca Mar 22, 2024

Choose a reason for hiding this comment

merveenoyan commented Mar 26, 2024

pcuenca Mar 22, 2024 •

edited

merveenoyan Apr 9, 2024 •

edited