-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spec for audio to audio #502
base: main
Are you sure you want to change the base?
Conversation
Should generate TS type with the helper command ( |
Done in f80bf58. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the pipeline generating several audios at once?
If not I would recommend not wrapping the output in an array for consistency with other tasks (see ImageToText for example)
https://github.com/huggingface/huggingface.js/blob/af447753043472d8ba65ab0591be7487433f7261/packages/tasks/src/tasks/image-to-text/spec/output.json#L1-L15
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does. You can try on the widget in https://huggingface.co/tasks/audio-to-audio.
"type": "string", | ||
"description": "The label of the audio file." | ||
}, | ||
"content-type": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe?
Can you link to where those parameters are documented please?
"content-type": { | |
"content_type": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not documented AFAIK. I only know it from the call made by the widget on https://huggingface.co/tasks/audio-to-audio 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For content-type
vs content_type
I've added a normalizer rule in Python to accept both since the Python attribute can only be content_type
(and is generated like this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see some specification from https://github.com/huggingface/api-inference-community/blob/main/docker_images/speechbrain/app/pipelines/audio_to_audio.py and potential outputs
* | ||
* A generated audio file with its label. | ||
*/ | ||
export interface AudioToAudioOutputElement { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI here is the returned values from community API https://github.com/huggingface/api-inference-community/blob/main/docker_images/speechbrain/app/pipelines/audio_to_audio.py#L37-L44
Json schema for audio-to-audio task. This is the "expected type" in the Python client but would prefer to double-check it's correct before continuing. Ping @Vaibhavs10 @osanseviero could you have a look please?
(useful for huggingface/huggingface_hub#2036)