Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea - NNSVS Support #69

Open
SeleDreams opened this issue Jan 31, 2021 · 10 comments
Open

Idea - NNSVS Support #69

SeleDreams opened this issue Jan 31, 2021 · 10 comments

Comments

@SeleDreams
Copy link
Contributor

SeleDreams commented Jan 31, 2021

It's an idea of something i'm thinking about working on but am not sure about the interest.
I have the idea to integrate a secondary engine to utsu named NNSVS which is an engine for AI based vocal synthesis that is open source https://github.com/r9y9/nnsvs it doesn't support the same type of voicebanks as they require to be AI trained but i feel like having one editor to use multiple types of voicebanks would be great, even more for voicebanks of better quality AI trained

NNSVS currently doesn't have compatible editor so i felt like it would also help and bring more people to utsu and nnsvs as well, helping the open source community

@SeleDreams
Copy link
Contributor Author

SeleDreams commented Jan 31, 2021

It would allow UTSU to become kind of the "AI UTAU" for the community where anyone could create AI vocals and make music with them

but before working on it i first wanted to know if the creator of utsu and other contributors want to see other vb types than just UTAUs supported as it's possible you only want UTAUs

@LucasCTN
Copy link
Contributor

LucasCTN commented Feb 5, 2021

I think this is a great idea! Would it generate a voicebank for UTSU's resample, or it also works as a whole resampler?

@adlez27
Copy link

adlez27 commented Feb 5, 2021

This might be helpful: https://note.com/crazy_utau/n/n45db22b33d2c

@SeleDreams
Copy link
Contributor Author

This might be helpful: https://note.com/crazy_utau/n/n45db22b33d2c

ENUNU is separate because it relies on the UTAU plugin api from what i remember it doesn't use the same system at all than utau in reality and relies on the data it can access to via plugins, that's why in the git it specifies the things it couldn't do due to the limitations of the plugin api

I think this is a great idea! Would it generate a voicebank for UTSU's resample, or it also works as a whole resampler?

it works as its own thing due to the difference of AI synthesis

@LucasCTN
Copy link
Contributor

LucasCTN commented Feb 5, 2021

Reading about the software, it looks like the task would be then to make UTSU be able to export to MusicXML so NNSVS can interpret it, right? And maybe bundle it in if the license permits.
This could be the open source version of Synthesizer V AI, which makes me very excited about!

@SeleDreams
Copy link
Contributor Author

Reading about the software, it looks like the task would be then to make UTSU be able to export to MusicXML so NNSVS can interpret it, right? And maybe bundle it in if the license permits.
This could be the open source version of Synthesizer V AI, which makes me very excited about!

it technically could but I think some small editing would be enough for it to directly get the data from UST files, all it needs is the data, the files it gets the data from aren't as important as long as there are deserializers

and yes this is one of the goals

@SeleDreams
Copy link
Contributor Author

SeleDreams commented Feb 6, 2021

licence wise, nnsvs is MIT, there is 0 problems with bundling it as from what i can see UTSU is under an MIT compatible licence

@titinko
Copy link
Owner

titinko commented Feb 7, 2021

Times like this remind me how much UTSU needs its own plugin framework.

If I understand correctly, integrating NNSVS with UTSU would have three parts to it.

Rendering songs in NNSVS: The easiest way is to write code converting the internal Song object into an NNSVS-readable file, then run NNSVS on that in the background.

Using NNSVS voicebanks: I could see UTSU's song editor being tweaked so that it pretends that NNSVS voicebanks are regular UTAU voicebanks on the frontend, but in the backend only renders them with NNSVS.

Creating NNSVS voicebanks: Since the format is completely different from UTAU's voicebanks, you'd have to write an entirely new voicebank editor UI.

@SeleDreams
Copy link
Contributor Author

Times like this remind me how much UTSU needs its own plugin framework.

If I understand correctly, integrating NNSVS with UTSU would have three parts to it.

Rendering songs in NNSVS: The easiest way is to write code converting the internal Song object into an NNSVS-readable file, then run NNSVS on that in the background.

Using NNSVS voicebanks: I could see UTSU's song editor being tweaked so that it pretends that NNSVS voicebanks are regular UTAU voicebanks on the frontend, but in the backend only renders them with NNSVS.

Creating NNSVS voicebanks: Since the format is completely different from UTAU's voicebanks, you'd have to write an entirely new voicebank editor UI.

Yes, I imagine that the vb creation side would come last since it's not the priority, usage would come first and vb creation second (since creating AI vbs to begin with is much more complex due to the AI training phase)

@LucasCTN
Copy link
Contributor

LucasCTN commented Mar 11, 2021

I wanna try making a serializer from the Song object to NNSVS. Looks like there's a need to convert it to HTS full-context label files (using Sinsy), and NNSVS makes a MusicXML to label file step to use it.

If i'm correct, does anyone knows any document with the specification of the data from HTS label files?

(For the little research that I did, it looks like it's easier to use MusicXML as a middle man and use pysinsy to do the final conversion to HTS)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants