Compare to VoiceFlow TTS #66

Liujingxiu23 · 2024-03-20T14:58:00Z

Thank you for your work and sharing!
It seems MATCHA-TTS and VoiceFLow-TTS (https://github.com/X-LANCE/VoiceFlow-TTS) are very similar?
What is the main diffences between these two methods?
And How about the performace on voice quality, for example prosody, and the inference speed?

shivammehta25 · 2024-04-25T09:43:33Z

They are! I met the author @cantabile-kwok just last week (super nice guy), it is interesting we both made certain decisions to improve the speed relative to just conditional flow matching. One way to speed up that they employed was to improve the paths by "rectifying" the learned paths by flow matching which is a two-step approach and quite effective. For us, we felt that the same speedup could be achieved by improving the architecture instead so we improved the U-net architecture and got a similar speedup.
They both are trying to solve a similar problem in different ways, you surely can "rectify" the paths with Matcha-TTS's architecture for even improved speed up :)

Hope that helps.

Shivam

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare to VoiceFlow TTS #66

Compare to VoiceFlow TTS #66

Liujingxiu23 commented Mar 20, 2024

shivammehta25 commented Apr 25, 2024

Compare to VoiceFlow TTS #66

Compare to VoiceFlow TTS #66

Comments

Liujingxiu23 commented Mar 20, 2024

shivammehta25 commented Apr 25, 2024