Wobble artifact produced by a text-to-speech software.

For everything after the recording stage: hardware/software and how you use it.
Post Reply

Wobble artifact produced by a text-to-speech software.

Post by wetduck »

Hello to everyone!

I hope you're having a great day. So, just the other day my boss handed us a new software to learn. This software uses AI to convert text into speech and the results are like deceivably great. The only problem is, some parts of it have this "wobble" in them probably because the AI has to change the tone/intonation of the words.

I've tried using Audition "Auto heal" but instead of just deleting the wobble it silences the part of the audio. It doesn't fix it all. I'm also not sure what the correct term of this problem is.

I was wondering if anyone has any solution to fixing this problem. This is what I see in the waveformImage. The word said is "world" but the r in the letter vibrates.

Here is the link to the audio: shorturl.at/fqLQX (Google drive link)

Thank you!
Last edited by Hugh Robjohns on Fri Sep 18, 2020 10:18 am, edited 1 time in total.
wetduck
Posts: 2 Joined: Fri Sep 18, 2020 8:35 am

Re: Wobble artifact produced by a text-to-speech software.

Post by Tomás Mulcahy »

The word "world" sounds fine. A much bigger problem is that the whole delivery is not convincing at all. Lacking emotion and spoken too quickly. It's an impressive algorithm for sure, but it needs work. Maybe slowing down the overall speed might make the r sound more agreeable for you?
Last edited by Tomás Mulcahy on Fri Sep 18, 2020 10:00 am, edited 1 time in total.
User avatar
Tomás Mulcahy
Frequent Poster
Posts: 3007 Joined: Wed Apr 25, 2001 12:00 am Location: Cork, Ireland.

Re: Wobble artifact produced by a text-to-speech software.

Post by Hugh Robjohns »

The 'wobbles' or vibrations, as you call it, sounds like editing artefacts to me --points where the different sound elements are being stitched together.
User avatar
Hugh Robjohns
Moderator
Posts: 43693 Joined: Fri Jul 25, 2003 12:00 am Location: Worcestershire, UK
Technical Editor, Sound On Sound...
(But generally posting my own personal views and not necessarily those of SOS, the company or the magazine!)
In my world, things get less strange when I read the manual... 

Re: Wobble artifact produced by a text-to-speech software.

Post by BJG145 »

The whole thing sounds wobbly to me; "world", "Frodo", "quiet Hobbit". I don't think that's something you can fix in the mix; either something went wrong during the creation/edit or the algorithm needs work. Maybe you could try getting the system to repeat this section a couple of times and see if the glitches are identical. (In ye olden days you sometimes had to feed in different words to get the best result; eg using "whirled" instead of "world".)
Last edited by BJG145 on Fri Sep 18, 2020 10:55 am, edited 5 times in total.
User avatar
BJG145
Longtime Poster
Posts: 8088 Joined: Sat Aug 06, 2005 12:00 am Location: UK

Re: Wobble artifact produced by a text-to-speech software.

Post by wetduck »

Hello everyone!

Thank you for taking a time out of your day to check my post. My initial thought was that these wobbles are not possible to fix after I export them from the text-to-speech software (because there are a lot of them).

Surprisingly, a coworker found out that typing the problematic words in ALL CAPS fixes them for some reason.

Thank you all and have a good day!
wetduck
Posts: 2 Joined: Fri Sep 18, 2020 8:35 am
Post Reply