How can I recreate my grandfathers voice?

boojumliussnark@lemmy.world · 3 months ago

How can I recreate my grandfathers voice?

Successful_Try543@feddit.org · edit-2 3 months ago

Maybe the term you are searching for is “AI voice cloning”. The engine of https://elevenlabs.io/voice-cloning claims to be able to understand and reproduce even Danish.

Edit: They seem to require some voice verification to make sure the voice is yours. Which is odd in your case.

https://speechify.com/da should allow to recreate the voice of “your beloved one”, at least they mention it on their German page.

boojumliussnark@lemmy.world · edit-2 3 months ago

I did sign up for ElevenLabs, unfortunately they cannot allow me to clone a dead persons voice, as per their FAQ:

You may only clone your own voice or a voice you have the rights to clone. For added security, when creating a Professional Voice Clone we require users to complete a Voice Captcha mechanism by reading a text prompt within a specific time to confirm your voice matches the training samples you upload for training. If there’s a match, your request is sent for fine-tuning. If not, you’ll have to reach out via our help center to have your voice verified manually.

Now I’m sure it wouldn’t be an issue to get the legal rights, but when I spoke to their support, they did not have any way to verify beyond the captcha.

Successful_Try543@feddit.org · edit-2 3 months ago

Maybe https://speechify.com/da/ works. At least they mention the recreation of the voice of “your beloved one” on their German page.

boojumliussnark@lemmy.world · 3 months ago

I can’t find this. Where is it on the German page?

Boozilla@lemmy.world · 3 months ago

I’ve been able to generate very good results with this open source project. You need a pretty good nVidia GPU, and it takes some time and tedious work to get it working they way you want it to:

https://github.com/neonbjb/tortoise-tts

Some voices sound exactly right. Other sound like a broken robot. The main reason I like it is that I can run it local without having to sign up for some stupid cloud service.

boojumliussnark@lemmy.world · 3 months ago

Looks very cool. I was unable to see anything regarding languages. Is it completely language independent somehow, or is it English only?

Grimy@lemmy.world · edit-2 3 months ago

Elvenlabs is currently the best but you can get some very good results with first xtts then rvc as a second pass. It involves fine tuning models and running things with python and notebooks, so requires some know how.

You can explore more models on the huggingface page https://huggingface.co/models?pipeline_tag=text-to-speech&sort=trending

Most have a huggingface space dedicated to them where you can try them, here is the xtts space for example https://huggingface.co/spaces/coqui/xtts

The language adds an other layer of difficulty, I would try their demo first to see if it gives anything workable but it isn’t a language current tts software cater too, it doesn’t seem to be an available option on xtts sadly.

boojumliussnark@lemmy.world · 3 months ago

Thank you for the tips. As I see it currently, I expect the language to be the biggest hurdle. It doesn’t appear like something I can add myself, even if I had the data for a model. So as far as I can tell it involves two currently more or less impossible steps: Get model data and teach language to model.

poleslav@lemmy.world · 3 months ago

If you can get them into a digital format I’ve personally used eleven labs to clone voices and make narrations for missions I created for a video game. I tried using different open source projects and getting it to run on my own with no avail, but 11 labs has been solid (it is unfortunately paid software of like $5/10 bucks a month though)

boojumliussnark@lemmy.world · 3 months ago

Was this with the “Instant voice clone”?