As we enter into awards season one of the most crucial ways to enter your work is with a buttoned up case study. But sometimes the budget for your case study doesn’t exist, and you’re not exactly excited about your nasally voice being the VO for some otherwise exceptional work. I wanted to explore how we can use AI to help make our case studies the very best possible, knowing not every account always has the time, budget and talent to follow the preferred process.
Enter ElevenLabs. This is a voice AI research and deployment company with a mission to make content universally available in any language and voice. We’re gonna learn how ElevenLabs is gonna help us get a professional voice over for an upcoming case study.
There is a free version of ElevenLabs, which will allow you to access a few voices from their voice library. But to get what I am looking for, I am going to subscribe to their Creator package, which at the moment is $11 a month and gives me full access to their voice library, the ability to create custom voices, and even clone voices I have the right to (like my own).
After subscribing, I navigate the left bar to ‘Voices’ and select ‘Explore’ to find the type of voice I am looking for. Through a search bar and filters you can hone in on the voice you’re looking for. For me, I needed a middle aged male, that sounded tough, confident, and strong. After listening to the samples, I settled on Leif, a “husky male.”
This brings you to a window that lets you start entering the script/copy. There are two options at the top: “Text to Speech” and “Speech to Speech.” For now, we are doing Text To Speech. You’ll notice several settings after you select your option.
- Stability– this adds or softens the expression in the voice. The more variable it is, the more variations in the voice. The more stable, the more flat, or unexpressive.
- Clarity– This deals mostly on pronunciation and how spot on to the example voice it will be. Lower, will soften things a bit, making it sound a bit more human with some imperfections and softer plosives and glottals in the expression. High, will do the opposite with very exact annunciations.
- Style Exaggeration– This adds variation in speed. ‘None’, will read the script as quickly as possible, great for that legal language. ‘Exaggerated’ will cause variations in the pace, and more pronounced pauses at commas, periods and paragraph breaks.
Let’s begin with default settings. Simply enter your text in the text field, and then click generate at the bottom.
At the bottom, a playbar will play your work. If you have longer scripts, give it time to catch up, as it generates the voice in real time. To the right of the playbar, you can download the track if you like it.
Here is what the first recording sounds like:
That’s good, but let’s toggle the settings to make it more variable, higher clarity, and highly exaggerated. You’ll notice some pink “danger zone” sections in the sliders that start to go beyond what the AI recommends for the levels. Going beyond those sections make the audio files less reliable in the output.
Here are the updated settings and recording:
Ok, that’s better, but it still sounds like AI. I want it to sound more like a person would read it. So, we are going to go back to the top of that window and now select “Speech to Speech.” What I am going to do is record myself reading the script, like I want it read. Then, we will mix the style of my reading with the Leif voice.
You can record right within the software simply by clicking the “Record Audio” button that shows up where the text field used to be. You can also record on your phone or computer and upload the audio:
Here is my voice in the style I want it recorded:
ElevenLabs will keep your slider settings from before. And It will take a bit longer to generate, but in the end you get the nice professional voice of Leif, mixed with the style of my reading:
This is just the tip of the iceberg of what ElevenLabs is capable of. You can dub this into other languages, you can even clone your own voice, enter it into their data-base and get paid if someone uses your voice for their project.
But all in all what we have learned is that getting decent VO for your needs is possible, even if it’s a scratch track for what you are ultimately looking for.
Dane Rahlf