With the advent of more complex applications of artificial intelligence, what you say might not always be up to you. Voice cloning, a process where software can make a certain voice say a desired input using a sample of that voice, has become more widespread and realistic. One company right here in Montreal, LyreBird AI, is on the cutting edge of this technology. Using only a small audio sample, their software can make an ultra-realistic synthetic voice that will say anything that is typed into the input box. Although, the ethics of voice cloning technology are ambiguous; will your speech be free to everyone?
Ethics aside, voice cloning technology is astounding. LyreBird’s website allows you to input words for one of their synthetic voices to say, and what it produces is almost indistinguishable from a real person. Similar to the controversial Deepfake technology, which was also shrouded in ethical concerns, taken in isolation, the product of the software is admirable.
Thankfully, LyreBird is well aware of the morality issues linked to their product. In fact, an entire page of their site is dedicated to their ethics, where they state that they are ensuring one can only synthesize their own voice. Furthermore, they acknowledge their commitment to remaining in talks with “leading machine learning researchers, ethics professors, and the broader public,” as they seek a wider release of their product in the future.
While LyreBird may not be a source of ethical concern, they do acknowledge the vast potential for other threats. On the same ethics page, LyreBird states that there is “no reason” to believe that other voice cloning companies and technologies will hold themselves to the same standards nor implement similar constraints on their software. Essentially, voice cloning technology is already available, will continue to get better and will inevitably propagate. The consequences of these events will have dramatic effects on society as a whole.
“[Lyrebird’s] danger is best exemplified in the recent case of a British energy company executive, who transferred over 200 thousand dollars to a foreign bank account, believing that his boss was giving orders to do so over the phone”
The danger of voice cloning is arguably more imminent than that of deepfake video technology. While more shocking and prominent in media and film, deepfake videos require large amounts of source imagery in order to produce a somewhat believable product. In contrast, artificial voice cloning requires far smaller samples in order to create a believable product, as demonstrated by LyreBird. The lack of public awareness around voice cloning technology, combined with its relative ease of creation, make it a dangerous advancement in artificial intelligence.
This danger is best exemplified in the recent case of a British energy company executive, who transferred over 200 thousand dollars to a foreign bank account, believing that his boss was giving orders to do so over the phone. In reality, the perpetrators used a voice clone to trick the victim. Although reported to be the first of its kind, these means of theft will likely become more common. In order to avoid such theft, corporations and the general public will need to remain vigilant and skeptical of things like recordings and phone calls.
The increasing complexity and quality of such technology and scams open up a broader question: Will we be able to believe what we see and hear? The thought of such a dystopian reality is disturbing, although it is important to acknowledge that we’re not there yet. Top tier video and voice cloning is not yet available to anyone and everyone, and there is significant discussion surrounding the regulation of such technology. In spite of this fact, companies like LyreBird give a well-needed reality check on the ethical dilemmas that technology may soon pose to society