[Music] you all know the voices of the people in your lives the feeling of hearing someone's voice after a long time not having heard it can be wonderful our voices are among the most unique aspects of us as individuals so it may surprise you to find out that today voices are eminently fey Keable to build a replica of someone's voice you start with the raw material recordings given a large enough collection of recordings and corresponding transcripts a machine learning algorithm can learn to associate the text of the transcripts with the audio we call this process
training or learning because as more data is processed the algorithm tends to become more accurate once this training process is complete you can type in more or less anything you want and get back an audio clip of the person in question saying those things even if that person has never said those things before hello everybody oprah winfrey here I hope you're all enjoying yourselves at TEDx tío though it could not be there with you today I would like to introduce you to Joe he's here to tell you about why machines have gained a voice so
the recordings used to train that algorithm actually came from a course that Oprah taught so now know that whenever you speak publicly you're making yourself vulnerable to future impersonation of this sort this is part of a more general phenomenon known as deep fakes in which all manner of media audio video and written text can be realistically faked by algorithms this also goes by the name of synthetic media so currently it's not that hard to tell what is a deep fake and what is authentic however the underlying technology machine learning will continue to improve and eventually
it will be impossible to distinguish deep fakes from real content I work at a Toronto based AI startup company called Desa and Desa has this culture of wildly ambitious side projects so about a year and a half ago a bunch of us were sitting around after work thinking about what we wanted to do and we hit upon this idea to create simulations of famous people the idea was to create a chatbot that you could talk to and that would talk back to you with the same voice and personality as anyone at least anyone with enough
recordings and conversations available online from which we could extract the data that we needed to train our algorithms as we started working on this we realized that perhaps the most visceral and shocking aspect of such an enterprise is the creation of voice because voices are so particular to individuals to recreate a well-known voice is to access something deep in the mind of the listener six months ago with my two brilliant colleagues haschke team and Freehan mama we released a short video containing highly realistic generated clips with the voice of popular American podcaster Joe Rogan I
can pronounced tongue twisters now check this out Peter Piper picked a peck of pickled peppers how many pickled peppers did Peter Piper pick Pete Joe Rogan it's me Joe Rogan please come save me man these artificial intelligence guys have trapped me in a machine so hopefully it's obvious that Joe Rogan never actually said those things some of you may already be thinking about what you wouldn't want to hear a deep fake of you saying imagine you get a phone call from or rather even better imagine somebody very close to you gets a phone call from
someone or some thing that sounds exactly like you I'll leave it to your imagination to what might happen next large-scale political manipulation will be attempted with this technology a fake hidden camera video of a candidate saying damming things that they never actually said could be posted to social networks and shared millions of times before a response is mustered the converse of this is that it gives plausible deniability to people who have actually said or done terrible things because they can always claim that it was faked perhaps the greatest risk collectively is to our shared sense
of truth video and audio have been for nearly a century our gold standard for proving the authenticity of an event we're on the verge of losing the assurance that the mere existence of a recording can be considered proof that an event took place there is a lot at stake here and to some of you this may seem like the beginning of a terrifying new world order we have a lot of work to do collectively to mitigate the problems that depicts will cause but I want you to know that we already have helpful tools at our
disposal and will continue to develop more let's consider the case of our impending loss of audio and video as a universal source of truth remember that articles and books can be deliberately misleading or contain outright lies and we still utilize and trust in those how is it that we trust the infinitely malleable medium that is text the answer of course is there exist many cues which help us discern what is truthful and what is not what's changing now is that we'll increasingly need to apply the same mindset to the world of audio and video one
thing that could be done at least in the interim before deep fakes become indistinguishable from real content is that machine learning can be applied to detect deep fakes just as you can train an algorithm to produce deep fakes you can train one to detect them this might make it possible for us to detect defects which are extremely difficult or impossible for humans unaided humans to discern however over the long run this cat-and-mouse game between detection and generation is a losing proposition over the long run we may come to rely upon more radical solutions one possibility
is to make use of specially designed cameras to ensure that a signature of a recording would be unalterably encoded into the public record at the moment of creation proving that it could not have been tampered with the idea would be to have a public blockchain that anyone could use to authenticate their content perhaps one day all of us will have easy access to such a tool my point here is to convey that human civilization is both resourceful and flexible and we will develop solutions to address this new problem one thing that makes this problem harder
to address today is the nature of social media sensational content tends to make it to the front page of our news feeds because it's naturally intriguing and frankly entertaining rather like a salacious political defect would be in May of this year a video of Nancy Pelosi surfaced in which she was shown to be slurring her words this video wasn't even a deep fake it was a so-called cheap fake in which the creator had simply slowed down a real video and modified the pitch giving the impression of a drunk speaker this video spread virally on social
media and was viewed millions of times now social media companies like Facebook have resisted addressing this problem I believe that these companies have a responsibility to not amplify what amounts to harmful defamatory identity theft now some of you may be thinking hold on Joe aren't you one of the people that creates this stuff well I have made deep fakes I take very seriously the responsibility to do no harm and there's a world of difference between what bad actors will do and what my colleagues and I do our aim is to showcase the talent the technology
to raise awareness about it not to actually deceive people has a machine-learning engineer I love to build things this quote from Nikola Tesla sums it up nicely I do not think there's any thrill that can go through the human heart like that felt by the inventor as they see some creation of the brain unfolding to success such emotions make one forget food sleep friends love everything most of what I work on day to day has nothing to do with defects but it just so happens that I work in a truly universal field machine learning is
the science of pattern recognition and recreation patterns in anything and everything machine learning will just as well process credit-card transaction data to identify fraud as it will generate synthetic voice to create new fraud the truth is I got into this voice stuff because I thought it'd be a fun project to work on with my colleagues it was only after we started working on it that we fully recognized the broader implications and I believe this will be a recurring trend for the field of AI ambitious young creators will work on the coolest applications they can think
of and may not always give enough forethought to the implications of their creations it's incumbent on those of us that work in this field to be mindful about what we're creating and what we're releasing for instance in doing this work we were careful to not release anything that would make it easier for others to reproduce our results making it more accessible would also make it easier for bad actors to use it there's also a special responsibility that the creators of deep fakes have it's a strange power to suddenly have that is to be the puppeteer
of a real person on the one hand it's incredibly fun and on the other hand it's tremendous responsibility it would be far too easy at this moment to tarnish a person's reputation or even destabilize a democracy as a machine-learning engineer this weighs on me but I'm also mindful of the potential for good this technology provides it will enable us to bring great personalities back into the world I for one would love to see a science lecture delivered by revived Carl Sagan as these the technologies mature and over the long run it will enable us to
paint flexibly in the medium of video this will unleash a new wave of creativity in film that we can scarcely imagine today one day a teenager will recreate Game of Thrones on their laptop without hiring any actors or filming anything at all the transition to this new world is going to be difficult and we're not being proactive enough what tends to happen in the development of dangerous new technology is that people wait until there's been a disaster before inventing solutions that eventually mitigate our problems I don't believe the problems of deep fakes are insoluble far
from it I believe that it's well within the capacity of our civilization to absorb it we've undergone changes far greater than this before and when we do finally come to grips with deep fix we'll have a whole new world of creativity to look forward to thank you [Applause] [Music]