when pop culture talks about AI they generally have something like this in mind some Terminator robot that might take over and kill us all but what I want to talk to you about today are the invisible algorithms that are all over our lives but that are not as glamorous as this robot but that are I think more important hey right now is being used to manage our electric systems power systems driving systems trading so sometimes you see the market going up down and up again trying to figure out who to hire and who not to

hire and if you are interested in learning about how pervasive these data driven algorithms are in our lives and some of the unintended consequences that might come because of that pervasiveness I recommend that you read weapons of mass destruction or automating inequality ai is also used in our government so how many of you have heard of this 2016 Republican article that talked about software that purports to predict someone's likelihood of committing a crime again so how many of you vote so judges use this prediction as one of the inputs to decide how many years to

sentence someone to prison for I recently signed something a letter against something called the extreme betting initiative and this was an initiative proposed by ice immigration Customs Enforcement ice has been in the news lately and what they were proposing was to partner with tech companies to analyze someone's social network activity and to help them determine whether this person would be a good immigrant or not whether this person's likely to commit a terrorist act or not whether this person would be a good citizen now I want to stop here and say there are two questions we

can ask right the first one is whether or not we should have tools that are doing this in the first place but the second question I want to address today is that the current AI tools the automated tools that we have to do this task are not robust enough to be used as such high stakes scenarios so let me give you an example a young Palestinian woke up one day and wrote good morning in Arabic on Facebook and it got translated to attack them and to English and Israeli authorities looked at the English translation they

didn't try to see what he wrote in Arabic and they arrested him they let her later to let him go but they arrested him and so this idea of people not applying their critical thinking skills to algorithms the way they would do to you and I it's called automation bias and for me I think that the combination of automation bias with these systemic errors that we have in our tools is very dangerous another tool that one might think being applied to social analysis of social networks is is word embeddings so think of word embeddings as

a space where words that are deemed to be similar to each other are closer in this space than words that are deemed to be different so you can use these word embeddings to train analogy detectors and my colleague Adam Cole I and his collaborators recently showed that these analogies can be can encode societal biases so I'll give you an example so you have an analogy man is to brother as woman is to sister man is to King as women is to Queen man is to doctor as women is to nurse why see so that's what

it gives you but this is the is a societal bias man is the computer programmer as woman is to gives us homemaker so you might ask why okay so there are these biases and this thing called word embeddings why should I care about this well you can imagine an automated tool that's being used to analyze resumes right so let's say I'm a recruiter I'm trying to find someone who is a computer programmer whose skills are close to someone at the computer programming skills so I might see things like Java Python C++ and the resume but

I'm gonna also see that this person is a quarterback at the University of Vermont this tells me this person is most likely to be a white male and I might have a similar resume exact same qualifications but this person now is at softball team captain at Spelman College which is an HPC historically black college so for women and so this person is most likely to be a black female softball captain right so now if I have this tool that's telling me that the distance of a black female to computer programming is much higher than the

man then the white male then you're propagating these biases right so we are propagating biases and ways that we don't even understand right now one other tool that I want to talk about that is once again heavily used in government in law enforcement is face recognition so I encourage all of you to read the perpetual line up report which talks about the fact that one in two American adults are in some face recognition database one in two and once again we have two questions here one is whether we should even have such a surveillance AI

system in the first place but the second thing I want to talk about is whether or not the face recognition tools that we currently have are robust enough to be conducting such a surveillance um in such high stakes scenarios and my answer is no so you all might have heard about the highly publicized Google photos classifying some black people at gorillas and my colleague joy bull limini and I had a paper talking about basically describing the fact that commercial gender classification assistants have very high error rates for particularly women of darker skin tones so what

is the gender classification system think of it as you know I it's a system I show it an image and it tells me the gender of the face in the image okay I still I showed of an image of a face and it just gives me a binary label male female it doesn't handle any other types of identities at all and even with this this self constrained very very constrained scenario here you see so I'm showing you the error rates as you go darker and darker and skin tone so let's look at IBM for example

for the lighter skinned group you have error rate of 5.1% and then you get to 46.8% which is almost random chance error rate so our tools are not robust enough to be used in the manner that we're using them right now we don't know what kinds of biases were propagating because all of these little AI tools are being strung together and being used in very high-stakes scenarios so what are the solutions I have all the answers I don't have any of the answers um I have some ideas for some parts of the solution so in

my community in my research community there are a lot of people including me who research technical solutions so bacon some mathematical notion of fairness but I want to say that before we even get there we can't ignore social and structural problems we can't ignore the fact that there are certain groups of people who are heavily policed who are the groups of people who might be adversely affected by these biases by these surveillance practices and who are the groups of people who are creating involved in creating this technology and distributing it right they look like this

they don't look like the people who are being surveilled and heavily policed and adversely being impacted by this technology currently we don't have any rules or regulations or standards to figure out what kind of tools you can use for which purpose and I'm advocating for standards laws and documentation so I want to close by saying a bunch of other industries have been here I is not the first one so for example I come from a hardware background and I used to design circuits and in circuits we have something called a datasheet so every component from

as simple as a resistor to as complicated as a CPU comes with something called a data sheet that tells you standard recommended operating characteristics disclaimers etc anybody heard of a data sheet no okay and I'm advocating I think we need to have data sheets for data sets and some of these automated tools that are being used in different places we need to have information about recommended usage for example if you're going to use high face recognition algorithms for high-stakes scenarios that needs to have a very different characteristic for some toy application we need to have

standards that describe this and so my colleagues and I have written a paper flushing out this idea and if you're interested in it it's called Xena she's for datasets and you can check it out the second industry I want to talk about is something we all know about the automobile industry right so when cars came on the road in the u.s. there were no stop signs or seatbelts or driver's licenses or drunk driving laws or anything like this right and there were so many accidents and even when seat belts were legislated in 1967 people didn't

want to use them so there had to have massive social campaigns trying to convince people to use seatbelts and there was this issue of bias and there still exists this issue of bias in automobiles too because some of you might know that car manufacturers were only testing their cars using crash dummies with prototypical male characteristics so then you ended up with these cars that were disproportionately killing women and children right and it was very very recently that the government mandated that they have to test their cars with crash dummies with prototypical female characteristics as well

I'm talking about like late 90s when I say very recently in fact there was a court opinion in the 20s asking whether or not the automobile was inherently evil and this reminds me of current conversations that are happening about AI whether or not as inherently evil are the robots gonna kill us all etc so I find it very very interesting and important to go back to history and look at other industries and draw some parallels finally I'll I'll close by talking about a very important industry which is the healthcare industry so some of you might

know that clinical trials were not were not legal in the United States and there were also a lot of unethical drug experimentation that was being done on marginalized communities so vulnerable people people of color soldiers prisoners pregnant women etc and even once there was a procedure to have clinical trials guess what once again there was no law that mandated that in your clinical trial you needed to have people of different groups before to test your drug on different groups before you put it on the market and that you needed to disaggregate all the results by

these groups and again it was very very recently that the United States government mandated that this had to be done for clinical trials so now you have studies that show that 8 out of the 10 drugs that were pulled out of circulation between 1997 2001 disproportionately affected women it took many years for standards to be put in place in these industries and we're still suffering from the biases that were propagated in these other industries so we should learn from these other industries and not do the same exact thing with AI we should have guardrails in

place and we should make sure that the group of people involved in creating the technology resemble the people who are using the technology thank you you

How To Stop Artificial Intelligence From Marginalizing Communities? | Timnit Gebru | TEDxCollegePark