Ethics of Algorithms in IoT - Analyzing data in the Internet of Things

CHAPTER 9 How Are Your Morals?

Ethics in Algorithms and IoT

Majken Sander and Joerg Blumtritt

Editor’s Note: At Strata + Hadoop World in Singapore, in December 2015, Majken Sander (Business Analyst at BusinessAnalyst.dk) and Joerg Blumtritt (CEO at Datarella) examined important questions about the transparency of algorithms, including our ability to change or affect the way an algorithm views us.

The codes that make things into smart things are not objective. Algorithms bear value judgments—making decisions on methods, or presets of the program’s parameters; these choices are based on how to deal with tasks according to social, cultural, legal rules, or personal persuasion. These underlying value judgments imposed on users are not visible in most contexts. How can we direct the moral choices in the algorithms that impact the way we live, work, and play?

As data scientists, we know that behind any software that processes data is the raw data that you can see in one way or another. What we are usually not seeing are the hidden value judgments that drive the decisions about what data to show and how—these are judgments that someone made on our behalf.

Here’s an example of the kind of value judgment and algorithms that we will be facing within months, rather than years—self-driving cars. Say you are in a self-driving car, and you are about to be in an accident. You have the choice: will you be hit straight on from a

huge truck, or from the side? You would choose sideways, because you think that will give you the biggest opportunity to survive, right? But what if your child is in the car, and sitting next to you? But how do you tell an algorithm to change the choice because of your values? We might be able to figure that out.

One variation in algorithms already being taken into account is that cars will obey the laws in the country in which they’re driving. For example, if you buy a self-driving car and bring it to the United Kingdom, it will obey the laws in the United Kingdom, but that same car should adhere to different laws when driving in Germany. That sounds fairly easy to put in an algorithm, but what about dif‐ ferences in culture and style—how do we put that in the algorithms? How aggressively would you expect a car to merge into the flow of the traffic? Well, that’s very different from one country to the next. In fact, it could even be different from the northern part of a coun‐ try to the southern, so how would you map that?

Beta Representations of Values

Moving beyond liabilities and other obvious topics with the self- driving car, we would like to suggest some solutions that involve taking beta representations of our values, and using those to teach the machines that we deal with who we are and what we want. Actually, that’s not too exotic. We have that already. For example, we have ad preferences from Google. Google assigns properties to us so it can target ads (hygiene, toiletry, tools, etc.), but what if I’m a middle-aged man working for an ad agency that has a lingerie com‐ pany as a client, with a line especially targeted for little girls? Google would see me as weird. Google forces views on us based on the way it looks at us.

What about a female journalist who is writing a story about the use of IoT—and her fridge tells her that, because she’s pregnant, she may not drink beer, because pregnant people are not allowed to drink beer. And her grocery store that delivers groceries to her a couple of times each week, suddenly adds orange juice, because everyone knows that pregnant women like orange juice. And her smart TV starts showing her ads for diapers. The problem is, the journalist is not pregnant, but there’s nowhere she can go to say, “Hey, you’ve got

it wrong—I’m not pregnant! Give me my beer and drop the orange juice!”

And that’s the thing that we want to propose—we need these kinds of interfaces to deal with these algorithmic judgments.

Choosing How We Want to Be Represented

There are the three ingredients for doing this. First, we have training

data—that’s the most important. We have to collect data on how we

act, so the machine can learn who we are. Next, we have the algo‐

rithms—usually some kind of classification and regression algo‐

rithm, such as decision trees, neural networks, or nearest neighbor. You could just see who is like me, and then see how people who are like me would have acted, and then extrapolate from that.

And then there’s the third ingredient: the boundary conditions—the initial distribution that you would expect, and this is the really the tricky part, because it’s always built into these kinds of probabilistic machines, and it’s the least obvious of the three.

For example, take the self-driving car (again). One of the challenges could be judging your state of mind when you get in the car. One day, you’re going to work and just want to get there fast. Is there an algorithm for it? Then there are days when you want to put the kids in the car and drive around and see trees and buildings and fun places. If you only do that on Sundays, that’s easy for an algorithm to understand, but maybe it’s Thursday and you want this feeling of Sunday.

Some software companies would solve it by having an assistant pop up when you get in your car. You would have to click through 20 questions in a survey interface before your car would start. How are you feeling today? Of course, no one wants to do that. Still, there must be some easy way to suggest different routes and abstractions for that.

There is. It’s already there. It’s just not controllable—by you. You can’t teach Google Maps. It’s the guys in Mountain View that make these decisions for you, and you really don’t see how they did it. But it should not be a nerd thing—it should be easy to train these algo‐ rithms ourselves.

Some companies are already experimenting with this idea. Netflix is a very good example. It’s the poster child for recommendation engines. There’s a very open discussion about how Netflix does it, and the interesting thing is that Netflix is really aware of how impor‐ tant social context is for your decisions. After all, we don’t make decisions on our own. Those who are near to us, like family, friends, or neighbors, influence us. Also, society influences our decisions. If you type in “target group” into Google’s image search, you get images that show an anonymous mass of people, and actually, that is how marketing teams tend to see human beings—as a target that you shoot at. The idea of representation is closely tied to a target group, because it gives you a meaningful aggregate—a set of people who could be seen as homogeneous enough to be represented by one specimen. You could do that by saying, well, as a market researcher, I take a sample of 2,000 women, and then I take 200 of them that might be women 20 to 39 years old. That’s how we do market research; that’s how we do marketing.

We would take these 200 women, build the mean of their properties, and all other women would be generalized as being like them. But this is not really the world we live in. If a recommendation engine, a search engine, or a targeting engine is done well, we don’t see people represented as aggregated. We see each one represented as an indi‐ vidual. And we could use that for democracy also.

We have these kind of aggregates also in democracies, in the constit‐ uencies. It’s a one-size-fits-all, Conservative Party program. It’s a one-size-fits-all Labor Party program, or Green Party program. Maybe 150 years ago this would define who you are in terms of the policies you would support. That made sense. But after the 1980s, that changed. We can see it now. We can see that this no longer fits. And our algorithmic representation might be a solution to scale, because you can’t scale grassroots democracy.

A grassroots democracy is very demanding. You have everybody always having to decide for every policy that’s on the table. That’s not feasible. You can’t even do that in small villages like we have in Switzerland. Some things have to be decided by a city council, so if we want a nonrepresentative way of doing policies, of doing politics, we could try using algorithmic representation to bulk suggest poli‐ cies that we would support.

That need not be party programs. It could be very granular—single decisions that we would tick off one by one. There are some down‐ sides, some problems that we have to solve.

For example, these algorithms tend to have a snowball effect for some decisions. We made agent-based simulation models, and that was one of the outcomes. And in general, democracies and societies —even nondemocratic societies—don’t work by just doing majority representation. We know that. We need some kind of minority pro‐ tection. We need to represent a multitude of opinions in every social system.

Second, there are positive feedback loops. I might see the effect of my voting together with others, and that’s like jumping on the band‐ wagon. That’s also seen in simulations. It’s very strong. It’s the con‐ forming trap.

And third, your data is always lagging behind. Your data is your past self. How could it represent changes of your opinion? You might think, well, last election, I don’t know, I was an angry, disappointed employee, but now, I’m self-employed, and really self-confident. I might change my views. That would not necessarily be mapped into the data. So these are three things we should be careful about. The fourth one is that we have to take care of the possibility that the algorithm of me is slightly off. It could be in a trivial way, like what I buy for groceries. It could be my movie preferences. So I have to actually give my “algorithmic me” feedback. I have to adjust it, maybe just a little bit, but I have to be able to deliver the feedback that, in the earlier examples, we were lacking the skills to do.

As users, we actually need to ask questions. Instead of just accepting that Google gives me the wrong product ads compared to my tem‐ perament, my fridge orders orange juice that I dislike, or my self- driving car drives in a way that I find annoying, we need to say, hey, the data is there. It’s my data. Ask me for it and I will deliver this, so you can paint the picture of who I really am.

About the Author

Alice LaPlante is an award-winning writer who has been writing about technology, and the business of technology, for more than 20 years. The former news editor of InfoWorld, and a contributing edi‐ tor to ComputerWorld, InformationWeek, and other national publi‐ cations, Alice was a Wallace Stegner Fellow at Stanford University and taught writing at Stanford for more than two decades. She is the author of six books, including Playing for Profit: How Digital Enter‐

In document Analyzing data in the Internet of Things (Page 59-66)