Get to Know Berkman Klein Fellow Amy Zhang
a spotlight on one of our 2018-2019 BKC Fellows
by Adam Nagy and Mai ElSherief
Amy Zhang is a fifth year Computer Science PhD at MIT CSAIL. Amy’s research focuses are human computer interaction, social computing, and the development of tools to improve discussions online. Her recent projects include Squadbox, a tool that allows recipients of email harassment to crowdsource moderation, blocking, and other forms of support, and co-development of interoperable standards to define and annotate the credibility of online news to improve information quality online.
Mai ElSherief, a fifth year Computer Science PhD at UC, Santa Barbara and Berkman Klein Center summer intern and Adam Nagy, a former research assistant at the Cyberlaw Clinic, sat down with Amy to discuss her research and goals as a Fellow!
Read more stories from our Interns and Fellows!
Mai: So, the first question is, please describe who you are and your research in a Tweet length or less.
Oh no! I have to Tweet on the fly! Okay, I’m a fifth year PhD student in computer science at MIT, I do HCI research, human computer interaction, and I basically focus on building tools for everyday users to have better discussions online.
Mai: Okay nice, this is the new tweet length - 280 characters.
It was kind of like a couple tweets.
Adam: It’s a thread. Right off the bat I’m going to go off-script and ask if you could describe human computer interaction.
So basically, it’s an area of computer science that looks at how to build systems and actually put them into practice, including how humans actually interact with technology, so it incorporates elements of areas of design, psychology, sociology, and uses a wide variety of methods to answer those questions. Anything from interviews, to surveys, to quantitative data analysis, to actually building prototype systems and doing iterations of user studies with people.
Adam: How do you envision the next generation of communication tools, and do you think it's important to sort of improve what we already have or to sort of create new tools entirely?
Yeah, that’s an interesting question, I think that the status quo is probably not a great place to be right now, and I think this is because of a bunch of factors that have only gotten worse in the last couple of years. So one of those being information overload. I think we need more tools to get better at managing and handling information overload. I think everyone has that experience where they are inundated with emails and chat, articles and tweets to read, and have difficulty understanding, filtering, and knowing whether something is true or not. Many have difficulty knowing whether or not to trust something and even with just keeping up with all the things they have to do.
I think we need more tools to get better at managing and handling information overload. I think everyone has that experience where they are inundated with emails and chat, articles and tweets to read, and have difficulty understanding, filtering, and knowing whether something is true or not
Mai: When I was going through your research, I found that a lot of the online harassment solutions you propose leverage collective support and an existing community of people to help people who experience harassment. So the question is: why is it important for you for solutions to include these communities? and how do you motivate people around you to be more involved in the process of standing up to harassers?
I think there are a number of different strategies that can help with online harassment, and they come at all sorts of different levels — all the way from platforms stepping in and hiring moderators, or building better tools for individuals getting help from their friends, and anywhere in between. The reason why I’ve been focusing on the latter is mostly because I don’t think there is enough work out there on that particular aspect, especially in terms of tools to actually help people be able to leverage their communities and the people they trust. So that is the reason why I focused on it — I don’t think it’s the only solution, I think actually there need to be solutions that include many different groups. So that would be the reason there.
And also just from talking to people and interviewing people that have experienced harassment. Many people talked about not believing that platforms would be able to solve the problem by themselves. We’ve also interviewed people whose friends were harassed and the overwhelming response we got was “oh my gosh, my friend is in this terrible situation, I really want to help them, I don’t really know how to help them." They seem to be taking on this big burden by themselves and there is nothing I can do about it. So, I think just giving people the opportunity to do that would be great.
Mai, Narrating: We then asked Amy about Squadbox, a tool she created to help victims of harassment; and what her goals are to continue to improve this tool over the course of her fellowship,
Yeah. Squadbox is a tool we built that lets people recruit their friends to help them moderate and deal with harassment. The tool lets the friend do things like check incoming messages from strangers to decide what to do with them, manage the user’s whitelist or blacklist for them, and you can imagine other things that they could do, such as help report people, maybe even respond to the harasser for the person who is getting harassed. Thinking of different ways for people who are getting harassed to not have to feel like everything is coming directly onto them, we built the tool and released the tool really recently, we’ve gotten some users but I’d really love to do kind of a longer study with people using the tool and see longitudinal effects of how can we help moderators more deal with the effect of having to look at this. How can we support each other to be able to encourage each other to do this kind of work? That’s my goal there.
Squadbox is a tool we built that lets people recruit their friends to help them moderate and deal with harassment. The tool lets the friend do things like check incoming messages from strangers to decide what to do with them, manage the user’s whitelist or blacklist for them, and you can imagine other things that they could do, such as help report people, maybe even respond to the harasser for the person who is getting harassed.
Mai: So, more design and kind of putting this into the wild and kind of getting feedback from users and incorporating that...that’s very interesting. So, as you mentioned right now, like you were saying, someone’s friend might want to respond to the harasser, right? So, this kind of gets us to our next question - which is crowdsourcing counter-harassment, where do you see the balance between someone needing to provide someone else, the person being harassed, with support and when do you see the need to fight back?
I think it really depends on the person who is being harassed. From talking to people, they have very different ideas about what they want to happen, some people just want to have it blocked and never see it again. Some people want to engage with people that are harassing them and write back or go on to write a blog post or talk about the harassment they’ve received, and some people don’t want that at all. So, I think it’s really up to that person and then for that person to be able to communicate to their friends or whoever is trying to help them, like this is the way I want to be helped, as opposed to them deciding for them, for instance.
Mai: So do you think Squadbox will have this option?
Yeah, yeah so right now on the tool, you can, the person that is being harassed can specify certain things. Like please don’t talk back to the harasser for me, and you can even obfuscate who is actually doing the harassing, like their email. Some other things that right now you can do, it’s basically just controls, of what the moderator can see versus not see, and what they can do versus not do.
Adam, Narrating: In April, Amy and a group of interdisciplinary researchers known as the Credibility Coalition released a paper with the goal of creating a scientific, systematic, and scalable way to assess the reliability of information online. We wanted to hear more about this research and its potential applications.
I’ll just tell the backstory of how this came to be. This started out of these series of conferences, kind of hackathons, around misinformation, called Disinfector, and I was at one of these. It was at the MIT Media Lab a year ago, and there were all these different projects around misinformation. So things like building algorithms to detect misinformation, was a popular one, another one was giving users tools to be able to determine whether something was true or false or letting people annotate different news articles as true or false and sort of crowdsourcing this.
And one thing that we noticed, several members who were at the conference, was that: It would be really great if the tool that’s having people annotate misinformation could then be useful for the tool that’s learning what misinformation is. Wouldn’t it be great if this data could be shared and could be interoperable? So out of that led to this current work which was saying: ‘Ok maybe we could come up with a shared vocabulary around what are the different things that people actually think about when they think about what makes something credible.’
It would be really great if the tool that’s having people annotate misinformation could then be useful for the tool that’s learning what misinformation is. Wouldn’t it be great if this data could be shared and could be interoperable?
So, we went and hosted a series of workshops, at a series of conferences, I helped host one at MozFest in London last year, and basically, we gave people articles, and said is this credible is this is not credible, why do you think so? What are the things that you are noticing, and from people’s suggestions we came up with this short list of types of indicators. Things from the tone of the article, to the citations that its referencing - which are kind of content based so within the article, versus something contextual such as what are the ads on the page like, for instance. So, from that we decided to work on the definition of those indicators, collect some data around it, and release that data. And then the second question was...
Adam: If any of these findings came as a surprise?
Oh yes! Yeah, there were some interesting things, like at first, we were kind of hesitant, we had one of them [indicator] which was clickbait, and we thought, well, everyone does clickbait now, the mainstream sources, the non-mainstream sources, but actually clickbait turned out to be a pretty good indicator. Partially because I think we designed the questions so that it wasn’t a binary, it was sort of like “how click-baitey” is this title? The more “click-baitey” it is the less credible it turned out to be. And in the future, I think it would be really interesting to enrich that a bit, think about the different techniques people are doing with their headlines. Another one was ads, so it turns out the use of ads, the quantity of ads, was not really a great indicator, but sort of the aggressiveness of ad placement did turn out to be, so if it was really sort of harming the user experience of the reader, then this was sort of a strong indicator, but not if there were just ads on the page.
Adam: I was wondering if you have thought a bit about avoiding the Snopes/Politifact problem of when the people you are trying to convince or perhaps steer away from misinformation, when they accuse the fact checkers of being biased or part of the misinformation - if you have thought about that in the course of building these tools?
Yeah, absolutely! I think that’s the problem with building a centralized system, right? When you have one thing, people can just attack the thing and hopefully what we are doing is thinking about this in a more decentralized fashion: where anyone and any organization can contribute annotations, and then people could decide which annotations they actually want to see. If they trust one organization versus another, they can specify that they want to only look at those for instance. Since, you know, there are many fact checking organizations out there and they are all somewhat different.
When you have one thing, people can just attack the thing and hopefully what we are doing is thinking about this in a more decentralized fashion: where anyone and any organization can contribute annotations, and then people could decide which annotations they actually want to see. If they trust one organization versus another, they can specify that they want to only look at those for instance.
Adam: The last question in this vein is sort of transparency versus gaming by bad actors type situation where, you know, the more decentralized and sort of open you are, for example, “clickbaitiness,” so then someone who is potentially trying to spread misinformation now knows “Okay, I am going to do something that people do not find to be clickbaity.” I am wondering about that potential for gaming that comes along with being transparent.
Yeah, there is definitely some interesting tradeoffs. I think transparency is definitely key because I think an algorithm that can detect misinformation that no one can understand how it works is problematic and hard to verify. On the other hand, you have this issue with gaming. For some things, it is actually not a problem. For the example of clickbait, now people won’t write clickbait headlines, I think that’s a good thing if it means we have less sensationalist content. They’re kind of like psychological hacks, right? If we take away that toolkit from people who are trying to game someone’s feed, I think that this overall a good thing. Of course, there are other indicators, where if: “Alright, if I check off these boxes, then my thing will be deemed credible” even if it is not.
Mai, Narrating: We asked Amy what kind of responsibility she believes would then fall on the end users.
Yeah, I think on the whole it’s good to think about how we can encourage people to be better information consumers. Maybe this involves trying to get people to be closer readers, to think about something before they share it, to get them in a mindset of “true versus not true” as opposed to “agree versus not agree.” So, I think sort of things design-wise we could use to nudge people in that direction would be great. I think things such as just a score or, like, maybe even more opaque just like the filtering of the algorithm changing based of these calculations would not have as much of a benefit as something that provides more context to the user.
I think on the whole it’s good to think about how we can encourage people to be better information consumers. Maybe this involves trying to get people to be closer readers, to think about something before they share it, to get them in a mindset of “true versus not true” as opposed to “agree versus not agree.”
Mai: Because I also work on hate speech and very kind of depressing issues [Amy laughs], I was just wondering what is the most surprising thing you have encountered while working on the harassment and the misinformation field and the other one was, as I said “this research is often depressing”, what are some victories that you would like to share?
Yeah, I think one thing that I have been pleasantly surprised about is the positive feedback we had been getting back about Squadbox — that feels really good! It is like ok we are doing something. I think partially because harassment is finally getting some visibility in the world, mainstream news - I guess - that people realize it is a problem and want to do something about it which is very encouraging because that’s the first step. It’s like now we know it’s a problem, let’s start thinking about what we can do, and I think for a long time we were not yet at even that first step. Because a lot of times you will still hear, and it is less common now, but in the past you will hear these things like “Just log off”, “It’s not real, ”“It is online”, “It is all fake”, “It is not real life” and I think people are finally coming around the idea of actually it really affects your life and actually it is a real problems that we should think about. [Mai: uhum] [Silence then Amy says Yeah!] [Mai: It is really interesting, yeah!] [Amy: uhum]
I think partially because harassment is finally getting some visibility in the world, mainstream news - I guess - that people realize it is a problem and want to do something about it which is very encouraging because that’s the first step. It’s like now we know it’s a problem, let’s start thinking about what we can do, and I think for a long time we were not yet at even that first step.
Mai: Let’s hope for the best!
Adam: My question was who are some people or also organizations in this space you admire or whose work has influenced your own?
Amy: [yeah, yeah]. Lots of groups. So, in the anti-harassment space, we were very inspired by the group “Hollaback” and their tool “HeartMob”, of course. So, they are also a sort of community support group for anti-harassment. A really great non-profit that has been thinking about how to help people dealing with harassment is “Online SOS” which got started a few years ago. Oh yeah, in the misinfo space, there are a lot of things popping up, back and forth, so the group I have been working with that I did the research study with is a group called the “Credibility Coalition” and I have just been really impressed with the people that came together on that. It just really came out of just small conversations that grew and grew and grew and I am impressed with how thoughtful people have been kind of putting together and leading this effort.
Read more about our Open 2019 - '20 Fellows Call for applications!
Mai ElSherief is a 5th year Ph.D. candidate at the Computer Science department at UC, Santa Barbara. Her research interests lie at the intersection of Computer Science and computational Social Science, specifically causes of social good. Find Mai on Twitter.
Adam Nagy was an algorithms and justice research assistant at the BKC Cyber Clinic during the summer of 2018. He is currently a Project Coordinator at the Cyberlaw Clinic within the Algorithms and Justice track of the Ethics and Governance of Artificial Intelligence Initiative.