Q&A with William Li, PhD student in computer science
by Achyuth Samudrala
Can you tell us something about your experience with CSAIL, MIT and research work building up to Berkman?
I am a PhD student in computer science at MIT focused on analyzing large, text-based open government datasets. Along the way, I was a master’s student in the Technology and Policy Program at MIT, and I’ve developed a strong interest in the interplay of technology and law, politics, and public policy.
I’ve actually had a chance to collaborate with past Berkman fellows, staff, community members on a number of past projects that have turned into academic publications, and I’m looking forward to working more closely with folks at Berkman in the coming year.
Your CV mentions a lot of work in interdisciplinary fields of computer science. How do you balance the statistical aspects with the more speculative aspects of such research?
I thoroughly enjoy interdisciplinary research collaborations; learning how to speak the languages of different fields and finding areas of mutual interest are fun and rewarding. I think that the way to balance the different dimensions of a project is to remain focused on the specific question or problem of interest, and to address it using the skills and background I have.
What projects do you plan to work on as a Berkman fellow?
I’m excited about projects that focus on large collections of text documents generated by government or the processes of citizen participation. These include studying similarities and differences in legislation introduced in all 50 states, public comments on the FCC’s net neutrality rules, and other documents from the different branches of government. I’m also developing ways to algorithmically describe, summarize, or interpret large text collections. Most recently, I’ve been interested in statistical models of text reuse, i.e. how text is repeated across large sets of documents.
What do you think is going to be the biggest challenge in your research on government data sets? Can you tell us more about the datasets you would be mining?
There is a huge amount of open government data. However, while the Internet has made it easier to access these government datasets, computational techniques for journalists, outside organizations, or individual citizens to process and make sense of them have lagged considerably behind. This is a challenge I’m interested in tackling as a Berkman fellow. More generally, I think it’s important to carefully consider whether the data (text, in this case), truly models the phenomena being studied, and this will always be a challenge for data scientists.
In many situations as a data scientist, one may have to stare hard at the results to make sense out of the data. How should one try to make sure that he has authentic results instead of the results he wants to see?
In many cases, correctly applying tests of statistical significance and predictive accuracy is the right approach -- there really is an exact answer to whether the results are valid. However, there are other tasks that don’t fit nicely into these frameworks; the challenge of making sense of a dataset comes to mind. In these situations, I think that a test of “practical significance” is the answer: does the result give us something novel, insightful, or useful? What this means is certainly harder to generalize -- in my mind, it testifies to the need to engage with people across disciplines throughout the data science process.
Mary Gray, Senior Researcher at Microsoft Research
by Alyssa Smith
The last time Mary Gray got lost (she thinks) she was wandering around Redmond, Washington last month. In her experience, asking people for directions when lost can be really interesting; people may not know exactly where to point someone, but they will want to say something regardless. Since she’s an adventurous person (her sense of direction is much better out in the wilderness), she finds this process of making sense of a city very exciting. Much of her work is in fact concerned with the ways people make sense of and navigate the world they live in. In her work, she focuses on having conversations where she can "bring her humanity to bear," engaging people and learning by talking.
Gray did her undergraduate and early graduate work in anthropology, and it remains one of the "sharp instruments" she relies on in her research. She values observing people’s relationships with technology “in situ”: people, she says, will do what they do because of the context they live in and how they are perceived or want to be perceived. She also employs queer theory to explore and critique the production of norms -- the way in which historical and political forces position certain things as typical. Queer theory, she says, challenges how we think things are "imagined to be most natural" -- it's a corrective to the idea of inherent stability.
Gray’s current work on digital labor grew out of the work she’s doing on LGBT young people making sense of and expressing their identities online through mobile technology (itself an extension of the work she did from 2001-2007 on rural LGBT youth’s use of media that culminated in her 2009 book Out in the Country). Somewhere along the way, she became “consumed by” the (largely invisible) crowdwork that makes the Internet run.
As algorithms advance, they run into new limits that require human labor to run seamlessly, but the platforms that rely on these algorithms want them to feel entirely automated. Crowdwork, Gray says, provides us with that illusion of automation, but our current paradigm doesn’t value what crowdworkers do and, in fact, renders them invisible. We don’t own what we are asking workers to do.
At Berkman, she will be holding summits that bring together researchers and people who participate in digital work to come up with concrete policy recommendations and a better model of crowdwork and the platform economies that organize them. She hopes to come up with a working model of the platform economy that recognizes workers as the most valuable piece of the platform, which includes giving them a greater share of the value they generate.
She’s really excited to be at Berkman because the room (metaphorically) is full of smart people who will challenge her on technical specifications for a better crowdworking platform. These people will “take her to task productively,” helping her to become ready to take on public naysayers and pessimists. She hopes that she can, in turn, bring her expertise in queer theory and subjectivity to the table and promote the sort of thoughtful reflection and pragmatic optimism toward technology that is such an integral part of her work. She has a “goofball” sense of humor that, combined with her predilection for adventuring and bushwhacking, will make her a wonderful addition to the Berkman community.
Mary Gray was interviewed on July 20th by Alyssa Smith, a 2015 Berktern who most recently got lost trying to find her lunch in the Berkman fridge.