Crowdsourcing: Background and Working Definitions
Definition: Although crowdsourcing can have many meanings, we define it here to mean breaking down large tasks into small ones that can be performed asynchronously.
- The Best Practices entry for crowdwork, developed last year and reposted on Class 3, classifies crowdwork three ways:
First, a large group of workers may do microtasks to complete a whole project; the best-known platform in this arena is Amazon Mechanical Turk. Second, companies may use cloudwork platforms to connect with individual workers, or a small group of workers, who then complete larger jobs (e.g., Elance and oDesk). Finally, a company may run “contests,” where numerous workers complete a task and only the speediest or best worker is paid (e.g., InnoCentive and Worth1000). In some contests, the company commits to picking at least one winner; in others, there is no such guarantee.
General Information on Crowdsouring
- For a quick overview by Jeff Howe, author of Crowdsourcing, take a look at this YouTube clip.
- Northwestern University Professor Kris Hammond also explains crowdsourcing, but argues its downsides are worker rewards and quality.
- Our very own Jonathan Zittrain discusses crowdsourcing in his talk, Minds for Sale.
- Several individuals gathered to discuss crowdsourcing in panel moderated by New York Times correspondent Brad Stone.
- In the News.
Although the idea of crowdsourcing has been around for many years, the Internet has made it much easier, cheaper, and efficient to harness the power of crowds. The power of crowds was popularized in 2005 when James Surowiecki published a book entitled, The Wisdom of Crowds. This book purported to show how large groups of people can, in many cases, be more effective at solving problems than specialists.. The following year, journalist Jeff Howe, coined the phrase "crowdsourcing" to refer to work that was performed by the "masses" online. Since Howe's article was published in 2006, numerous authors have written books on crowdsourcing, each choosing to focus on different aspects of the topic. Howe himself took up the topic in 2008, proclaiming crowdsourcing to be a panecea--a place were a perfect meritocracy could thrive.[ Howe examined crowdsourcing from a variety of perspectives: what benefits it can provide, what kinds of tasks it can accomplish, and the potential changes it may bring about. Howe's diagnosis of crowdsourcing was positive--in it he saw many potential solutions and few potential problems. Others have followed Howe's lead in describing the benefits of crowdsourced work. Clay Shirky has published two books--Here Comes Everybody (2008) and Cognitive Surplus (2010)--in which he describes how technology does more than enable new tools, it also enables consumers to become collaborators and producers. Although Shirky's book are not expressly about crowdsourcing per se, they mirror the optimism Howe expresses, both in terms of collaborative enterprises and the Internet's power to enable them.
These books have provoked an academic interest in finding out who is the crowd, or why the crowd moves the way it does. Some have looked at scientific crowdsourcing, asking what characteristics make someone a successful crowdworker/problem-solver. Part of answering that question, it turns out, asking why people attempt to be part of the innovating crowd in the first place. The authors of this study found that the crowd was highly educated. It also found that heterogeneity in scientific interests, as well as monetary and intrinsic motivations to be important drivers of "good" problem-solvers. Others have examined non-scientific endeavors and asked similar questions. This report also found that most crowdworkers developing photographs were highly educated and motivated primarily by money.
While some focused on the potential consumer revolution or the composition of the crowd, others examined the business-related aspects of crowdsourcing. In Groundswell (2008), Charlene Li and Josh Bernoff focus on how to most effectively use crowdsourcing to advantage businesses. The authors highlight how users bases of products can undermine a product or brand. As a result, the authors propose businesses use the "groundswell" to their advantage, fostering communities that can provide valuable feedback and economic payoffs. Marion K. Poetz and Martin Schreier also have taken a business perspective on crowdsourcing. They argue that the crowd is capable of producing valuable (but not always viable) business ideas at a low cost. They suggest future research to better understand their findings.
Other authors have pointed out some of the problems with crowdsourcing. Dr. Mathieu O'Neil has argued that, despite its benefits, crowdsourcing is inconsistent in quality, can lack the diversity, and can contain many irresponsible actors. Miram Cherry has argued that some crowdwork can be exploitative, sometimes forcing people to work for absurdly low wages. She argues that we need a legal framework for addressing low wages, proposing we apply the Fair Labor Standards Act (FLSA) to crowdsourced work like that found on Mechanical Turk. In a forthcoming article, she takes a more systematic (but still legal) approach to different kinds of virtual work. Cherry seems to be the only law professor to have written on addressing crowdsourcing from a doctrinal perspective.
Much of the other literature on the subject concerns the problem of quality. Cheat detection--the ability to filter out individual who complete tasks without actually reading them, seeking only money--has recently drawn attention. Indeed, a possible crowdsourced solution to cheaters has been proposed. Others have attempted to increase the quality of the traditionally-automated mechanism used to translate words by crowdsourcing translation tasks. In addition to simple crowdsourcing, one set of authors suggests combining human crowdwork with machine work. This process, according to the authors, the system can specific a specific "speed-cost-quality tradeoff," which is based on an allocation of tasks among computers and humans. John J. Horton, David Rand, and Richard Zeckauser have addressed using the crowd for quality experimental research.
The literature on crowdsourcing often discusses broad or specific issues. Books tend to have an overall argument about the value of crowdsourcing, its core attributes, and how it needs to be structured. Articles, conversely, tend to describe specific studies or problems within a particular community. There is little room for systematically addressing thematic problems endemic to nearly all kinds of crowdsourcing. Instead, the problems are dealt with by the platforms themselves. 99designs--a website that allows people to solicit creative logo designs--has several policies regulating the behavior of those who request and perform work. Most crowdsourcing services have similar policies or recommendations. In January 2010, a small group of students from Harvard Law School and Stanford Law School gathered in Palo Alto for three weeks to talk about these more general problems. They produced a document of Best Practices (Class 3), which sought to identify and propose framework to address problems endemic to crowdsourced work. That document identified 6 major issues that needed to be addressed in clowdwork:
1. Disclosure: worker's want to know the identity of the employer; so disclosure should be the default preference.
2. Fairness: employers sometimes underpay, pay late, or don't pay at all; so employers should pay fair and just wages on time.
3. Feedback and Monitoring: judging the worker, task, or company is difficult for each player; so platforms should work to enable better feedback and monitoring systems.
4. Healthy Work Environment: workers face the risks of stress from repetition, alienation and isolation, and addiction; so platforms should explain risks and companies should implement strategies to reduce risks.
5. Reputation and Portability: workers who do good (or bad) work cannot capitalize on (and employers cannot avoid) their work; so platforms and companies should work to keep records of worker information and use it to track performance and confirm identities.
6. Privacy Protection: workers are concerned with employers sharing their (potentially sensitive) information; so platforms should protect information and not release it.
The best practices provide a nice starting point because they identify several major issues common to all crowdsourcing problems. It does not, however, capture all potential problems. Additionally, it tends to focus concerns only on the workers; but platforms and companies also face (similar) problems. Additionally, because the document is meant as a general framework, it is hard to get a sense of whether it could be effectively implemented. There is room, then, to explore problems that are both broad enough to have implications for a variety of actors, but specific enough to merit a context-specific solution.
Our Addition: Identifying Areas, Exploring the Problems
Given the body of literature and the Best Practices document, we found the idea of addressing systemic problems both attractive and difficult. Instead of replicating the Best Practices, or simply writing an overview of crowdsourcing, we decided to take a different angle. Unlike the Best Practices document, which classified problems generally and then worked downward to devise specific solutions by applying them to different types of crowdwork, we worked from the top down. We identified three types of crowdwork that suggested a variety of important, but (context-)specific problems. At the beginning stages, we had only our intuition to guide our "sense" of the problems. As we delved further into them, however, they crystalized. From our discussions we identified three types of crowdwork in which specific problems arise, some of which systemic problems with crowdsourcing that the Best Practices does not address. Nevertheless, we wanted to draw on the Best Practices document to determine whether some of its strategies seemed workable or needed to be expanded, refined, or discarded. To accomplish this goal, we attempted to integrate the Best Practices approaches into our framing of both the problems and the solutions we discussed.
An Introduction to Our Approach
Our discussion of various crowdsourcing environment suggested a variety of ways to slice the pie. In the end, we settled on 3 areas of crowdsourcing, reaching a rough classification based on the type of work performed. In that sense, our division followed the Best Practices division of work into microtasks, connective tasks, and contest tasks. But there was an important difference: our classification of work depended also upon the purpose for which the work was being put, focusing on a specific case study for each. In other words, it mattered to us that one task was framed as a "game" versus a "survey." We cared not just about the framing, but the motives of employer and the worker. We asked questions like, "For what purpose is the employer requesting this task?" and "Why does the worker choose to perform the task?" Not to be misleading, our questions were not systematic and categorical. We didn't analyze every kind of crowdwork using motive and purpose. Rather, these questions provided a general framing for dividing crowdwork into analytical categories--places where we could identify specific problems that may differ depending on the answers to these questions. After significant discussion, we settled on three types of tasks, choosing a case study to explore each one:
1. Microtasks, Amazon's Mechanical Turk;
2. Tasks requiring "professional" skills, 99designs; and
3. "Game" tasks, Gwap.
Given that we framed our topics on a "sense" of latent problems, we had to determine whether these intuited problems actually existed. Thus, for each of these tasks we attempted to identify salient "problems": issues that cause concern for workers, employers, platforms, businesses, or society generally. In identifying problems, we had two goals. The first was to provide a set of new issues for others to build upon in future work. The second was to explore a small number of issues and propose our own context-specific solutions. In this sense, it was an exercise in both applying the Best Practices and inventing new solutions that either context or framing prevented the Best Practices from solving. In what follows, we explain each topic, the problems it presents, and specific solutions to selected problems. Although we think the solutions we propose have some teeth, they are not meant to be final. Indeed, our goal in presenting these solutions and problems is to provide a base from which others can build.