Difference between revisions of "Crowdsourcing"

From Identifying Difficult Problems in Cyberlaw
Jump to navigation Jump to search
Line 28: Line 28:
  
 
Much of the other literature on the subject concerns the problem of quality. Cheat detection--the ability to filter out individual who complete tasks without actually reading them, seeking only money--has recently drawn attention. Indeed, a possible crowdsourced solution to cheaters has been proposed.[http://domino.research.ibm.com/library/cyberdig.nsf/papers/A08798F3F7A3476A8525777E005C6AD2] Others have attempted to increase the quality of the traditionally-automated mechanism used to translate words by crowdsourcing translation tasks.[http://domino.research.ibm.com/library/cyberdig.nsf/papers/A08798F3F7A3476A8525777E005C6AD2] In addition to simple crowdsourcing, one set of authors suggests combining human crowdwork with machine work. This process, according to the authors, the system can specific a specific "speed-cost-quality tradeoff," which is based on an allocation of tasks among computers and humans.[http://alexquinn.org/papers/CrowdFlow,%20Integrating%20Machine%20Learning%20with%20Mechanical%20Turk%20for%20Speed-Cost-Quality%20Flexibility%20(Quinn,%20Bederson,%20Yeh,%20Lin).pdf]
 
Much of the other literature on the subject concerns the problem of quality. Cheat detection--the ability to filter out individual who complete tasks without actually reading them, seeking only money--has recently drawn attention. Indeed, a possible crowdsourced solution to cheaters has been proposed.[http://domino.research.ibm.com/library/cyberdig.nsf/papers/A08798F3F7A3476A8525777E005C6AD2] Others have attempted to increase the quality of the traditionally-automated mechanism used to translate words by crowdsourcing translation tasks.[http://domino.research.ibm.com/library/cyberdig.nsf/papers/A08798F3F7A3476A8525777E005C6AD2] In addition to simple crowdsourcing, one set of authors suggests combining human crowdwork with machine work. This process, according to the authors, the system can specific a specific "speed-cost-quality tradeoff," which is based on an allocation of tasks among computers and humans.[http://alexquinn.org/papers/CrowdFlow,%20Integrating%20Machine%20Learning%20with%20Mechanical%20Turk%20for%20Speed-Cost-Quality%20Flexibility%20(Quinn,%20Bederson,%20Yeh,%20Lin).pdf]
 +
  
 
''PROBLEMS NOT IDENTIFIED''
 
''PROBLEMS NOT IDENTIFIED''
  
The literature on crowdsourcing often discusses very broad or very specific issues. Books tend to have an overall argument about the value of crowdsourcing and how it needs to be structured. Articles, conversely, tend to describe specific studies or problems within a particular community. There is little room for systematically addressing thematic problems endemic to nearly all kinds of crowdsourcing. Instead, the problems are dealt with by the platforms themselves. 99designs--a website that allows people to solicit creative logo designs--has several policies regulating the behavior of those who request and perform work. Most crowdsourcing services have similar policies or recommendations. In January 2010, a small group of students from Harvard Law School and Stanford Law School gathered in Palo Alto for three weeks to talk about these more general problems. They produced a document of Best Practices ([[Class 3]]), which sought to identify and propose framework to address problems endemic to crowdsourced work. That document identified _ major issues:
+
The literature on crowdsourcing often discusses very broad or very specific issues. Books tend to have an overall argument about the value of crowdsourcing and how it needs to be structured. Articles, conversely, tend to describe specific studies or problems within a particular community. There is little room for systematically addressing thematic problems endemic to nearly all kinds of crowdsourcing. Instead, the problems are dealt with by the platforms themselves. 99designs--a website that allows people to solicit creative logo designs--has several policies regulating the behavior of those who request and perform work. Most crowdsourcing services have similar policies or recommendations. In January 2010, a small group of students from Harvard Law School and Stanford Law School gathered in Palo Alto for three weeks to talk about these more general problems. They produced a document of Best Practices ([[Class 3]]), which sought to identify and propose framework to address problems endemic to crowdsourced work. That document identified 6 major issues that needed to be addressed in clowdwork:
 +
 
 +
1. Disclosure: worker's want to know the identity of the employer; so disclosure should be the default preference.
 +
 
 +
2. Fairness: employers sometimes underpay, pay late, or don't pay at all; so employers should pay fair and just wages on time.
 +
 
 +
3. Feedback and Monitoring: judging the worker, task, or company is difficult for each player; so platforms should work to enable better feedback and monitoring systems.
  
1. Disclosure: worker's want to know the identity of the employer, and so disclosure should be the default preference.
 
2. Fairness: employers sometimes underpay, pay late, or don't pay at all, and so employers should pay fair and just wages on time.
 
3. Feedback and Monitoring: judging the worker, task, or company is difficult for each player, so platforms should work to enable better feedback and monitoring systems.
 
 
4. Healthy Work Environment: workers face the risks of stress from repetition, alienation and isolation, and addiction; so platforms should explain risks and companies should implement strategies to reduce risks.  
 
4. Healthy Work Environment: workers face the risks of stress from repetition, alienation and isolation, and addiction; so platforms should explain risks and companies should implement strategies to reduce risks.  
 +
 
5. Reputation and Portability: workers who do good (or bad) work cannot capitalize on (and employers cannot avoid) their work; so platforms and companies should work to keep records of worker information and use it to track performance and confirm identities.
 
5. Reputation and Portability: workers who do good (or bad) work cannot capitalize on (and employers cannot avoid) their work; so platforms and companies should work to keep records of worker information and use it to track performance and confirm identities.
 +
 
6. Privacy Protection: workers are concerned with employers sharing their (potentially sensitive) information; so platforms should protect information and not release it.  
 
6. Privacy Protection: workers are concerned with employers sharing their (potentially sensitive) information; so platforms should protect information and not release it.  
  
 
The best practices provide a nice starting point because they identify several major issues common to all crowdsourcing problems. It does not, however, capture all potential problems. Additionally, it tends to focus concerns only on the workers, when there are problems for both the platforms and companies. Additionally, because the document is meant as a general framework, it is hard to get a sense of whether it could be implemented with any effect.
 
The best practices provide a nice starting point because they identify several major issues common to all crowdsourcing problems. It does not, however, capture all potential problems. Additionally, it tends to focus concerns only on the workers, when there are problems for both the platforms and companies. Additionally, because the document is meant as a general framework, it is hard to get a sense of whether it could be implemented with any effect.
 +
 +
== Our Addition: Identifying Problem Areas, Exploring the Problems ==
 +
 +
Given the body of literature and the Best Practices document, we found the idea of addressing systemic problems both attractive and difficult. Instead of replicating the best practices, or simply writing an overview of crowdsourcing, we decided to take a different angle. Instead of classifying problems generally and then working downward, applying them to different types of crowdwork, we worked from the top down. We identified three types of crowdwork that suggested a variety of important problems. At the beginning stages, we had only our intuition to guide our "sense" of the problems. As we delved further into them, however, they crystalized. From our discussions we identified three types of crowdwork in which specific problems arise, some of which systemic problems with crowdsourcing that the Best Practices does not address. Nevertheless, we wanted to draw on the Best Practices document to determine whether some of its strategies seemed workable or needed to be expanded, refined, or discarded.
 +
 +
 +
THE CROWDSOURCING SUBJECTS AND PROBLEMS
 +
  
 
== A Framework For Analyzing Issues in Crowdsourcing ==
 
== A Framework For Analyzing Issues in Crowdsourcing ==

Revision as of 18:18, 24 November 2010

Crowdsourcing: Background and Working Definitions

Definition: Although crowdsourcing can have many meanings, we define it here to mean breaking down large tasks into small ones that can be performed asynchronously.

  • The Best Practices entry for crowdwork, developed last year and reposted on Class 3, classifies crowdwork three ways:

First, a large group of workers may do microtasks to complete a whole project; the best-known platform in this arena is Amazon Mechanical Turk. Second, companies may use cloudwork platforms to connect with individual workers, or a small group of workers, who then complete larger jobs (e.g., Elance and oDesk). Finally, a company may run “contests,” where numerous workers complete a task and only the speediest or best worker is paid (e.g., InnoCentive and Worth1000). In some contests, the company commits to picking at least one winner; in others, there is no such guarantee.

  • General Information on Crowdsourcing.
    • For a quick overview by Jeff Howe, author of Crowdsourcing,[1] take a look at this YouTube clip.[2]
    • Northwestern University Professor Kris Hammond also explains crowdsourcing, but argues its downsides are worker rewards and quality.[3]
    • Our very own Jonathan Zittrain discusses crowdsourcing in his talk, Minds for Sale.[4]
    • Several individuals gathered to discuss crowdsourcing in panel moderated by New York Times correspondent Brad Stone.[5]
  • In the News.
    • The New York Times recently ran an article on crowdsourcing featuring two crowdsourcing companies:[6] Microtask[7] and CloudCrowd.[8]
    • It's interesting to note that these companies are attempting to monetize crowdsourcing in exactly the way in which Howe says it cannot be monetized successfully.

Crowdsourcing Literature

GENERAL OVERVIEW

Although the idea of crowdsourcing has been around for many years, the Internet has made it much easier, cheaper, and efficient to harness the power of crowds. The power of crowds was popularized in 2005, James Surowiecki published a book entitled, The Wisdom of Crowds, which purported to show how large groups of people can, in many cases, be more effective at solving problems than specialists.[9]. The following year, journalist Jeff Howe, coined the phrase "crowdsourcing" to refer to work that was performed by the "masses" online.[10] Since Howe's article was published in 2006, numerous authors have written books on crowdsourcing, each choosing to focus on different aspects of the topic. Howe himself took up the topic in 2008, proclaiming crowdsourcing to be a panecea--a place were a perfect meritocracy could thrive.[[11] Howe examined crowdsourcing from a variety of perspectives: what benefits it can provide, what kinds of tasks it can accomplish, and the potential changes it may bring about. Howe's diagnosis of crowdsourcing was positive--in it he saw many potential solutions and few potential problems. Others have followed Howe's lead in describing the benefits of crowdsourced work. Clay Shirky has published two books--Here Comes Everybody (2008)[12] and Cognitive Surplus (2010)[13]--in which he describes how technology does more than enable new tools, it also enables consumers to become collaborators and producers. Although Shirky's book are not expressly about crowdsourcing per se, they mirror the optimism Howe expresses, both in terms of collaborative enterprises and the Internet's power to enable them.


While some focused on the potential consumer revolution, others examined the business-related aspects of crowdsourcing. In Groundswell (2008), Charlene Li and Josh Bernoff focus on how to most effectively use crowdsourcing to advantage businesses. The authors highlight how users bases of products can undermine a product or brand.[14] As a result, the authors propose businesses use the "groundswell" to their advantage, fostering communities that can provide valuable feedback and economic payoffs. Marion K. Poetz and Martin Schreier also have taken a business perspective on crowdsourcing.[15] They argue that the crowd is capable of producing valuable (but not always viable) business ideas at a low cost. They suggest future research to better understand their findings.


Other authors have pointed out some of the problems with crowdsourcing. Dr. Mathieu O'Neil has argued that, despite its benefits, crowdsourcing is inconsistent in quality, can lack the diversity, and can contain many irresponsible actors.[16] Miram Cherry has argued that some crowdwork can be exploitative, sometimes forcing people to work for absurdly low wages.[17] She argues that we need a legal framework for addressing low wages, proposing we apply the Fair Labor Standards Act (FLSA) to crowdsourced work like that found on Mechanical Turk. In a forthcoming article, she takes a more systematic (but still legal) approach to different kinds of virtual work.[18] Cherry seems to be the only law professor to have written on addressing crowdsourcing from a doctrinal perspective.


Much of the other literature on the subject concerns the problem of quality. Cheat detection--the ability to filter out individual who complete tasks without actually reading them, seeking only money--has recently drawn attention. Indeed, a possible crowdsourced solution to cheaters has been proposed.[19] Others have attempted to increase the quality of the traditionally-automated mechanism used to translate words by crowdsourcing translation tasks.[20] In addition to simple crowdsourcing, one set of authors suggests combining human crowdwork with machine work. This process, according to the authors, the system can specific a specific "speed-cost-quality tradeoff," which is based on an allocation of tasks among computers and humans.[21]


PROBLEMS NOT IDENTIFIED

The literature on crowdsourcing often discusses very broad or very specific issues. Books tend to have an overall argument about the value of crowdsourcing and how it needs to be structured. Articles, conversely, tend to describe specific studies or problems within a particular community. There is little room for systematically addressing thematic problems endemic to nearly all kinds of crowdsourcing. Instead, the problems are dealt with by the platforms themselves. 99designs--a website that allows people to solicit creative logo designs--has several policies regulating the behavior of those who request and perform work. Most crowdsourcing services have similar policies or recommendations. In January 2010, a small group of students from Harvard Law School and Stanford Law School gathered in Palo Alto for three weeks to talk about these more general problems. They produced a document of Best Practices (Class 3), which sought to identify and propose framework to address problems endemic to crowdsourced work. That document identified 6 major issues that needed to be addressed in clowdwork:

1. Disclosure: worker's want to know the identity of the employer; so disclosure should be the default preference.

2. Fairness: employers sometimes underpay, pay late, or don't pay at all; so employers should pay fair and just wages on time.

3. Feedback and Monitoring: judging the worker, task, or company is difficult for each player; so platforms should work to enable better feedback and monitoring systems.

4. Healthy Work Environment: workers face the risks of stress from repetition, alienation and isolation, and addiction; so platforms should explain risks and companies should implement strategies to reduce risks.

5. Reputation and Portability: workers who do good (or bad) work cannot capitalize on (and employers cannot avoid) their work; so platforms and companies should work to keep records of worker information and use it to track performance and confirm identities.

6. Privacy Protection: workers are concerned with employers sharing their (potentially sensitive) information; so platforms should protect information and not release it.

The best practices provide a nice starting point because they identify several major issues common to all crowdsourcing problems. It does not, however, capture all potential problems. Additionally, it tends to focus concerns only on the workers, when there are problems for both the platforms and companies. Additionally, because the document is meant as a general framework, it is hard to get a sense of whether it could be implemented with any effect.

Our Addition: Identifying Problem Areas, Exploring the Problems

Given the body of literature and the Best Practices document, we found the idea of addressing systemic problems both attractive and difficult. Instead of replicating the best practices, or simply writing an overview of crowdsourcing, we decided to take a different angle. Instead of classifying problems generally and then working downward, applying them to different types of crowdwork, we worked from the top down. We identified three types of crowdwork that suggested a variety of important problems. At the beginning stages, we had only our intuition to guide our "sense" of the problems. As we delved further into them, however, they crystalized. From our discussions we identified three types of crowdwork in which specific problems arise, some of which systemic problems with crowdsourcing that the Best Practices does not address. Nevertheless, we wanted to draw on the Best Practices document to determine whether some of its strategies seemed workable or needed to be expanded, refined, or discarded.


THE CROWDSOURCING SUBJECTS AND PROBLEMS


A Framework For Analyzing Issues in Crowdsourcing

1. How do concerns of reputation and identity play into crowdsourced work quality?

  • tasks where want rep known, others not known
  • phone card/coupon system
  • Verification of workers is becoming a problem (can access the linked article through Harvard Library).[22]

2. Can we ensure work quality using (semi)automated mechanisms?

  • Some have attempted to use crowdsourcing to ensure quality on crowsourced tasks using cheat detection mechanisms.[23] This can be done for both routine and complex tasks.

3. Can we enhance work quality using a targeting system

  • Amazon rec, ebay sytle, MT?, differentiate tasks?