Difference between revisions of "Future of Wikipedia"

From Cyberlaw: Difficult Issues Winter 2010
Jump to navigation Jump to search
Line 48: Line 48:
====Identity & Bias:====
====Identity & Bias:====
Studies on Wikipedia's contributing population determine that the majority are white males (Oded Nov says 92.7% male, another [http://blogs.wsj.com/digits/2009/08/31/only-13-of-wikipedia-contributors-are-women-study-says/ 87%]). If this is the case, does Wikipedia truly represent an unbiased cross-section of global (or even American) knowledge? How does the identity of the contributing community bias Wikipedia regarding politics? Consider claims that Wikipedia needs to be further censored or is being [http://state-ethics.blogspot.com/2009/07/censoring-wikipedia-2.html manipulated by Nazis] seeking to control the flow of information in Germany.  
Studies on Wikipedia's contributing population determine that the majority are white males (Oded Nov says 92.7% male, another [http://blogs.wsj.com/digits/2009/08/31/only-13-of-wikipedia-contributors-are-women-study-says/ 87%]). If this is the case, does Wikipedia truly represent an unbiased cross-section of global (or even American) knowledge? How does the identity of the contributing community bias Wikipedia regarding politics? Consider claims that Wikipedia needs to be further censored or is being [http://state-ethics.blogspot.com/2009/07/censoring-wikipedia-2.html manipulated by Nazis] seeking to control the flow of information in Germany, or that its editors are [http://www.conservapedia.com/Conservapedia:How_Conservapedia_Differs_from_Wikipedia far more liberal than the American public].

Revision as of 00:16, 19 December 2009


Wikipedia was formally launched on January 15, 2001, by Jimmy Wales and Larry Sanger.[1] It represented a new development in the collaborative, web-based creation of bodies of knowledge. Initially it was a complement to the expert-written encyclopedia project “Nupedia,”[2] in order to provide an additional source of articles. Wikipedia soon outpaced Nupedia and grew to be arguably the most successful example of collaborative content creation. Today Wikipedia boasts that it contains several million articles and pages in hundreds of languages worldwide contributed by millions of users.

Wikipedia is arguably the most successful online collaboration but it is not the first. One early predecessor was Interpedia, initiated in 1993,[3] although the project never fully left the planning stages.[4] Free Software Foundation’s Richard Stallman described the need for a free universal encyclopedia in 1999, although the Free Software Foundation didn’t launch its GNUPedia to compete with Nupedia until January 17, 2001, two days after the start of Wikipedia.[5] And Wikipedia itself grew out of Nupedia, an online collaborative encyclopedia. On January 10, 2001, Wales and Sanger created the first Nupedia wiki, but reputedly Nupedia’s expert volunteers did not want to participate, so Wikipedia was established as a separate site.[1] Wikipedia’s vision: Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.[6]

Growth of Wikipedia

The growth of Wikipedia depended on the contribution of numerous lay users, a departure from the Nupedia tradition of using expert contributors. Nupedia was founded upon the use of highly qualified expert contributors and a multi-step peer review process, but despite its interested editors, the process was slow, and only 12 articles were written in the first year.[7] Wikipedia, in contrast, generated over 1,000 articles in its first month of operation and over 20,000 articles in its first year—a rate of 1,500 articles per month.[1] In September, 2001, Wikipedia expanded into multilingual sites, beginning the development of Wikipedias for all major languages.


Initially, Wikipedia was managed by Bomis, an organization headed by Jimmy Wales. In March 2002, during the dot-com bust, Bomis withdrew funding for Wikipedia.[8] At that time, Larry Sanger left both Nupedia and Wikipedia. He returned briefly to academia, then joined the Digital Universe Foundation and founded Citizendium, an alternative open encyclopedia that uses real names for contributors to discourage vandalism and expert guidance to ensure accuracy of information.[9]

Meanwhile, Jimmy Wales created the Wikimedia Foundation, a non-profit charitable organization head-quartered in San Francisco, CA.[6] Wikimedia was announced on June 20, 2003. Wikimedia serves as an umbrella body that includes several other types of wiki collaborative information sharing sites:

The foundation's by-laws declare a statement of purpose of collecting and developing educational content and to disseminate it effectively and globally.[10] Wikimedia is managed by a Board of Trustees. The Foundation’s board also organizes Wikimania every year, a conference for users of the Wikimedia Foundation projects.


Academic studies of Wikipedia have mainly used Wikipedia as a tool to analyze other phenomenon. The users on Wikipedia provide a large database of subjects which the researchers use to test their hypotheses or as a social network which can be manipulated and observed. The majority of studies focus on either semantic relatedness[11][12][13]or online coordination and conflict resolution techniques.[14][15][16]


Following Wikimania 2009, the Wikimedia Foundation created a strategy page that identifies major concerns for the future of Wikimedia and permits users to contribute and comment on proposed solutions. The problems presented below have been highlighted as the most significant and challenging problems facing Wikimedia.

Identity & Growth of the Contributing Community

There are three main concerns relating to the contributing community that sustains Wikipedia:

  1. Size of the contributing community – is it sustainable and is it sufficient?
  2. Identity of the contributing community – does population bias create content bias?
  3. Inequality within contributing community – does Wikipedia really represent contributions of the many, or is it moving towards an elite system?


Several studies and articles have suggested that Wikipedia's contributing community has slowed growth, stopped growing, or is even declining (see Battle for Wikipedia's Soul"; "Slowing Growth of Wikipedia"; or "Volunteers Log Off as Wikipedia Ages" for a sample). Others, such as Oded Nov, have looked at "What Motivates Wikipedians" and concluded that the majority are motivated by fun. But is Wikipedia "fun" enough to maintain it's contributing community? A study by the Palo Alto Research Centre found that the number of new articles added per month flatlined at 60,000 in 2006 and has since declined by a third. Wikimedia Australia's Vice-President, Liam Wyatt explains this: "Because the project is much more filled out and more complete, it's increasingly harder for new users to be able to add something without some level of expertise."

Identity & Bias:

Studies on Wikipedia's contributing population determine that the majority are white males (Oded Nov says 92.7% male, another 87%). If this is the case, does Wikipedia truly represent an unbiased cross-section of global (or even American) knowledge? How does the identity of the contributing community bias Wikipedia regarding politics? Consider claims that Wikipedia needs to be further censored or is being manipulated by Nazis seeking to control the flow of information in Germany, or that its editors are far more liberal than the American public.


Within the Wikipedia contributing community, there has been a rapid divide between "contributors" and "editors", with editors determining much of the style, tone and occasionally content of articles. One study found that “elite users” were pushing out new contributors, with 25% of occasional wiki editors’ changes being erased or reverted by established editors. This was up from 10% in 2003.[17].

Quality Control - Perceived and Actual

It is important to distinguish between concerns about the actual quality of Wikipedia articles and concerns about the perceived quality of the articles. The one should be approached as a contributor and technical problem and the other should be addressed as a publicity problem. Also, the concept of quality is intentionally broad and includes everything from accuracy of information to degree of citation provided to the quality of images and prose.

Actual Quality of Wikipedia

On October 24, 2005, The Guardian published an article entitled "Can you trust Wikipedia?" where a panel of experts were asked to critically review seven entries related to their fields.[18] One article was deemed to have made "every value judgement... wrong", the others receiving marks from 5 to 8 out of a notional ten. Of the other six articles reviewed and critiqued, the most common criticisms were:

  1. Poor prose, or ease-of-reading issues (3 mentions)
  2. Omissions or inaccuracies, often small but including key omissions in some articles (3 mentions)
  3. Poor balance, with less important areas being given more attention and vice versa (1 mention)

The most common praises were:

  1. Factually sound and correct, no glaring inaccuracies (4 mentions)
  2. Much useful information, including well selected links, making it possible to "access much information quickly" (3 mentions)

Nature reported in 2005 that science articles in Wikipedia were comparable in accuracy to those on Encyclopedia Britannica's web site. Out of 42 articles, only 4 serious errors were found in Wikipedia, and 4 in Encyclopedia Britannica, although more than a hundred lesser errors and omissions were found in each and Wikipedia's articles were often "poorly structured."[19]

On March 24, 2006, Britannica provided a rebuttal of this article, labeling it "fatally flawed",[20] to which Nature responded.[21]

Among Britannica's criticisms were that excerpts rather than the full texts of some of their articles were used, that Nature composited parts of different Britannica texts to make a text for review in one case, that Nature did not check the factual assertions of its reviewers, and that many points which the reviewers labeled as errors were differences of editorial opinion. Nature responded that any errors on the part of its reviewers were not biased in favor of either encyclopedia, that in some cases it used excerpts of articles from both encyclopedias, and that Britannica did not share particular concerns with Nature before publishing its "open letter" rebuttal.

Three subsequent studies--a 2006 web-based survey,[22] a 2004 comparison of Brockhaus Multimedial, Microsoft Encarta, and the German Wikipedia, [23] (repeated in 2007 [24]), and a 2007 review by Australian magazine PC Authority[25]--concluded that Wikipedia was generally as reliable as other traditional Encyclopedias.

However, Wikipedia may not be as reliable in technical or specialized fields. A peer-reviewed 2008 study[26]examined 80 Wikipedia drug entries. The research team found few factual errors but determined that these articles were often missing important information, like contraindications and drug interactions. One of the researchers noted that "If people went and used this as a sole or authoritative source without contacting a health professional...those are the types of negative impacts that can occur." The researchers also compared Wikipedia to Medscape Drug Reference (MDR), by looking for answers to 80 different questions covering eight categories of drug information, including adverse drug events, dosages, and mechanism of action. They have determined that MDR provided answers to 82.5 percent of the questions, while Wikipedia could only answer 40 percent, and that answers were less likely to be complete for Wikipedia as well. None of the answers from Wikipedia were determined factually inaccurate, while they found four inaccurate answers in MDR. But the researchers found 48 errors of omission in the Wikipedia entries, compared to 14 for MDR. The lead investigator concluded: "I think that these errors of omission can be just as dangerous [as inaccuracies]", and he pointed out that drug company representatives have been caught deleting information from Wikipedia entries that make their drugs look unsafe.

In addition to these potential omissions (or purposeful deletions), the structure of Wikipedia lends itself to several potential vulnerabilities:

  1. Information citation loops
  2. Vandalism
  3. Anonymity of authors lending to false information (see e.g. the Essjay controversy

Perceived Quality of Wikipedia

Whether or not Wikipedia actually is accurate, its reception as a trusted source has been plagued by doubts regarding the trustworthiness of its content as the product of mass collaboration by anonymous authors.

The perception of Wikipedia in the average population is relatively high. In a web-based survey conducted in spring 2006, fifty participants rated Wikipedia articles: 76% agreed that the article was accurate, and 46% agreed it was complete. The same survey compared Wikipedia to Encyclopedia Britannica: of 18 responses, 6 favored Britannica, 7 favored Wikipedia, and 11 found Wikipedia more complete.[27]

However, Wikipedia's reception by academia has been less than stellar.

Even if Wikipedia itself doesn’t intend to be used as a course for academic works, it is often used by students and researchers as a starting point. However, the open-source collaborative and anonymous efforts that produce Wikipedia have led to wide-spread skepticism of its accuracy. Most of the angry responses targeted at Wikipedia have been aimed at its claim to be an encyclopedia. Such claims are thought to establish greater expectations of accuracy than are or possibly can be achieved by non-expert collaboration. Academics have also criticized Wikipedia for its perceived failure as a reliable source, and because Wikipedia editors may not have degrees or other credentials generally recognized in academia.

Robert McHenry, a former editor-in-chief for the Encyclopedia Britannica, describes Wikipedia as the “Faith-Based Encyclopedia.” He describes the “crucial and entirely faith-based step” in the Wikipedia process: “Some unspecified quasi-Darwinian process will assure that those writings and editings by contributors of greatest expertise will survive; articles will eventually reach a steady state that corresponds to the highest degree of accuracy.” This step, he argues, is a completely unwarranted leap of faith. Rather, “Contrary to the faith, the article has, in fact, been edited into mediocrity.”[28]

Andrew Orlowski accuses Wikipedia of being a “vanity exercise” for calling itself an encyclopedia, and writes that the use of the term "encyclopedia" to describe Wikipedia may lead users into believing it is more reliable than it may be. He points out (describing a libel case against Wikipedia) that “If what we today know as "Wikipedia" had started life as something called, let's say - "Jimbo's Big Bag O'Trivia" - we doubt if it would be the problem it has become.” The public begins to expect trustworthy information from Wikipedia and instead gets a “king-sized cocktail” of bureaucracy and “spontaneous graffiti.” [29]

Middlebury College went so far as to ban the citation of Wikipedia in papers in its history department. On this note, however, consider the fact that Wikipedia itself states in its guidelines that Wikipedia is not suitable for academic citation because Wikipedia, like any encyclopedia, is a tertiary source. The use of Wikipedia is not accepted in many schools and universities in writing a formal paper, and some educational institutions have banned it as a primary source while others have limited its use to only a pointer to external sources.&#91;30&#93;&#91;31&#93;Cite error: Closing </ref> missing for <ref> tag

Improving Wikipedia's Perceived Accuracy

One study presented at the 2008 ACM Conference on Computer Supported Work explored whether a visualizations system could improve readers’ perceptions of trustworthiness in a wiki by exposing hidden article information.&#91;32&#93; The results suggest that surfacing information that is relevant to the stability of the article and patterns of editor behavior can have a significant impact on users’ trust. This should be considered in conjunction for proposals on color-coding articles by age, editing contribution etc that are being considered to improve article accuracy.

Other suggestions include:

  • Reputation-based text coloring. Each article could display a button labeled "check text reputation": upon clicking the button, a user would be led to a copy of the page, where the text background color reflects the reputation of the author of each portion of text, as well as the reputation of authors who vetted the text, editing the page while leaving the text in place. The appeal of this method is that reputation is displayed in an anonymous way, associated to the article text. This avoids placing blame or praise directly on the authors: the impersonal character of this feedback could be well-suited to a collaborative forum such as the Wikipedia.&#91;33&#93;
  • Restricting edits. Highly controversial articles could be protected, so that only authors with sufficiently high reputation are able to edit them. This is currently employed by Wikipedia as part of its Protection Policy but it could be expanded.
  • Reputation-based alert system. Wikipedia Editors keep a watchful eye on most controversial articles, and in fact, on a large portion of the Wikipedia, improving content and undoing poor-quality revisions. A reputation system could be used to alert them whenever a crucial or controversial article is modified by a low-reputation author. A reputation system provides an incentive for high-quality contributions. A reputation system could provide an additional incentive for authors to provide high quality contributions to the Wikipedia.&#91;34&#93;
  • Content-Driven Reputation system. Study by Adler & de Alfaro proposes a content-driven reputation system for Wikipedia to allow readers to determine reliability of an article based on the reputation of the contributors and editors. The reputation of authors would be based on how their contribution to Wikipedia fares: the longer an article or edit remains un-edited or un-altered, the better the author’s reputation. This can be, however, much less accurate than a user-driven reputation system. Author contributions can be deleted for a variety of reasons, including reorganizations and thorough rewrites of the articles. Alder & Alfaro address these issues in that the reputation of authors whose edits are reverted to the original text suffers; reputation of authors whose edits are further refined later on do not suffer.&#91;35&#93;
  • Zeng et al. also propose a mechanism wherein the revision history of the Wikipedia article is used to compute a trust value for the article.&#91;36&#93;
  • It could also prove interesting to explore combinations of user and content reputation devices.

Sustainability of Wikimedia Model

The strategy discussions at Wikimania 2009 raised the question of whether Wikimedia, as it stands today, is sustainable: both from a technological and organizational standpoint.

  • Is a platform that both supports numerous users and serves less tech savvy contributors possible?
  • How can Wikimedia ensure its financial stability?
  • How can Wikimedia re-structure its institutional organization to allow oversight without creating too many levels of hierarchy such that the bureaucracy becomes ungainly?

Emerging Strategic Priorities in this area include:

    • Optimize Wikimedia’s operations
    • Identify roles volunteers are best suited to perform and what are the most effective uses of paid staff
    • Create alliances and partnerships with other institutions and organizations to advance the mission: also, what are the necessary preconditions to such alliances? How support similar projects?

Expansion & Questions of Scope

Since the founding of Wikipedia in 2001, there has been substantial growth in user-generated online content.&#91;37&#93;&#91;38&#93; According to one Nielsen rating, user-generated content drives 50% of the top fastest growing internet brands.&#91;39&#93; Consider just the popularity of collaborative site such as YouTube, Flickr, or Slashdot.org. Traditional media outlets such as BBC News.com have also added areas for collaboration.&#91;40&#93; User-generated content appears to be the way forward – but is Wikipedia a good model upon which to base that progress? Can the system used for Wikipedia be applied in other scenarios?


  1. 1.0 1.1 1.2 [1],History of Wikipedia. Cite error: Invalid <ref> tag; name "History of Wikipedia" defined multiple times with different content Cite error: Invalid <ref> tag; name "History of Wikipedia" defined multiple times with different content
  2. [2], Wikipedia Entry on Nupedia.
  3. [3], Wikipedia Entry on Interpedia
  4. [4], Joseph Reagle Article on Interpedia & Wikipedia Background.
  5. [5],The Free Universal Encyclopedia and Learning Resource.
  6. 6.0 6.1 [6], Wikimedia Foundation Cite error: Invalid <ref> tag; name "Wikimedia Foundation" defined multiple times with different content
  7. [7], The Early History of Nupedia and Wikipedia: A Memoir - Part I" and "Part II", Slashdot, April 2005.
  8. [8].(July 31, 2006). Schiff, Stacy. "Know It All". The New Yorker.
  9. [9], Anderson, Nate (February 25, 2007). "Citizendium: building a better Wikipedia". Ars Technica.
  10. [10], Wikimedia Foundation bylaws. Archived from the original on 2007-04-20.
  11. M Strube et al,WikiRelate!, Computer Semantic Relatedness Using Wikipedia, Proceedings of the National Conference on Artificial Intelligence (2006)
  12. E Gabrilovich et al, Computing Semantic Relatedness Using Wikipedia-Based Explicit Semantic Analysis(2007)
  13. Zesch et al, Analyzing and Accessing WIkipedia as a Lexical Semantic Resource, Data Structures for Linguistic Resources (2007).
  14. Viegas et al, [Talk Before You Type: Coordination in Wikipedia, Hawaii International Conference on System Sciences (2007)
  15. Kittur et al, He Says, She Says; Conflict and Coordination in Wikipedia, Proceedings of the SIGCHI Conference on Human Computing (2007)
  16. D Wilkonson & B Huberman, Assessing the Value of Cooperation in Wikipedia, Computers and Society, arXiv:cs/0702140v1 [cs.DL] (2007).
  17. Editors as Elite Users
  18. [11], Can you trust Wikipedia?, The Guardian, 2008.
  19. P.D. Magnus, On Trusting Wikipedia, Britannica.
  20. [12], Journal Nature study "fatally flawed" says Britannica, March 24, 2006, Wikinews.
  21. [13], Encyclopedia Britannica and Nature: A Response, March 23, 2006.
  22. [14], Larry Press, Survey of Wikipedia accuracy and completeness, Professor of Computer Information Systems, California State University (2006)
  23. Michael Kurzidim: Wissenswettstreit. Die kostenlose Wikipedia tritt gegen die Marktführer Encarta und Brockhaus an, in: c't 21/2004, October 4, 2004, S. 132-139.
  24. Dorothee Wiegand: "Entdeckungsreise. Digitale Enzyklopädien erklären die Welt." c't 6/2007, March 5, 2007, p. 136-145.
  25. [15], PC Authority:'Wikipedia Uncovered'.
  26. [16] KA Clauson et al., Scope, completeness, and accuracy of drug Iinformation in Wikipedia, 42 Annals Pharmacotheraphy 1814 (2008).
  27. Larry Press, "Survey of Wikipedia accuracy and completeness," Professor of Computer Information Systems, California State University (2006).
  28. [17], Robert McHenry, The Faith-Based Encyclopedia Blinks, Dec 14, 2005 (2008).
  29. [18], Andrew Orlowski, "Who's responsible for Wikipedia?" The Register, Dec 12, 2005.
  30. [19], Lysa Chen, "Several colleges push to ban Wikipedia as resource," Duke Chronicle, March 28, 2007
  31. "A Stand Against Wikipedia", Inside Higher Ed (January 26, 2007). Retrieved on January 27, 2007.
  32. Kittur, Suh & Chi, Can You Ever Trust a Wiki?: Impacting Perceived Trustworthiness in Wikipedia, in PROCEEDINGS OF THE ACM 2008 CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK (2008) 477-480. [20].
  33. T. Cross. Puppy smoothies: Improving the reliability of open, collaborative wikis. First Monday, 11(9), September 2006.
  34. P. Resnick, R. Zeckhauser, E. Friedman, and K. Kiwabara. Reputation systems. Comm. ACM, 43(12):45{48, 2000. C. Dellarocas. The digitization of word-ofmouth: Promises and challenges of online reputation systems. Management Science, October 2003.
  35. B. Thomas Adler & Luca de Alfaro, A Content-Driven Reputation System for the Wikipedia.
  36. H. Zeng, M.A. Alhoussaini, L. Ding, R. Fikes, and D.L. McGuinness. Computing trust from revision history. In Intl. Conf. on Privacy, Security and Trust, 2006.
  37. Geist, M. Mapping the digital future. OECD: Organisation for Economic Cooperation and Development 254 (2006), 36–37.
  38. Dunn, J., Byrd, D., Notess, M., Riley, J., and Scherle, R. Variations2: Retrieving and using music in an academic setting. Commun. ACM 49, 8 (Aug. 2006) 53–58.
  39. Nielsen NetRating. [www.nielsen-netratings.com/pr/PR_060810.PDF User-generated content drives half of U.S. top 10 fastest growing Web brands], (Aug. 10, 2006).
  40. Eltringham, M. Citizen journalists challenge BBC, BBC NewsWatch (2006).