White, Western and Male: How Wikipedia fails to deliver on the promise of knowledge by all, for all

 

Wikipedia is the keystone project of the Wikimedia Foundation, a non-profit organisation whose mission is to “share the sum of all knowledge with every person in the world” and to “help bring new knowledge online, lower barriers to access, and make it easier for everyone to share what they know” (1). While the Foundation actively runs a dozen free knowledge projects, Wikipedia is by far the most prominent and well known of them. It consistently ranks as one of the most popular websites in the world (2), and is a remarkable anomaly as a non-profit in the company of multi-billion dollar technology giants. Much of this status comes from Wikipedia’s unchallenged dominance in the online encyclopaedia realm, with traditionally published encyclopaedias being unable to match its free proposition. And while it does solicit and receive considerable donations, the key to their ability to provide such a service free of charge is due to an estimated 100 million hours of unpaid work done by volunteers to create, edit and maintain articles. With 35 million articles in 290 languages, the economics of the project without volunteer labour is simply unfeasible.

Click here for a visualisation of Wikipedia’s monthly page views over time

However, this system of crowd-sourced labour in its current state is a barrier to Wikipedia’s stated goal of building free encyclopaedias of neutral, cited information in all languages of the world, at the same time as being the only way to achieve it. According to Wikipedia’s own survey of their volunteer editors in 2011, they are overwhelmingly male, concentrated in North America and Europe and 76 percent of them edit in the English Wikipedia (3). While this survey is now several years old, the percentage of female editors is still commonly estimated to be around 10-12 percent, although some studies put the number closer to 15-20 percent (4). Of members with more than 500 edits, however, only six percent are female (5). Furthermore, while the number of non-english articles has grown, the breakdown of the 290 languages represented is still weighted heavily in favour of the English and European markets (6):

Screenshot 2016-02-15 at 18.28.54

This breakdown of editors clearly illustrates that Wikipedia’s workforce is overwhelmingly Western and male, and while no data exist on editor ethnicities, it is not a stretch to add that they are most likely overwhelmingly white as well. And in the pursuit of a collection of the world’s cumulative knowledge, the absence of voices from outside this narrow slice of the global population is significant.

The counterargument to this would be that Wikipedia mandates neutral, cited articles, in which case the profile of the majority of editors should not be important. The ten rules for editing and the five pillars of the project repeatedly emphasise neutrality in writing and that the site is not a platform for opinion or promotion. What is more, a large portion of editing is conducted by bots (7), including ClueBot NG that detects vandalism with up to 90 percent accuracy and others perform useful, if more mundane, tasks like automatically tidying up categories, fixing links or correcting common misspellings. There is also COIBot that reports potential conflicts of interest where account or usernames overlap with the subject of the article being edited. Considering the difficulty of ensuring impartiality and objectivity in an environment where anyone can edit a page, such a bot is important. Other attempts to monitor conflicts of interest have also appeared, such as @congress-edits, a Twitter bot that tweets when anonymous edits are made from IP addresses in the US Congress. Many of these edits turn out to be to be correcting minor errors or misspellings (8), but the oversight acts as a preventative measure. However, active conflicts of interest are only the most obvious way in which objectivity can be violated, and bots can only do so much. With the production of content still entirely reliant on active participants, what pages are created and developed depends very much on the interests and expertise of the editors.

Here, a participant base weighted heavily towards a certain segment of the population begins to have an impact. The first of these traits – Western – is in some ways not difficult to address given time. The pattern of distribution of editors globally largely matches the distribution of internet access and uptake. The result is clear in the numbers, with English (including North America and the Commonwealth) and European articles making up the vast majority of the total, and, by language, representing nine of the thirteen Wikipedias with over a million articles each. This means significantly more articles about Western issues and events, more articles written from a Western perspective, and more influence on Wikipedia policy and the community from Western editors. A 2011 study from the University of Oxford found that 84 percent of entries tagged with a location were about Europe or North America, and Antarctica had more entries than any nation in Africa or South America (9). However, the remaining four one-million-plus Wikipedias show the growth in uptake in Asia, with Japanese, Vietnamese and two Filipino languages represented. And while the English Wikipedia remains nearly three times as big as the next largest (German), the percentage of all articles that are in English and in the ten largest Wikipedias dropped steadily over time.

PercentWikipediasGraph

While this data does not go beyond 2008, it has also been noted that new language Wikipedias seem to follow the same pattern of growth that English did (CITE), suggesting that the trend may continue as more languages are added and grow. While the English/Western dominance looks sure to continue for some time, as internet access spreads and improves globally it is feasible that Wikipedia’s geographic diversity will match it in progress. The Foundation is even engaging directly in the effort to expand access globally with their Wikipedia Zero initiative, that allows access to Wikipedia in 64 countries, mainly in the Global South via mobile data without incurring any charges.

Click here for an animated graph of Wikipedia’s growth

However, that this fails to address is the silo effect that is created when a group writes only for its own members. While the individual sections may grow the information each language group can access will be limited. Many browsers do offer increasingly accurate web page translation, but discovery is severely limited for a user who does not speak the language in question. In this one can see a flaw in the Wikipedia mission, as simpling collating the world’s knowledge is not the same as making it accessible to everyone. It could be argued that the English Wikipedia’s size counters this somewhat, as English is a common second language globally, making it the default version, accessible to most, if not all. Indeed the infographic below demonstrates how much more popular the English version is than any other. However the Western bias, and even a North American bias is clear in the selection of articles and likely in the content, so it remains an inaccurate depiction of the world’s knowledge to whoever reads it. What is more, there are considerable drawbacks for native English speakers, who are far less likely than others to learn another language, as there is a very real likelihood that they would not even recognise that the information they were getting could be biased or incomplete. A collection of over five million articles (and counting) gives a convincing impression of being comprehensive and there is considerable trust placed in an encyclopaedia, online or otherwise.

FT_16.01.13_wikipedia_bubble

The effect of gender representation is in many ways more subtle, and more difficult to address. The 80-90 percent male editing force has resulted in some notable effects on the kinds of subjects that are added and developed, and as well as creating more obvious controversies. Generally, the male bias has led to articles on typically male ventures like Pokemon, or WWE (the fourth most contested article in the English on the site) being given an enormous amount of attention, and articles about women, of specific interest to women or about aspects of culture traditionally ascribed as feminine being absent or neglected (10). Founder Jimmy Wales has used the example of Kate Middleton’s wedding dress to show how the subject of an article can highlight the gender divide. In a speech to Wikimania he pointed out that while there are over a hundred articles on different Linux distributions, indicative of the influence of a male, tech-heavy community, a new article about Middleton’s dress was immediately flagged for deletion with responses ridiculing it as trivial (11). Research has also shown that articles worked on by predominantly female editors, which were presumably of interest to female readers, were significantly shorter than those edited by mostly male or an equal mix of editors (12). It is also important to consider that studies such as this only use article length as an indicator, as analysing how much these trends are replicated in article content is difficult. As the recent OED controversy illustrated, however, biases can be insidious and go unnoticed for long periods of time, so it is likely that there are many instances of embedded gender prejudices throughout Wikipedia’s millions of articles, being read by an audience who look to the source as an authority. This issue is discussed by philosopher Martin Cohen who says that “all the prejudices and ignorance of its creators are imposed” on the content and that at the time of writing (2008) the articles that had earned a ‘bronze star’ for being accurate, neutral and complete only made up 0.01% of the total. The number has not grown in the eight years since.

There have also been several much less subtle demonstrations of the gender divide, few less subtle than the GamerGate episode. In short, a female video game developer, Zoe Quinn, released an interactive fiction game based on her experiences with depression, and was immediately met with threats and harassment, including doxing that put her phone number and address online, from the gaming community. The attacks escalated when an ex-boyfriend wrote a blog post claiming that Quinn had cheated on him with several people, including a journalist who had written about the game. That journalist was quick to point out that he had not reviewed the game, merely reported that it existed, but the story evolved into one purportedly about ethics in gaming journalism, while being vitriolic in its treatment of Quinn and women in gaming generally.

While there were layers of controversy throughout the story as it spilled into social media and other characters become involved, Wikipedia was a battleground between the two sides from early on. After a lengthy edit war of the GamerGate page, the issue went to Wikipedia’s highest arbitration committee, ArbCom, where eleven male and three female members ruled to ban five prominent feminist editors from editing either the GamerGate page or any other article about “gender or sexuality, broadly construed” (13). The breadth of the sanctions was widely criticised for leaving not only the original page but those of the people involved, Quinn and others, open to editing by their critics. The only accounts suspended on the opposing side of the issue were throwaways. Regardless of whether the ArbCom decision was justified or not, Gawker stated that at the very least “the episode punches a neat a hole in the idea that Wikipedia is a neutral and democratic platform” but beyond that, “that the world’s seventh-most popular website would look at Gamergate and decide that what’s needed is a silencing of feminist perspectives is depressing, but it’s hardly surprising.”

This criticism is not the first the Foundation has faced around the gender gap and its impact, but the senior executives acknowledge it freely, as evidenced by Wales’ speech, and are making efforts to address it. Sue Gardiner, a former executive director of the Wikimedia Foundation, set a goal in 2011 to raise the proportion of female editors to 25 percent by 2015 and numerous projects have been introduced to try to help. On the English Wikipedia these included a gender gap taskforce to help recruit and retain female editors, the Inspire Campaign grant funding and projects like WikiProject Feminism, WikiProject Women’s History, WikiProject Women scientists, and WikiProject Women’s sport designed to expand the article entries in under-developed areas. Efforts have also been made to redesign the user interface to make it more approachable, but this was met with stiff resistance from the established community when it was made the default, so it was eventually retained only as an opt-in option that is difficult to find for newcomers (14). Other external efforts include edit-a-thons where both experienced and novice female editors arrange meet-ups and edit together, offering tutorial sessions, research support and guidance. Unfortunately, these have faced trouble from the community as well, with one event that used Smithsonian archives to create new articles on unknown female historical figures having two of the pages they created quickly flagged for deletion and subjected to debate (15). There is no evidence that these efforts have made any impact on pushing female editor representation even close to the 25% mark and the initiatives meet with resistance from the male core of editors at every turn. Jimmy Wales admitted in 2014 that the effort had ‘completely failed’. Representation remains around 10-15 percent according to most sources and the environment remains toxic for many of the female editors who do participate, with stories of harassment commonplace.

What has been presented here is only an overview of the deeply entrenched issues of representation in the Wikipedia editing ecosystem. The Western bias is overwhelming in sheer numbers and undoubtedly affecting content. Women remain isolated and excluded by hostility, harassment and a system designed by and for a community that does not want to include them (16). While ethnicity has not been explored here, many of the same issues of underrepresentation exist for minority groups. Entries on Eric Garner, who was choked to death by an NYPD officer in 2014, and other victims of police violence have been edited by an NYPD IP address to appear less inflammatory (17) and articles on black history and culture are absent or underdeveloped (18). The small efforts that are made to improve the situation, while valuable, face an enormous uphill struggle and are actively resisted by members of dominant groups. The problem threatens to completely undermine Wikipedia’s goal of collecting and sharing the world’s knowledge, and it already does undermine their credibility as a neutral source of information. And yet the illusion of neutrality remains convincing, as internal conflicts remain invisible to the general public, and its authority only grows as more and more people and programs draw their information from within its pages. Ultimately, it is clear that while theoretically comprehensive and open to everyone, Wikipedia is an online microcosm of the restricted access, participation and representation that has existed throughout history. It does not, in its current form, look likely to meet the goal it has set for itself of knowledge for all, by all.

3 Replies to “White, Western and Male: How Wikipedia fails to deliver on the promise of knowledge by all, for all”

  1. Zoe, this essay is very insightful. I particularly enjoyed your graphics, links to relevant articles, and the many helpful examples you have provided throughout. Wikipedia’s editing/authoring gender and geographical biases are something I had not known about, but are useful to know especially considering that I am an avid reader/consulter of Wikipedia (though not an editor) myself. Your essay provides a compelling overview of the issues and your stance on them.

    I have noted, though, that some of your examples could have been made stronger. For example, the unfair deletion of the Wikipedia entry on Kate Middleton’s wedding dress could have used more support to drive the point home. In the source that you cite regarding the dress, it explains that her dress is significant beyond the event of the wedding, as it could have a lasting impact on the fashion industry as a whole, and therefore merits a Wikipedia entry. Another issue you raise is about the Gamergate episode and the ensuing Wikipedia edit wars. This paragraph could have been introduced better (I have noted a sentence with hypothes.is that could be used for introduction) as I felt lost reading as to how the issue linked to Wikipedia until well into the paragraph. Other comments are minor and have been made throughout with hypothes.is, such as your one unattributed parenthetical citation.

    Since Wikipedia lives online and hence is perpetually a work in progress, there is no reason why the issues you have raised in your essay cannot be corrected or at least ameliorated. You note that Wikipedia is aware of their current gender gap in terms of editors, as evidenced by their many initiatives to improve this and recruit women editors. You also mention Wikipedia’s mission to get more engagement in other countries as well, in order to be more representative of knowledge on a global scale. If Wikipedia is not successful with these plans, at the very least they have made the first necessary step to fixing these issues, which is acknowledging them.

    In response to your essay, I’d like to take the devil’s advocate argument: that people who want to edit Wikipedia, do or will. A heavy male skewing in terms of editorial ratio could be because of the coding knowledge needed that you mention, but the technology/computer science fields where coding skills can be developed are male-dominated as well. Thus the onus of Wikipedia’s gender gap cannot be placed on Wikipedia alone, as it is a symptom of a larger issue. A step in the right direction could be to make more of an effort to get women involved in STEM subjects. This could help balance these fields and would have reverberations for coding-based pursuits and beyond.

    My personal conclusions to the problems presented in this essay are twofold. First, Wikipedians could be held more accountable for entries they write/edit by having to use their real names instead of random usernames. This could help with the issues you raise concerning editing wars. Second, Wikipedia could adapt its entry editing to a WYSIWYG interface so that anyone who does not have an understanding of code could help add/correct entries, which would encourage participation and move closer to their mission of democratizing knowledge. You may enjoy reading David’s essay, “Participating On The Internet: The Most Unfair Playground In The World,” as it links up well with yours. David’s essay can be found at: http://tkbr.publishing.sfu.ca/pub802/2016/02/participating-on-the-internet-the-most-unfair-playground-in-the-world/

    1. Nice to see links between student essays! I had the same thought about the Gamergate controversy description, although my suggestion was to make it shorter. Overall, a fair critique, and one that helps the reader keep reading on the subject. Thanks!

  2. This is an excellent summary of some of the biases that exist on Wikipedia. Unfortunately, these issues on wikipedia have been around for a long time, and have yet to be adequately addressed. It is incredibly frustrating. Aaron Swartz wrote a piece back in 2006 that touches on similar points.

Leave a Reply