4 Wikidata and Gender Equity in Publishing

Jere Odell; Mairelys Lemus-Rojas; and Lucille Brys

Inequities in Scholarly Publishing

Inequities in scholarly publishing mirror—and in some cases exaggerate—broader social and systemic injustices. While scholarly communication librarian professionals often think about paywalls and barriers to knowledge access, these problems are preceded by inequities that exclude or disadvantage people that have been marginalized by sexist, racist, elitist, and colonialist histories. These inequities result in a global corpus of scholarly literature that is disproportionately authored by white men from wealthy countries (Alperin, 2018; Hathcock, 2018; Roh, 2018).

These systemic inequities are compounded by bias in authorship and citation practices. Women are less likely to be named as co-authors of team-based research (Larivière et al., 2013). At the same time, even when publishing in highly-ranked journals, works authored by women authors were less likely to be cited (Larivière et al., 2017). Not unlike biases against job applicants for positions in research labs and contributors to code repositories, these differences are evident when an author’s first name merely appears to be one that is conventionally given to women (Moss-Racusin et al., 2012; Terrell et al., 2017). When women authors are excluded from authorship and from the reference lists of subsequent works, their contributions to scholarly communication become less visible and less likely to be recognized and rewarded.

Gender Inequities in Wikipedia

The social effects of systemic bias against women authors carry over into secondary and tertiary knowledge systems, including Wikipedia. When the primary and secondary literature under-addresses the work of women scholars, Wikipedians may fail to recognize women as meeting the criteria for “notability” (Wikipedia:Notability – Wikipedia, 2022). In one well-publicized case, for example, a scientist, Donna Strickland, did not have an English Wikipedia entry until after she won the Nobel Prize for Physics (Koren, 2018). When implicit or direct biases like these are matched with the harassment of women and transgender editors, Wikipedia loses editors that would have an interest in contributing content that would reduce gender inequities in the encyclopedia (Baltz, 2021; Jacobs, 2019).

Bridging the Gender Gap in Wikidata

Direct action to encourage women, transgender, and nonbinary participation in Wikimedia while simultaneously supporting projects that focus on content of interest to these participants is one way to begin to address the gender gap. In 2014 only 15.5% of biographies in English Wikipedia were about women, by 2022 that number increased to 19.2%—largely due to the dedicated work of WikiProject Women in Red and other similar projects (Wikipedia:WikiProject Women in Red – Wikipedia, n.d.). Even as a content gap of five to one remains, this work demonstrates that progress can be made.

Wikidata has similar inequities in content. As of a data snapshot from May 9, 2022, close to 82% of Wikidata entries for humans with a sex or gender (P21) statement are labeled “male.” In contrast, only 18% are labeled “female” and less than 1% are labeled with other genders (Humaniki / Wikimedia Diversity Dashboard Tool, n.d.). Those working on library scholarly communication initiatives can begin to address this gap by contributing entries for women authors. Not only would this narrow the gender gap on Wikidata, but it would contribute to increasing the discoverability of the identifiers (such as ORCID) and the linked works of women authors. Contributing referenced and public data about women and other authors who are not cis-gendered men to Wikidata may make it more likely that the authors will be read and cited as a result. Nonetheless, librarians should take a conscientious approach to making gender-based contributions to Wikidata.

Conscientious Gender Statements

The description of humans (Q5) in Wikidata should be done with care. Although much can be discovered about a person by using search engines, adding that information to an open, linked data site like Wikidata, can expose personal information to unexpected downstream use. Use of the property for sex or gender (P21), particularly for living persons, should be accompanied with careful consideration of the related ethical issues. Readers of this text may in fact decide that P21 has no place in a project that seeks to respect a person’s autonomy. Readers that come to that conclusion are not alone.

The Wikidata community is divided on the scope and the responsible use of sex or gender (P21). The property has been the subject of many debates throughout the years by both seasoned Wikidata users and newer contributors to the project. While much discussion has happened around this topic, there is not a consensus from the community on how to best use this property or what is ethically acceptable. Some argue that the property should be separated into two rather than using it as is which mostly documents gender identity (https://archive.ph/Q1DDS). In 2019, it was proposed to split the property, to change its label to gender, and to create a new property to record sex. This proposal drew both support and opposition. For some users this approach made sense semantically since sex and gender are two distinct concepts and therefore should be treated differently. Others pointed out their concerns with the proposed split for different reasons including fear that it could be used as an invitation to harass trans and nonbinary individuals, or the fact that some languages do not make a distinction between sex and gender (https://archive.ph/zwTux).

However, as the members of the Wikidata:WikiProject LGBT have suggested, “modeling gender matters.” Refusing to model a responsible use of gender in Wikidata may seem like a reasonable way to avoid misgendering a person but taking that approach will minimize the record of gendered persons—including for those that dedicated their lives for the right to be known as a person with a gendered identity. Similarly, efforts to work toward gender equity in scholarly communication and in Wikimedia projects rely, in part, on the ability to record and track the genders of authors. Thus, as contributors consider these factors they may weigh the sometimes conflicting principles of personal autonomy and general beneficence. In respect for personal autonomy, “Everybody should feel comfortable with the way their gender is modeled.” Meanwhile, contributors may also aspire to the common good (general beneficence). In other words, these contributors may work from “a position of influence” to develop Wikidata’s gender model in a way that will  “advocate” and “promote” efforts related to gender equity and inclusion (Wikidata:WikiProject LGBT/gender – Wikidata, n.d.).

A Model for sex or gender (P21) Statements

While the community reaches a consensus on how to best document sex and gender data, continuing to use the existing sex or gender property would facilitate future analyses of the data. While this approach might not always present accurate results, it would nonetheless be a step forward in the representation of this data point. For the purpose of adding data to the P21 property for authors affiliated to a particular institution, Wikidata contributors should make an effort to provide this information in a way that is respectful of how the authors have identified themselves in their university profile pages. Careful P21 statements will, in turn, facilitate the analysis and mitigation—to the extent possible—of gender inequities in the scholarly record.

One model that attempts to honor the self-expressed, publicly-stated, gender identities of persons is described here. In this model (see Figure 7), the reference for the P21 statements include reference URL (to indicate the source of information), quotation (to include a snippet of text from the faculty/researchers page where they have included pronouns to identify themselves), retrieved (to indicate the date the data was accessed), archive URL (to include the archived link of the source website), archive date (to record the date in which the source website was archived), based on heuristic with the values inferred from grammatical gender used in text and (if applicable) inferred from given name.

Screen shot of the "sex or gender" reference model.
Figure 7. Example of the reference model in use for Lauren Berlant’s sex or gender (P21) statement.

Activities

  • Watch Os Keyes’s keynote from WikidataCon 2019 and review the related notes. In what ways could Wikidata minimize potential harm to persons?
  • Find an author item in Wikidata, perhaps a faculty/researcher at your institution. Does this item include a value for the property sex or gender (P21)? If so, is it referenced? Based on the model presented in this section, add or improve a reference for the P21 statement.

Additional Resources

To continue learning and understanding more about the issues around inequities in scholarly publishing, consider the following works: Intersectional inequalities in science, The Magnification of Inequities During COVID-19 and Why it Matters for Science, and The rise of citational justice: How scholars are making references fairer. In addition, a talk given during the 2021 WikidataCon offers an insightful perspective on Non-binary gender identities in Wikidata.

References

Alperin, J. P. (2018, August 9). World Scaled by Number of Documents with Authors from Each Country in Web of Science: 2016 [Map]. https://doi.org/10.6084/m9.figshare.7064771.v1

Baltz, S. (2021, February 24). Wikipedia’s political science coverage is biased. I tried to fix it. Washington Post. https://www.washingtonpost.com/politics/2021/02/24/wikipedias-political-science-coverage-is-biased-i-tried-fix-it/

Hathcock, A. (2018). Racing to the Crossroads of Scholarly Communication and Democracy: But Who Are We Leaving Behind? In the Library with the Lead Pipe. http://www.inthelibrarywiththeleadpipe.org/2018/racing-to-the-crossroads-of-scholarly-communication-and-democracy-but-who-are-we-leaving-behind/

Humaniki | Wikimedia Diversity Dashboard Tool. (n.d.). Humaniki. Retrieved May 16, 2022, from https://humaniki.wmcloud.org, archived at https://archive.ph/ddsnn

Jacobs, J. (2019, April 8). Wikipedia Isn’t Officially a Social Network. But the Harassment Can Get Ugly. The New York Times. https://www.nytimes.com/2019/04/08/us/wikipedia-harassment-wikimedia-foundation.html

Koren, M. (2018, October 2). One Wikipedia Page Is a Metaphor for the Nobel Prize’s Record With Women. The Atlantic. https://www.theatlantic.com/science/archive/2018/10/nobel-prize-physics-donna-strickland-gerard-mourou-arthur-ashkin/571909/

Larivière, V., Ni, C., Gingras, Y., Cronin, B., & Sugimoto, C.R. (2013). Bibliometrics: Global gender disparities in science. Nature, 504(7479), 211–213. https://doi.org/10.1038/504211a

Larivière, V., & Sugimoto, C.R. (2017, March 27). The end of gender disparities in science? If only it were true… CWTS Meaningful metrics. https://www.cwts.nl:443/blog?article=n-q2z294

Moss-Racusin, C. A., Dovidio, J. F., Brescoll, V. L., Graham, M. J., & Handelsman, J. (2012). Science faculty’s subtle gender biases favor male students. Proceedings of the National Academy of Sciences, 109(41), 16474–79. https://doi.org/10.1073/pnas.1211286109

Roh, C. (2018, April 14). Scholarly Communication in a Time of Change: Considering the Impact of Bias, Diversity, and Traditional Publishing Structures as Scholarly Communication Moves to New Platforms and Systems (Talk Transcript). California Academic Research Libraries. https://works.bepress.com/charlotteroh/43/

Terrell, J., Kofink, A., Middleton, J., Rainear, C., Murphy-Hill, E., Parnin, C., & Stallings, J. (2017). Gender differences and bias in open source: pull request acceptance of women versus men. PeerJ Computer Science, 3, e111. https://doi.org/10.7717/peerj-cs.111

Wikidata:WikiProject LGBT/gender – Wikidata. (n.d.). Wikidata. Retrieved April 18, 2022, from https://www.wikidata.org/wiki/Wikidata:WikiProject_LGBT/gender

Wikipedia:WikiProject Women in Red – Wikipedia. (n.d.). Wikipedia. Retrieved May 29, 2022, from https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Women_in_Red

Wikipedia:Notability – Wikipedia. (2022). Wikipedia. Retrieved June 1, 2022, from https://en.wikipedia.org/wiki/Wikipedia:Notability

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Wikidata for Scholarly Communication Librarianship Copyright © 2022 by Jere Odell; Mairelys Lemus-Rojas; and Lucille Brys is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Digital Object Identifier (DOI)

https://doi.org/10.7912/8spa-6s68

Share This Book