Wiki Workshop 2022

A forum bringing together researchers exploring all aspects of Wikimedia projects. Held virtually at The Web Conference 2022, April 25, 2022.

  • April 12, 2022: Mishi Choudhary (SFLC) has joined us as a panelist.
  • April 5, 2022: Cory Doctorow (Electronic Frontier Foundation) and Tiffiniy Cheng (Fight for the Future) have joined us as panelists.
  • March 16, 2022: Registration for Wiki Workshop 2022 is now open. Register!
  • March 13, 2022: We received a total of 31 paper submissions as part of this year’s workshop. Reviews for the archival submissions are concluded. The reviews for non-archival submissions are in-progress.
  • March 8, 2022: Erik Moeller (Freedom of the Press Foundation) is confirmed as our panel moderator.
  • Jan. 18, 2022: Lawrence Lessig is confirmed as our Keynote Speaker.
  • Dec. 16, 2021: Call for contributions and submission instructions are published.
  • Dec. 14, 2022: Wiki Workshop 2022 will be fully remote.
  • Dec. 14, 2022: Wiki Workshop 2022 webpage online.

The times in the table below are in UTC. 12:00 UTC is 5:00 in San Francisco, 8:00 in New York City, 15:00 in Nairobi, and 20:00 in Beijing.

12:00 - 12:30 Welcome and icebreaking (Video)
12:30 - 13:20 Paper Session I (Video)
13:20 - 13:25 Break
13:25 - 13:35 Music Break: Ugnė Danielė Reikalaitė
13:35 - 14:30 Paper Session II (Video)
14:30 - 14:40 Break
14:40 - 15:40 Poster Session
15:40 - 15:55 Break and Live Music
15:55 - 16:50 Panel discussion moderated by Erik Moeller, featuring Tiffiniy Cheng, Mishi Choudhary, and Cory Doctorow - "10 Years After the SOPA/PIPA Blackout: The Past and Future of Online Protest" (Video)
16:50 - 17:00 Music Break: Ugnė Danielė Reikalaitė
17:00 - 17:15 Wikimedia Foundation Research Award of the Year (Video)
17:15 - 17:20 Break
17:20 - 18:20 Keynote by Lawrence Lessig - "How can the Internet be so bad and so good: The lessons we must draw and that Wiki must teach" (Video)
18:20 - 18:30 Closing (Video)

Lawrence Lessig (Harvard Law School)

Lawrence Lessig

Photo by Jessica Scranton

Lawrence Lessig is the Roy L. Furman Professor of Law and Leadership at Harvard Law School. Prior to returning to Harvard, he taught at Stanford Law School, where he founded the Center for Internet and Society, and at the University of Chicago. He clerked for Judge Richard Posner on the 7th Circuit Court of Appeals and for Justice Antonin Scalia on the United States Supreme Court. Lessig is the founder of Equal Citizens and a founding board member of Creative Commons, and serves on the Scientific Board of AXA Research Fund. A member of the American Academy of Arts and Sciences and the American Philosophical Society, he has received numerous awards including a Webby, the Free Software Foundation's Freedom Award, Scientific American 50 Award, and Fastcase 50 Award. Cited by The New Yorker as “the most important thinker on intellectual property in the Internet era,” Lessig has turned his focus from law and technology to “institutional corruption”—relationships which, while legal, weaken public trust in an institution—especially as that affects democracy. His books are: They Don't Represent Us: Reclaiming Our Democracy (2019), Fidelity & Constraint: How the Supreme Court Has Read the American Constitution (2019), America, Compromised (2018), Republic, Lost v2 (2015), The USA is Lesterland (2014), One Way Forward (2012), Republic, Lost: How Money Corrupts Congress—and a Plan to Stop It (2011), Remix: Making Art and Commerce Thrive in the Hybrid Economy (2008), Code v2 (2006), Free Culture (2004), The Future of Ideas (2001), and Code and Other Laws of Cyberspace (1999). Lessig holds a BA in economics and a BS in management from the University of Pennsylvania, an MA in philosophy from Cambridge University, and a JD from Yale.

Erik Moeller

Erik is VP of Engineering at Freedom of the Press Foundation. He was Deputy Director of the Wikimedia Foundation from January 2008 to April 2015. In that role, he organized the technical implementation of the SOPA/PIPA blackout of Wikipedia in 2012.
At Freedom of the Press Foundation, Erik manages the development of the SecureDrop software, which is used by 70+ news organizations to communicate with anonymous whistleblowers.
Erik has also worked as a journalist and author, project manager, public speaker, and software engineer.

Tiffiniy Cheng (Fight for the Future)

Tiffiniy Cheng

Tiffiniy Cheng is co-founder and board member of Fight for the Future (FFTF), an organization known for its mass campaigns that have changed Internet history. FFTF created the campaigns behind the 2016 efforts to block backdoors into encryption and the iPhone, the landmark 2015 net neutrality decision, and the largest and most visible online protest in history known as the Internet Blackout when 24 million people took action. Tiffiniy’s work focuses on upending the arbitrary powers and laws that seek to limit openness, privacy, and freedom of expression. She has spent over 15 years building software applications and internet strategies to this end––she created some of the earliest viral online protests; started Bitcoin Black Friday; built Open Congress, the most popular government transparency site; and initiated post-2008 national protests against too-big-to-fail. She is currently running political and internet strategy for A-teams, an effort to build small tech activism teams like FFTF that can use the Internet to capture the debate and win policy change for the public interest. She was a Shuttleworth and Ashoka fellow for her work fighting for privacy and an open society.

Mishi Choudhary (Software Freedom Law Center)

Mishi Choudhary

Photo by By Ot (CC BY-SA 4.0)

Mishi Choudhary is a technology lawyer and an online civil liberties activist with law practice in New York and New Delhi. The Open magazine calls her an emerging legal guardian of the free and open internet. She is the Legal Director of the New York based Software Freedom Law Center and Partner at Moglen & Associates. At SFLC, Mishi has served as the primary legal representative of many of the world's most significant free software developers and non-profit distributors, including Debian, the Apache Software Foundation, and OpenSSL. She advises technology startups and established businesses around the world on intellectual property matters in particular on open source software licensing and strategy, export control compliance, diversity and inclusion, data protection and content moderation.

Cory Doctorow

Photo by Jonathan Worth (CC BY-SA 2.0)

Cory Doctorow is a science fiction author, activist and journalist. He is the author of many books, most recently RADICALIZED and WALKAWAY, science fiction for adults; HOW TO DESTROY SURVEILLANCE CAPITALISM, nonfiction about monopoly and conspiracy; IN REAL LIFE, a graphic novel; and the picture book POESY THE MONSTER SLAYER. His latest book is ATTACK SURFACE, a standalone adult sequel to LITTLE BROTHER; his next nonfiction book is CHOKEPOINT CAPITALISM, with Rebecca Giblin, about monopoly and fairness in the creative arts labor market, (Beacon Press, 2022). In 2020, he was inducted into the Canadian Science Fiction and Fantasy Hall of Fame. (Learn more)

Bart Magnus
Public Domain Tool: automating the calculation of works in the public domain while contributing to and using Wikidata [PDF]
Carina Negreanu, Alperen Karaoglu, Jack Williams, Shuang Chen, Daniel Fabian, Andrew Gordon and Chin-Yew Lin
Rows from Many Sources: How to enrich row completions from Wikidata with a pre-trained Language Model [PDF]
Anass Sedrati and Reda Benkhadra
Are Democratic User Groups More Inclusive? [PDF]
Puyu Yang and Giovanni Colavizza
A Map of Science in Wikipedia [PDF]
Karthic Madanagopal and James Caverlee
Improving Linguistic Bias Detection in Wikipedia using Cross-Domain Adaptive Pre-Training [PDF]
Jean Dupuy, Adrien Guille and Julien Jacques
Anchor Prediction: A Topic Modeling Approach [PDF]
Núria Ferran-Ferrer, Marc Miquel-Ribé, Julio Meneses and Julià Minguillón
The gender perspective in Wikipedia: A content and participation challenge [PDF]
Hiba Arnaout, Trung-Kien Tran, Daria Stepanova, Mohamed H. Gad-Elrab, Simon Razniewski and Gerhard Weikum
Utilizing LM Probes for KG Repair [PDF]
Tiziano Piccardi, Martin Gerlach and Robert West
Going down the Rabbit Hole: Characterizing the Long Tail of Wikipedia Reading Sessions [PDF]
Subhashish Panigrahi
Building a Public Domain Voice Database for Odia [PDF]
Włodzimierz Lewoniewski, Krzysztof Węcel and Witold Abramowicz
Reliability in Time: Evaluating the Web Sources of Information on COVID-19 in Wikipedia across Various Language Editions from the Beginning of the Pandemic [PDF]
Nicole Schwitter
Offline Meetups of German Wikipedians: Boosting or braking activity? [PDF]
Yonas Mitike Kassa
Wikipedia Knowledge Graphs as Job Interview Kits [PDF]
Nidia Hernandez, Gimena del Rio and Diego de la Hera
Insights on the references of Wikipedia’s featured articles in English, French, Portuguese and Spanish [PDF]
Manoj Niverthi, Gaurav Verma and Srijan Kumar
Characterizing, Detecting, and Predicting Online Ban Evasion [PDF]
Patrick Healy, Klaus Ackerman, Simon Angus, Paul Raschky, Weijia Li, Nathan Lane, Satya Borgohain and Cynthia Huang
Editing the truth [PDF]
Mykola Trokhymovych and Diego Saez-Trumper
WikiFactFind: Semi-automated fact-checking based on Wikipedia [PDF]
Nathan Teblunthuis
Measuring Wikipedia Article Quality in One Dimension by Extending ORES with Ordinal Regression [PDF]
Kai Zhu and Xin Yue Zhou
Can Machine Translation Narrow Knowledge Gap across Languages? A Large-scale Multilingual Analysis of the Partnership between Google Translate and Wikipedia [PDF]
Marc Miquel-Ribé, Cristian Consonni and David Laniado
Wikipedia, Elder or Teen? [PDF]
Yashashwani Srinivas
The Digital Gender Disparity [PDF]
David Ramírez-Ordóñez, Núria Ferran-Ferrer and Julio Meneses
Wikipedia and gender: The deleted, the marked, and the unpolluted biographies [PDF]
Oktie Hassanzadeh
Building a Knowledge Graph of Events and Consequences Using Wikipedia and Wikidata [PDF]
Anis Elebiary and Giovanni Luca Ciampaglia
The role of online attention in the supply of disinformation in Wikipedia [PDF]
C. Maria Keet, Langa Khumalo and Zola Mahlaza
Considerations for a model for Niger-Congo B (`Bantu') noun classes in Wikidata [PDF]

Workshop date: April 25, 2022. This year’s workshop will be a virtual event.

If authors want paper to appear in proceedings:

  • Submission deadline: February 3, 2022
  • Author feedback: March 3, 2022
  • Camera ready version due: March 10, 2022

If authors do not want paper to appear in proceedings:

  • Submission deadline: March 10, 2022
  • Author feedback: April 1, 2022

We invite contributions to Wiki Workshop 2022 which will take place virtually as part of The Web Conference 2022. Wiki Workshop, now in its 9th edition, is an annual research event aimed at bringing together researchers who explore all aspects of the Wikimedia projects including Wikipedia (in more than 160 actively edited languages), Wikidata, Wikimedia Commons, Wikisource, Wiktionary, and beyond. With members of the Wikimedia Foundation's Research team on the organizing committee and the experience of successful workshops in 2015, 2016, 2017, 2018, 2019, 2020, and 2021, we aim to continue facilitating a direct pathway for exchanging ideas between the organization that serves Wikimedia projects and the researchers interested in studying them.

Topics of interest include, but are not limited to

  • new technologies and initiatives to grow content, quality, equity, diversity, and participation across Wikimedia projects
  • use of bots, algorithms, and crowdsourcing strategies to curate, source, or verify content and structured data
  • bias in content and gaps of knowledge
  • diversity of the Wikimedia editors and users
  • detection of low-quality, promotional, or fake content (misinformation or disinformation), as well as fake accounts (e.g., sock puppets)
  • questions related to community health (e.g., sentiment analysis, harassment detection)
  • understanding editor motivations, engagement models, and incentives
  • Wikimedia consumer motivations and their needs: readers, researchers, tool/API developers
  • innovative uses of Wikipedia and other Wikimedia projects for AI and NLP applications
  • consensus-finding and conflict resolution on editorial issues
  • participation in discussions and their dynamics
  • dynamics of content reuse across projects and the impact of policies and community norms on reuse
  • privacy, security, and trust
  • collaborative content creation (unstructured, semi-structured, or structured)
  • innovative uses of Wikimedia projects' content and consumption patterns as sensors for real-world events, culture, etc.
  • open-source research code, datasets, and tools to support research on Wikimedia contents and communities

Papers should be 1 to 12 pages long (maximum 8 pages for the main paper content + maximum 2 pages for appendixes + maximum 2 pages for references). We explicitly encourage the submission of preliminary work in the form of extended abstracts (1 or 2 pages). The review process will be single-blind (as opposed to double-blind), i.e., authors should include their names and affiliations in their submissions.

Papers will be published on the workshop webpage and optionally (depending on the authors' choice) in the proceedings of the Web Conference 2022. Authors whose papers are accepted to the workshop will have the opportunity to present their work in an oral presentation and/or poster session.

Please review our privacy statement prior to submitting your paper. For submission dates, see above.

Srijan Kumar (Georgia Tech)
Emily Lescak (Wikimedia Foundation)
Miriam Redi (Wikimedia Foundation)
Robert West (EPFL)
Leila Zia (Wikimedia Foundation)

