Wiki Workshop 2023

A forum bringing together researchers exploring all aspects of Wikimedia projects. Held virtually as a standalone event, May 11, 2023.

The times in the table below are in UTC. 12:00 UTC is 5:00 in San Francisco, 8:00 in New York City, 15:00 in Nairobi, and 20:00 in Beijing.

12:00 - 12:15 Welcome and Orientation
12:15 - 12:25 Getting to Know Each Other
12:25 - 13:45 Research and Developer tracks (parallel sessions)
13:45 - 13:50 Break
13:50 - 14:00 Live Music
14:00 - 15:20 Research and Developer tracks (parallel sessions)
15:20 - 15:30 Live Music and Break
15:30 - 16:30 Panel - AI and the Future of the Wikimedia Projects
16:30 - 16:40 Break
16:40 - 17:55 Research track (parallel sessions)
17:55 - 18:00 Break
18:00 - 18:05 Live Music
18:05 - 18:20 Wikimedia Foundation Research Award of the Year ceremony
18:20 - 18:45 Town hall and open conversation
18:45 - 18:50 Closing

Isaac Johnson (Wikimedia Foundation)

Isaac Johnson

Isaac Johnson is a Senior Research Scientist at the Wikimedia Foundation. He conducts foundational research and develops new AI technologies to support contributors in addressing knowledge gaps on the Wikimedia projects. He focuses on doing this in an ethical and sustainable manner. Examples of projects in this space have included developing topic classification models that cover all 300+ languages of Wikipedia, guiding the roll-out of generative models within Wikimedia edit recommender systems, and participating in the data governance working group for BigScience's BLOOM large language model.

Elena Simperl (King’s College London and Open Data Institute)

Elena Simperl

Elena Simperl is a Professor of Computer Science at King’s College London and the ODI’s Director of Research. She is also a Fellow of the British Computer Society, a Fellow of the Royal Society of Arts, a senior member of the Society for the Study of AI and Simulation of Behaviour, and a former Fellow of the Alan Turing Institute. Elena’s research is in human-centric AI, exploring socio-technical questions around the management, use, and governance of data in AI applications. According to AMiner, she is in the top 100 most influential scholars in knowledge engineering of the last decade. She also features in the Women in AI 2000 ranking. In her 15-year career, she has led 14 national and international research projects, contributing to another 26. She is the scientific and technical lead of the European programme MediaFutures, researching how arts, entrepreneurship and AI can help tackle online harms. She served as programme and general chair to conferences in artificial intelligence, social computing, and data innovation. She is the president of the Semantic Web Science Association.

Thomas Wolf (Hugging Face)

Thomas Wolf

Thomas is a co-founder of Hugging Face where he oversees the open-source team and the science teams. He enjoys creating open-source software that make complex research accessible and is most proud of creating the Transformers and Datasets libraries as well as the Magic Sand tool. When not building OSS, he pushes for open-science in research in AI/ML, to lower the gap between academia and industrial labs by imagining projects like the BigScience Workshop on Large Language Models. His current research interests are centered around overcoming the current limitations of Large Language Models with multi-modalities and complementary approaches. Thomas enjoys writing and filming educational content on ML and NLP, including writing the reference book Natural Language Processing with Transformers published by O'Reilly and written with co-authors Lewis Tunstall and Leandro von Werra, writing in his medium blog and recording out-of-the-ordinary videos like The Future of Natural Language Processing.


Photo by Sumanth69 (CC BY-SA 4.0)

Neechalkaran is a Wikipedian & Computational Linguist from Tamilnadu. Developer of ChatWiki (Wikidata based ChatBOT), VaaniNLP (Tamil NLP library) and various language tools like Tamil Spell checker, Indic Transliteration. Developed various userscripts and BOT for Indic Languages. He created more than 23,000 articles in Tamil wikipedia and 35,000 pages in Tamil Wiktionary. Neechalkaran was a recipient of Information Technology in Tamil Award from Canada Tamil Literary Garden & Tamil Computing Award from Government of Tamilnadu. He has organized various Technology Workshops and Hackathons related to Wikimedia, Life science, Language Computing in South India.

Diego de la Hera

Photo by Diegodlh (CC0 1.0)

Diego de la Hera is a scientist and wikimedian from Argentina. He contributes to technical projects, including the development of tools such as Cita, a Wikidata addon for Zotero, and Web2Cit, a tool to collaboratively improve automatic citations in Wikipedia. He is also one of the founders and member of Wikimedistas Calamuchita, a non-recognized user group in the Calamuchita Valley in Córdoba, Argentina, and of Wikitécnica, a community of Spanish-speaking technical wikimedians.

Nidia Hernández

Nidia Hernández is a linguist (University of Buenos Aires) and a specialist in natural language processing (Paris 3/INALCO). She is a member of CAICYT-CONICET (Buenos Aires, Argentina) where she develops resources for Digital Humanities projects and she also collaborates in the documentation, processing and analysis of endangered languages of South America.

Sohom Datta

Sohom Datta is a student and is one of the volunteer developers working on the ProofreadPage extension at Wikimedia. The ProofreadPage extension provides "proofreading" capabilities to MediaWiki, which is essential for Wikisource, a project that aims to be a free library that anyone can improve. Sohom started his journey in 2020 as a Google Summer of Code student and worked on various features related to Wikisource, such as the inclusion of the Pagelist widget, an interface aimed at simplifying the act of labelling the page number for a book. More recently, in 2021 and 2022, he was involved in efforts to streamline the editing interface for Wikisource, including the addition of IIIF support and the improvement of zooming and panning mechanisms.

Leah Ajmani, Nicholas Vincent and Stevie Chancellor
Peer-Produced Moderation: The Tradeoffs of Page Protection on Wikipedia [PDF]
Reham Al Tamime and Ingmar Weber
Addressing Wikipedia’s Gender Gaps Through Linkedin Ads [PDF]
Vahid Ashrafimoghari and Jordan Suchow
Detecting Cross-Lingual Information Gaps in Wikipedia [PDF]
Rona Aviram and Omer Benjakob
Wikipedia as a tool for contemporary history of science: A case study on CRISPR [PDF]
Aitolkyn Baigutanova, Jaehyeon Myung, Diego Saez-Trumper, Ai-Jou Chou, Miriam Redi, Changwook Jung and Meeyoung Cha
Longitudinal Assessment of Reference Quality on Wikipedia [PDF]
Francesco Bailo, Rohit Ram and Marian-Andrei Rizoiu
Ideology and the collective construction of knowledge of unfolding events on Wikipedia [PDF]
Ava Bartolome, Stevie Chancellor and Loren Terveen
Policy Deliberations in Wikipedia Talk Pages: How Editors Construct Knowledge About Mental Health [PDF]
Dylan Baumgartner, Cristina Sarasua and Pablo Aragon
Deliberative Quality in Wikimedia Projects: Comparing RfCs on Meta, Wikipedia and Wikidata [PDF]
Bettina Berendt, Özgür Karadeniz, Sercan Kıyak, Stefan Mertens and Leen d'Haenens
Diversity and bias in DBpedia and Wikidata as challenges for downstream processing [PDF]
Andrea Burns, Krishna Srinivasan, Joshua Ainslie, Geoff Brown, Kate Saenko, Bryan A. Plummer, Jianmo Ni and Mandy Guo
WikiWeb2M: A Page-Level Multimodal Wikipedia Dataset [PDF]
Kaylea Champion and Benjamin Mako Hill
Taboo and Otherwise: Epistemic Life Histories of Knowledge Resources [PDF]
Zied Dammak and Florian Lemmerich
Effects of the Russo-Ukrainian War on the Editor Activity of the Ukrainian, Russian, and English Wikipedias [PDF]
Laura Fernández and Núria Ferran-Ferrer
Addressing the Wikipedia’s Gender Gap: Towards a Full Inclusion of Intersex and Trans-Non-Binary Gender Identities [PDF]
Patrick Gildersleve
Depths of wikipedia: Understanding cross-platform online attention, content creation, and success [PDF]
Florian Grisel and Giovanni De Gregorio
Codifying Digital Behavior Around the World: A Socio-Legal Study of the Wikimedia Universal Code of Conduct
Sneh Gupta and Kulveen Trehan
Feminist Activism and Power Dynamics in the Digital Sphere: A Case study of #VisibleWikiWomen campaign [PDF]
Sohyeon Hwang and Aaron Shaw
Variation and overlap in the peer production of community rules: the case of five Wikipedias
Steve Jankowski, Claudio Celis Bueno, Jakko Kemper and Ouejdane Sabbah
Global Platform Governance: Multilingual Policy Development on Wikipedia [PDF]
Lucas Jarnac and Pierre Monnin
Wikidata to Bootstrap an Enterprise Knowledge Graph: How to Stay on Topic? [PDF]
Erwan Joud, Nicolas Jullien and Marine Le Gall
Regulation of algorithmic errors in digital commons platforms: managing the conformity-creativity conundrum [PDF]
Zarine Kharazian, Kate Starbird and Benjamin Mako Hill
Governance Capture in a Self-Governing Community: A Qualitative Analysis of Serbo-Croatian Wikipedias [PDF]
Taehee Kim, David Garcia and Pablo Aragón
Controversies over Historical Revisionism in Wikipedia [PDF]
Neeraja Kirtane, Anuraag Shankar, Chelsi Jain, Ganesh Katrapati, Senthamizhan V, Raji Baskaran and Balaraman Ravindran
Hidden Voices: Reducing gender data gap, one Wikipedia article at a time [PDF]
Piotr Konieczny and Włodzimierz Lewoniewski
Measuring Americanization: A Global Quantitative Study of Interest in American Topics on Wikipedia [PDF]
Elisa Kreiss, Krishna Srinivasan, Tiziano Piccardi, Jesus Adolfo Hermosillo, Cynthia Bennett, Michael S. Bernstein, Meredith Ringel Morris and Christopher Potts
Characterizing Image Accessibility on Wikipedia across Languages [PDF]
Hermann Kroll and Wolf-Tilo Balke
Are Qualifiers Enough? Context-Compatible Information Fusion for Wikimedia Data [PDF]
Zhaozhi Li, Julia Wagner, Weijun Yuan, Benjamin Mako Hill and Seth Frey
One path or many? Policy development and diffusion across Wikipedia language editions [PDF]
Malinda Lu and Eni Mustafaraj
Identifying the Gaps in the Coverage of Web Domains in Wikipedia and Wikidata for Credibility Assessment Purposes [PDF]
Daniele Metilli, Beatrice Melis, Chiara Paolini and Marta Fioravanti
How does Wikidata shape gender identities? Initial findings and developments from the WiGeDi project [PDF]
Skaiste Mielinyte and Björn Ross
Detecting Sockpuppet Accounts in Wikipedia: A Quantitative Evaluation of Different Models [PDF]
Finn Nielsen
Synia: Displaying data from Wikibases [PDF]
Tracy Perkins, Sophia Hussein, Lundyn Davis and Mariam Trent
Wikipedia and the Outsider Within: Black Feminism and Racialized, Gendered Knowledge Sharing [PDF]
Tiziano Piccardi, Martin Gerlach and Robert West
Temporal Rhythms of Information Consumption on Wikipedia [PDF]
David Ramírez-Ordóñez and Núria Ferran-Ferrer
Endurance against oblivion: The case of the Articles for Deletion with gender perspective in Wikipedia [PDF]
Riina Reinsalu
Using Wikipedia for educational purposes in Estonia [PDF]
Mostofa Najmus Sakib and Francesca Spezzano
Automated Detection of Sockpuppet Accounts in Wikipedia [PDF]
Bruno Scarone, Ricardo Baeza-Yates and Erik Bernhardson
Understanding Search Behavior Bias in Wikipedia [PDF]
Nicole Schwitter
Data Brief: Twenty years of offline meeting data of the German-language Wikipedia [PDF]
Anubhav Sharma, Ankita Maity, Tushar Abhishek, Rudra Dhar, Radhika Mamidi, Manish Gupta and Vasudeva Varma
Multilingual Bias Detection and Mitigation for Low Resource Languages
Bhavyajeet Singh, Aditya Hari, Rahul Mehta, Tushar Abhishek, Manish Gupta and Vasudeva Varma
Cross-lingual Multi-Sentence Fact-to-Text Generation: Generating factually grounded Wikipedia Articles using Wikidata [PDF]
Manoj Sirvi, Zishan Kazi and Vikram Pudi
Wikipedia Real-time Updates Recommendation System [PDF]
Ivan Smirnov, Camelia Oprea and Markus Strohmaier
Toxic comments reduce activity of volunteer editors on Wikipedia
Prasanna Sridhar, Horace Lee, Abhishek Dutta and Andrew Zisserman
WISE Image Search Engine (WISE) [PDF]
Shivansh Subramanian, Dhaval Taunk, Manish Gupta and Vasudeva Varma
XOutlineGen: Cross-lingual Outline Generation for Encyclopedic Text in Low Resource Languages [PDF]
Tomoki Tada, Katsuhiko Hayashi, Hidetaka Kamigaito and Yuya Taguchi
WPUA-search: A Method to Discover Wikipedia Unusual Articles [PDF]
Dhaval Taunk, Shivprasad Sagare, Anupam Patil, Shivansh Subramanian, Manish Gupta and Vasudeva Varma
XWikiGen: Cross-lingual Summarization for Encyclopedic Text Generation in Low Resource Languages
Gokul Thota, Rahul Khandelwal and Vasudeva Varma
A Web-centric entity-salience based system for determining Notability of entities for Wikipedia [PDF]
Michele Tizzani, Violeta Muñoz-Gómez, Marco De Nardi, Daniela Paolotti, Olga Muños, Piera Ceschi, Arvo Viltrop and Ilaria Capua
Integrating digital and field surveillance as complementary efforts to manage epidemic diseases of livestock: African swine fever as a case study [PDF]
Carla Toro Fernández
La Tercera and Wikipedia: the relationship between the news and the editions in the encyclopedia during the Social Outbreak of 2019 [PDF]
Mykola Trokhymovych, Muniza Aslam, Ai-Jou Chou, Diego Saez-Trumper and Ricardo Baeza-Yates
Towards a fair vandalism detection system for Wikipedia [PDF]
Houcemeddine Turki, Montasser Akermi, Amina Amara, Mohamed Ali Hadj Taieb, Khalil Chebil, Daniel Mietchen and Mohamed Ben Aouicha
Developing a Wikimedia-related research structure in a developing country [PDF]
Ziko van Dijk
Knowledge gap or added value? Self-related content in two lesser resourced languages [PDF]
Amit Arjun Verma, S.R.S Iyengar, Neeru Dubey and Simran Setia
An Efficient Approach to Store and Access Wikipedia's Revision History [PDF]
Veniamin Veselovsky, Akhil Arora, Tiziano Piccardi, Ashton Anderson and Robert West
The Webonization of Wikipedia: Characterizing Wikipedia Linking Across the Web [PDF]
Linda Wang
Social and Language Influence in Wikipedia Articles for Deletion Debates [PDF]
Yurong Wang, Claudia Wagner and Ana L. C. Bazzan
Gender Asymmetries in the depiction of Historical Figures: A Comparison of Bede’s Historia Ecclesiastica and Wikipedia [PDF]
Jheng-Hong Yang, Carlos Lassance, Stéphane Clinchant, Rafael Sampaio De Rezende, Miriam Redi, Krishna Srinivasan and Jimmy Lin
Building Authoring Tools for Multimedia Content with Human-in-the-loop Relevance Annotations [PDF]

  • Submission deadline: March 23, 2023 23:59 AOE
  • Author notification: April 17, 2023
  • Final version due: May 1, 2023 23:59 AOE
  • Workshop date: May 11, 2023 (tentatively scheduled from 12:00-19:00 UTC)

We invite contributions to the 10th edition (!) of Wiki Workshop, which will take place virtually on May 11, 2023 (tentatively 12:00-19:00 UTC). Wiki Workshop is the largest Wikimedia research event of the year, aimed at bringing together researchers who study all aspects of Wikimedia projects (including, but not limited to, Wikipedia, Wikidata, Wikimedia Commons, Wikisource, and Wiktionary) as well as Wikimedia developers, affiliate organizations, and volunteer editors. Co-organized by the Wikimedia Foundation’s Research team and members of the Wikimedia research community, the workshop facilitates a direct pathway for exchanging ideas between the organizations that serve Wikimedia projects and the researchers actively studying them. New this year: Building on the successful experiences of organizing Wiki Workshop in 2015, 2016, 2017, 2018, 2019, 2020, 2021, and 2022 and based on feedback from authors and participants over the years, we are introducing a few updates to the research track of the workshop for 2023:

  • This 10th edition will take place as a standalone event (rather than in co-location with a conference, as in previous years).
  • We have changed the format of submissions and will only accept 2-page extended abstracts (following the successful IC2S2 model).
  • Submissions are non-archival, so we welcome ongoing, completed, and already published work. Non-archival means that Wiki Workshop does not constitute a publication venue and that having a paper being submitted, accepted, and presented at the workshop does not constitute publication.
  • We are excited to share that the authors of Wiki Workshop 2023 will have the opportunity to receive feedback, improve their work, and submit the extended version of their research paper to a special issue of the ACM Transactions on the Web, which will have a dedicated open call for papers later in 2023.

Topics include, but are not limited to:

  • new technologies and initiatives to grow content, quality, equity, diversity, and participation across Wikimedia projects
  • use of bots, algorithms, and crowdsourcing strategies to curate, source, or verify content and structured data
  • bias in content and gaps of knowledge on Wikimedia projects
  • relation between Wikimedia projects and the broader (open) knowledge ecosystem
  • exploration of what constitutes a source and how/if the incorporation of other kinds of sources are possible (e.g., oral histories, video)
  • detection of low-quality, promotional, or fake content (misinformation or disinformation), as well as fake accounts (e.g., sock puppets)
  • questions related to community health (e.g., sentiment analysis, harassment detection, tools that could increase harmony)
  • motivations, engagement models, incentives, and needs of editors, readers, and/or developers of Wikimedia projects
  • innovative uses of Wikipedia and other Wikimedia projects for AI and NLP applications and vice versa
  • consensus-finding and conflict resolution on editorial issues
  • dynamics of content reuse across projects and the impact of policies and community norms on reuse privacy, security, and trust
  • collaborative content creation
  • innovative uses of Wikimedia projects' content and consumption patterns as sensors for real-world events, culture, etc.
  • open-source research code, datasets, and tools to support research on Wikimedia contents and communities
  • connections between Wikimedia projects and the Semantic Web
  • strategies for how to incorporate Wikimedia projects into media literacy interventions

This year’s Wiki Workshop solicits extended abstracts (PDF format, maximum 2 pages, including references). Submissions that exceed the 2-page limit will be automatically rejected. Authors may include 1 additional page with figures and/or tables (including captions) only. Initial submissions require names and affiliations of authors, 5 keywords, a title, abstract, and a main text outlining the contribution, methods, findings, and impact of the work, whichever is relevant. Submissions will be non-archival and as a result may have already been published, under review, or ongoing research. All submissions will be reviewed by multiple members of the Wiki Workshop Program Committee. The names of the authors will be revealed to the reviewers, whereas reviewers will remain anonymous to authors. Authors of accepted abstracts will be invited to present their research in a pre-recorded oral presentation with dedicated time for live Q&A on May 11, 2023. Accepted abstracts may be shared on the website prior to the event. The template for abstracts can be found here. Please review our Privacy Statement before submitting your abstract on EasyChair.

  • Pushkal Agarwal (King's College London)
  • Reem Al-Kashif (Ain Shams University)
  • Pablo Aragón (Wikimedia Foundation)
  • Akhil Arora (Ecole Polytechnique Fédérale de Lausanne)
  • Sumit Asthana (University of Michigan)
  • Bunty Avieson (University of Sydney)
  • Alice Battiston (University of Turin)
  • Pablo Beytía (Catholic University of Chile)
  • Amy Bruckman (Georgia Institute of Technology)
  • Hannah Bruckner (NYU Abu Dhabi)
  • Jonathan Chang (Cornell University)
  • Djellel Difallah (NYU Abu Dhabi)
  • Heather Ford (University of Technology Sydney)
  • Patrick Gildersleve (London School of Economics and Political Science)
  • Kristina Gligorić (Stanford University)
  • Isaac Johnson (Wikimedia Foundation)
  • Lucie-Aimée Kaffee (University of Copenhagen)
  • Kiriaki Kalimeri (ISI Foundation)
  • Os Keyes (University of Washington)
  • Isabelle Langrock (University of Pennsylvania)
  • Florian Lemmerich (University of Passau)
  • Włodzimierz Lewoniewski (Poznań University of Economics and Business)
  • Daniele Metilli (University College London)
  • Tiziano Piccardi (Stanford University)
  • Daniele Rama (University of Turin)
  • Miriam Redi (Wikimedia Foundation)
  • Riina Reinsalu (University of Tartu)
  • Diego Saez-Trumper (Wikimedia Foundation)
  • Marija Sakota (Ecole Polytechnique Fédérale de Lausanne)
  • Rossano Schifanella (University of Turin)
  • Indira Sen (GESIS)
  • Claudia Șerbănuță (Făgăraș Research Institut)
  • Francesca Spezzano (Boise State University)
  • Andreas Spitz (University of Konstanz)
  • Krishna Srinivasan (Google)
  • Loren Terveen (University of Minnesota)
  • Michele Tizzoni (University of Trento)
  • Mykola Trokhymovych (University Pompeu Fabra)
  • Houcemeddine Turki (University of Sfax)
  • Nick Vincent (UC Davis)
  • Morten Warncke-Wang (Wikimedia Foundation)
  • Dale Zhou (University of Pennsylvania)
Pablo Aragon (Wikimedia Foundation)
Martin Gerlach (Wikimedia Foundation)
Evelin Heidel (Wikimedistas de Uruguay)
Karen Hernandez (Wikimedia Foundation)
Francesca Tripodi (University of North Carolina)
Robert West (EPFL)
Leila Zia (Wikimedia Foundation)

