Wiki Workshop 2019

Follow @wikiworkshop

May 14, 2019: Workshop happened -- amazing!
Apr. 29, 2019: PDFs of accepted papers available.
Apr. 17, 2019: List of accepted papers announced.
Apr. 4, 2019: Workshop schedule announced.
Feb. 6, 2019: Denny Vrandečić confirmed as invited speaker.
Feb. 6, 2019: Erica Kochi confirmed as invited speaker.
Jan. 23, 2019: Jure Leskovec confirmed as invited speaker.
Jan. 23, 2019: Neil Thompson confirmed as invited speaker.
Jan. 23, 2019: Timnit Gebru confirmed as invited speaker.
Jan. 20, 2019: Workshop date announced: Tuesday, May 14, 2019.
Dec. 6, 2018: Wiki Workshop 2019 webpage online.

9:00 - 9:20	Welcome and icebreaking
9:20 - 10:05	Invited talk: Denny Vrandečić
10:05 - 10:17	Paper presentation: Mohsen Sayyadiharikandeh, Jonathan Gordon, Jose-Luis Ambite and Kristina Lerman: Finding Prerequisite Relations Using the Wikipedia Clickstream
10:17 - 10:30	Paper presentation: Xiaoxi Chelsy Xie, Isaac Johnson and Anne Gomez: Detecting and Gauging Impact on Wikipedia Page Views
10:30 - 11:00	Coffee break
11:00 - 11:45	Invited talk: Timnit Gebru
11:45 - 12:25	Poster spotlight presentations
12:25 - 12:30	Poster setup
12:30 - 14:00	Lunch and poster session
14:00 - 14:12	Paper presentation: Swati Goel, Ashton Anderson and Leila Zia: Thanks for Stopping By: A Study of “Thanks” Usage on Wikimedia
14:12 - 14:25	Paper presentation: Ali Javanmardi and Lu Xiao: What’s in the Content of Wikipedia’s Article for Deletion Discussions? Towards a Visual Analytic Approach
14:25 - 15:10	Invited talk: Erica Kochi
15:10 - 15:30	Open discussion
15:30 - 16:00	Coffee break
16:00 - 16:40	Invited talk: Jure Leskovec
16:40 - 17:25	Invited talk: Neil Thompson
17:25 - 17:30	Closing remarks

Denny Vrandečić (Google)

Beyond Wikidata

Wikidata has quickly become a major wiki project. By becoming so, it has stretched what wikis can be successfully used for. We will take a look at the state of Wikidata, how it can help the Wikipedias (and other projects), and we'll discuss the question if we can take the wiki approach further to even more complex approaches, such as an Abstract Wikipedia.

Bio

Denny works at the Google Knowledge Graph. He previously has worked at the Karlsruhe Institute of Technology (2004-2012), the University of Southern California (2010), and as the project director of Wikidata at Wikimedia Deutschland (2012/13). His research interests are massive collaborative systems, knowledge bases, and the Semantic Web.

Erica Kochi (UNICEF)

Bio

Erica co-founded and co-leads UNICEF’s Innovation Unit, a group tasked with identifying, prototyping and scaling technologies and practices that improve UNICEF’s work on the ground. Erica also serves as Innovation Advisor to UNICEF’s Executive Director. Erica co-taught ‘Design for UNICEF’ at NYU’s ITP and has lectured at the Yale School of Management, Harvard University, The Art Center, Stanford University School of Engineering, and Columbia School of International and Public Affairs on technology, innovation, design, and international development.

Jure Leskovec (Stanford University)

Making Wikipedia Safer

Bio

Jure is an associate professor of Computer Science at Stanford University. His research focuses on mining and modeling large social and information networks, their evolution, and diffusion of information and influence over them. Problems he investigates are motivated by large scale data, the Web and online media.

Timnit Gebru (Google)

Understanding the Limitations of AI: When Algorithms Fail

Automated decision making tools are currently used in high stakes scenarios. From natural language processing tools used to automatically determine one’s suitability for a job, to health diagnostic systems trained to determine a patient’s outcome, machine learning models are used to make decisions that can have serious consequences on people’s lives. In spite of the consequential nature of these use cases, vendors of such models are not required to perform specific tests showing the suitability of their models for a given task. Nor are they required to provide documentation describing the characteristics of their models, or disclose the results of algorithmic audits to ensure that certain groups are not unfairly treated. I will show some examples to examine the dire consequences of basing decisions entirely on machine learning based systems, and discuss recent work on auditing and exposing the gender and skin tone bias found in commercial gender classification systems. I will end with the concept of an AI datasheet to standardize information for datasets and pre-trained models, in order to push the field as a whole towards transparency and accountability.

Bio

Timnit is a research scientist in the Ethical AI team at Google. Prior to that, she was a postdoc at Microsoft Research, New York, and a PhD student in the Stanford Artificial Intelligence Laboratory. She is currently studying the ethical considerations underlying any data mining project, and methods of auditing and mitigating bias in sociotechnical systems. The New York Times, MIT Tech Review and others have recently covered her work. As a cofounder of the group Black in AI, she works to both increase diversity in the field and reduce the negative impacts of racial bias in training data used for human- centric machine learning models.

Neil Thompson (MIT)

Science is Shaped by Wikipedia: Evidence from a Randomized Control Trial

“I sometimes think that general and popular treatises are almost as important for the progress of science as original work.” — Charles Darwin
As the largest encyclopedia in the world, it is not surprising that Wikipedia reflects the state of scientific knowledge. However, Wikipedia is also one of the most accessed websites in the world, including by scientists, which suggests that it also has the potential to shape science. This paper shows that it does. Incorporating ideas into Wikipedia leads to those ideas being used more in the scientific literature. We provide correlational evidence of this across thousands of Wikipedia articles and causal evidence of it through a randomized control trial where we add new scientific content to Wikipedia. In the months after uploading it, an average new Wikipedia article on Chemistry is read tens of thousands of times and causes changes to hundreds of related scientific journal articles. Adding references to Wikipedia also has an effect, causing important scientific articles to get more citations. Our findings speak not only to the influence of Wikipedia, but more broadly to the influence of repositories of knowledge and the role that they play in Science.

Bio

Neil is a Research Scientist at MIT’s Computer Science and Artificial Intelligence Lab and a Visiting Professor at the Lab for Innovation Science at Harvard. He is also an Associate Member of the Broad Institute, and was previously an Assistant Professor of Innovation and Strategy at the MIT Sloan School of Management, where he co-directed the Experimental Innovation Lab (X-Lab). Neil did his PhD in Business and Public Policy at Berkeley. Prior to academia, he worked at organizations such as Lawrence Livermore National Laboratories, Bain and Company, the United Nations, the World Bank, and the Canadian Parliament.

Preeti Bhargava, Nemanja Spasojevic, Sarah Ellinger, Adithya Rao, Abhinand Menon, Saul Fuhrmann and Guoning Hu

Learning to Map Wikidata Entities to Predefined Topics [PDF]

Ali Javanmardi and Lu Xiao

What’s in the Content of Wikipedia’s Article for Deletion Discussions? Towards a Visual Analytic Approach [PDF]

Xiaoxi Chelsy Xie, Isaac Johnson and Anne Gomez

Detecting and Gauging Impact on Wikipedia Page Views [PDF]

Shaunak Mishra, Aasish Pappu and Narayan Bhamidipati

Inferring Advertiser Sentiment in Online Articles using Wikipedia Footnotes [PDF]

Mohsen Sayyadiharikandeh, Jonathan Gordon, Jose-Luis Ambite and Kristina Lerman

Finding Prerequisite Relations Using the Wikipedia Clickstream [PDF]

James Ashford, Liam Turner, Roger Whitaker, Alun Preece, Diane Felmlee and Don Towsley

Understanding the Signature of Controversial Wikipedia Articles through Motifs in Editor Revision Networks [PDF]

Chuankai An and Daniel Rockmore

Open Personalized Navigation on the Sandbox of Wiki Pages [PDF]

Swati Goel, Ashton Anderson and Leila Zia

Thanks for Stopping By: A Study of “Thanks” Usage on Wikimedia [PDF]

Nicolas Aspert, Volodymyr Miz, Benjamin Ricaud and Pierre Vandergheynst

A Graph-Structured Dataset for Wikipedia Research [PDF]

Gil Domingues and Carla Teixeira Lopes

Characterizing and Comparing Portuguese and English Wikipedia Medicine-Related Articles

Chander Iyer and Srinath Ravindran

Understanding Travel from Web Queries Using Domain Knowledge from Wikipedia [PDF]

Khonzodakhon Umarova and Eni Mustafaraj

How Partisanship and Perceived Political Bias Affect Wikipedia Entries of News Sources [PDF]

Charlotte Rudnik, Thibault Ehrhart, Olivier Ferret, Denis Teyssou, Raphaël Troncy and Xavier Tannier

Searching News Articles Using an Event Knowledge Graph Leveraged by Wikidata [PDF]

Iris Qu, Nithum Thain and Yiqing Hua

WikiDetox Visualization [PDF]

Olga Slivko

Online “Brain Gain”: Do Immigrants Return Knowledge Home? [PDF]

Lei Zheng, Christopher M. Albano and Jeffrey V. Nickerson

Steps toward Understanding the Design and Evaluation Spaces of Bot and Human Knowledge Production Systems [PDF]

Cristian Consonni, David Laniado and Alberto Montresor

Discovering Topical Contexts from Links in Wikipedia

Workshop date: Tuesday, May 14, 2019

If authors want paper to appear in proceedings:

Submission deadline: January 31, 2019
Author feedback: February 21, 2019
Camera-ready version due: March 3, 2019

If authors do not want paper to appear in proceedings:

Submission deadline: March 14, 2019
Author feedback: March 28, 2019

Note: If you need a visa to travel to U.S. and your application for the visa depends on your workshop paper being accepted, we would advise you to submit your workshop paper for the January 31 deadline.

Wikipedia is one of the most popular sites on the Web, a main source of knowledge for a large fraction of Internet users, and one of the very few projects that make not only their content but also many activity logs available to the public. Furthermore, other Wikimedia projects, such as Wikidata and Wikimedia Commons, have been created to share other types of knowledge with the world for free. For a variety of reasons (quality and quantity of content, reach in many languages, process of content production, availability of data, etc.) such projects have become important objects of study for researchers across many subfields of the computational and social sciences, such as social network analysis, artificial intelligence, linguistics, natural language processing, social psychology, education, anthropology, political science, human–computer interaction, and cognitive science.

The goal of this workshop is to bring together researchers exploring all aspects of Wikimedia projects such as Wikipedia, Wikidata, and Commons. With members of the Wikimedia Foundation's Research team on the organizing committee and with the experience of successful workshops in 2015, 2016, 2017, and 2018, we aim to continue facilitating a direct pathway for exchanging ideas between the organization that coordinates Wikimedia projects and the researchers interested in studying them.

Topics of interest include, but are not limited to

new technologies and initiatives to grow content, quality, diversity, and participation across Wikimedia projects
use of bots, algorithms, and crowdsourcing strategies to curate, source, or verify content and structured data
bias in content and gaps of knowledge
diversity of Wikimedia editors and users
detection of low-quality, promotional, or fake content, as well as fake accounts (e.g., sock puppets)
questions related to community health (e.g., sentiment analysis, harassment detection)
understanding editor motivations, engagement models, and incentives
Wikimedia consumer motivations and their needs: readers, researchers, tool/API developers
innovative uses of Wikipedia and other Wikimedia projects for AI and NLP applications
consensus-finding and conflict resolution on editorial issues
participation in discussions and their dynamics
dynamics of content reuse across projects and the impact of policies and community norms on reuse
privacy
collaborative content creation (unstructured, semi-structured, or structured)
innovative uses of Wikimedia projects' content and consumption patterns as sensors for real-world events, culture, etc.
open-source research code, datasets, and tools to support research on Wikimedia contents and communities

Papers should be 1 to 8 pages long and will be published on the workshop webpage and optionally (depending on the authors' choice) in the workshop proceedings. The review process will be single-blind (as opposed to double-blind), i.e., authors should include their names and affiliations in their submissions. Authors whose papers are accepted to the workshop will have the opportunity to participate in a poster session.

We explicitly encourage the submission of preliminary work in the form of extended abstracts (1 or 2 pages).

Papers should be 1 to 8 pages long. We explicitly encourage the submission of preliminary work in the form of extended abstracts (1 or 2 pages). No need to anonymize your submissions.

For submission dates, see above.

Format: ACM SIG conference proceedings template (use sample-sigconf.pdf as the template)
Submission site: https://easychair.org/conferences/?conf=wikiworkshop2019

Michele Catasta, Stanford University
Lucas Dixon, Jigsaw
Besnik Fetahu, L3S Hannover
Andrea Forte, Drexel University
Gary Hsieh, University of Washington
Yiqing Hua, Cornell University
Isaac Johnson, Wikimedia Foundation
Os Keyes, University of Washington
Markus Kroetzsch, University of Dresden
Florian Lemmerich, RWTH Aachen University
Lauren Maggio, Uniformed Services University
David McDonald, University of Washington
Jonathan Morgan, Wikimedia Foundation
André Panisson, ISI Foundation
Daniela Paolotti, ISI Foundation
Tiziano Piccardi, EPFL
Dario Rossi, Huawei
Diego Saez-Trumper, Wikimedia Foundation
Markus Strohmaier, RWTH Aachen University
Nithum Thain, Jigsaw
Michele Tizzoni, ISI Foundation
Morten Warncke-Wang, Wikimedia Foundation
Joe Wass, Crossref
Ramtin Yazdanian, EPFL
Amy Zhang, MIT

Robert West

Bob is an assistant professor of Computer Science at EPFL, where he heads the Data Science Lab. His research aims to understand, predict, and enhance human behavior in social and information networks by developing techniques in data science, data mining, network analysis, machine learning, and natural language processing. He holds a PhD in computer science from Stanford University.

Miriam Redi

Miriam is a Research Scientist at the Wikimedia Foundation and Visiting Research Fellow at King's College London. Formerly, she worked as a Research Scientist at Yahoo! Labs in Barcelona and Nokia Bell Labs in Cambridge. She received her PhD from EURECOM, Sophia Antipolis. She conducts research in social multimedia computing, working on fair, interpretable, multimodal machine learning solutions to improve knowledge equity.

Dario Taraborelli

Dario is a social computing researcher and the Wikimedia Foundation's Head of Research. His current interests focus on online collaboration, open science, and the measurement and discoverability of scientific knowledge. He holds a PhD in cognitive science from the École des Hautes Études en Sciences Sociales.

Please direct your questions to wikiworkshopgooglegroupscom.

Wiki Workshop 2019

News

Schedule

Invited speakers

Denny Vrandečić (Google)

Beyond Wikidata

Bio

Erica Kochi (UNICEF)

Bio

Jure Leskovec (Stanford University)

Making Wikipedia Safer

Bio

Timnit Gebru (Google)

Understanding the Limitations of AI: When Algorithms Fail

Bio

Neil Thompson (MIT)

Science is Shaped by Wikipedia: Evidence from a Randomized Control Trial

Bio

Accepted papers

Key dates

Call for contributions

Submission instructions

Program committee

Previous editions

Organization

Robert West

Miriam Redi

Dario Taraborelli

Contact