How to Use Crowdsourcing Effectively for Social Media & Web Science Research [ICWSM 2016 & Web Science 2016]

Since the term crowdsourcing was coined in 2005, we have witnessed a surge in the  adoption of the crowdsourcing paradigm. Crowdsourcing solutions are highly sought-after to solve problems that require human intelligence at a large scale. In the last decade there have been numerous applications of crowdsourcing spanning several domains in both research and for practical benefits across disciplines (from sociology to computer science). In the realm of research practice, crowdsourcing has unmistakably broken the barriers of qualitative and quantitative studies by providing a means to scale-up previously constrained laboratory studies and controlled experiments. Today, one can easily build ground truths for evaluation, access potential participants around the clock with diverse demographics at will, and all within an unprecedentedly short amount of time. This also comes with a number of challenges related to lack of control on research subjects and to data quality.

In this tutorial, we will introduce the crowdsourcing paradigm in its entirety. We will discuss altruistic and reward-based crowdsourcing, eclipsing the needs of task requesters as well as the behavior of crowd workers. The tutorial will focus on paid microtask crowdsourcing, and reflect on the challenges and opportunities that confront us. In an interactive demonstration session, we will run the audience through the entire lifecycle of creating and deploying microtasks on an established crowdsourcing platform, optimizing task settings in order to meet task needs, and aggregating results thereafter. We will present a selection of state-of-the-art methods to ensure high-quality results and inhibit malicious activity. The tutorial will be framed within the context of Web Science. The interdisciplinary nature of Web Science breeds a rich ground for crowdsourcing, and we aim to spread the virtues of this growing field.

Tutorial Organizers

Ujwal Gadiraju   UJ-1   

Affiliation : L3S Research Center, Leibniz Universit ̈at Hannover, Germany
Webpage :
Email :


Dr. Gianluca Demartini    Gian

Affiliation : Information School, University of Sheffield, United Kingdom
Webpage :
Email :


Dr. Djellel Eddine DifallahDjellel

Affiliation : Exascale Infolab, University of Fribourg, Switzerland
Webpage :
Email :


Dr. Michele Catastamugshot-small_400x400

Affiliation : EPFL, Switzerland
Webpage :
Email :

Tutorial Schedule

I. Introduction to Crowdsourcing

Tutors : Ujwal Gadiraju and Gianluca Demartini
Duration : 45 mins

After a short overview of the tutorial learning objectives and schedule, we will introduce the area of crowdsourcing and discuss the following introductory topics.

1. Worker Participation: Altruistic vs Incentive-based Crowdsourcing
2. Paid Microtask Crowdsourcing: Challenges and Opportunities
3. Crowdsourcing for Web Science Research
4. Transition from the Lab to the Crowd

II. Quality Control Mechanisms

Tutors : Djellel Eddine Difallah and Ujwal Gadiraju
Duration : 45 mins

1. Aspects that Affect the Quality of Results
2. Typical Quality Control Measures
3. Influence of Task Types
4. Optimization
5. Selecting the Crowd We Need

III. Crowdsourcing: The A to Z of How

Tutors : Gianluca Demartini and Michele Catasta
Duration : 45 mins

1. Leveraging the Workforce
2. Aggregating Results
3. Considerations
4. Lifecycle of a Crowdsourcing Experiment

IV. Crowdsourcing Opportunities in Web Science Research

Tutors : Michele Catasta and Gianluca Demartini
Duration : 90 mins

1. Hybrid Human-Machine systems
2. A Holistic Human-computation Perspective
3. Future of Crowdsourcing in Web Science