“Are Crowdsourcing Platforms Reliable for Video Game-related Research?” A Case Study on Amazon Mechanical Turk

Eisele, L., Apruzzese, G., Annual Symposium on Computer-Human Interaction in Play (WiP track), 2024 Workshop
Oneliner: Game-related user studies should validate the responses collected via AMT.

Abstract. Video games are becoming increasingly popular in research, and abundant prior work has investigated this domain by means of user studies. However, carrying out user studies whose population encompasses a large and diverse set of participants is challenging. Crowdsourcing platforms, such as Amazon Mechanical Turk (AMT), represent a cost-effective solution to address this problem. Yet, prior efforts scrutinizing the data-quality (unrelated to gaming) collected via AMT raises a concern: is AMT reliable for game studies?

In this paper, we are the first to tackle this question. We carry out three user studies (n=302) through which we evaluate the overall validity of the responses—pertaining to 14 popular video games—we received via AMT. We adopt strict verification mechanisms, which are trivial to “bypass” by real gamers, but costly for non-gamers. We found that the percentage of valid responses ranges from 5% (for WoW) to 28% (for PUBG). We hence advocate future research to carefully scrutinize the validity of responses collected via AMT.

Paper PDF Cite Repository ACM Digital Library