📊 The Data Quality Test - Paper Presentation

Data Quality is integral to what you do. But how can you be sure you’ve made the right choice of data-collection platform?

So, we decided to put them to the test! We compared MTurk, Prolific and panel companies on metrics related to attention, comprehension, honesty, and reliability. One platform outperformed them all by an average margin of 40%.

Guess which one? :wink:

Join our host David Rothschild, PhD as we present the findings of our paper. Pre-print here.

This webinar will be streamed directly to the forum below. Scroll down for the video! :slight_smile:

1 Like

If you’ve got questions, post it below, and we’ll answer it after the talk!

1 Like

Presentation Stream


Hey David, great talk! The floor is now open for questions. We’ve had some submitted already, so I’ll post those in a bit.

Here is a question we’ve had submitted. Let us know what you think:

How do you mitigate conflict of interest here?

  1. We view this entire project as a kick-off point for discussion, innovation, and replication! We are so excited by the feedback we have already got. We excited by talking to other teams already considering new projects to push the work forward. And, we, and another team, have already started replicating work under similar, but different conditions.
1 Like

If we have to put Comprehension, Honesty, Accuracy and Reliably in a ranked order of importance for data quality would it be possible? And if so what order?

1 Like

@Ekaterina_Damer I saw your Twitter exchange with the CEO of Cloud research - may I ask what Prolific’s response is to their reply to your paper.


Hey Steve, we should be able to check the % of researchers who rated each aspect as being important from our initial survey. From memory, I believe Attention was most commonly rated as important, followed by Comprehension, then Reliability, then Honesty. But I can double-check this!


My memory has not served me well here! Looking at the % of researchers who rated each aspect as important, the actual results were Comprehension (87%), Attention (84%), Honesty (83%), Reliability (72%).


I have a few questions:

  • Do you think including questions that monitor the attention should become standard practice for market researchers conducting online surveys?
  • Is there a threshold for survey length, where you would consider it essential to include measures of attention i.e 5 minutes, 10 minutes?
  • As the measure of comprehension required interpretation of open-ends. It feels like it requires a lot of time, and will be incompatible with projects requiring a quick turnaround. Is there an alternative approach that measures comprehension with close-ended questions?
1 Like

Hi Alex,

I believe in their preprint/blog post CloudResearch are asking us to apply their data quality filters to enable a “fair comparison”.

However, we have no ability to do this for the other platforms we compare. Hence we would be comparing an extremely polished version of CR with the most basic versions of the Mturk, Prolific, Qualtrics Panels and Dynata. This doesn’t seem sensible and probably wouldn’t pass the academic peer review process.

Our goal was to compare the underlying participant pools of the different platforms, so that’s what we’ve done. We didn’t know that turning off the CloudResearch defaults means that we’re more or less sampling from Mturk itself. We now know that and are planning a follow-up study to address the concerns of CloudResearch.

Hope this helps? :slightly_smiling_face:


Hi @louis.stevens thanks for the questions!

  1. I think that anyone conducting online research will generally benefit from having an attention check measure included, so I believe that this should become standard practice. However, there are certainly potential drawbacks that I can see - for instance, if attention checks become standard and closely mirror each other then participants will quickly become aware of them and learn to pass them (take a look at this paper for an overview of this)

  2. There is no generally suggested threshold for this that I am aware of. Personally I would advise that any research online includes this type of quality measure regardless of length.

  3. There are a few other options. You could provide participants with a range of options that summarize the instructions and ask them to select the one that is accurate, however this measure will likely be confounded by the fact that you are providing the participant with a reminder of the content (i.e., they may not remember what they were asked to do until they see the option in the answers). Another way might be to ask participants about a very specific part of the question and only allow them to answer a single word or number - for example, if a question asks the participant to look at 5 images, ask them “how many images do you need to look at?” - this would make the results much easier to quantify.

I hope this helps?


Really helpful, thanks Andrew.

1 Like

Thanks for the answer. Yes, that makes sense to me. I’m looking forward to the findings of the follow-up study.

1 Like