Press/Studies Are you participating in "Crowdturfing"?

Serial Mom

I’m smiling…that alone should scare you.
Contributor
Joined
Aug 18, 2016
Messages
4,228
Reaction score
9,451
Points
988
Gender
Female
With my morning coffee I stumbled upon this article today:

WPI Research Detects When Online Reviews and News Are a Paid-for Pack of Lies

After reading it, I have a few thoughts/questions.....

My goal is to reveal a whole ecosystem of crowdturfing. Who are the workers performing these tasks? What websites are they targeting? What are they falsely promoting?”
Are they looking to identify turkers who complete these tasks? If so, how do you think they would do that?

The algorithm can identify the malicious organizations posting the tasks, the websites the crowdturfers are told to target, and even the workers who are signing up to complete the tasks.

This also makes me wonder about what they intend to do with the info they gather in identifying workers. Will this lead to a worker ID blacklist?

Lee added that he expects to
make his algorithms openly available to companies and organizations, which could tailor them to their specific needs. “I expect to share the data set so people can come up with a better algorithm, adapted for their specific organization,” he said.

So does this mean he intends to share the data he collects about workers who perform these type of tasks?

Per the Acceptable Use Policy examples of the prohibited activities are:

  • Collecting personally identifiable information (e.g., don’t ask Workers for their email address or phone number)
  • Using the MTurk website to try to generate "referred" site visits or click-through traffic
  • Disrupting, manipulating, or degrading the operation of any website, product, or service
  • Phishing, spamming, or pharming
  • Unsolicited contacting of users or third parties, or other abusive behavior
  • Advertising or marketing activities, including HITs requiring registration at another website or group
  • HITs intended to promote a site, service or opinion (e.g., don't ask Workers to write fake news or reviews)
  • Infringing or misappropriating the rights of others, including HITs that require violating the terms of any website, product, or service
  • Posting or transmitting any content that is illegal, fraudulent or otherwise objectionable, including content that constitutes child pornography, relates to bestiality, or depicts non-consensual sex acts
  • Disrupting, manipulating, or impairing the operation of the MTurk website (e.g., don’t try to "game" search results for HITs, or knowingly publish HITs that Workers will be required to return after accepting them)
  • Scraping data or content from the MTurk website (except in connection with permitted Worker scripts and automation tools, see FAQ above)
  • Creating a security risk for MTurk, any MTurk user, or any third party, including posting HITs that require downloading software that contains harmful content
  • Using bots, scripts, or other automated methods to complete HITs
  • Performing or requesting HITs through venues other than the MTurk website unless permitted via MTurk templates or applications that we may make available to you
  • Performing your own HITs in the marketplace (however, it's ok to perform a small portion of your own HITs for testing purposes)
  • Posting HITs on behalf of third parties without our prior written consent
  • Posting a HIT that may contain adult content without using the Adult Content Qualification

That being said, my opinion is that it's both wrong and against MTurk policies to post those hits, as well as to complete them. I also wonder, if this researcher's work is actually violating this policy:
Scraping data or content from the MTurk website (except in connection with permitted Worker scripts and automation tools, see FAQ above)

Because the article stated:

Using machine learning and predictive modeling, Lee builds algorithms that sift through the posted tasks looking for patterns that his research has shown are associated with these illegitimate tasks: for example, higher hourly wages or jobs that involve manipulating or posting information on particular websites or clicking on certain kinds of links.
Does this mean he is scraping data/content from the MTurk website?


What are your thoughts my fellow turkers?
 

Cara

ad space now available
Contributor
Joined
Jan 12, 2016
Messages
6,685
Reaction score
12,921
Points
1,163
Location
Mass.
Gender
Female
that college is where i live ;D

i guess we should just be vigilant and report HITs that are suspicious.
 
  • Like
Reactions: Serial Mom

Serial Mom

I’m smiling…that alone should scare you.
Contributor
Joined
Aug 18, 2016
Messages
4,228
Reaction score
9,451
Points
988
Gender
Female
that college is where i live ;D

i guess we should just be vigilant and report HITs that are suspicious.
Yes we should. I've reported over a dozen or so just this week alone.
 

Achilles2357

Active Member
Joined
May 24, 2017
Messages
966
Reaction score
1,459
Points
393
Gender
Male
I have a few thoughts;
1) This guy is using "higher hourly wages" as one sign that a task involves "crowdturfing". That seems utterly wrong to me, at least when it comes to mturk. It seems absurdly wrong. I haven't seen all that many of these tasks, but they don't seem to pay that much, especially for the work involved.
2) The WPI article says: "In Lee’s research, the algorithms have detected fake likes with 90 percent accuracy and fake followers with 99 percent accuracy." First question: Does this mean that when the algorithm calls something a fake like, it has a 90% chance of actually being a fake like? Or does it mean that the algorithm calls 90% of actual fake likes fake likes? These are two quite different issues. And how does he know what the actual fake likes and fake followers are? Obviously he doesn't. It seems the only way he could really know if something was a fake like or a fake follower (apart from actual research by others) would be if he created the fakes himself. So maybe he is himself posting HITs on mturk to create fake likes, etc., and then having his algorithm scrape mturk. He might then be claiming that his algorithm is correctly identifying 90% of the fake likes and 90% of the fake followers that he himself created via his hits. Of course this is speculation, but it seems to violate mturk policies in multiple ways, and it violates IRB policies too, if workers unwittingly do his "research" hits. But, really, if he didn't create the fake likes, etc., how does he know they are fake?
3) It sounds like he is obviously scraping mturk for his own research purposes. But I wonder if mturk policies on this can be enforced. If he has an mturk account, then they probably can. But if he doesn't, and is not overwhelming the mturk server, then he could fight amazon on 1st amendment grounds, saying that this is publicly available info and he has the right to analyze it and publish the analyses.
4) So far as I can tell, only requesters have access to the info on what workers do what hits. If he is claiming to be able to determine this, this suggests he is a requester. Or it suggests some major technical flaw in mturk. The bigger problem for workers is that your mturk worker ID is your amazon id for all amazon purposes: customer, seller, affiliate, etc. So if this guy is claiming to identify workers who do bad hits, then he is possibly exposing them on many levels. I don't know how readily viewable the ID is on other parts of amazon today. Maybe they are wising up and hiding that, or maybe not.
 

<Gucci>

Francis Manancis ...fuck you...
Threaderator
Crowd Pleaser
HIT Poster
Joined
Feb 6, 2016
Messages
13,614
Reaction score
29,767
Points
1,739
Age
39
Location
Detroit
Gender
Female
I like the shit out of just about everything...
I don’t run any automation tools or scripts or anything. I actually read all the posts on the forum and on Facebook that I do like....
I’m not a bot and I don’t consider myself malicious. I like to post gifs in general and like stuff and that’s just my personality.
I get captchas from google on my cell phone because I search multiple topics too quickly.
Mind you I use safari with zero scripts.
Even google doesn’t like how I search and like things. Their own algorithm detects me as a bot. I think it might be worth mentioning to this researcher. This guy might have things all wrong. The algorithms may be much less reliable than he thinks.
I think I might contact the researcher.