Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mturk Submission is too brittle; fails too often; relies on deprecated spec #436

Open
jacob-lee opened this issue Oct 5, 2020 · 6 comments

Comments

@jacob-lee
Copy link
Contributor

jacob-lee commented Oct 5, 2020

The current method used to submit to MTurk is unreliable for reasons largely outside of our control.

We use an HTML-embedded line of JavaScript to force the ad page to reload so that a submit button appears that the user can click, and submit simple POST using mturk's External HIT API.

  • For reasons that are not entirely clear, requests to this API fail a not-insignificant fraction of the time
  • Failed requests cannot be retried
  • The javascript that tries to force the ad page to reload window.opener.location.reload(true); uses a flag that is deprecated (not part of the spec) and some browsers no longer support it (see: https://stackoverflow.com/questions/55127650/location-reloadtrue-is-deprecated)

I want to propose switching over to the use of Mturk's Internal HIT API, in the manner that most Qualtrics HITs on the mturk site: namely, providing a textbox on the mturk site for the subjects to enter in a provided completion code (e.g. generated by python's uuid.uuid4() on the server).

This would require several components:

1. Construction of the Internal HIT.

    with open(args.question_xml, mode='r') as fh:
        questions = fh.read()
    hit = client.create_hit(
            MaxAssignments=args.num_assignments,
            AutoApprovalDelayInSeconds=250000,
            LifetimeInSeconds=args.lifetime,
            AssignmentDurationInSeconds=args.assignment_duration,
            Reward=args.reward,
            Title=args.title,
            Keywords=args.keywords,
            Description=args.description,
            Question=questions,
            UniqueRequestToken=args.unique_request_token,
        )

For this, we would have a question.xml file; there exists a standardized XML schema for this, e.g.

<QuestionForm xmlns='http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd'>  
  <Question>
    <QuestionIdentifier>compensation_code</QuestionIdentifier>
    <DisplayName>Compensation Code</DisplayName>
    <IsRequired>true</IsRequired>
    <QuestionContent>
      <Text>Enter in the compensation code we sent you.</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <Constraints>
          <Length minLength="6" maxLength="6"/>
        </Constraints>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
</QuestionForm>

(note: a FreeTextAnswer isn't probably what we are looking for, but it is what I had on hand to use as an example)

2. Getting submitted completion codes

Mturk will return an xml string with the information we want. We just have to parse the xml, e.g. for the above example:

mturk would return as part of its JSON response the following:

'Answer': '<?xml version="1.0" encoding="UTF-8" '
                            'standalone="no"?>\n'
                            '<QuestionFormAnswers '
                            'xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionFormAnswers.xsd">\n'
                            '<Answer>\n'
                            '<QuestionIdentifier>compensation_code</QuestionIdentifier>\n'
                            '<FreeText>abcdef</FreeText>\n'
                            '</Answer>\n'
                            '</QuestionFormAnswers>\n'

which we could retrieve like this:

response = client.list_assignments_for_hit(
        HITId=args.hitid,
        MaxResults=100,
        AssignmentStatuses=['Submitted']    
    )
    xmlstr = response['Assignments'][0]['Answer']
    root = ET.fromstring(xmlstr)
    entered = root.find(
        './/{http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionFormAnswers.xsd}FreeText'
        ).text
    print(f'participant entered code is:{entered}')

(note: we'd probably want to use the defusedxml library, which is less vulnerable to an XML exploit).

So, I'm thinking that we would have status:

UnconfirmedSubmit
ConfirmedSubmit
DisconfirmedSubmit (they gave the wrong completion code; we would want a manual override of this in case of user error)
Approved
Bonused

@deargle
Copy link
Collaborator

deargle commented Oct 5, 2020

Thanks for writing this up! The erroring external hit submit problem bothers me, although interestingly, on reflection, I don't think I've ever experienced it since I moved away from using the psiturk ad server -- are you using the psiturk ad server? Regardless, I'm in favor of switching.

One band-aid though about the reload being deprecated -- we could simply close the experiment task popup, and have a note on the ad that says "if you have completed the task, refresh this page to see a submit button." Tacky, but would be easy to add.

@jacob-lee
Copy link
Contributor Author

I've looked into this a bit more. Mturk provides three access points to creating a HIT:

QuestionForm -- this is for internal HITs
ExternalQuestion -- this is what we've been using
HTMLQuestion -- like a QuestionForm but you can use your own HTML and Javascript

The problem with QuestionForm is that it doesn't seem to make WorkerId, AssignmentId, or HITid available; that is, there is no way to dynamically construct a link with those parameters to send it to external server. This isn't going to work well.

HTMLQuestion is more hopeful. I think it is what is being used by some of the HITs that ask for completion codes. The three parameters are made available and can be manipulated by embedded Javascript. Also these can make use of Crowd HTML, sets the correct submission endpoint and inserts a submit button at the end, e.g.,

<HTMLQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2011-11-11/HTMLQuestion.xsd">
  <HTMLContent><![CDATA[
    <!DOCTYPE html>
      <body>
        <script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
        <crowd-form>
          <crowd-classifier
            name="sentiment"
            categories="['Positive', 'Negative', 'Neutral', 'N/A']"
            header="What sentiment does this text convey?"
          >
            <classification-target>
            Everything is wonderful.
            </classification-target>
            
            <full-instructions header="Sentiment Analysis Instructions">
            <p><strong>Positive</strong> 
              sentiment include: joy, excitement, delight</p>
            <p><strong>Negative</strong> sentiment include: 
              anger, sarcasm, anxiety</p>
            <p><strong>Neutral</strong>: neither positive or 
              negative, such as stating a fact</p>
            <p><strong>N/A</strong>: when the text cannot be 
              understood</p>
            <p>When the sentiment is mixed, such as both joy and sadness,
              use your judgment to choose the stronger emotion.</p>
            </full-instructions>
         
            <short-instructions>
             Choose the primary sentiment that is expressed by the text. 
            </short-instructions>
          </crowd-classifier>
        </crowd-form>
      </body>
    </html>
  ]]></HTMLContent>
  <FrameHeight>0</FrameHeight>
</HTMLQuestion>

and this might have a pretty good likelihood of working as expected.

The thing is is that HTMLQuestion uses the ExternalQuestion end point, mturk/externalSubmit. At least some of the trouble we've experienced seems to be random failures of single-shot submits failing.

However, it may be the case that because the HTMLQuestion is not hosted externally that the same-origin policy is no longer a barrier to using XHR to make sure it submits correctly. I think I'm going to look at that more closely.

@jacob-lee
Copy link
Contributor Author

I sort of like the idea of dropping the window.opener.location.reload(true); in closepopup.html to use the cross-origin message passing interface instead.

The child window can send the parent a message like this:

window.opener.postMessage(message);

and the parent can set up a listener like this:

        <script type="text/javascript">                                                                                                                                            
          function openwindow() {                                                                                                                                                  
              popup = window.open('{{ server_location }}/consent?hitId={{ hitid }}&assignmentId={{ assignmentid }}&workerId={{ workerid }}', 'Popup');                            
              window.addEventListener('message', function(e) {                                                                                                                     
                  processChildMessage(e.data);                                                                                                    
              } , false);                                                                                                                                                          
          }                                                                                                                                                                        
          function processChildMessage(message) {                                                                                                                                
              console.log(message);                                                                                                                                                
              var div = document.getElementById('receiver');                                                                                                                       
              var content = document.createTextNode(message);                                                                                                                      
              div.appendChild(content)}), 4000);
              // this div content can be html to submit to mturk.                                                                                                                                    
          }                                                                                                                                                                        
        </script>   

I tested the message passing between my server and a sandboxed HIT. So seems quite feasible as an alternative.

@jacob-lee
Copy link
Contributor Author

I have pushed a commit here:

https://github.com/jacob-lee/psiTurk/tree/mturk-submit-refactor

This involves

  1. a refactor of two routes in experiment.py
  2. changes to the template files closepopup.html and ad.html

and is based on the do_debug fix PR still pending.

This does not include a change from using external HIT API. At moment I'm concerned with reports that the submit to mturk button was not showing up for people. The code switches to using postMessage to send messages back and forth between the parent and child windows, with Javascript inducing a reload when it is confirmed.

@dhalpern
Copy link
Member

We recently switched off the ad server which I was hoping would solve this problem (as @deargle mentioned) but it unfortunately doesn't seem to have done that. Are there any plans to add this submit refactor in an upcoming release? If it's not ready yet, what still needs to be done?

@jacob-lee
Copy link
Contributor Author

jacob-lee commented Jul 19, 2022

Observations:

  • my refactor works (two years of use) and explicitly documents (in a big table) the underlying logic, which I think is an improvement, because the code in the main branch's experiment.py is gnarly and scary to touch.
  • Possible that it works for me, but may not work for others. Because I use psiturk differently than other people, maybe. For as big a refactor as this, I'd actually suggest that the interested parties test the refactor in real world experiments to see if any problems come up before it's merged into master.
  • it doesn't make a big difference as far as failed submissions. It's all about the external HIT api. Because we're doing longitudinal mturk studies, I get a lot of feedback on it.
  • incorporating the refactor is something I (very much) support, but there have been divergences in the code since then, so will be a bit more difficult to merge now than it was then
  • Also the change to closepopup.html and ad.html is not backwards compatible with older tasks (because psiturk code and experiment code are not as separate as they should be).Javascript probably should be injected into the template instead of being hardcoded by the template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants