Display random sample of texts from a corpus of ~10K texts

More
1 month 2 weeks ago #199648 by syssecsurvey
Dear LimeSurvey community,

We would like to create a survey that presents participants with a few randomly drawn "scenarios" (texts), about which they will have to answer a couple of questions. The total number of possible texts will be around ~10K, so it is not feasible to just create a question group for each text and randomly display a small subset. We would like to store the drawn / generated texts with each survey response. We will not use the participant database because recruitment will be handled by an external party that distributes the survey link.

Currently we are discussing two approaches how to generate the scenarios and now we're wondering if / how they could be implemented with LimeSurvey:

1) Pre-generate all possible texts (~10K) and randomly draw a subset.
2) Generate the text at runtime by combining a text template with randomly drawn string values. This could involve a more complex logic that, e.g., rules out some combinations of values.

The main issues I'm aware of are a) where to store the pre-generated texts / snippets, b) how to include them in a running survey (JavaScript?), and c) how to associate the generated text with the survey response.

Any help would be greatly appreciated!

Please Log in to join the conversation.

LimeSurvey Partners
More
1 month 2 weeks ago - 1 month 2 weeks ago #199652 by Joffm
Hi,
an easy solution for your first approach is:
Use one or more questions of type huge text.
"Huge text" is implemented in the MySQLdatabase as "text" -> 65kB. So you might have to use two or three of them.


Enter your texts as "default text".
Each text starts with a unique number like "#00001", "#00002",... "#09999"
And at the very end you should add a "#" as well.

If you create a random number between (1-9999) you can easily find and extract the text.

Let's say
The random number is: 3561
So you have to grab text number "#00561" out of text bucket "3".

With the implemented functions "strpos", "substr", "str_repeat", "strlen" you can do that.
Example:
1. Get the bucket: eqBucket:{intval(random / 10)}
2. Create your text marker: eqPart: {"#"+str_repeat("0",5-strlen(random))+random}
3. Get the text: eqStr1: {if(eqBucket==0,substr(T0,strpos(T0,eqPart)),if(eqBucket==1,substr(T1,strpos(T1,eqPart)),""))}
4. Get the text length till the next text starts eqlen: {strpos(eqStr1,"#",2)-6}
5. Get the final text eqStr2: {substr(eqStr1,6,eqlen)}

This was done by me this moment rather quick and dirty.
If you or I think it over, you may find some ways to simplify.


OR

In a question of type "short text" start an ajax call to a remote php-script.
<script type="text/javascript" charset="utf-8">
$(document).on('ready pjax:scriptcomplete',function(){
$.post(' www.myServer.com/myScript.php ' , function(data) {

$('#question{QID} input[type="text"]').val(data)

});
});
</script>

And myScript.php may only contain an array with your texts and sends back this by a random number selected text.

Joffm


Volunteers are not paid.
Not because they are worthless, but because they are priceless
Last edit: 1 month 2 weeks ago by Joffm.

Please Log in to join the conversation.

More
1 month 2 weeks ago #199672 by holch
So there are about 10.000 possible text snippets, if I got you right? And from those you'll randomly draw a small subset.
Joffm already delivered some starting points on how you could approach this.

But what I am wondering is the analysis. How big is your sample size that you think you will be able to draw conclusions for the different text snippets and the different combinations?

I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.

Please Log in to join the conversation.

More
1 month 1 week ago #200058 by syssecsurvey
@Joffm: Thanks for your help! I'll give your suggestions a shot later today when I set up a test survey.

@holch: Thanks for voicing your concerns! This is a vignette study - each text is basically a text template with different variables which we fill with values randomly drawn from a finite set. The method we will be using for data analysis allows us to compute the influence of those values without the need to display multiple participants exactly the same vignette / text - all we need is for each value to be displayed a sufficiently large number of times.

Please Log in to join the conversation.

More
1 month 1 week ago - 1 month 1 week ago #200072 by holch

The method we will be using for data analysis allows us to compute the influence of those values without the need to display multiple participants exactly the same vignette / text - all we need is for each value to be displayed a sufficiently large number of times.

Understood.

But now I am wondering if all those 10.000+ texts are actually necessary.

Is the text always the same, just the variables/values change?

If so, I would rather pipe the values into the text than to save 10.000+ texts. This might not be feasible, depending on your study design of course, but as long as we haven't seen some examples that allow us to understand your design better, this is all speculation.

We will not use the participant database because recruitment will be handled by an external party that distributes the survey link.


This not necessarily means that you can't use tokens. If the external party is able to send out individual links to the participants (any half decent panel provider should be able to do so), you can still provide them with a list of token links for them to send out.

Then you could generate the values beforehand, write them as attributes into the token table and pipe them into the text with {token:attribute_1} etc.

But as I said, as I don't know your survey, this is just an idea which might work. But this way you have good control over the variables shown (because they are already saved in the token table and you can also somehow control better, how often each value is shown.

I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.
Last edit: 1 month 1 week ago by holch.

Please Log in to join the conversation.

More
1 month 1 week ago #200274 by syssecsurvey
So I tried Joffm's solution with the external PHP script, which worked just fine ... until I hid the short text question from participants. Since there is no question to display, the call to the PHP server is not executed. If I make the question visible again, everything works. Is there a way to execute this call and store the returned text somewhere such that participants don't get to see it?

Also, I need multiple texts retrieved this way. First I tried to do this with a "multiple short texts" question, but if I try the code posted above, all of the text fields are filled with the same text, presumably because in this question type the text fields have a different structure. How would I have to execute the call to the external PHP file and fill in the text fields if I have multiple short texts in one question?

@holch: We discussed generating texts on-the-fly, but opted against it because we would like to outsource the complex decisions involved in generating the texts. Currently my colleagues' favorite approach is to use pre-generated texts assembled into sets and each participant is shown a randomly selected set. Right now I think that the easiest way to solve the whole problem would be to 1) create one PHP array for each position in the survey where we would like to insert one of the scenarios (i.e., for each question group), 2) let LimeSurvey generate a random number for each participant at the beginning of the survey, store that somewhere, and, 3) whenever one of the question groups is loaded, calls the remote PHP file to retrieve $textX[random_number] and insert the result where needed, which is the description of a question group. Is it possible to insert text into the description of question groups this way, and how would I have to do that?

Thanks a lot in advance!

Please Log in to join the conversation.

More
1 month 1 week ago - 1 month 1 week ago #200275 by tpartner

So I tried Joffm's solution with the external PHP script, which worked just fine ... until I hid the short text question from participants. Since there is no question to display, the call to the PHP server is not executed. If I make the question visible again, everything works. Is there a way to execute this call and store the returned text somewhere such that participants don't get to see it?

Hide the question with a CSS class "hidden".

Also, I need multiple texts retrieved this way. First I tried to do this with a "multiple short texts" question, but if I try the code posted above, all of the text fields are filled with the same text, presumably because in this question type the text fields have a different structure. How would I have to execute the call to the external PHP file and fill in the text fields if I have multiple short texts in one question?

Would you be calling different PHP scripts for each sub-question or the same PHP script, passing it a variable to indicate which sub-question is to be loaded?

Cheers,
Tony Partner

Solutions, code and workarounds presented in these forums are given without any warranty, implied or otherwise.
Official LimeSurvey Partner - partnersurveys.com
Last edit: 1 month 1 week ago by tpartner.

Please Log in to join the conversation.

More
1 month 1 week ago #200342 by syssecsurvey
Thanks! In the meantime I managed to avoid the "hidden subquestions" issue altogether by pursuing approach 3) described above:
1) I created n remotely stored PHP files, with n being the number of texts shown to each participant. Each file holds t texts.
2) At the beginning of the survey, I have a hidden equation-type question that generates a random number in the (0, t - 1) range.
1) At the location where each text is needed (i.e., in the description of each question group), I issue a jQuery call to the PHP file corresponding to this question and send the random number. The PHP file returns the text stored in the PHP array at position [random_number].
3) If the jQuery call fails, I insert a fallback text hardcoded into each question group description.

The only thing that does not work right now is storing information that a participant has seen a fallback text at any point in the survey. I tried to create a hidden short text question at the beginning of the survey that is prefilled with the value "nofallback" (also tried this without the default value) and, in the .fail() part of the jQuery call, try to change the answer to this question to "fallback". After excessively searching both this forum and the LimeSurvey documentation, I just cannot manage to change the answer to this question. I tried all of the suggestions commented out at the end of the code below (and more). What am I doing wrong? I also tried just adding a new text question with the value "fallback" as suggested in the first answer in this thread, but this also doesn't work, presumably because I'm operating from within a question group description and not a question. I would strongly prefer to have a single question that records whether a fallback text was displayed at any time in the survey and not record this information in each question group separately.

I'm also open to any other method to store somewhere in the survey response information that the .fail() part of any jQuery call has been triggered.
<p><strong>Question Group Heading</strong></p>
<div id="text1_container"> </div>
<p><script type="text/javascript" charset="utf-8">
$(document).on('ready pjax:scriptcomplete',function(){
$.post('https://www.remote-server.com/questionX.php?key={randomnumber}' , function(data) {
$('#text1_container').html(data);
})
.fail(function() {
	$('#text1_container').html("FALLBACK TEXT");
	// All of the lines below don't work (printing {fallback} on the next page always shows the default value "nofallback"}
	//$("#answer{138631X276X2469}").val("fallback").trigger('keyup');
	//$('#answer138631X276X2469').val('fallback');
	//$('#fallback') = 'True';
})
;
});
</script>

Please Log in to join the conversation.

More
1 month 1 week ago - 1 month 1 week ago #200344 by tpartner
The hidden text question for "fallback" must be in the same group as your JavaScript and must be hidden with a CSS class "hidden".

JavaScript is client-side so can only manipulate questions that exist in the DOM when the script is called.

This is also the reason that JavaScript cannot manipulate questions hidden via the "Hide this question" question setting. Those questions are not rendered in the screen (DOM).

Then, this should work (assuming that the survey/group/question ID values are correct):

$('#answer138631X276X2469').val('fallback').trigger('keyup');

Cheers,
Tony Partner

Solutions, code and workarounds presented in these forums are given without any warranty, implied or otherwise.
Official LimeSurvey Partner - partnersurveys.com
Last edit: 1 month 1 week ago by tpartner.
The following user(s) said Thank You: syssecsurvey

Please Log in to join the conversation.

More
2 weeks 2 days ago #201705 by syssecsurvey
Sorry for the late response! You were completely right, this was just plain stupidity on my part, I'm rather new to JavaScript / web programming.
Anyway, I managed to get the whole thing to work using the syntax suggested in the previous post, and I was pleasantly surprised to see LimeSurvey automatically adjust the question (group) IDs when I copied the survey to create a version for beta testing. Thanks for your help, I have certainly learned a lot about the capabilities of LimeSurvey during this project. :-)

Please Log in to join the conversation.

Start now!

Just create your account and start using Limesurvey today.

Register now