Welcome to the LimeSurvey Community Forum

Ask the community, share ideas, and connect with other LimeSurvey users!

Converting string results to numeric while keeping value labels

  • Omti90
  • Omti90's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
1 year 11 months ago #227437 by Omti90
Please help us help you and fill where relevant:
Your LimeSurvey version: Version 3.27.19+210928
Own server or LimeSurvey hosting: Own Survey
Survey theme/template: Irrelevant
==================
Hello,
I've finished my survey(s) and now want to turn it/them into an SPSS database. My problem is that the results for item batteries come as string outputs. Something like A1 A2 A3 A4 for the answers. Using the SPSS script it automatically adds value labels to these, using the information from the questions in the survey.

To properly analyse the answers I need to convert the string-results into numeric results: A1 -> 1; A2 -> 2...
Now I'm not sure how to do that while actually keeping the value labels. While possible, I would rather avoid relabeling every answer. I have tried with a script to turn the value labels into strings already and then turning those string variables back into numeric variables while using the content as strings, but this solution is pretty clunky and also has the additional problem that my survey is international and has a different survey language depending on the location.

What kind of possible approaches are there to this problem and is there some actually easy solution?

Thank you,
Omti90

Please Log in to join the conversation.

  • Joffm
  • Joffm's Avatar
  • Offline
  • LimeSurvey Community Team
  • LimeSurvey Community Team
More
1 year 11 months ago #227443 by Joffm
This you do in SPSS.
Either by "converting into  same variable"
 



or directly by adding this statement to your syntax file
 

Then you have to change the type from "string" to "numeric"
 
Also easy. Do it once and copy and paste the type.

And the VALUE LABELS?
Should be easy to adapt VALUE LABELS section in the syntax file by "Search & Replace".

This is really a tiny fine for not doing a full test of your survey before.

Joffm

Volunteers are not paid.
Not because they are worthless, but because they are priceless

Please Log in to join the conversation.

More
1 year 11 months ago #227446 by jelo
You would usually use a suitable item coding before you start the survey.
Using the default A1, A2 is not suitable for analyzing it numeric as a scale.

So the topic is now about how to use SPSS. Which is not part of the forum scope here. A simple approach via the GUI would be to recode into new variable and recode A1 string to 1 numeric and so on.
Take a look at e.g.
cscar.github.io/workshop-spss/working-wi...html#manual-recoding
for an short introduction.

The meaning of the word "stable" for users
www.limesurvey.org/forum/development/117...ord-stable-for-users

Please Log in to join the conversation.

More
1 year 11 months ago #227447 by jelo
I see that Joffm showed the syntax approach.
@Joffm: You need to announce your holidays in advance. Too many LimeSurvey projects are at risk when you don't visit the forum ;-)

The meaning of the word "stable" for users
www.limesurvey.org/forum/development/117...ord-stable-for-users

Please Log in to join the conversation.

  • holch
  • holch's Avatar
  • Offline
  • LimeSurvey Community Team
  • LimeSurvey Community Team
More
1 year 11 months ago #227449 by holch

@Joffm: You need to announce your holidays in advance.


Hahahhaa, yes, whole companies are in panic mode, master and bachelor thesis' are in jeopardy, panic all around. ;-)

I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.

Please Log in to join the conversation.

  • Joffm
  • Joffm's Avatar
  • Offline
  • LimeSurvey Community Team
  • LimeSurvey Community Team
More
1 year 11 months ago #227454 by Joffm
Oh, guys,

Well, next planned holidays: About Nov 20 - Dec 10

Joffm

Volunteers are not paid.
Not because they are worthless, but because they are priceless

Please Log in to join the conversation.

  • Omti90
  • Omti90's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
1 year 11 months ago #227459 by Omti90
Thanks Joffm, this really helps. I didn't realise you could just search replace the value labels... That basically just solved the problem lol.

So just as a syntax fanatic, is there some way to do this sort of search replace thing via syntax?

Also to pay my limesurvey tax, are there some limesurvey settings where I can make sure it just uses numbers instead of A+number as standard? It's too late for this project, but it'd be good to know in the future.

Please Log in to join the conversation.

  • holch
  • holch's Avatar
  • Offline
  • LimeSurvey Community Team
  • LimeSurvey Community Team
More
1 year 11 months ago #227462 by holch
I don't think there is a setting for the default codes for answers and subquestions.

I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.

Please Log in to join the conversation.

More
1 year 11 months ago #227481 by Mia_white
thanks for this

Please Log in to join the conversation.

  • Omti90
  • Omti90's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
1 year 11 months ago - 1 year 11 months ago #227673 by Omti90
Since the search/replace option was a bit too fiddly and prone to typos for me, I wrote an SPSS/Python script to automatically convert standard Limesurvey scales to a numeric format in SPSS. It will look for any string variable with attached value labels, automatically recode all unique values containing a number to said number (while also recoding the "other" answer to a number) and then change labels accordingly. This will also adjust value labels that don't have an occurence of their value in the dataset.

In case someone wants to reuse this please keep in mind this is for unaltered standard settings of limesurvey batteries. If you have some weird keys/values like "A1Lime45" corresponding to your value labels/questions this script won't work. It also won't work properly if you have values containing the same number in your variable. like "A1" (Car - Mercedes) and "B1"(Bike - Mercedes). If you have customised your variables that way this will not work.

For some reason it doesn't want to show the code in the proper code blocks for me, so I've added it in spoilers below. I hope that works...
Code:
BEGIN PROGRAM PYTHON3.
import spss, spssaux, re, spssdata
 
print("Warning: This program is meant for recoding limesurvey default item batteries to a numeric format.\
 Eventual customisations may not be covered by this code. Please make sure to check the results for errors.\
  By default variables remain in string format and need to be manually converted")
 
sDict = spssaux.VariableDict() #Get a copy of the Variable Dictionary

infotext = "Variables processed: " # begin infotext

#Iterate through all variables (in the dictionary - all variables in this case)
for var in sDict:
    #Only Adress variables which are of type String and have value labels.
    if spss.GetVariableType(var.index) > 0 and var.ValueLabels != {} :
        varname = spss.GetVariableName(var.index) #Get variable Name
        infotext += varname + ', ' #Add processed variable to infotext
        
        #Begin Value Labels and Recode commands for active variable
        commandlab = 'Value Labels ' + varname + ' '
        commandrecode = 'Recode ' + varname + ' '
        commandalter = 'Alter Type ' +  varname + '(f2).'
        
        #Get unique values of Variable
        varvalues = set(spssdata.Spssdata(varname, names=False).fetchall())
        varvalues = set([item[0] for item in varvalues])
 
        #Write the recode code for the variable
        for item in varvalues:
            if any(char.isdigit() for char in item): #make sure only values containing digits are processed
                newkey = re.findall(r'\d+', item)[0] #Get first number in key (value of label)
                newkey = re.sub(r'0+(.+)', r'\1', newkey) #Cut leading zeros from number
                
                #Add entry to recode command
                commandrecode = commandrecode + '("' + item + '"="' + newkey + '") '
            if "-" in item: #in case "other" answer
                newkey = "-9" #set value for "other" answer
                
                #Add entry for Value labels and recode commands
                commandrecode = commandrecode + '("' + item + '"="' + newkey + '") '
                commandlab = commandlab + newkey + ' ' + '"' + 'Other answer' + '"' + ' '
 
        
        
        #Open entry for each value of valuelabels
        for key,val in var.ValueLabels.items():
 
            newkey = re.findall(r'\d+', key)[0] #Get first number in key (value of label)
            newkey = re.sub(r'0+(.+)', r'\1', newkey) #Cut leading zeros from number
            
            #Add entry Value labels command:
            commandlab = commandlab + newkey + ' ' + '"' + val + '"' + ' '
        
        #Complete commands for SPSS:
        commandlab = commandlab + '.'   
        commandrecode = commandrecode + '.\n Execute.'
        
        #Check codes
        #print(commandlab) #Syntax for labeling
        #print(commandrecode) #Syntax for recoding
        #print(varvalues) #Unique values for variable
        
        #Execute recode and relabel (and alter type).
        spss.Submit(commandlab) #relabel
        spss.Submit(commandrecode) #recode
        #spss.Submit(commandalter) #alter type to numeric
        
        #Show frequencies of altered variables to look for errors
        spss.Submit("frequencies " + varname)
    
#Print Infos
print(infotext)
print("Warning: Only Values containing numbers have been recoded.\
 Please check your dataset to make sure all entries have been affected.")
end program.


Warning: Spoiler!


As a standard setting this script will use -9 as a value for the option "Other". This can be manually changed in the script by simply changing the number. I would not recommend to pick a number with more figures than one since the strings are usually only size-2.
I have disabled automatic conversion of the string variables to numeric. This is so you can use the frequencies to check the data for anomalies and conversion errors. This should not occur, but I have only tested this script on my own data so it might not work properly in your specific case. You can reactivate the conversion by removing the first "#" from the line "#spss.Submit(commandalter) #alter type to numeric".

Anyways, I hope this will help people who have to deal with the same issue in the future.
Last edit: 1 year 11 months ago by Omti90.
The following user(s) said Thank You: holch

Please Log in to join the conversation.

Lime-years ahead

Online-surveys for every purse and purpose