There are a lot of statistical software packages out there. SAS, SPSS, R, STATA, Mplus, LISREL, and the list goes on. The first statistical tool I learned was SPSS, so I have a certain fondness for it, similar to how one might feel about a first car. After years of doing data management with SPSS, I know its ins and outs, its idiosyncracies, its capabilities and its limits. And even though there are flashier and more powerful programs out there, I don’t want to give up the program I know best. I feel very comfortable with SPSS. And I want to drop some knowledge on you about how to use SPSS more effectively.
SPSS, unlike many other statistical programs, features drop-down menus. Whereas other programs force you to write syntax, SPSS allows you to point-and-click your way through analyses. For beginning statisticians, drop-down menus are an excellent tool because you don’t have to learn a new programming language before diving in. But it’s not the only way, or the best way, to do analyses.
Like other statistical packages, SPSS also lets you write syntax. I have found SPSS syntax to be a necessity for data management and running analyses, for a few reasons. First, if you are working with a large number of variables, it is often easier to type the names of the variables you need than to find them all in the SPSS command window. Second, if you need to repeat the same command with different variables or in different datasets, it is easier to copy and paste syntax, and change the variables you need, than to click through the menu a gajillion times. Third, syntax gives you more ways to manipulate your data than drop-down menus do. For example, DO IF statements let me compute new variables based on complex series of commands.
Fourth, and most important, using SPSS syntax allows you to go back and see what you did to your data. This has saved me countless hours of retracing my steps. When an analysis produces wonky results, syntax tells me exactly what analyses I ran with which dataset and which variables, and I can see what went wrong. It’s like being able to talk to myself in the past! Sometimes past me is full of wisdom:
Me: Why are these regression results different than the ones I ran yesterday?
Past Rose: Because you’re using the wrong data set. You’re welcome, you beautiful and statistically gifted creature.
Sometimes past me was a doofus:
Me: Why does this variable I computed not look like I want it to?
Past Rose: Because I forgot to turn off the filter before I ran the compute command. Sorry, girl. I’m a doofus.
Whether Past Rose was brilliant (which is most of the time) or not on her game (which is usually the twenty minutes just before lunch), I know what happened and how to fix it. Without syntax, I would be bumbling around trying to recreate the results I had in order to find out what I did. And it’s not just an issue of saving time. What happens if you lose your nice, organized dataset and have to start cleaning your data from scratch? What happens if, six months after you submit a manuscript, a reviewer wants to know details about how analyses were conducted? In that case, being able to locate your syntax might be a matter of life or death (for your paper, not you).
If you already use syntax, you’re amazing and I hope you felt real smart reading the last few paragraphs. If you haven’t used syntax before, I hope I converted you. It’s as easy as clicking “paste” instead of execute” in the drop-down menu, and tutorials exist that make it easy to learn how to do more complex commands. Whether you are a syntax pro or a novice, here are some tips about how to make the most of it:
3a. Make notes on your syntax. In SPSS, typing a * at the beginning of a line makes a comment. Future you will be so grateful if you comment on your syntax with descriptions of what you’re doing to your data and why. Notes are also useful for times when you need to do something in the data editor window instead of in syntax. For example, I once needed to change the value of a variable for a single case, and it didn’t make sense to do it in syntax. So I changed it manually in the data editor window and wrote a note about what I did in the syntax window.
3b. Put your syntax in an order that makes sense to you. Say that I computed a variable, ran frequencies on it, computed another variable, ran frequencies on it, and did a regression. I could organize my syntax chronologically, according to the sequence of the commands I ran. Another option is to organize it by command, with all the computations together, then all the frequencies, then the regression. If I did these commands with multiple data sets, I could also organize my syntax by data set. I don’t care which of these options you choose, if any, but I do care that you have a system that lets you navigate your syntax files easily.
3c. Store your files in an appropriate location. Everyone needs a good file organization system, and that’s its own topic, but what good is saving syntax if you can’t find it when you need it?
A good system for writing and storing syntax is as important as being competent at running analyses. Your statistical brilliance is useless if you can’t explain to others (fellow scientists, supervisors, readers, journal reviewers, students) how you got your results. If you use SPSS, utilizing its syntax feature is the best way to ensure that you can retrace your steps when you need to. I have plenty more to say about SPSS. I also have plenty to say about sex, my research area of expertise. But syntax is so important, and I’ve been wanting to tell the world for so long, that it just had to come first.
TL;DR Save your syntax, save your time, save your sanity, save your papers.