1.3 Variables

We can divide up variables in a number of ways. First is the distinction between categorical variables (sometimes also called qualitative variables) and quantitative (also called numeric or numerical). There is also an intermediate class called ordinal variables.

1.3.1 Categorical variables

A categorical variable takes a finite set of possible values that cannot be ordered from smaller to larger. For example, the variable marital status could take on the values {single, married, separated, divorced, widowed}. There is no sense in which widowed is greater than divorced, which is greater than separated. You could put the values in any order. Another example of a categorical variable would be, in a vaccine trial, whether you received the actual vaccine or a saline solution (in other words, whether you were in the ‘treatment group’ or the ‘control group’). We sometimes speak of variables having a number of levels , which means possible values. So, in the vaccine trial, group would be a variable with two levels: treatment, and control.

1.3.2 Quantitative variables

Quantitative variables are variables that: (i) are represented by values that can be ordered from less to more; and (ii) the gap between successive values on the scale has a constant meaning. For example, number of children is a quantitative variable. Two children is more than one child. Moreover, the difference between two children and three children means the same thing as the difference between eight children and nine children: one additional child in both cases. Likewise, distance is a quantitative variable. 114km is further than 113km, and the difference between 113km and 114km means the same thing as the difference between 1km and 2km; an extra kilometre.

Some quantitative variables are discrete. This means that they can only take on integer values. Counts of things are generally discrete. I can either have 2 children or 3 children, but not 2.17 children.

Continuous quantitative variables can take on an in-principle infinite number of different values. Examples of continuous variables are distances, speeds, latencies, and weights. The number of possible values is only infinite in principle, not in practice, because our equipment will always have a precision limit. Perhaps our clock only captures latencies to the nearest second. Nonetheless the variable is continuous because there are still a large number of recordable values the variable will take on. If our clock was so crude that we could only distinguish, say, events that had taken less than a minute from those that had taken more than a minute, then we would perhaps choose to treat latency as a categorical variable with the possible values {slow, fast}, rather than a continuous one.

1.3.3 Ordinal variables

Ordinal variables are variables that have some features of qualitative variables and some features of quantitative ones. An example is the following: highest education qualification, out of the possible set {high school, 2 year college, undergraduate degree, masters, PhD}. This is like a quantitative variable in that we can place the possible values in an order from less education to more education. The order of the levels is therefore not arbitrary. On the other hand, the difference between the levels does not have a constant meaning. The difference between high school and 2 year college might not be the same, in terms of amount of extra education, as the difference between masters and PhD.

There is a class of statistical methods for ordinal variables. We will not deal with them in this book but it is possible you will need to investigate them for your research. In practice, ordinal variables are often either converted into two-category ones (like Agree versus Disagree or Depressed versus Not depressed), or else treated as continuous variables. In particular, you will often meet measures using a so-called Likert scale, after psychologist Rensius Likert. In a Likert scale, there are multiple questions or items measuring the same construct. Each item produces an ordinal variable, with discrete levels Strongly disagree/Disagree/Neutral/Agree/Strongly Agree or similar. However, when you average together the many items to produce a single overall score for the scale, the resulting variable can take on many values, and is treated as quantitative and continuous.

1.3.4 Measured versus manipulated variables

Another key distinction, orthogonal to qualitative versus quantitative, is between measured and manipulated variables. With measured variables, the researcher observes the value that is there. With manipulated variables, the researcher intervenes in the world to make the variable have the value that it has.

Suppose we are studying physical activity and depression. We could recruit some participants, and ask them how many times per week they do physical activity, and how many depressive symptoms they have. Physical activity in this study would be a measured variable; the people were exercising a certain amount anyway, and we measured how much that was.

On the other hand, we could take volunteers and request half of them to do exercise four times a week, and the other half never to exercise. Thus, it would be our doing, not theirs, that certain among them end up doing more physical activty than others. In this case, amount of exercise would be a manipulated variable, not a measured variable. Our study would be an experimental study, as we will see in a later section.

Whenever you encounter or design a research study, it is a useful exercise to list all the variables in the study, and classify each one as qualitative, ordinal or quantitative; and measured or manipulated.