Variables: Definition, Examples, Types of Variables in Research

Variables: Definition, Examples, Types of Variables in Research

What is a Variable?

Within the context of a research investigation, concepts are generally referred to as variables. A variable is, as the name applies, something that varies.

Examples of Variable

These are all examples of variables because each of these properties varies or differs from one individual to another.

  • Age,
  • sex,
  • export,
  • income and expenses,
  • family size,
  • country of birth,
  • capital expenditure,
  • class grades,
  • blood pressure readings,
  • preoperative anxiety levels,
  • eye color, and
  • vehicle type.

What is Variable in Research?

A variable is any property, characteristic, number, or quantity that increases or decreases over time or can take on different values (as opposed to constants, such as n, that do not vary) in different situations.

When conducting research, experiments often manipulate variables. For example, an experimenter might compare the effectiveness of four types of fertilizers.

In this case, the variable is the ‘type of fertilizers.’ A social scientist may examine the possible effect of early marriage on divorce. Her early marriage is variable.

A business researcher may find it useful to include the dividend in determining the share prices. Here, the dividend is the variable.

Effectiveness, divorce, and share prices are variables because they also vary due to manipulating fertilizers, early marriage, and dividends.

Qualitative Variables

An important distinction between variables is the qualitative and quantitative variables.

Qualitative variables are those that express a qualitative attribute, such as hair color, religion, race, gender, social status, method of payment, and so on. The values of a qualitative variable do not imply a meaningful numerical ordering.

The value of the variable ‘religion’ (Muslim, Hindu.., etc..) differs qualitatively; no ordering of religion is implied. Qualitative variables are sometimes referred to as categorical variables.

For example, the variable sex has two distinct categories: ‘male’ and ‘female.’ Since the values of this variable are expressed in categories, we refer to this as a categorical variable.

Similarly, the place of residence may be categorized as urban and rural and thus is a categorical variable.

Categorical variables may again be described as nominal and ordinal.

Ordinal variables can be logically ordered or ranked higher or lower than another but do not necessarily establish a numeric difference between each category, such as examination grades (A+, A, B+, etc., and clothing size (Extra large, large, medium, small).

Nominal variables are those that can neither be ranked nor logically ordered, such as religion, sex, etc.

A qualitative variable is a characteristic that is not capable of being measured but can be categorized as possessing or not possessing some characteristics.

Quantitative Variables

Quantitative variables, also called numeric variables, are those variables that are measured in terms of numbers. A simple example of a quantitative variable is a person’s age.

Age can take on different values because a person can be 20 years old, 35 years old, and so on. Likewise, family size is a quantitative variable because a family might be comprised of one, two, or three members, and so on.

Each of these properties or characteristics referred to above varies or differs from one individual to another. Note that these variables are expressed in numbers, for which we call quantitative or sometimes numeric variables.

A quantitative variable is one for which the resulting observations are numeric and thus possess a natural ordering or ranking.

Discrete and Continuous Variables

Quantitative variables are again of two types: discrete and continuous.

Variables such as some children in a household or the number of defective items in a box are discrete variables since the possible scores are discrete on the scale.

For example, a household could have three or five children, but not 4.52 children.

Other variables, such as ‘time required to complete an MCQ test’ and ‘waiting time in a queue in front of a bank counter,’ are continuous variables.

The time required in the above examples is a continuous variable, which could be, for example, 1.65 minutes or 1.6584795214 minutes.

Of course, the practicalities of measurement preclude most measured variables from being continuous.

Discrete Variable

A discrete variable, restricted to certain values, usually (but not necessarily) consists of whole numbers, such as the family size and a number of defective items in a box. They are often the results of enumeration or counting.

A few more examples are;

  • The number of accidents in the twelve months.
  • The number of mobile cards sold in a store within seven days.
  • The number of patients admitted to a hospital over a specified period.
  • The number of new branches of a bank opened annually during 2001- 2007.
  • The number of weekly visits made by health personnel in the last 12 months.

Continuous Variable

A continuous variable may take on an infinite number of intermediate values along a specified interval. Examples are:

  • The sugar level in the human body;
  • Blood pressure reading;
  • Temperature;
  • Height or weight of the human body;
  • Rate of bank interest;
  • Internal rate of return (IRR),
  • Earning ratio (ER);
  • Current ratio (CR)

No matter how close two observations might be, if the instrument of measurement is precise enough, a third observation can be found, falling between the first two.

A continuous variable generally results from measurement and can assume countless values in the specified range.

Dependent Variables and Independent Variable

In many research settings, two specific classes of variables need to be distinguished from one another: independent variable and dependent variable.

Many research studies aim to reveal and understand the causes of underlying phenomena or problems with the ultimate goal of establishing a causal relationship between them.

Look at the following statements:

  • Low intake of food causes underweight.
  • Smoking enhances the risk of lung cancer.
  • Level of education influences job satisfaction.
  • Advertisement helps in sales promotion.
  • The drug causes improvement of health problems.
  • Nursing intervention causes more rapid recovery.
  • Previous job experiences determine the initial salary.
  • Blueberries slow down aging.
  • The dividend per share determines share prices.

In each of the above queries, we have two independent and dependent variables. In the first example, ‘low intake of food’ is believed to have caused the ‘problem of being underweight.’

It is thus the so-called independent variable. Underweight is the dependent variable because we believe this ‘problem’ (the problem of being underweight) has been caused by ‘the low intake of food’ (the factor).

Similarly, smoking, dividend, and advertisement are all independent variables, and lung cancer, job satisfaction, and sales are dependent variables.

In general, an independent variable is manipulated by the experimenter or researcher, and its effects on the dependent variable are measured.

Independent Variable

The variable that is used to describe or measure the factor that is assumed to cause or at least to influence the problem or outcome is called an independent variable.

The definition implies that the experimenter uses the independent variable to describe or explain its influence or effect of it on the dependent variable.

Variability in the dependent variable is presumed to depend on variability in the independent variable.

Depending on the context, an independent variable is sometimes called a predictor variable, regressor, controlled variable, manipulated variable, explanatory variable, exposure variable (as used in reliability theory), risk factor (as used in medical statistics), feature (as used in machine learning and pattern recognition) or input variable.

The explanatory variable is preferred by some authors over the independent variable when the quantities treated as independent variables may not be statistically independent or independently manipulable by the researcher.

If the independent variable is referred to as an explanatory variable, then the term response variable is preferred by some authors for the dependent variable.

Dependent Variable

The variable used to describe or measure the problem or outcome under study is called a dependent variable.

In a causal relationship, the cause is the independent variable, and the effect is the dependent variable. If we hypothesize that smoking causes lung cancer, ‘smoking’ is the independent variable and cancer the dependent variable.

A business researcher may find it useful to include the dividend in determining the share prices. Here dividend is the independent variable, while the share price is the dependent variable.

The dependent variable usually is the variable the researcher is interested in understanding, explaining, or predicting.

In lung cancer research, the carcinoma is of real interest to the researcher, not smoking behavior per se. The independent variable is the presumed cause of, antecedent to, or influence on the dependent variable.

Depending on the context, a dependent variable is sometimes called a response variable, regressand, predicted variable, measured variable, explained variable, experimental variable, responding variable, outcome variable, output variable, or label.

An explained variable is preferred by some authors over the dependent variable when the quantities treated as dependent variables may not be statistically dependent.

If the dependent variable is referred to as an explained variable, then the term predictor variable is preferred by some authors for the independent variable.

Levels of an Independent Variable

If an experimenter compares an experimental treatment with a control treatment, then the independent variable (a type of treatment) has two levels: experimental and control.

If an experiment were to compare five types of diets, then the independent variables (types of diet) would have five levels.

In general, the number of levels of an independent variable is the number of experimental conditions.

Background Variable

In almost every study, we collect information such as age, sex, educational attainment, socioeconomic status, marital status, religion, place of birth, and the like. These variables are referred to as background variables.

These variables are often related to many independent variables, so they indirectly influence the problem. Hence they are called background variables.

The background variables should be measured if they are important to the study. However, we should try to keep the number of background variables as few as possible in the interest of the economy.

Moderating Variable

In any statement of relationships of variables, it is normally hypothesized that in some way, the independent variable ’causes’ the dependent variable to occur.

In simple relationships, all other variables are extraneous and are ignored.

In actual study situations, such a simple one-to-one relationship needs to be revised to take other variables into account to explain the relationship better.

This emphasizes the need to consider a second independent variable that is expected to have a significant contributory or contingent effect on the originally stated dependent-independent relationship.

Such a variable is termed a moderating variable.

Suppose you are studying the impact of field-based and classroom-based training on the work performance of health and family planning workers. You consider the type of training as the independent variable.

If you are focusing on the relationship between the age of the trainees and work performance, you might use ‘type of training’ as a moderating variable.

Extraneous Variable

Most studies concern the identification of a single independent variable and measuring its effect on the dependent variable.

But still, several variables might conceivably affect our hypothesized independent-dependent variable relationship, thereby distorting the study. These variables are referred to as extraneous variables.

Extraneous variables are not necessarily part of the study. They exert a confounding effect on the dependent-independent relationship and thus need to be eliminated or controlled for.

An example may illustrate the concept of extraneous variables. Suppose we are interested in examining the relationship between the work status of mothers and breastfeeding duration.

It is not unreasonable in this instance to presume that the level of education of mothers as it influences work status might have an impact on breastfeeding duration too.

Education is treated here as an extraneous variable. In any attempt to eliminate or control the effect of this variable, we may consider this variable a confounding variable.

An appropriate way of dealing with confounding variables is to follow the stratification procedure, which involves a separate analysis of the different levels of lies in confounding variables.

For this purpose, one can construct two cross­tables for illiterate mothers and the other for literate mothers.

Suppose we find a similar association between work status and duration of breast­feeding in both the groups of mothers. In that case, we conclude that mothers’ educational level is not a confounding variable.

Intervening Variable

Often an apparent relationship between two variables is caused by a third variable.

For example, variables X and Y may be highly correlated, but only because X causes the third variable, Z, which in turn causes Y. In this case, Z is the intervening variable.

An intervening variable theoretically affects the observed phenomena but cannot be seen, measured, or manipulated directly; its effects can only be inferred from the effects of the independent and moderating variables on the observed phenomena.

We might view motivation or counseling as the intervening variable in the work-status and breastfeeding relationship.

Thus, motive, job satisfaction, responsibility, behavior, and justice are some of the examples of intervening variables.

Suppressor Variable

In many cases, we have good reasons to believe that the variables of interest have a relationship, but our data fail to establish any such relationship. Some hidden factors may suppress the true relationship between the two original variables.

Such a factor is referred to as a suppressor variable because it suppresses the relationship between the other two variables.

The suppressor variable suppresses the relationship by being positively correlated with one of the variables in the relationship and negatively correlated with the other. The true relationship between the two variables will reappear when the suppressor variable is controlled for.

Thus, for example, low age may pull education up but income down. In contrast, a high age may pull income up but education down, effectively canceling the relationship between education and income unless age is controlled for.

4 Relationships Between Variables

In dealing with relationships between variables in research, we observe a variety of dimensions in these relationships.

  1. Positive and Negative Relationship
  2. Symmetrical Relationship
  3. Causal Relationship
  4. Linear and Non-linear Relationship

Positive and Negative Relationship

Two or more variables may have a positive, negative, or no relationship. In the case of two variables, a positive relationship is one in which both variables vary in the same direction.

However, they are said to have a negative relationship when they vary in opposite directions.

When a change in the other variable does not accompany the change or movement of one variable, we say that the variables in question are unrelated.

For example, if an increase in wage rate accompanies one’s job experience, the relationship between job experience and the wage rate is positive.

If an increase in an individual’s education level decreases his desire for additional children, the relationship is negative or inverse.

If the level of education does not have any bearing on the desire, we say that the variables’ desire for additional children and ‘education’ are unrelated.

Strength of Relationship

Once it has been established that two variables are related, we want to ascertain how strongly they are related.

A common statistic to measure the strength of a relationship is the so-called correlation coefficient symbolized by r. r is a unit-free measure, lying between -1 and +1 inclusive, with zero signifying no linear relationship.

As far as the prediction of one variable from the knowledge of the other variable is concerned, a value of r= +1 means a 100% accuracy in predicting a positive relationship between the two variables, and a value of r = -1 means a 100% accuracy in predicting a negative relationship between the two variables.

Symmetrical Relationship

So far, we have discussed only symmetrical relationships in which a change in the other variable accompanies a change in either variable.

This relationship does not indicate which variable is the independent variable and which variable is the dependent variable.

In other words, you can label either of the variables as the independent variable.

Such a relationship is a symmetrical relationship. In an asymmetrical relationship, a change in variable X (say) is accompanied by a change in variable Y, but not vice versa.

The amount of rainfall, for example, will increase productivity, but productivity will not affect the rainfall. This is an asymmetrical relationship.

Similarly, the relationship between smoking and lung cancer would be asymmetrical because smoking could cause cancer, but lung cancer could not cause smoking.

Causal Relationship

Indicating a relationship between two variables does not automatically ensure that changes in one variable cause changes in another.

It is, however, very difficult to establish the existence of causality between variables. While no one can ever be certain that variable A causes variable B, one can gather some evidence that increases our belief that A leads to B.

In an attempt to do so, we seek the following evidence:

  1. Is there a relationship between A and B? When such evidence exists, it indicates a possible causal link between the variables.
  2. Is the relationship asymmetrical so that a change in A results in B but not vice-versa? In other words, does A occur before B? If we find that B occurs before A, we can have little confidence that A causes.
  3. Does a change in A result in a change in B regardless of the actions of other factors? Or, is it possible to eliminate other possible causes of B? Can one determine that C, D, and E (say) do not co-vary with B in a way that suggests possible causal connections?

Linear and Non-linear Relationship

A linear relationship is a straight-line relationship between two variables, where the variables vary at the same rate regardless of whether the values are low, high, or intermediate.

This is in contrast with the non-linear (or curvilinear) relationships, where the rate at which one variable changes in value may differ for different values of the second variable.

Whether a variable is linearly related to the other variable or not can simply be ascertained by plotting the K values against X values.

If the values, when plotted, appear to lie on a straight line, the existence of a linear relationship between X and Y is suggested.

Height and weight almost always have an approximately linear relationship, while age and fertility rates have a non-linear relationship.

Frequently Asked Questions about Variable

What is a variable within the context of a research investigation?

A variable, within the context of a research investigation, refers to concepts that vary. It can be any property, characteristic, number, or quantity that can increase or decrease over time or take on different values.

How is a variable used in research?

In research, a variable is any property or characteristic that can take on different values. Experiments often manipulate variables to compare outcomes. For instance, an experimenter might compare the effectiveness of different types of fertilizers, where the variable is the ‘type of fertilizers.’

What distinguishes qualitative variables from quantitative variables?

Qualitative variables express a qualitative attribute, such as hair color or religion, and do not imply a meaningful numerical ordering. Quantitative variables, on the other hand, are measured in terms of numbers, like a person’s age or family size.

How do discrete and continuous variables differ in terms of quantitative variables?

Discrete variables are restricted to certain values, often whole numbers, resulting from enumeration or counting, like the number of children in a household. Continuous variables can take on an infinite number of intermediate values along a specified interval, such as the time required to complete a test.

What are the roles of independent and dependent variables in research?

In research, the independent variable is manipulated by the researcher to observe its effects on the dependent variable. The independent variable is the presumed cause or influence, while the dependent variable is the outcome or effect that is being measured.

What is a background variable in a study?

Background variables are information collected in a study, such as age, sex, or educational attainment. These variables are often related to many independent variables and indirectly influence the main problem or outcome, hence they are termed background variables.

How does a suppressor variable affect the relationship between two other variables?

A suppressor variable can suppress or hide the true relationship between two other variables. It does this by being positively correlated with one of the variables and negatively correlated with the other. When the suppressor variable is controlled for, the true relationship between the two original variables can be observed.