Single-Item Measures Are Better Than You Think

Picture of single black chess piece on white background

In 1992 I wrote a “Little Green Sage Series” book Summated Rating Scale Construction as an accessible guide to creating multi-item rating scales. These scales are popular throughout the social sciences to assess people’s attitudes, behavior, beliefs, perceptions, personality, and values just to name a few things. In many of the social sciences, including industrial-organizational psychology and management, such measures are all but required for publication in the best outlets due to the belief that single-item measures should be avoided. A recent paper by Russell Matthews, Laura Pineault, and Yeong-Hyun Hong, published in Journal of Business and Psychology questions that widely held belief. They show that single item measures can be a reasonable alternative to multi-item measures, providing valuable information in many cases. Their results show that single item measures are better than you think.

The Multi-Item Advantage

There are several advantages to multi-item measures that make them a better choice in many situations.

  • More resistant to errors. With a single item, if a respondent makes a mistake, it can have a large impact on their score. Say we have a 7-point scale with the single item, “I dislike my job”. If a respondent reads quickly and doesn’t see the “dis” before “like”, they might rate 1 instead of 7 and their score is completely reversed. If you have 5 items, an error this large on one of the items will have far less impact. For example, a total score of 35 (5 items rated 7) will become a 29 if someone checks 1 instead of 7 on one item.
  • Higher reliability.The more items you have, the more stable and consistent the score will be over time, assuming the underlying thing you are measuring doesn’t change. In part it is due to the lesser impact of mistakes, and in part it is due to how small fluctuations in ratings average out across many items.
  • Higher correlations with other variables. Reliability sets an upper limit to how strongly a variable can correlate with another variable. Multi-items scales have the potential to yield stronger results, assuming the items all measure the same thing.
  • Ability to tap into complex constructs. Simple constructs, such as whether or not people like something—their job, their occupation, or steamed broccoli—can be reflected well in a single item. Other constructs can be complex and difficult to assess well with one item. Personality characteristics are one example. For example, the single item, “I often go to parties” is not a complete measure of whether someone is extroverted or introverted, as there is more to it than just attending parties.

Single Item Measures Are Better Than You Think

The Matthews team conducted a series of studies to investigate how well 91 single-item measures performed. They didn’t all perform equally well, but the majority showed good psychometric properties, supporting their use. Some highlights:

  • Most of the single item measures were judged as reflecting the construct intended, for example, commitment to the organization or feeling fatigued.
  • Most of the single-item measures demonstrated good reliability (consistency in scores for each respondent) over time.
  • Most of the measures related to other measures they were expected to relate to; they demonstrated criterion-related validity.
  • Single items measures did well in a head-to-head comparison with corresponding multi-item measures. When both single- and multi-item measures were related to the same other measures, the single-item measures in many cases had results not a lot different from the corresponding multi-item measures. In a few cases the single-item measures did even a little better.

When To Use Single-Item Measures

The biggest limitation to multi-item measures is their length. This limits the number of variables you can include in a survey as long surveys can lead to survey fatigue. If respondents get tired of answering, they can lose focus and make mistakes, thus eliminating some of the advantages of having multiple items. Thus, there is a trade-off between getting more precise assessment of few variables or less precise assessment of more variables. In many cases, the tradeoff would favor single-items measures.

Single-item measures can be particularly useful in daily diary studies where you ask people to complete surveys several times per day over one or two weeks. Such surveys, typically completed on a smartphone, must be as short as possible. Single-item measures would allow assessment of a dozen variables in less than a minute.

The Matthews team showed that single-item measures are better than you think. I admit that I was skeptical at first, but they convinced me. In fields like mine, the organizational sciences, where single-item scales are disrespected, it is time to reconsider that they can be valuable tools to be considered.

Photo by Sebastian Voortman from Pexels

SUBSCRIBE TO PAUL’S BLOG: Enter your e-mail and click SUBSCRIBE

Join 930 other subscribers

6 Replies to “Single-Item Measures Are Better Than You Think”

  1. My colleagues and I recently had a paper on a sensitive topic accepted for publication. The paper described a study of the psychometric and structural properties our Pandemic Anxiety Inventory. In studying the PAI’s criterion validity and mindful of the demands we placed on respondents, we used a number of single-item scales. Your commentary makes a lot of sense to me.

    –Irvin

  2. Thank you very much, Paul, for bringing these interesting findings to attention. They remind me of the work of Matthias Bürisch (1984, 1997) who examined the construct validity of long, short and one-item scales in single-trait-multi-rater designs, employing other-ratings as a criterion. I think his findings provide important information about the validity of very short measures, in addition to the results of the outcome of the extensive studies of Matthews and his colleagues. As you mentioned elsewhere, convergence of self-reports and non-self-reports contribute to the confidence that it wasn’t biases within the self-reports that accounted for the results.
    Bürisch found that one, two- and four-item scales could ‘outperform’ much longer ones, even in cross-validation samples, depending on content saturation of the original item pool. The findings of Bürisch and Matthews and colleagues are contra-intuïtive and not according to customary theorizing. However, they appear to be in line with the reasoning of Gulliksen (1950, p. 382), cited by Bürisch, (1984), that adding an item to a scale will only help to increase the validity of the scale when the correlation between the new item and the criterion is larger than the correlation between the new item and the ‘old’ set of items. And even that when the reverse is true, the ultimate consequence of adding this kind of items items would be that validity would be reduced. And that is what I found, long time ago, in a small study on depression in entrepreneurs participating in a small-business school, validating the Zung 20-item self-rating depression scale (De Jong, 1987).
    The importance of the findings with respect to the validity of small scales is possibly reaching further than the convenience of relatively short surveys. They give food for thought about concepts like latent traits and reliability and their use in Structural Equation Modelling and Meta-analyses.
    Rendel
    Bürisch, M. (1984) You don’t always get what you pay for: measuring depression with short and simple versus long and sophisticated scales. Journal of Research Personality, 18, 81-98.
    Bürisch, M. (1997) Test length and validity revisited. European Journal of Personality, 11, 303-315.
    De Jong (1987) Sociale Ondersteuning, Spanning en Stemming/Social Support, Stress and Mood. Dissertation, Utrecht University
    Gulliksen, H. (1950) Theory of mental tests. New York: John Wiley and Sons, Inc.

  3. Thanks so much for your thoughts on this Dr. Spector. Do you think with the results from Matthews et al. (2022) that I-O/Management journals might become more accepting about the use of single-item measures? I worry that despite these results papers may still be rejected if they resort to the use of single-item measures.

    Best,
    Wiston

    1. All you can do is make your case for what you do, citing Matthews, and hope you can convince your reviewers. Reviews are subjective and no matter what you do, some reviewer will complain you should have done something else.

Leave a Reply

Your email address will not be published. Required fields are marked *