When I was a graduate student in the1970s, articles were focused on problems of organizations, and statistical methods were simple. As the computer came into widespread use in the 1980s, the statistical methods used became increasingly complex. One class of computationally complex methods, not feasible to conduct by hand, is statistical modeling. These methods allow for confirmatory tests of theoretical models. Over time the use of modeling increased as papers in top journals in industrial-organizational psychology and management focused more on fancy statistical modeling to test theories than on new insights that could be useful to practitioners. Thus, today we have the often-noted academic-practice divide. I get the allure of statistical modeling. Academics need to publish, and modeling holds the keys to the kingdom. But it occurs to me that statistical modeling methods are the real Trojan horse. They allow authors to sneak their studies into top journals.

## How Prevalent Is Modeling?

Statistical modeling has become ubiquitous in top journals in industrial-organizational psychology, management, and other fields. Rather than it being just one tool in the tool kit, it has become the whole kit. I did a quick content analysis of a couple of recent issues of the top-tier *Journal of Applied Psychology*, to see what statistical methods were used. Of the papers reporting on primary studies with data collected on people, every one used some form of statistical modeling. Most used structural equation modeling that enables the testing of a process model suggesting a chain of events unfolding over time. For example, we might suggest that abusive leader behavior leads to anger and anxiety in direct reports that leads to burnout and poor job performance. Structural equation modeling might be used on survey data to see if correlations among these variables, as rated by a sample of direct reports, is consistent with the model. In other studies multilevel modeling is used when individual data are spread across different groups. For example, we might conduct our leadership study by collected data on all direct reports for all the managers in a company. Ratings of leadership for each manager will vary among their direct reports, and the average ratings for each manager will vary.

## The Logic of Statistical Modeling Methods

At the end of the day, modeling is just a complicated way of investigating relationships (i.e., correlations) among variables. As I explained in detail in the past, the deductive logic of modeling is quite weak. What we do is specify a theoretical model–A leads to B leads to C in the simplest mediator case. If that model is correct, we would expect a particular pattern of correlations among the variables. But merely showing that the model fits the data doesn’t tell us that the model is correct–we just couldn’t show it was wrong. There are many models that will fit equally well, and no way to know which is the correct one. In terms of logic, if A then B does not imply if B therefore A. This is the logical weakness in modeling. There are times when we might want to show that a particular model is plausible (or not), but we cannot rely on the pattern of relationships alone to tell us that we have the correct model.

## Statistical Modeling Methods Are the Real Trojan Horse

If statistical modeling is so weak, why is it used so much? The simple answer is Publish or Perish. Researchers are under tremendous pressure to publish, and need to find ways to increase the odds that they can get their articles into top journals. This is where the Trojan horse comes in. By using impressive statistical analysis, people can sneak their paper past peer reviewers and into the journal. Researchers wanting to publish look carefully at what is in the top journals, and they emulate what they see. Since so many articles in top journals use modeling, it is a safe bet that using modeling in a submitted paper will increase your chances of success. In other words statistical modeling methods are the real Trojan horse. It is the current trend, and people will continue using it until some other trend comes along to replace it.

## What Is the Value Added in Modeling Articles?

I generally ignore the modeling portions of research articles, but that doesn’t mean I see no value in those articles. Most use nonexperimental research designs like surveys of employees. Although they might not be up to the task of showing causal connections, they can be very helpful in showing what relates to what. Those connections among variables can provide useful information for the academic field and for practitioners, including:

**Building Blocks of Theory**: Serious theories are able to describe and explain important phenomena. They are typically built upon large bodies of exploratory research that show how a give variable, say burnout, relates to other variables as well as conditions under which those relationships might differ. A theory might take all these correlations and put them together into a cohesive explanation that can be put to further test.**Hints about Where to Intervene**: Nonexperimental studies can’t tell us with any degree of certainty that if we change something at work with an intervention, we will achieve a desired outcome. It can tell us, however, where to start looking. It can suggest the possibility that we have identified something useful. If we find that direct report ratings of abusive supervision relates to their job performance, we can look deeper to see if abuse leads to poor performance, or if poor performance is the trigger for abuse.**Raw Material for Meta-Analysis**: Meta-analysis is the quantitative combination of results across multiple studies of the same thing. If we have 50 studies that report the correlation between employee job satisfaction and motivation, we can compute the average correlation across those studies. We can go farther to see if there are differences in the methods or populations in these studies that affects those correlations. Are the correlations greater for supervisors than single contributors, in the United States than China, or for engineers than electricians? Although scientific fields value novelty, where each study does something new, there is a need for studies that look at the same phenomenon in different ways and with different populations.

Statistical modeling is an important tool for research, but at the current time it is being overused by applying them to data that do not deserve such complex methods. It can be hard to avoid it when all you have is nonexperimental data and the publication winds are blowing in that direction. Thus, often statistical modeling methods are the real Trojan horse. This is an old tune, but the field is in bad need of research that is more relevant to practice and research that can do a better job of testing causal processes. Field experimentation and intervention studies come to mind. They can be difficult to pull off, and more difficult to publish in a field that today is obsessed with theory and complex statistics. But the potential impact is far greater, as such studies can bridge the gap between the academic and practitioner worlds as they provide strong evidence for theory testing.

Image generated by the WordPress editor AI.

SUBSCRIBE TO PAUL’S BLOG: Enter your e-mail and click SUBSCRIBE

But what about all the data that do warrant more complex statistics to tackle endogeneity problems and satisfy underlying assumptions, and the myriad of previous studies that applied a statistical method whilst violating some of these assumptions?

Personally, I do not put a lot of faith in models with nonexperimental data to shed much light on causal processes. If you want to know if X is likely to lead to Y, you have to manipulate X under controlled conditions. I know not everyone agrees.