The idea of pre-registering studies with journals prior to conducting a study is spreading. Many journals encourage researchers to submit study plans (introduction and method) for peer review prior to data collection (or with the hybrid version, prior to submitting the entire paper). Reviewers can render opinions about the value of the proposed paper and the journal can commit to publishing the paper, no matter how those results come out. This procedure has the potential to reduce some questionable research practices.
Advantages of Pre-Registering
There are advantages to pre-registering for individual researchers and their scientific fields.
- Encourages the researcher to more thoroughly think through their purpose, hypotheses, methods and analysis before conducting the study. This can improve the quality of the study because the purpose is more clearly developed and the methods are more closely matched to the purpose. It is less likely that the researcher will realize at the time of write-up that something would have been better done a different way.
- Discourages HARKing (hypothesizing after results are known). If hypotheses are pre-registered before data are collected, then hypotheses will have been chosen a priori.
- Discourages p-hacking (conducting a series of analyses on the same data until results are significant and in the desired direction). If data analysis is chosen before data are collected, there is less room for playing with the data to get desired results.
- Confirmation bias in the literature would be reduced. This is because studies that fail to find significant results will be just as likely to be published.
Pre-Registering Is Not Enough
The idea of pre-registering has arisen during a time of increasing realization that what appears in scientific journals is often a byproduct of questionable research practices, such as HARKing and p-hacking. What is not clear, however, is whether people who have engaged in questionable practices will continue to game the system even with pre-registering. These questionable practices occur because of the reward systems for researchers, and those have not changed. What is needed is not just pre-registration, but a change in how researchers conduct their research. In other words, you can pre-plan even if you don’t pre-register.
Pre-Plan Even If You Don’t Pre-Register
The idea of pre-registering is to get researchers to publically commit to the methods and analyses they will use prior to collecting data. But the key is not just the public commitment, but that researchers honestly pre-plan and then execute those plans. With statistical tests, this is the best way to assure that probabilities upon which they are based are not distorted. In technical terms the Type 1 error rates claimed by the researchers have not been inflated and in some cases, inflated tremendously as demonstrated by Joseph Simmons, Leif Nelson and Uri Simonsohn.
Pre-planning means that you decide in advance the hypotheses you will test, the procedures for data collection, and the analyses that you will run. The entire data handling process should be planned from cleaning data to the analyses. It should be decided in advance the basis for dropping cases (e.g., because of failing an attention check with surveys or a manipulation check with experiments), and whether or not transformations will be used. What should be avoided is an iterative process where different data manipulations are used until results look “better”. As Simmons et al. showed, each thing you do to the data raises the chances of getting bogus results. Do enough things and you are almost guaranteed to find results, but at the risk of publishing a study that is not going to be replicable because what you found was purely by chance.
What If I Want To Do Exploratory Research?
Exploratory research is an approach where the researcher has no hypotheses to test. He or she conducts research to find interesting patterns in data. The use of data mining techniques can yield valuable insights, and should be encouraged. There are three things to keep in mind.
- There is a difference between data-mining and p-hacking. Data mining is about looking at relationships among a large number of variables to determine which are statistically significant. P-hacking is about trying a variety of ways to look at a specific relationship. Data mining might ask which of a large number of variables can predict a particular behavior (e.g., purchase a particular brand of product). P-hacking is about trying different ways to test if a specific variable can predict a behavior.
- Data mining can still be preplanned. The researcher can decide in advance the series of analyses that will be conducted.
- Results of data mining should be cross-validated. Depending upon size, a dataset can be divided into two or more subsamples. Data-mining can be conducted on one subsample, and the results can be tested for replication on the other “hold-out” samples/s. As more and more statistical tests are conducted, it becomes increasingly important to establish that results can replicate.
Changing More Than Submission Rules
Pre-registration is intended to reduce questionable research practices and improve the integrity of science. It has caught on to varying degrees among different scientific fields, with perhaps the most use in fields that rely on experimental methods. But we need to do more than just change how people submit papers. We need to change how research is done so that pre-planning becomes the norm, even when papers are not pre-registered.