Growing Pains?

The future of methodology in psychological science

Richard D. Morey (Twitter @richarddmorey)
Association for Psychological Science // May 2016

Why are we still doing this?

Much of what I'm about to say has been said.

  • Meehl, P. (1978) Theoretical Risks and Tabular Asterisks: Sir Karl, Sir Ronald, and the Slow Progress of Soft Psychology.
  • Meehl, P. (1985) Why Summaries of Research on Psychological Theories are often uninterpretable.

I was born in 1978. We are still struggling with these issues.

The "crisis"

Majority of scientists polled believe science has "reproducibility crisis"

(Monya Baker, Nature News, 25 May 2016)

Some suggested fixes

  • Pre-registration
  • More replication
  • Open data/materials
  • Better methods education
  • Effect sizes/CIs
  • Meta-analysis
  • Bayesian statistics
  • etc, etc

 

 

The "crisis"

Majority of scientists polled believe science has "reproducibility crisis"

(Monya Baker, Nature News, 25 May 2016)

Some suggested fixes

  • Pre-registration
  • More replication
  • Open data/materials
  • Better methods education
  • Effect sizes/CIs
  • Meta-analysis
  • Bayesian statistics
  • etc, etc

Things that will not fix the crisis

  • Pre-registration
  • More replication
  • Open data/materials
  • Better methods education
  • Effect sizes/CIs
  • Meta-analysis
  • Bayesian statistics
  • etc, etc

Chess vs. Chmess

Dan Dennett (2013: "Intuition Pumps And Other Tools for Thinking")

Chess

Chmess

"Psmychology Scmience"

Signs that Psychological Science is a game

  • Opportunistic focus on "rules"
  • Lack of understanding of basic statistical methods
  • Ineffectiveness of psychological theory
  • Theoretical frameworks as fashions, rhetorical games
  • Citations as long-standing "urban myths"
  • Proliferation of proxy variables/auxiliary hypotheses
  • Unreasonable importance of incentives

The deeper problem

Psychology* has a lack of empirical efficacy and a lack of epistemic accountability.

*and to varying extents, other sciences

Opportunistic focus on "rules"

Rules are very important for any game!

  • With \(p<0.05\), no one can question my result (statistics as fairness)
    •  ...but when \(p=0.09\)...
    • this will be replaced by \(B_{10}>X\)
    • "Authors/editors need standards!"
  • Transactional model of authorship (authorship as fairness)
    • In discussion on data sharing: "What do you offer in exchange for the data [if not authorship]?"

Importance of incentives

Are bad incentive structures at fault?

  • Incentive to publish
  • Incentive to obtain funding
  • Incentive to push a narrative

Ultimate incentive: empirical efficacy

  • Require empirical efficacy \(\rightarrow\) better behaviour follows: Good science works.
  • Prime importance of other incentives indicates a basic failure.

Without empirical efficacy, all incentives will be gamed.

What about the QRP-busters?

Do QRP-busting tests suggest psychology is self-correcting?

  • Based on untenable assumptions about scientific practice (Morey, 2013)
  • Applied post hoc to "suspicious" sets of experiments (Simonsohn, 2013)
  • Developed to show what is already suspected (e.g. Klaassen, 2015)
  • Based on uninterpretable statistical philosophy (e.g. Klaassen, 2015)

"[N]ew statistical tools, perhaps especially those that provide potential critics with access to easy publications, can be misused." (Simohnson, 2013)

QRP-busters are players in the same game.

The solution

Demand empirical efficacy

  • Focus on useful application of knowledge
  • Be skeptical of correlational "tests" of theories

Describe phenomena; ignore theory

  • Understand: theories are easy to "confirm" (proxy variables, moderators, etc)
  • Aim to deeply understand a phenomenon
    • Can I parametrically manipulate effect size?
    • Under what conditions is it apparent?
    • Under what conditions does it disappear?

Unconscious thought theory

Dijksterhuis (2004, JPSP) described the "unconscious thought advantage"

  • Experiment 1: distracted decisions are better (p=.012)
  • "Experiment 2 served various purposes. One goal was to replicate the effects of Experiment 1. However, rather than asking participants to evaluate each apartment separately, they were now asked to choose one of the apartments."
  • "Experiment 3 was designed to replicate the finding of superior unconscious thought with different stimulus materials."
  • "However, it is not yet clear what exactly happens during the unconscious thought period. Experiments 4 and 5 were designed to shed more light on this process."

Strick et al (2011) meta-analysis

"Despite its scientific acclaim and its appeal to a broad audience, UTT has been heavily criticized. One recurring point of criticism is that the UTE is difficult to replicate."

Strick et al identify "moderators" in a meta-analysis

"In summary, the results of this analysis show that the UTE is a real phenomenon... However, the results also clearly highlight that the occurrence of the UTE depends on various moderators. Without taking notice of the boundary conditions there is little guarantee that researchers will replicate the UTE in their future studies. That being said, we would like to encourage researchers to make adjustments to the initial paradigm according to their own situation and their experimental intuition."

Meta-analysis (badly) does the job that should have been done initially!

Still doesn't replicate (Nieuwenstein & van Rijn, 2012; Nieuwenstein et al. 2015)

Conclusion

Psychological science is full of (mere) game playing

  • Lack of empirical efficacy and epistemic accountability

Solution

  • Demand focus on phenomena, not "theory"
  • Demand efficacy in useful situations

Better science, and theory, will follow.