Sunday, March 27, 2016

Baby Steps

If you hadn’t noticed by now, I can be indecently peppy. Over at PsychMAP, when my (brilliant and thoughtful) co-moderator gloomily surveys the imperfections of our field and threatens to drown his sorrows in drink, I find myself invariably springing in like Tigger with a positive reframing. We’ve found flaws in our practices? What a chance to improve! An effect didn’t replicate? Science is sciencing! Eeyore’s tail goes missing? Think of all the opportunities!

Every now and then, though, I stare at this field that I love without recognizing it, and I want to sit down in the middle of the floor and cry. Or swear. Or quit and open a very small but somehow solvent restaurant that serves delicious food with fancy wine pairings to a grand total of six people per night.

The thing that gets me to land mid-floor with a heavy and heart-sinking whomp is not discovering imperfections about our past or noting the lurching, baby-giraffe-like gait of our field’s uneven but promising steps toward progress—it’s when the yelling gets so loud, so polarized, so hands-over-my-ears-I-can’t-hear-you-la-la-la that it drowns out everything else. It’s when Chicken Little and the Ostrich have a screaming match. It’s when people stop being able to listen.

I’ve written before about some of the unintended and damaging consequences that this kind of tone can have, and here’s another: Add these loud debates to the shifting standards and policies in our field right now, and the average researcher’s instinct might be, quite reasonably, to freeze. If it’s unclear which way things are going and each side thinks the other is completely and hopelessly wrong about everything, then maybe the best course of action is to keep your head down, continue with business as usual, and wait to see how things shake out. What’s the point of paying attention yet if nobody can agree on anything? Why start trying to change if the optimal endpoint is still up for debate?

If you find yourself thinking something along these lines from time to time, I am here to say (passionately and melodramatically from where I sit contemplating my future prospects as a miniature restauranteur, plopped in the middle of the floor):

Here’s the thing: We don’t have to agree on exactly where we’re going, and we certainly don’t have to agree on the exact percentage of imperfection in our field, to agree that our current practices aren’t perfect. Of course they’re not perfect—nothing is perfect. We can argue about exactly how imperfect they are, and how to measure imperfection, and those discussions can sometimes be very productive. But they’re not a necessary precondition for positive change.

In fact, all that’s needed for positive change is for each of us to acknowledge that our research practices aren’t perfect, and to identify a step—even a very small, incremental, baby step—that would make them a little better. And then to take the step. Even if it’s the tiniest baby step imaginable in the history of the world. One step. And then to look around for another one.

So for example, a few years ago, my lab took a baby step. We had a lab meeting, which was typical. We talked about a recent set of articles on research practices, which was also typical. But this time, we asked ourselves a new question: In light of what we know now about issues like power, p-hacking, meta-analysis, and new or newly rediscovered tools like sequential analyses, what are some strategies that we could adopt right now to help distinguish between findings that we trust a lot and findings that are more tentative and exploratory?

We made a list. We talked about distinguishing between exploratory and confirmatory analyses. We talked about power and what to do when a power analysis wasn’t possible. We generated some arbitrary heuristics about target sample sizes. We talked about how arbitrary they were. We scratched some things off the list. We added a few more.

We titled our list “Lab Guidelines for Best Practices,” although in retrospect, the right word would have been “Better” rather than “Best.” We put a date at the top. We figured it would evolve (and it did). We decided we didn’t care if it was perfect. It was a step in the right direction. We decided that, starting today, we will follow these guidelines for all new projects.

We created a new form that we called an Experiment Archive Form to guide us. (It evolved, and continues to evolve, along with our guidelines for better practices. Latest version available here.) 

And starting with these initial steps, our research got better. We now get to trust our findings more—to learn more from the work that we do. We go on fewer wild goose chases. We discover cool new moderators. We know when we can count on an effect and when we should be more tentative.

But is there still room for improvement? Surely there is always room for improvement.

So we look around.

You look around, too.

What’s one feasible, positive next step?

Some places to look:

        Braver, Thoemmes, & Rosenthal (2014 PPS): Conducting small-scale, cumulative meta-analyses to get a continually updating sense of your results.
        Gelman & Loken (2014 American Scientist): Clear, concise discussion of the “garden of forking paths” and the importance of acknowledging when analyses are data-dependent.
        Judd, Westfall, & Kenny (2012 JPSP): Treating stimuli as a random factor, boosting power in studies that use samples of stimuli.
        Lakens & Evers (2014 PPS): Practical recommendations to increase the informational values of studies.
        Ledgerwood, Soderberg, & Sparks (in press chapter): Strategies for calculating and boosting power, distinguishing between exploratory and confirmatory analyses, pros and cons of online samples, when and why to conduct direct, systematic, and conceptual replications.
        Maner (2014 PPS): Positive steps you can take as a reviewer or editor.
        PsychMap: A Facebook group for constructive, open-minded, and nuanced conversations about Psychological Methods and Practices.