Much as in the past few years there have been high profile cases around issues related to data management, there have been similar high profile cases around ethical issues in responsible reporting. Although we spend a fair bit of time in class reviewing different types of ethical issues, and discussing specific case studies, I am only going to briefly mention the pitfalls here, to focus instead on best practices. Particular ethical issues in publishing (some of which I discussed in that 2015 post) include (many of these points draw from readings that I cite in my syllabus):
- Plagiarism
- Self-plagiarism
- p-hacking/fishing for significance
- HARKing: Hypothesizing after the results are known
- Garden of forking paths
- Non-transparency in analyses
- Not correcting for Type I error
- Claiming that the difference between significant and not significant is meaningful
- Overstating meaningfulness, particularly with large sample sizes (ignoring practical significance)
- Ignoring meaningful results that may not be significant
- Viewing p < .05 as a magical cutoff
- Causal conclusions without causal evidence
- Not revealing conflict of interest/funding sources
So, here are some suggestions for best practices:
Don’t plagiarize or self-plagiarize: Yes, it seems obvious. I’ve talked about these issues in more detail in previous posts. But the main point is that accidental plagiarism, and self-plagiarism, are relatively common and you should follow best practices to avoid them.
Write hypotheses before running analyses: It is very common for people (not just students) to say, I’m really interested in how [broad construct A] relates to [broad construct B]. Then to run a bunch of correlations trying to see if indicators of A relate to indicators of B. Then to drop variables/analyses that don’t work very well, and then keep ones that do. Then create a story around these findings. If instead you formulate hypotheses first, you can commit to running specific things, and keeping all of those analyses in your paper. And, not HARKing, or coming up with post-hoc hypotheses for why A would relate to B in that way. You don’t have to table every unsuccessful analysis – you can for instance say, we expected there would be interactions with gender, but Step 3, in which we added interactions with gender was never significant, so we do not report those analyses in the table. And, you can still run follow up analyses if you find something you can’t quite explain – but just explain clearly that you ran those analyses to follow up on the unexpected finding, rather than pretending you had planned to run them all along.
Statistically test the difference between two analyses: If two variables are significantly correlated but two others are not, don’t describe them as meaningfulness different findings. Or, if two variables are correlated for one group but not another, don’t describe it as X matters for group A but not group B. For instance, if you’re interested in whether body image has similar associations with girls’ sexual behavior and boys’, you can run interactions with gender rather than separate analyses. If you do the latter, and one correlation is significant and the other is not, it could be meaningful, or it could be noise that pushed one correlation slightly about p < .05 and the other below it, or it could be that the correlations are identical but one group was slightly larger. But if you run the analyses separately, you can’t conclude that the associations are meaningfully different, even if one reaches statistical significance and the other does not.
Describe correlational results without causal language. It is so easy to write that X predicted Y – that’s even the way we talk about variables statistically in regression. But, unless you have manipulated something, avoid using causal language. Explain associations, but don’t say that one led to the other, even if you have longitudinal data. Helpful ways to write about non-causal associations in the results section:
- Body image was positively associated with number of sexual partners
- Young men who had more positive body image tended to have more sexual partners
- Having better body image may lead men to feel more confident sexually, which in turn leads to opportunity for more sexual partners. However, it is important to note that these results are correlational. It is also possible that having more sexual partners leads men to feel better about their bodies, or that something else explains this association. For instance, it may be that men with better self-esteem both feel better about their bodies, and have more sexual partners.
Consider effect size: If you have an enormous sample, it is easy to get a significant correlation, even at r = .10. But a correlation of .10 means that you explained 1% of the variance, which is not particularly meaningful. Be aware of the practical significance of your results.
Reveal all potential conflicts of interest and funding sources. Enough said.
Consider pre-registration: there are arguments for and against, as I discussed in my 2015 post. But do know that pre-registration is becoming increasingly common, including some journals that will review your paper before you run analyses so that, if accepted, whether you find significant results or not, they agree to publish it.
Somehow this list does not feel very comprehensive, perhaps because some of the best practices in data management also apply here, and also, because I wrote about the topic 4 years ago. But, if you add “avoid the bullet points at the top of this post” you have a pretty good list of good practices to follow.
“Best practices in responsible reporting first appeared on Eva Lefkowitz’s blog on June 13, 2019.”