Find me
The Developmental Aspects of Sexual Health Laboratory
  • Home
  • People
  • Research Projects
  • Blog
  • Publications
  • Presentations

Ethical data management

1/30/2015

0 Comments

 
In Professional Development, we discussed violations of ethical data management, using some high profile examples as case studies (e.g., Hauser, Stapel, Woo-Suk). Instead of having all students read about each case, I assigned each student to read an article about their own case study (see syllabus for exact readings). I think it led to good discussion, where each student could present their case to their classmates, rather than my lecturing about each case, or everyone coming in with the same knowledge base. Many of the cases we discussed were rather glaring/obvious, with entire datasets fabricated or manipulated. But we also talked about fuzzier cases and where one might draw the line. 

Then we talked about best practices in data management. The goal was two-fold – first, recognizing how to avoid violating ethical data management principles. Second, even when you are doing everything ethically, making sure you do it in a way where no one could suspect you of unethical practices. I should be clear that we didn’t try to tackle IRB/human subjects ethical issues this week – we really focused on the data management end of things.

I suggested the following best practices:

Know your collaborators well. This point is important whether you’re talking about mentors, mentees, or collaborators on the same level. In some of the cases we discussed, there were collaborators who were getting or hearing about great data that turned out to be problematic. I’m not saying that the collaborators/students should have recognized the situation sooner – many people have been tricked in similar ways. But the more closely you work with someone, look at data together, and know the person you work with, the easier it might be to recognize problems. In many of these cases, people who had nothing to do with the fabricated or massaged data had publications that were rescinded and thus disappeared from their CVs. That’s a huge deal for someone junior, and you do not want that to happen to you.

Know your own data.
Before you run analyses, look at your data, your means and standard deviations, get a sense of what you have. I don’t mean start running analyses to test hypotheses at the start, but running descriptives, identifying problems with scales or measures… doing these things early can prevent problems later.

Clean data before analyses. No data are perfect. Data have outliers. Data have inconsistent responses. But when do you address these issues? Don’t wait until your results are not significant to poke around and look for data to eliminate. Instead, before any analyses, clean your data. Sometimes participants report they’ve had 20,000 sex partners, or that they’ve had sex 9000 times in the past 3 months. Other times participants answer “1” to every item on a 7 point scale even though some items are reverse scored. It’s acceptable to clean data, or even to throw out improbable data, as long as your decision rules are logical, consistent, pre-determined, and not decided after you test your hypotheses.

Make data cleaning decisions openly.
Be open about the data cleaning process. Don’t make these decisions on your own, and then lose all of the raw data (pesky fires!). Make the decision process public and based on group consensus about how to handle these issues.

Document data cleaning decisions.
Document decision rules, and any cases that were changed from the raw data. Be ready to show someone your decision rules, raw data, and cleaned data, if asked.

Save syntax. All syntax you ever write. One of my students, Rose Wesche, wrote a whole blog post on this point recently, so you can just read what she said.

Analyzing partial datasets.
This is a tough one. There are times when analyzing a partial dataset is highly useful. You want to submit for a conference but your data aren’t all in. I did my job talk on the first half of my dissertation data. If I hadn’t, I wouldn’t have had a job talk. But a risk here is analyzing partial data repeatedly until the results duck under the magical p < .05, and then ending data collection. If you really must analyze a partial dataset, be sure you know what your final N will be, and don’t deviate from it.

Archive data post-publication.
APA says 5 years after the publication. Because many of us publish for many years after publication, it’s important to archive data for many years.

Students generated the following additional ideas:

Write a clear methods section,
so that others can replicate your methods.

Imagine your worst enemy over your shoulder.
Apparently my husband shared this point last year in their methods class – nice to know they listened/retained it. That is, when making data cleaning and analysis decisions, make sure they’re justifiable.

General transparency.

Change the publish or perish culture.
Students were concerned that many ethical violations occurred because of intense pressure to publish in order to succeed and/or obtain tenure. They thought a culture shift would decrease the prevalence of such incidents.

Stress management. Related to the prior point, as individuals, we might not be able to change the culture, but we could work on our own management of the pressures of academia, so that we can make wise/ethical decisions.

What did we miss? What’s important to teach students about being future scientists/researchers?

“The post Ethical issues in data management first appeared on Eva Lefkowitz’s blog on January 30, 2015.”

0 Comments

Your comment will be posted after it is approved.


Leave a Reply.

    Eva S. Lefkowitz

    I write about professional development issues (in HDFS and other areas), and occasionally sexuality research or other work-related topics. 

    Looking for a post doc? 
    List of HDFS-relevant post docs
    Looking for a fellowship? 
    List of HDFS relevant fellowships, scholarships, and grants
    Looking for an internship?
    List of HDFS-relevant internships
    Looking for a job?
    List of places to search for HDFS-relevant jobs

    Categories

    All
    Adolescent Development
    Being A Grad Student
    Conferences
    Excel
    Gmail
    Grant Proposals
    Job Market
    Mentoring
    Midcareer
    Networking
    PowerPoint
    Publishing
    Research
    Reviewing
    Sexual Health
    Social Media
    SPSS
    Teaching
    Theses & Dissertations
    Transitions
    Undergraduate Advice
    Word
    Work/life Balance
    Writing

    Archives

    February 2022
    May 2021
    January 2021
    July 2019
    June 2019
    May 2019
    January 2019
    November 2018
    October 2018
    September 2018
    August 2018
    July 2018
    June 2018
    May 2018
    March 2018
    October 2017
    November 2016
    June 2016
    May 2016
    April 2016
    February 2016
    January 2016
    December 2015
    November 2015
    May 2015
    March 2015
    February 2015
    January 2015
    December 2014
    November 2014
    October 2014
    September 2014
    August 2014
    July 2014
    June 2014
    May 2014
    April 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013

    Tweets by @EvaLefkowitz

    RSS Feed

    View my profile on LinkedIn

    Enter your email address:

    Delivered by FeedBurner

    Blogs I Read

    Female Science Professor

    The Professor is in

    APA Style Blog

    Thinking About Kids

    Tenure She Wrote

    Prof Hacker

    Andrew Gelman

    Claire Kamp Dush
Proudly powered by Weebly