November 11, 2016 // WSN, Monterey, California

Trying to do reproducible science



data_final_v2.xls



Re: FWD: data question





Sorry guys, this probably isn't reproducible science

Actually doing reproducible science


We struggled to reproduce and repeat our own work.

Data science principles and tools have changed how we do science:

  • reproducibility
  • collaboration
  • communication



Upcoming paper: how we now do reproducible science by leveraging from data science - philosophy and tools

Lowndes et al., in prep:

Data science

Data science

Science and data science

Science:

  • when is X most abundant?
  • what is Y's habitat preferences?


    This is scientific discovery; these questions don't yet have answers.









Science and data science

Science:

  • when is X most abundant?
  • what is Y's habitat preferences?


    This is scientific discovery; these questions don't yet have answers.


Data science:

  • how do I import my data?
  • how do I subset the years I want?



    This is data science; there are existing solutions and tools for these questions.

Ocean Health Index

a method to score benefits that oceans provide to people

Ocean Health Index

a method to score benefits that oceans provide to people

  • we struggled to reproduce our own work
  • we had focused only on scientific methods, not data prep ( = data science)

Ocean Health Index

we now work reproducibly, and support others building from our work

Data science tools

we use these free, open-source tools with growing, inclusive communities

Doing reproducible science

coding + version control are the keystone

Doing reproducible science

but must also collaborate and communicate effectively

Doing reproducible science

Doing reproducible science

You too can do better science in less time

1. Learn to code
    - in R
    - with RStudio

2. Use version control
    - git
    - with GitHub
    - through RStudio







Doing reproducible science

You too can do better science in less time

1. Learn to code
    - in R
    - with RStudio

2. Use version control
    - git
    - with GitHub
    - through RStudio

3. Learn in an intentional way

  • feel empowered (in a panic)
  • think ahead (for a single purpose)
  • with a community (in isolation)

Great resources

Thank you

HUGE thanks to the OHI team, colleagues, #rstats community