Code
# Create project folders
dir.create("data/raw", recursive = TRUE)
dir.create("data/processed", recursive = TRUE)
dir.create("scripts") # some people use 'R' instead, or 'notebooks'
dir.create("figures")
dir.create("outputs")By the end of this lab, you will be able to:
You’ve all worked on projects where you couldn’t remember where data came from 6 months later, or a collaborator couldn’t run your code because paths were broken. Personally, I have worked with folks that had gold mines of data, but it was so disorganized or messy that it was hard to trust what was original data vs mistakes vs manipulation that maybe shouldn’t have happened in the first place. Think of this as setting up scaffolding before building, it feels like overhead now but will save you hours of frustration later when you’re writing your dissertation methods section or responding to reviewer comments.
We have been talking about reproducibility in science, and ecology, for several years now. We won’t delve too much into it this lab, but here are some resources:
In RStudio:
jsdm-labs-yourname or any other nameRun this code in your new project console (you can also add these folders manually in the ‘Files’ pane):
# Create project folders
dir.create("data/raw", recursive = TRUE)
dir.create("data/processed", recursive = TRUE)
dir.create("scripts") # some people use 'R' instead, or 'notebooks'
dir.create("figures")
dir.create("outputs")What each folder is for:
data/raw/ - Original downloaded data (NEVER EDIT THESE)data/processed/ - Your cleaned datascripts/ - Your R scripts or qmd files for each labfigures/ - Saved plotsoutputs/ - Reports, results, etc.The Golden Rule: Raw data is read-only. If you need to change something, save a new version in processed/ and document the changes and why.
Create a file called README.md in your project root (File -> New File -> Text File, save as README.md):
# JSDM Course Labs - [Your Name]
Course work for Advanced Community Ecology (Spring 2025)
## Project Structure
- `data/` - All datasets (see data/README.md for sources)
- `scripts/` - Analysis scripts for each lab
- `figures/` - Generated visualizations
- `outputs/` - Reports and results
## Labs Completed
- [ ] Lab 0: Project Setup
- [ ] Lab 1: Community Data EDA
- [ ] Lab 2: TBDWhen using NLA data in publications or reports, follow the recommended citations and acknowledgements cite as:
Citation: U.S. Environmental Protection Agency. [insert the year the survey report was published]. National Aquatic Resource Surveys. [insert the survey name and survey year] (data and metadata files). Available from U.S. EPA web page: https://www.epa.gov/national-aquatic-resource-surveys/data-national-aquatic-resource-surveys. Date accessed: YYYY-MM-DD.
Create data/README.md:
# Data Sources
This file documents all datasets used in this project.
## Template for Each Dataset:
**Dataset Name:**
**Download Date:**
**Source URL:**
**Files Downloaded:**
**Citation:**
**Notes:**
---
[Add your datasets below as you download them]Your project should now look like this:
jsdm-labs-yourname/
├── data/
│ ├── raw/
│ ├── processed/
│ └── README.md
├── scripts/
├── figures/
├── outputs/
├── README.md
└── jsdm-labs-yourname.Rproj
There is a neat way to check this and build these ‘trees’ with fs. Take a screenshot or list your files to confirm after you run the code below:
fs::dir_tree(path = ".", recurse = TRUE)Can you: