Skip to content

Commit d962f13

Browse files
authored
Merge pull request #41 from fhdsl/ah/tweaks_part2
Review of "Using DaSEH materials"
2 parents ff9410b + dfb525d commit d962f13

5 files changed

Lines changed: 89 additions & 67 deletions

File tree

01-intro.Rmd

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
ottrpal::set_knitr_image_path()
44
```
55

6-
76
# Introduction <img src="https://raw.githubusercontent.com/fhdsl/daseh_instructor_guide/1afbe6783430718ae6a63c607da6e457a73a90ff/assets/leaf.png" alt="Leaf" loading="lazy" style="width:10%; height:auto; vertical-align: middle; display: inline;">
87

98
***

03-daseh_use.Rmd

Lines changed: 85 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -7,35 +7,38 @@ ottrpal::set_knitr_image_path()
77

88
## Learning Objectives
99

10-
1110
This chapter will provide guidance on how to use DaSEH resources for instruction including how to:
1211

1312
- Determine prerequisite knowledge and skills
14-
- Identify what material is appropriate for beginner, intermediate, or advanced learners.
15-
- Use the full set of DaSEH resources, some of the modules, or just the data for different kinds of instruction.
16-
- Extend the materials to serve as a template for homework assignments or independent student exploration.
13+
- Identify what material is appropriate for beginner, intermediate, or advanced learners
14+
- Use the full set of DaSEH resources, some of the modules, or just the data for different kinds of instruction
15+
- Extend the materials to serve as a template for homework assignments or independent student exploration
16+
17+
The examples presented in this chapter are merely suggestions. Modifications to the material to fit student needs are expected and encouraged! If you come up with a different way to use our resources, please [let us know](https://daseh.org/contact.html) what you come up with so that other educators may be inspired by your creativity.
1718

18-
The examples presented in this chapter are merely suggestions - modifications to the material to fit student needs are expected and encouraged! If you come up with a different way to use our resources, please [let us know](https://daseh.org/contact.html) what you come up with so that other educators may be inspired by your creativity.
19+
::: {.notice}
20+
Under our [CC BY-NC-SA license](introduction.html#reuse-and-licensing), you should indicate that you are using our resources or a modified version of our resources.
21+
:::
1922

2023
### Prerequisites
2124

25+
The following are suggested for students using DaSEH materials.
2226

2327
#### Environmental Health Subject Matter
2428

25-
The materials in DaSEH use data related to environmental health. There is no requirement for any prior knowledge on environmental health. The resources are also applicable for those interested in data science for other uses.
29+
DaSEH uses data related to environmental health. There is no requirement for any prior knowledge on environmental health. DaSEH resources are also applicable for those interested in data science for other uses.
2630

2731
#### Statistics
2832

29-
The DaSEH materials do expect some familiarity with statistics and focuses mostly on the application of R for analysis, rather than the theory of statistics. We recommend additional resources for statistics if you are teaching a statistics course.
33+
DaSEH materials expect some familiarity with statistics and focuses mostly on the application of R for analysis, rather than the theory of statistics. We recommend additional resources for statistics if you are teaching a statistics course. Alternatively, you may choose to omit the [Statistics module](https://daseh.org/modules/Statistics/Statistics.html).
3034

3135
#### Coding/Data Science
3236

33-
All materials for DaSEH use the R statistical programming language for data analysis. No familiarity with R basics is expected for learners.
37+
All DaSEH materials use the R statistical programming language for data analysis. No familiarity with R basics is expected for learners.
3438

3539
#### Software
3640

37-
All case studies use the R statistical programming language for data analysis. While there is no specific R version requirement for the case studies, the `OCSdata` package, which can be used to get and load the data, does require R 3.5. Furthermore, R packages used to run specific analyses in each case study may have their own R version requirements. R version requirements may be checked in the `sessionInfo()` section in each case study.
38-
41+
DaSEH uses R and RStudio. The most recent versions tested out for DaSEH can be found [here](https://daseh.org/docs/module_details/day0.html).
3942

4043
### Experience Level Descriptions
4144

@@ -53,7 +56,7 @@ Typically, most middle/high school and first year undergraduate students will fi
5356

5457
The DaSEH materials are structured in a modular manner to support both partial and full use of our materials. Educators are also free to use the DaSEH data by itself.
5558

56-
### Teaching the full set of materials
59+
### Teaching DaSEH - Full Set of Materials
5760

5861
The DaSEH materials are written to provide a comprehensive introduction to environmental health data science. Our materials provide students with experience in all the standard aspects of a data science workflow as well as best practices regarding reproducibility. The following list provides a few examples of how educators could use the materials:
5962

@@ -79,59 +82,67 @@ ottrpal::include_slide('https://docs.google.com/presentation/d/1vCiMPvvsdwQjiMWj
7982

8083
[See the slide directly.](https://docs.google.com/presentation/d/1vCiMPvvsdwQjiMWjf0YuSpTkG0DGXsy1614cRiFc7ns/edit?slide=id.g3d507fbfd91_0_4#slide=id.g3d507fbfd91_0_4)
8184

82-
### Teaching Part of the DaSEH Materials
85+
### Teaching DaSEH - Part of the Materials
8386

84-
Some educators may find that only certain modules are relevant to their course learning objectives. Each provides information about how to access the appropriate data. Note that you may have to add some introduction to explain any functions that were explained in a previous module.
87+
Depending on the course objectives, instructors might choose a subset of modules. Note that some introduction/explanation might be needed for any functions that were explained in a previous module. The following are a few examples of how our modules could be used:
8588

89+
#### Data Visualization Focus
8690

87-
* For a data visualization course, the following modules could be useful:
91+
The following modules could be useful:
8892

89-
- Basic R (only if students don't have familiarity with R)
90-
- RStudio (only if students don't have familiarity with R)
91-
- Manipulating Data in R (to convert data from wide to long format to facilitate data visualization)
92-
- Intro to Data Visualization
93-
- Data Visualization
94-
- Factors
93+
- Basic R (only if students don't have familiarity with R)
94+
- RStudio (only if students don't have familiarity with R)
95+
- Manipulating Data in R (to convert data from wide to long format to facilitate data visualization)
96+
- Intro to Data Visualization
97+
- Data Visualization
98+
- Factors
9599

96-
* For a data wrangling course, the following modules could be useful:
100+
#### Data Wrangling Focus
97101

98-
- Basic R
99-
- RStudio (only if students don't have familiarity with R)
100-
- Subsetting Data in R
101-
- Data Classes
102-
- Data Cleaning
103-
- Manipulating Data in R
104-
- Factors
102+
The following modules could be useful:
105103

106-
* For a reproducibility course the following modules could be useful:
104+
- Basic R
105+
- RStudio (only if students don't have familiarity with R)
106+
- Subsetting Data in R
107+
- Data Classes
108+
- Data Cleaning
109+
- Manipulating Data in R
110+
- Factors
107111

108-
- Reproducibility
109-
- Data Input
110-
- Data Output
111-
- Functions
112+
#### Reproducibility Focus
112113

113-
* For a data ethics course the following materials could be useful:
114+
The following modules could be useful:
114115

115-
- Reproducibility
116-
- Version Control (from the codeathon materials)
117-
- Data ethics (from the codeathon materials)
118-
119-
<br>
116+
- Reproducibility
117+
- Data Input
118+
- Data Output
119+
- Functions
120120

121+
#### Data Ethics Focus
121122

123+
The following materials could be useful:
122124

125+
- Reproducibility
126+
- Version Control (from the codeathon materials)
127+
- Data Ethics (from the codeathon materials)
123128

129+
### Teaching DaSEH - Data Only
124130

125-
### Teaching With DaSEH Data Only
131+
Educators can use DaSEH's data without using the DaSEH materials as a whole. The data is available on GitHub in the [data directory](https://github.com/fhdsl/DaSEH/tree/main/data) and the [data page](https://daseh.org/data.html) of the website. See the [data section](https://hutchdatascience.org/daseh_instructor_guide/daseh-infrastructure.html#data) of the infrastructure chapter for more information about how to access the data.
126132

127-
Educators can use the data available with the DaSEH without using the DaSEH materials as a whole. The data is available on GitHub in the [data directory](https://github.com/fhdsl/DaSEH/tree/main/data). See the [data section](https://hutchdatascience.org/daseh_instructor_guide/daseh-infrastructure.html#data) of the infrastructure chapter for more information about how to access the data.
133+
The data can also be accessed directly in R via URL, replacing `filename.csv` with the name of the data file in the following pattern:
134+
135+
```
136+
"https://daseh.org/data/filename.csv"
137+
```
128138

129-
The data can also be accessed in R by updating the path of daseh.org/data/ with the name of the data file in the URL like the following examples:
139+
For example:
130140

131141
```{r, eval = FALSE}
142+
# readr package is required for read_csv()
132143
library(readr)
133-
er <-
134-
read_csv("https://daseh.org/data/CO_ER_heat_visits.csv")
144+
145+
er <- read_csv("https://daseh.org/data/CO_ER_heat_visits.csv")
135146
136147
er_visits_age <- read_csv("https://daseh.org/data/CO_ER_heat_visits_by_age.csv")
137148
@@ -144,13 +155,11 @@ Be careful to make sure that the name of the file matches exactly include the ca
144155
ottrpal::include_slide('https://docs.google.com/presentation/d/1vCiMPvvsdwQjiMWjf0YuSpTkG0DGXsy1614cRiFc7ns/edit?slide=id.g3b78c177085_0_0#slide=id.g3b78c177085_0_0')
145156
```
146157

158+
A table of which module(s) data is used in is available here: https://daseh.org/data.html
147159

148-
A table of which data is used in which materials is available here: https://daseh.org/data.html
149-
160+
A paper about how to consider what data to use for teaching may also be useful to read called: [How to be “Choosy”: Wrangling big datasets for the classroom](https://onlinelibrary.wiley.com/doi/abs/10.1111/test.70022) (pdf can be found [here](https://github.com/fhdsl/daseh_instructor_guide/blob/3e8e8afa94912866fa62989165f7b1d89295b607/resources/Wilkerson2025.pdf)).
150161

151162

152-
A paper about how to consider what data to use for teaching may also be useful to read called:[How to be “Choosy”: Wrangling big datasets for the classroom](https://onlinelibrary.wiley.com/doi/abs/10.1111/test.70022).
153-
154163
## DaSEH Level Recommendations
155164

156165
Overall the DaSEH materials are intended for anyone with zero to minimal familiarity with R, although we have had learners with more intermediate levels of experience who have reported getting a lot out of the material.
@@ -167,10 +176,10 @@ Here, we are using the following interpretations of "beginner", "intermediate",
167176
| Intermediate | Some experience with importing common data formats (e.g. CSVs) into R or significant experience in another programming language. Some experience wrangling or cleaning raw data in common formats (e.g. numerical data) in R or significant experience in another programming language. Some experience with common visualization packages in R (e.g. ggplot) or significant experience in another programming language. Some familiarity with common statistical concepts (e.g. summary statistics, hypothesis testing) and techniques (e.g. t-test). |
168177
| Advanced | Experience with importing uncommon data types (e.g. PDFs or web-scraping) and comfort with troubleshooting import challenges. Experience cleaning and wrangling raw data in uncommon formats (e.g. regular expressions) in R and comfort with troubleshooting wrangling challenges. Experience with creating complex data visualizations in R and comfort with visualization challenges.Good understanding of foundational statistical concepts and comfort with applying foundational statistical techniques. |
169178

179+
<br>
170180

171-
172-
The following table lists a few example case studies that would be suitable for each experience level.
173-
181+
The following table lists a few example case studies that would be suitable for each experience level.
182+
<br>
174183

175184
| Module | Skill Level|
176185
| ------- | --------- |
@@ -196,46 +205,57 @@ The following table lists a few example case studies that would be suitable for
196205
| Data Ethics | All levels |
197206
| Mapping mini-module | Intermediate |
198207

208+
199209
## Troubleshooting
200210

201-
You may encounter errors trying to render our materials.
211+
You may encounter errors trying to render our materials from the `.Rmd` files.
202212

203213
R packages versions can have updates to arguments and function names that can cause code to work differently or can break the code.
204214

205-
If you encounter an error, this is likely the reason. We try to update our materials when we can, but updates to packages may happen in the meantime. You can either use the error message from trying to knit the Rmd file to determine what function may have been updated or deprecated (we recommend this option to help you or your students learn the most up-to-date information).
215+
If you encounter an error, this is likely the reason. We try to update our materials when we can, but updates to packages may happen in the meantime. We recommend using the knit-to-`Rmd` error messages to determine what function(s) may have been updated or deprecated. This helps you or your students learn the most up-to-date information.
216+
206217

207218
## Additional Use Cases
208219

209-
Our materials can be used in a variety of ways that cater to the learner's goals, experience, and interests. Below, we provide a few examples of how they could be used . If you use DaSEH resources in a new way, we would love to [hear](https://daseh.org/contact.html) about it!
220+
Our materials can be used in a variety of ways that cater to the learner's goals, experience, and interests. Below, we provide a few examples of how they could be used. If you use DaSEH resources in a new way, please [let us know](https://daseh.org/contact.html) about it!
221+
222+
### Using Materials for Assignments
223+
224+
Assignments could include:
225+
226+
- Scientific writing like [writing scientific journal sections](https://github.com/advdatasci/homework9) (e.g. Introduction, Methods, Results, Discussion) based on the the data and analysis
227+
228+
- [Extending analyses](https://github.com/advdatasci/homework11) based on results presented in the lectures and labs
229+
230+
- Using data in the [Predictability, Computability, and Stability (PCS) framework](https://yu-group.github.io/vdocs/PCSDoc-Template.html) to think critically about real-world data challenges
210231

211-
1. Using materials for assignments
232+
- Additional data visualization
212233

213-
Assignments could include:
214-
- Scientific writing like [writing scientific journal sections](https://github.com/advdatasci/homework9) (e.g. Introduction, Methods, Results, Discussion) based on the the data
215-
- [Extending analyses](https://github.com/advdatasci/homework11) based on results presented in the lectures and labs.
216-
- Additional data visualization
217-
- Presentations
234+
- Presentations
218235

219236
Our final project guidelines could also be expanded to create a more involved project.
220237

221-
222-
2. Independent Study
238+
### Independent Study
223239

224-
Our materials and recordings be used for learners to gain experience in statistics and data science independently. We strongly recommend that independent learners aim to actively engage with the recordings by running the analyses independently, and exploring additional data to investigate their own hypotheses. Furthermore, creating a finished product, such as a blog post or a presentation, can be an excellent demonstration of the skills learned.
240+
Our materials and recordings can be used to help learners to gain experience in statistics and data science independently. We strongly recommend that independent learners aim to actively engage with the recordings by running the analyses independently, and exploring additional data to investigate their own hypotheses. Furthermore, creating a finished product, such as a blog post or a presentation, can be an excellent demonstration of the skills learned.
225241

226242

227243
## Additional Resources
228244

245+
### Resources for Data Science and Writing
246+
229247
- Considerations for effective and ethical [data visualization](http://jtleek.com/ads2020/week-5.html)
230248
- [Advanced Data Science Course](http://jtleek.com/ads2020/) taught by [Jeff leek](https://jtleek.com/) and [Roger Peng](https://rdpeng.org/) at [Johns Hopkins Bloomberg School of Public Health](https://publichealth.jhu.edu/).
249+
231250
- Resource on [writing and evaluating scientific writing](https://ocw.mit.edu/courses/20-109-laboratory-fundamentals-in-biological-engineering-spring-2010/pages/assignments/guidelines-for-writing-up-your-research/).
232251

233-
Resources for GitHub, Code Review, and Reproducibility:
252+
### Resources for GitHub, Code Review, and Reproducibility:
234253

235254
- Guide to [code review](https://hutchdatascience.org/code_review/)
255+
236256
- [Introduction to reproducibility course](https://jhudatascience.org/Reproducibility_in_Cancer_Informatics/)
237-
- [Advanced Reproducibility course](https://jhudatascience.org/Adv_Reproducibility_in_Cancer_Informatics/) (this includes more information about how to create a pull request and how to do code review)
238257

258+
- [Advanced Reproducibility course](https://jhudatascience.org/Adv_Reproducibility_in_Cancer_Informatics/) (this includes more information about how to create a pull request and how to do code review)
239259

240260

241261
## Session info

config_automation.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ check-quizzes: false
88
quiz_error_min: 0
99
# Check that urls in the content are not broken
1010
url-checker: true
11-
url_error_min: 0
11+
url_error_min: 1
1212
# Spell check Rmds and quizzes
1313
spell-check: true
1414
spell_error_min: 0

resources/dictionary.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ cheatsheet
1010
cheatsheets
1111
codeathon
1212
codesmall
13+
Computability
1314
Coursera
1415
creativecommons
1516
css

resources/ignore-urls.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@ https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].
33
https://www.contributor-covenant.org/faq][FAQ].
44
https://www.contributor-covenant.org/translations][translations].
55
https://github.com/ottrproject/OTTR_Template/issues/new/choose)!
6+
https://daseh.org/data/filename.csv
7+

0 commit comments

Comments
 (0)