Based on the Software Carpentry strategy of collaborative development of hands-on, interactive lessons for workshops, we facilitate and develop the lessons for Data Carpentry workshops.
These lessons are distributed under the CC-BY and are free for re-use or adaptation, with attribution. We’ve had people use the lessons in courses, to build new lessons or use them for self-guided learning.
Data Carpentry workshops are domain-specific, so that we are teaching researchers the skills most relevant to their domain and using examples from their type of work. Therefore we have several types of workshops and lessons are ordered by topic.
- Ecology materials
- Genomics materials
- Geospatial data materials
- Social science materials
- Biology semester long materials
This workshop uses a tabular ecology dataset from the Portal Project Teaching Database and teaches data cleaning, management, analysis and visualization. There are no pre-requisites, and the materials assume no prior knowledge about the tools. We use a single dataset throughout the workshop to model the data management and analysis workflow that a researcher would use.
The workshop can be taught using R or Python as the base language.
The focus of this workshop is on working with genomics data and data management and analysis for genomics research. It covers metadata organization in spreadsheets, data organization, connecting to and using cloud computing, the command line for sequence quality control and bioinformatics workflows, and R for data analysis and visualization. The workshop does not teach any particular bioinformatics tools, but the foundational skills that will allow you to conduct any analysis and analyze the output of a genomics pipeline.
Geospatial Data Workshop
This workshop is co-developed with the National Ecological Observatory Network (NEON). It focuses on working with geospatial data - managing and understanding spatial data formats, understanding coordinate reference systems, and working with Raster and Vector data in R for analysis and visualization.
|Working with vector data in R||Leah Wasser, Joseph Stachelek|
|Working with raster data in R||Leah Wasser, Joseph Stachelek|
|Introduction to Geospatial data||Leah Wasser, Joseph Stachelek|
Social Science Materials
This is not yet a full workshop, but we have a lesson focused on text mining in R
|Social sciences text mining||Ben Marwick|
Biology Semester-long Course
The Biology Semester-long Course was developed and piloted at the University of Florida in Fall 2015. Course materials include, readings, lectures, exercises and assignments that expand on the material presented at workshops focusing on SQL and R. The course is accessible to: