Open Refine for Ecology

A part of the data workflow is preparing the data for analysis. Some of this involves data cleaning, where errors in the data are identifed and corrected or formatting made consistent. This step must be taken with the same care and attention to reproducibility as the analysis.

OpenRefine (formerly Google Refine) is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another.

This lesson will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.

Getting Started

Data Carpentry’s teaching is hands-on, so participants are encouraged to use their own computers to insure the proper setup of tools for an efficient workflow.
These lessons assume no prior knowledge of the skills or tools.

To get started, follow the directions in the “Setup” tab to download data to your computer and follow any installation instructions.


This lesson requires a working copy of OpenRefine (also called GoogleRefine).

To most effectively use these materials, please make sure to install everything before working through this lesson.

For Instructors

If you are teaching this lesson in a workshop, please see the Instructor notes.


Setup Download files required for the lesson
00:00 1. Introduction Motivation for using OpenRefine
00:10 2. Working with OpenRefine Getting started working with OpenRefine
00:20 3. Filtering and Sorting with OpenRefine Filtering and sorting data
00:30 4. Examining Numbers in OpenRefine Examing numerical data
00:40 5. Scripts from OpenRefine Code generation from OpenRefine
01:00 6. Exporting and Saving Data from OpenRefine How to save and export data from OpenRefine
01:20 7. Other Resources in OpenRefine Other resources available for working with OpenRefine
01:40 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.