OpenRefine Lessons for Librarians

This lesson is a short introduction to OpenRefine, a tool that is used to explore and “clean up” structured data. The example used in this lesson is an excerpt from a database of research articles. Even though the lesson was developed with librarians in mind, the concepts introduced should be applicable to other topics as well.

The lesson materials are based on the Library Carpentry OpenRefine for librarians lesson (work in progress).

Prerequisites

No particular skills are required to follow this lesson, although a general knowledge of structured data (e.g. spreadsheets) is certainly useful.

Schedule

13:00 Introduction to OpenRefine What is OpenRefine? What can it do?
13:10 Importing data into OpenRefine How do I get data into OpenRefine?
13:55 Basic OpenRefine functions I: Working with columns, sorting, faceting, filtering and clustering How do I move, rename or remove columns in OpenRefine?
How do I sort data in OpenRefine?
How can I work with a subset of my full data set in OpenRefine?
How can I easily correct common data issues in my data with OpenRefine?
14:25 Basic OpenRefine functions II: How do I edit my data based on filters and facets?
How do I use transformations to programmatically edit my data?
How do I use GREL, the General Refine Expression Language?
How do I save and reuse a set of operations for use in subsequent projects?
What are the data formats supported by OpenRefine and why should I care?
15:05 Advanced OpenRefine functions How do I fetch data from an Application Programming Interface (API) to be used in OpenRefine?
How do I reconcile my data by comparing it to authoritative datasets
How do I install extensions for OpenRefine
15:35 Finish