R: Reading & Filtering weather data from the Global Historical Climatology Network (GHCN)

1. Introduction
The GHCN weather data set is a great starting point for exploring trends in global or regional weather patterns. It is a publicly available, observational, dataset that has daily and monthly summaries of weather variables, including high temperature, low temperature and precipitation. The weather data is available for thousands of stations world-wide; and for many of these stations the weather records stretch back over a century. In this blog post, we describe how to:

  • read in this fixed-width dataset into R;
  • use the metadata information to create a subset of weather stations in the US with data from 1936-2015;
  • determine percentage of missing data for each station;

thus creating a list of weather stations in the US with 98% coverage of the weather variables TMAX, TMIN, and PRCP for the 80-year period from 1936 to 2015.

Continue reading