Cloudiness Trends for the 2017 Solar Eclipse

Planning an excursion to see the upcoming solar eclipse? NASA can help with that! They provide two sets of data which can point you to good viewing:

A little bit of R scripting lets us combine these and put them onto a Leaflet map of the US.

Downloads

Before coding, there are two things to download:

R: Analyzing weather data from the Global Historical Climatology Network (GHCN)

1. Introduction
The R code below can be used to extract some weather metrics such as maximum daily temperature, minimum daily temperature, average or total daily rainfall and other annual metrics from the GHCN weather data set. This code assumes that you have already created a dataframe of the GHCN stations of interest to you. For example, the set of GHCN stations of interest in this exercise consists of the 520 stations within the US that have data for the 80 years from 1936–2015, with less than 2% missing data (see “R: Reading & Filtering GHCN weather data” on how this set was created). This dataframe (stn80 in our case) with the stations of interest should include, at a minimum, the station ID, LAT, LON (the LAT & LON are useful for mapping the metrics).

The GHCN weather data has one data file for each station. The station data file from GHCN has the following format:
Note: the GHCN station datafiles were converted from a fixed width format to a comma separated format.

head(USC00010252)
  X.1  X          ID year month element Val1 Val2 Val3 Val4 Val5 Val6 Val7 Val8 Val9 Val10 Val11 Val12 Val13 Val14 Val15 Val16 Val17 Val18 Val19 Val20 Val21 Val22 Val23 Val24 Val25 Val26 Val27 Val28 Val29 Val30 Val31
1  42 42 USC00010252 1938     1    TMAX   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA   244   256   239   233   222   194   189   233   228   239   239   239   250   250   200    67    67    94   178   233   183
2  43 43 USC00010252 1938     1    TMIN   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    67    94   111   111   117   144   144   167   183   183   139   122   150   139    44   -72   -67   -17   -17    78    56
3  45 45 USC00010252 1938     1    PRCP    0   64   25    0    0   89  191    0    0     0     0    15     0     5     0     0     0    23     0     0     0     0     0   203     0     0     0     0     0     0    41
4  48 48 USC00010252 1938     2    TMAX  172  161  211  233  261  256  256  256  250   261   256   239   256   233   261   233   233   239   233   128   261   261   178   128   117   211   233   206    NA    NA    NA
5  49 49 USC00010252 1938     2    TMIN  -28   33   83   50   78  156   89   94  106    83    94   100    83   122    89   133    78   128    56     6    78   133   111    33     6     0    44    78    NA    NA    NA
6  51 51 USC00010252 1938     2    PRCP    0    0    0    0    0    0    0    0    0     0     0     0     0     0     0     0     0     3   686     0     0     0   114     0     0     0     0     0    NA    NA    NA

Note: TMAX and TMIN are in tenths of degree Celsius, so 172 is 17.2C
We will manipulate these station data files in R to create several different metrics and write them to their own output files.
Continue reading

R: Reading & Filtering weather data from the Global Historical Climatology Network (GHCN)

1. Introduction
The GHCN weather data set is a great starting point for exploring trends in global or regional weather patterns. It is a publicly available, observational, dataset that has daily and monthly summaries of weather variables, including high temperature, low temperature and precipitation. The weather data is available for thousands of stations world-wide; and for many of these stations the weather records stretch back over a century. In this blog post, we describe how to:

  • read in this fixed-width dataset into R;
  • use the metadata information to create a subset of weather stations in the US with data from 1936-2015;
  • determine percentage of missing data for each station;

thus creating a list of weather stations in the US with 98% coverage of the weather variables TMAX, TMIN, and PRCP for the 80-year period from 1936 to 2015.

Continue reading