Monday, December 13, 2010

Canadian Heat Waves Part 2

What do we do with missing records?

To get a proper analysis one needs a complete recordset. But that does not exist with EC data. I've got 570 stations downloaded that have data (that many again that did not return any data from EC).

I ran a scan of all the stations and counted all records that have data for each year for each station into one table. This is the actual count of records per year:

Of the 570 stations only 4 have a complete 100% dataset from 1900-2009 for all of Canada.
The spreadsheet of all 570 stations and their stats can be downloaded here.
Then even for those stations that have narrow start dates, not all of them have complete records between the start and end years.
This plot shows the start date vs percent of records they should have. Each dot is a station. Notice so few have complete records.

Expanded view of the above plot.

The claim is made that one can fill in gaps in one station with records from a near by station, combining the two into one recordset. If that was even possible to get anything meaningful from doing that, this data suggests one can't hope to be able to do that until well into the 20th century, losing the beginning years.

So how does one get a long term trend of combined stations to get anomalies with such scant data? There isn't any way to do it, period. This is why one can only look at specific stations.

