«Robert Rohde1, Judith Curry2, Donald Groom3, Robert Jacobsen3,4, Richard A. Muller1,3,4, Saul Perlmutter3,4, Arthur Rosenfeld3,4, Charlotte Wickham5, ...»
For comparison the sampling statistical uncertainty is also shown (black), though it does not contribute to the total. From 1900 to 1950, the spatial uncertainty is dominated by the complete lack of any stations on the Antarctic continent. From 1960 to present, the statistical uncertainty is largely dominated by fluctuations in the small number of Antarctic temperature stations. For comparison, the land-only 95% uncertainties for HadCRU and NOAA are presented. As discussed in the text, in addition to spatial and statistical consideratiosn, the HadCRU and NOAA curves include additional estimates of “bias error” associated with urbanization and station instrumentation changes that we do not currently consider. The added “bias error” contributions are small to negligible during the post 1950 era, but this added uncertainty is a large component of the previously reported uncertainties circa 1900.
The two types of uncertainty tend to co-vary. This reflects the reality that station networks historically developed in a way that increasing station density (which helps statistical uncertainties) tended to happen at similar times to increasing spatial coverage (which helps spatial uncertainties). Overall, we estimate that the total uncertainty in the 12-month landsurface average from these factors has declined from about 0.7 C in 1800 to about 0.06 C in the present day.
The step change in spatial uncertainty in the early 1950s is driven by the introduction of the first weather stations to Antarctica during this time. Though the introduction of weather stations to Antarctica eliminated the largest source of spatial uncertainty, it coincidentally increased the statistical uncertainty during the post-1950 period. The Antarctic continent represents slightly less than 10% of the Earth’s land area and yet at times has been monitored by only about dozen weather stations. To the extent that these records disagree with each other they serve as a large source of statistical noise. An example of this occurred in 1979 (see Figure 9) when an uncertainty of a couple degrees regarding the mean temperature of Antarctica led to an uncertainty of ~0.2 C for the whole land-surface.
Since the 1950s, the GHCN has maintained a diverse and extensive spatial coverage, and as a result the inferred spatial uncertainty is low. However, we do note that GHCN station counts have decreased precipitously from a high of 5883 in 1969 to about 2500 at the present day. This decrease has primarily affected the density of overlapping stations while maintaining broad spatial coverage. As a result, the statistical uncertainty has increased somewhat. We note again that the decrease in station counts is essentially an artifact of the way the GHCN monthly data set has been constructed. In fact, the true density of weather monitoring stations has remained nearly constant since the 1960s, and that should allow the “excess” statistical uncertainties shown here to be eliminated once a larger number of stations are considered in a future paper.
A comparison of our uncertainties to those reported by HadCRU and NOAA (Figure 9) is warranted (comparable figures for GISS are not available). Over much of the record, we find that our uncertainty calculation yields a value 50-75% lower than these other groups. As the sampling curves demonstrate (Figure 6), the reproducibility of our temperature time series on independent data is extremely high, which allows us to feel justified in concluding that the statistical uncertainty is very low. This should be sufficient to estimate the uncertainty associated with any unbiased sources of random noise affecting the data. Similarly, the concordance of the analytical and empirical spatial uncertainties gives us confidence in those estimates as well.
In comparing the results, we must note that curves by prior groups in Figure 9 include an extra factor they refer to as “bias error” by which they add extra uncertainty associated with urban heat islands and systematic changes in instrumentation (Brohan et al. 2006; Smith and Reynolds 2005). As we do not include comparable factors, this could explain some of the difference. However, the “bias” corrections being used cannot explain the bulk of the difference.
HadCRU reports that the inclusion of “bias error” in their land average provides a negligible portion of the total error during the period 1950-2010. This increases to about 50% of the total error circa 1900, and then declines again to about 25% of the total error around 1850 (Brohan et al. 2006). These amounts, though substantial, are still substantially less than the difference between our uncertainty estimates and the prior estimates. We therefore conclude that our techniques can estimate the global land-based temperature with considerably less spatial and statistical uncertainty than prior efforts.
The assessment of bias / structural uncertainties may ultimately increase our total uncertainty, though such effects will not be quantified here. As mentioned previously, in one of our other submitted papers (Wickham et al.) we conclude that the residual effect of urbanization on our temperature reconstruction is probably close to zero nearly everywhere. In addition, the scalpel technique, baseline adjustments, and reliability measures should be effective at reducing the impact of a variety of biases. As such, we believe that any residual bias in our analysis will also be less than previous estimates. However, further analysis of our approach is needed before we can decide how effective our techniques are at eliminating the full range of biases.
We should also comment on the relatively large uncertainties in Figure 9 compared to those in Figure 1. These imply that the other groups believe past ocean temperatures have been much more accurately constrained than land-based temperatures. This conclusion is stated more explicitly at Smith and Reynolds 2005, Brohan et al. 2006.
In considering the very earliest portions of our reconstruction, we should note that our uncertainty analysis may be appreciably understating the actual uncertainty. This can occur for two principle reasons. First, the uncertainty attributed to spatial undersampling is based primarily on the variability and spatial structure of climate observed during the latter half of the twenty century. For example, our approach assumes that the difference between temperatures in the Southern Hemisphere and temperatures in Europe remain similar in magnitude and range of variation in the past as they are today. The plausibility of this assumption is encouraged by the relative uniformity of climate change during the 20th century, as shown in Figure 7. However, this assumption could turn out to be overly optimistic and result in an under (or over) estimation of the natural climate variation in other parts of the world. Second, as the number of stations gets low the potential for additional systematic biases increases. The statistical error measurement technique essentially tests the internal consistency of the data. The more the data disagrees amongst itself, the larger the estimated statistical error. This is adequate if older measurement technology is simply more prone to large random errors. However, this technique cannot generally capture biases that occur if a large fraction of the records erroneously move in the same direction at the same time. As the number of available records becomes small, the odds of this occurring will increase. This is made more likely every time there is a systematic shift in the measurement technology being employed.
where ( ) is the global average temperature plotted in Figure 5, ( ⃑ ) is the “weather field” that we estimated using equation 12. The remaining term ( ⃑ ) is the approximately timeinvariant long-term mean temperature of a given location, often referred to as the climatology.
In our construction we treat this via equation  a function of latitude, altitude, and a smoothed local average calculated using equation . As mentioned earlier, the latitude and altitude components account for about 95% of the structure. A map of the climatology ( ⃑ ) is shown in Figure 10. We found the global land average from 1900 to 2000 to be about 8.90 ± 0.48 C, which is broadly consistent with the estimate of 8.5 C provided by Peterson et al. (2011). The Berkeley Average analysis process is somewhat unique in that it produces a global climatology and estimate of the global mean temperature as part of its natural operations, rather than discarding this information as the three other groups generally do.
Figure 10. A map of the derived Climatology term, 95% of the variation is accounted for by altitude and latitude.
Departure from this is evident in Europe and in parts of Antarctica.
14. Discussion In this paper we described a new approach to global temperature reconstruction. We used spatially and temporally diverse data exhibiting varying levels of quality and constructed a global index series that yields an estimate of the mean surface temperature of the Earth. We employ an iteratively reweighted method that simultaneously determines the history of global mean land-surface temperatures and the baseline condition for each station, as well as making adjustments based on internal estimates of the reliability of each record. The approach uses variants of a large number of well-established statistical techniques, including a generalized fitting procedure, Kriging, and the jackknife method of error analysis. Rather than simply excluding all short records, as was done by prior Earth temperature analysis groups, we designed a system that allows short records to be used with appropriate – but non-zero – weighting whenever it is practical to do so. This method also allows us to exploit discontinuous and inhomogeneous station records without prior “adjustment” by breaking them into shorter segments at the points of discontinuity.
It is an important feature of this method that the entire discussion of spatial interpolation has been conducted with no reference to gridded data sets at all. The fact that our approach can, in principle, avoid gridding allows us to avoid a variety of noise and bias that can be introduced by gridding. That said, the integrals required by equation  will in general need to be computed numerically, and per equation  require the solution of a large number of matrix inverse problems. In the current paper, the numerical integrals were computed based on a 15,984 element equal-area array. Note that using an array for a numerical integration is qualitatively different from the gridding used by other groups. There are no sudden discontinuities, for example, depending on whether a station is on one side of a grid point or another, and no tradeoffs to be made between grid resolution and statistical precision. We estimate that the blurring effects of the gridding methods used by HadCRU and GISS each introduce an unaccounted for uncertainty of approximately ~0.02 C in the computation of annual mean temperature. Such a gridding error is smaller than the total ~0.05 C uncertainties these groups report during the modern era, but not so small as to be negligible. The fact that the resolution of our calculation can be expanded without excess smoothing or trade offs for bias correction allows us to avoid this problem and reduce overall uncertainties. In addition, our approach could be extended in a natural way to accommodate variations in station density; for example, high data density regions (such as the United States) could be mapped at higher resolution without introducing artifacts into the overall solution.