Jump to content

Recommended Posts

Posted

I apologise in advance for the long post, but full explanation is needed.

 

The Clear Climate Code project is a great concept to port the old FORTRAN GISS code into Python and make it more useable.

 

The project has succeeded and only found minor code problems in the GISS code. IOW, it does what it says it does. I have no problem with this.

 

The question to me is "Is what the code does correct?" and "How could it be tested?"

 

One of the functions of the GISS code that I've always been concerned about the extraction of "data" using stations from up to 1200 km away. Frankly it's so counter-intuitive to appear just plain silly. The problem is, of course, that being counter intuitive doesn't make it wrong.

 

I believe that I have come up with a way that the GISS code performance can be tested against real world values.

 

A recent cheifio article gave me the idea. It concerned what he called the "Bolivia Effect". Basically he noticed that the GHCN contains no data on Bolivia since 1990.

 

This means that the "Land Only" data produced by GISS is from the interpolation of data from outside Bolivia. It was also noted that the interpolated region showed quite a marked warming trend.

 

"Yada, yada, yada...rant at GISS, blah, blah, blah."

 

My thought was that chiefio didn't go far enough in his experiment.

 

We have the GISS maps for January 2010 showing the temperature anomaly compared to the 1961-1990 baseline. We also know that stations in Bolivia are not being used because if we plot with 250 km smoothing the nation turns grey showing no input.

 

Compare

attachment.php?attachmentid=2426&stc=1&d=1267257002

 

with

attachment.php?attachmentid=2427&stc=1&d=1267257024

 

As can be seen, Bolivia is missing from the 250km map, but the values have been "filled in" in the 1200 km map. Now I state again, I have no problem with this, if it is done right.

 

So is it right?

 

Rather than bitching about GISS filling in and being wrong, blah, blah, I thought "What if I looked at the actual Bolivia data and compared it to the GISS map?

 

So I looked. Just because the GHCN can't find Bolivia for 20 years and NASA GISS can't find it for 20 years doesn't mean I shouldn't try.;)

 

Strangely enough I found that they have quite a complete record available from their SERVICIO NACIONAL DE METEOROLOGIA E HIDROLOGIA.

 

The individual station records are available here, and the baseline averages are available here.

 

So, I could get the daily station data for January 2010 and I could get the baseline averages for each station.

 

I adopted the following methodology for what can only be called a "rough and dirty" test. I wish it noted that the methodology was decided on before I downloaded any data. I want that to be very clear.

 

Bolivia has 39 stations listed.

 

1. Stations will be dropped off if there is no data.

2. If the average for a station for the 1961-1990 baseline is not available or cannot be identified, the station will be dropped.

3.Stations with baselines other than 1961 - 1990 will be dropped.

 

This will give me the stations that can be compared to a 1961-1990 baseline.

 

Infilling methods.

 

1. If a station is missing data for a single day then the max or min will be filled in by choosing a value 1/2 way between the day previous and day after.

2.If a station is missing data for 2 consecutive days, then values of 1/3 and 2/3 the difference between the day previous to and the day after the break.

3. Stations missing 3 days or more will be dropped.

 

This allows for the infilling of data if there is a day or two missing.

Next is arriving at an "average" temperature.

Proceedure;

 

1. The min and max for each day is then added together.

2. The result is divided by 2 to get the daily average.

3. The daily averages are then added together for each station and then divided by 31. This gives the monthly average for each station.

4. I then subtract the baseline average for each station from the 2010 average temp to find the anomaly value for that station.

5. The anomaly values are then added together and divided by the number of stations (28) to get a national anomaly value.

 

(As far as I can tell, this is pretty much how it is done by the big boys.)

 

I'm not going to defend this exercise as in any way "rigorous" and I hope for criticism to improve things. My first concern is obviously that the method is flawed. Obviously there are no corrections for UHI or station siting moves.

 

Concerning UHI: As this would only add to the warming at a station, and the question was about "too much" warming, it would only act in the favour of the GISS projection.

 

Concerning station moves, etc. : Since the GISS map shows between 1 and 2 degrees of warming, any minor changes in siting could be ignored. Major changes would show up when comparing the months data with the baseline. No such major discrepancies were noted in the data.

 

I also have no way of telling if this is how the Bolivian service arrives at their monthly averages. However, this method does yield figures that are very close to the Bolivian ones, so it shouldn't be too far out. (Hopefully)

 

I'm operating on the idea that since the GISS map shows at least a 1 degree anomaly, then an occasional .1 degree inaccuracy won't kill the main concept.

 

I have done the figures and they will follow in the next post.

 

This sort of thing is very new to me, so I welcome any constructive criticism.

 

Please make comments on the methods before reading the results.

GHCN_GISS_1200km_Anom01_2010_2010_1961_1990.gif

GHCN_GISS_250km_Anom01_2010_2010_1961_1990.gif

Posted (edited)

Okay, what happened, when I did the figures?

 

Result:

 

Stations originally = 39

 

Stations dropped due to no daily data for January.

 

Aplolo

Charana

 

Stations left = 37

 

Stations dropped due to no average identified.

 

Bermejo

La Paz - Zona Sur

Monteagudo

Potosi - Aeropuerto

Reyes

Santa Cruz - Trompillo

 

 

Stations left = 31

 

Stations dropped due to wrong baseline.

 

San Matias

San Ramon

Santa Cruz - Viru Viru

 

Stations left = 28

 

Stations dropped due to lack of daily data.

 

Nil.

 

So out of 39 stations there was sufficient daily data that could be compared to the 1961-1990 baseline for only 28.

 

Average of each station; Baseline average; Anomaly

 

Ororo = 13.35; 12.9; .5

ASCENCION DE GUARAYOS = 26.6; 25.1; 1.5

Camiri = 25.5; 25.9; - .4

Cobija = 26.3; 26.1; .2

COCHABAMBA = 19.7; 19; .7

CONCEPCION = 25; 25.4; - .4

GUAYARAMERIN = 26.4; 27; - .6

LA PAZ - CENTRO = 14.1; 11.9; 2.2

LA PAZ - EL ALTO = 9.8; 8.5; 1.3

MAGDALENA = 27.4; 27.1; .3

POTOSI CIUDAD = 11.6; 10.5; 1.1

PUERTO SUAREZ = 27.6; 28; - .4

RIBERALTA = 27.3; 26.8; .5

ROBORE = 27.2; 28.4; - 1.2

RURRENABAQUE = 26.9; 26.8; .1

SAN BORJA = 27.1; 27; .1

SAN IGNACIO DE MOXOS = 26.7; 26.7; 0

SAN IGNACIO DE VELASCO = 25.6; 26.4; - .8

SAN JAVIER = 24.4; 24.5; - .1

SAN JOAQUIN = 27.7; 27; .7

SAN JOSE = 26.9; 27.1; - .2

SANTA ANA = 27.5; 27.5; 0

SUCRE = 16; 16.1; - .1

TARIJA = 21.4; 21.6; - .2

TRINIDAD = 27; 27.1; - .1

VALLEGRANDE = 18.8; 18.8; 0

VILLA MONTES = 28; 27.5; .5

YACUIBA = 25.9; 26.9; - 1.0

 

I don't know what happened to the figures, but in the quote they don't come out separated like I formatted them. So values are separated by semicolons.

 

This gives a total anomaly for Bolivia (all stations) of +4.2 degrees compared to the 1961-1990 baseline.

 

When we divide by the number of stations to get the average anomaly it comes out as .150 compared to the baseline.

 

This does not look good for the GISS interpolation method.

 

So the bottom line is that if my methods are even remotely correct, then the GISS methods for "smoothing" are very wrong.

 

I have great difficulty believing that I could show GISS "wrong" with one days work. I'm think it's far more likely that I'm wrong. But I can't see how I can be so far out.

 

Hence, I will rework the figures based on criticisms of the methods.

 

If anybody wants to check the data, etc it's in Excel format and I'll happily give to anyone who wants it. It also means that nobody else has to go to the trouble of getting the station data. It's not in directly downloadable format, so it was a copy/paste for each station. Very time consuming.

Edited by JohnB
Posted

I wonder what would happen if you loaded this temperature data into the model. Would the results be accurate, or would there still be an anomaly? (Perhaps the "hole" in the data also affects the results of nearby areas in the model.)

 

If you happen to know how long the Clear Climate Code version takes to run a simulation like this, and if custom data can be loaded in, I might be able to try this.

Posted

There will always be an anomaly compared to the baseline.

 

I don't know if extra data can be loaded into the CCC as, AFAIK, it draws it's data directly from the GHCN database. You would need to d/load the GHCN and add the new data to the file and then change the pointers in the CCC to get the data from the new file. It might be possible.

 

It would be an interesting exercise to compare two maps, one with and one without the Bolivia data in the initial file. It would be another way of checking the interpolation code.

 

I mentioned the CCC project because it is a worthwhile endeavour in it's own right for models to have clear and consistent code.

 

It does however have one basic flaw. If I were to write a ballistics model in Fortran with the value g= 8.9 m/s2 and then port it to Python, the model would still be wrong as the initial values are wrong. In this respect, the CCC project is not a confirmation of the accuracy of the GISS code. Mistaken assumptions simply get ported with the code.

 

The purpose of this exercise was to test the accuracy of what the GISS code actually does in one particular area.

 

Where there is no data, as shown in grey in the 250 km map, the code infills the areas as demonstrated in the 1200 km map. There is no new data, the extra colour is generated by the code.

 

What I have attempted to find out by this exercise is whether or not the interpolation is accurate when compared to the real world values. I think it is reasonable to call the GISS interpolation a "prediction". In this case the prediction is that the average temps in Bolivia were between 1 and 2 degrees warmer than the baseline in January 2010.

 

While it is admittedly rough and dirty, my exercise came up with the answer of .15 degrees compared to the baseline. This is an order of magnitude lower than the prediction. Either I am way off or the GISS code is. (I'm reasonable sure it should be me, but I can't see how I could be that wrong. That's why I would like criticism.)

Posted

Yeah, I looked into the possibility of adding to the dataset earlier today. I think it is possible, but it may take some time -- the file format is rather odd. It would at least be possible to fudge it a bit. I'm reasonably good with Python, and the code looks fairly easy to understand, so I might be able to pull it off. If I have some spare time I might try it.

 

The other interesting question is "Why wasn't this temperature data included in the dataset?" Perhaps there's some other reason besides them not knowing it existed.

Posted
If anybody wants to check the data, etc it's in Excel format and I'll happily give to anyone who wants it. It also means that nobody else has to go to the trouble of getting the station data. It's not in directly downloadable format, so it was a copy/paste for each station. Very time consuming.

 

You might look into automating the screen scraping process. As a bit of reverse skepticism about data errors, are you sure you didn't introduce one in the process of manually copying and pasting the station data?

 

In Ruby, I'd use a tool like Mechanize to do this:

 

http://mechanize.rubyforge.org/mechanize/

Posted
As a bit of reverse skepticism about data errors, are you sure you didn't introduce one in the process of manually copying and pasting the station data?

 

Virtually certain. I was able to highlight the dates, mins and maxes as a block. When pasted into excel the format stayed the same. IOW, everything lined up into the cells unexpectedly well.

 

I then checked the pasted format against the webpage and the values matched in each case. I then repeated the exercise for each station. Since there were no apparent irregularities, I expect them all to be correct.

 

I'll check though, just to be sure.

 

Thanks for the Mechanize suggestion.

 

Any comments on the methods?

Posted

Just an update.

 

I'm going to try to create a gridded data set for the nation and see how that compares. It would appear that standard practice is to average each grid cell and then average the grid cells.

 

I'll give this a go first and will advise the papers used for the process.

 

Bascule, I won't ask NASA just yet as it would be comparing gridded to non gridded data. (Which I think is a flaw in the previous method)

 

I would also like to make it clear that nobody should expect GHCN or anybody else to datascrape websites for the station data. It's just too time consuming.

 

My personal opinion is that international standards for data storage, modification and reporting should be agreed on. Thor only knows how you would do it, but ATM you can't guarantee that a stations baseline average is worked out the same way in two different nations. This is an appalling state of affairs and only serves to make the raw data all the more valuable for future reference.

  • 2 months later...
Posted (edited)

Firstly, sorry for the delay, but RL can get in the way. ;)

 

I was also following a methodological path that I now realise is unnecessary.

 

I was working on the basis that to provide fully gridded data I would have to locate extra stations from the surrounding nations to include them into the grids. This would mean individually locating stations in Peru, Brazil, Paraguay, Argentina and Chile. I realised recently that for the purposed of this exercise this data is not needed.

 

As can be seen from the picture, Bolivia encompasses either fully or almost fully 3 5x5 Grid cells, designated A, B and C on the map.

 

A has 7 stations, B has 6 stations and C has 6 stations. To raise the average anomaly of grid A, the small area of Peru included in the grid would have to warm enough to raise the average anomaly of all stations by at least 10 which would mean that if there were two Peruvian stations they would each have to warm by a whopping 80 to bring the grid square anywhere near the GISS projection. I view this to be "highly unlikely". :)

 

So for the purposes of this exercise, if we concentrate on the Grids A, B and C, we can ignore data from the neighbouring nations. If we were to expand the study to any of the other grid squares, then the data from the neighbouring nations would have to be included.

 

boliviamodified.jpg

 

(Click for full size image.)

 

Gridded Data.

 

Okay, so we have the grids with the stations mapped out, how do we turn that into "Gridded Data"? This turns out to be surprisingly easy.

 

The method used by the CRU is oulined in Brohan et al 2005 which references Jones and Moberg 2003.

 

From Brohan et al;

Each grid-box value is the mean of all available station anomaly values, except that station outliers in excess of five standard deviations are omitted.

 

Jones and Moberg;

In this study we will use the CAM approach, which requires reducing all the station temperature data to anomalies, from a common period such as 1961–90 on a monthly basis. Gridbox anomaly values will then be produced by a simple averaging of the individual station anomaly values within each grid box.

 

This methodology appears to be commonly used, GHCN says;

Anomalies were calculated on a monthly basis for all adjusted stations having at least 25 years of data in the 1961-1990 base period. Station anomalies were then averaged within each 5 X 5 degree grid box to obtain the gridded anomalies.

 

GISS, use a different methodology unfortunately. They use the "Reference Method" as outlined in http://pubs.giss.nasa.gov/docs/1987/1987_Hansen_Lebedeff.pdf They appear to have refined the method as described in the original paper. Figure 2 in Hansen and Lebedeff show the world divided into 80 spacial zones, whereas the current GISS GISSTemp page says;

A grid of 8000 grid boxes of equal area is used. Time series are changed to series of anomalies. For each grid box, the stations within that grid box and also any station within 1200km of the center of that box are

combined using the reference station method.

 

One could argue that we are comparing apples and oranges, since I'm using the "Anomaly method" of Dr. Jones and GISS are using the "Reference Station" method of Hansen and Lebedeff. The response could be argued that I'm using the Anomaly method to check the calculations of the Reference Station method.

 

TBH, I'm not too sure on this, so any input and advice would be appreciated.

 

Anyway, back to the grids.

 

Grid A, stations and anomalies;

4. Cobija = .2

7. GUAYARAMERIN = -.6

13.RIBERALTA = .5

15.RURRENABAQUE = .1

16.SAN BORJA = .1

17.SAN IGNACIO DE MOXOS = 0.0

22.SANTA ANA = 0.0

 

Total anomaly for Grid A = + 0.3 0

Divided by No of stations (7) = .0430

 

Anomaly of Grid A compared to 1961-1990 baseline is +.0430

 

Grid B stations and anomalies;

1. Ororo = .5

5. COCHABAMBA = .7

8. LA PAZ - CENTRO = 2.2

9. LA PAZ - EL ALTO = 1.3

11.POTOSI CIUDAD = 1.1

23.SUCRE = -.1

 

Total anomaly for Grid B = 15.60

Divided by the No of stations (6) = 2.60

 

Anomaly of Grid B compared to the 1961-1990 baseline is +2.60

 

Grid C stations and anomalies;

2. ASCENCION DE GUARAYOS = 1.5

6. CONCEPCION = -.4

18.SAN IGNACIO DE VELASCO = -.8

19.SAN JAVIER = -.1

21.SAN JOSE = -.2

26.VALLEGRANDE = 0.0

 

Total anomaly for Grid C = 0.00

(Now I'm stuffed, aren't I?;))

Divided by the No of stations (6) = 0.00

 

Anomaly of Grid C compared to the 1961-1990 baseline is 0.00

 

The first thing that really stands out is Grid B. Why is it so high compared to the other two?

 

Only 4 stations recorded anomalies above 1.00 compared to the baseline and Grid B has 3 of them. Two of those 3 are in La Paz, the largest city in Bolivia, while the third, Potosi, is a regional Capital and major mining centre.

 

Metro La Paz, as it is now known is the result of three cities expanding into one large city. Originally La Paz, El Alto and Viacha, the cities are now one and reside in a bowl like depression surrounded by higher regions. I think it quite reasonable to assume that the very high (compared to the rest of the nation) anomalies could be the result of UHI.

 

I'll not defend this idea as fact, but it seems a reasonable hypothesis to explain the anomalous temperates in those locations. Arguments, refutations and suggestions are welcome.

 

The other reason that I won't argue strongly for the idea is that my stance is well known here and I would accuse myself of a form of "confirmation bias" and arguing for the result I want to see. I can only show as an impirical fact that the Grid B anomaly is very different from the rest of the nation. The explanation for this must come from further debate.

 

So, back to the basic idea of this thread; "Are the GISS extrapolations falsifiable?" I think they are.

 

Referring back to the maps in the original post, it is quite apparent that GISS overstate warming by at least an order of magnitude.

 

Grid A falls into the area listed as a 1-20 anomaly and is shown by the data to be only .0430.

 

Grid C seems to fall into the area with a .5-10 anomaly and the data shows 0.00.

 

I therefore submit that the "Reference Station" methodology used by GISS in the preperation of their Global Temperature Maps is falsified and is therefore wrong.

 

(At least from a first approximation using the Anomaly Method and the data as gleaned from the relevent Meteorological Department of Bolivia. Bearing in mind that no adjustments for UHI or other possible effects have been done.)

 

I still have trouble with the idea that I've found something that shows GISS wrong. I mean, come on, I'm an uneducated plebian and they are NASA ffs. I guess I'll have to write this up in a more condensed form and ask GISS or the guys at CCC for their comments.

 

I must be wrong somewhere, but I can't see where.

Edited by JohnB
Posted
As can be seen from the picture, Bolivia encompasses either fully or almost fully 3 5x5 Grid cells, designated A, B and C on the map.

 

I'm a bit confused here. I worked mostly on a mesoscale model, and that's the knowledge I'm trying to apply here, but you're talking about this map:

 

attachment.php?attachmentid=2506&d=1273807725

 

What's the expected granularity of the grids? The model we worked on used a fine-grained mesh, and by making the grids increasingly fine grained the accuracy of the model could be improved, albeit at the cost of increasing the compute time, as the grids where also the unit of parallelism at which computation was dispatched to our cluster.

 

I don't really know any of the details of either CRU or GISS's models, so I don't know what "5x5" is supposed to pertain to. What are the units?

 

I'm sure the practices used at the mesoscale don't apply to models aimed at the global level, but just reading that alarm bells are going off in my head as to the size of your grids. Can you tell me how coarse a granularity these estimates are normally performed at compared to the level you're performing them at?

Posted

My apologies bascule for not making the 5 x 5 clear.

 

CRU uses a 5 x 5 degree grid for their global temperature sets. A 5 x 5 grid, so to speak.

 

In the case of Bolivia, Grid A encompasses from 65-700 West and 10-150 South. Grid B is from 65-700 West and 15-200 South and Grid C is from 60-650 West and 15-200 South.

 

Basically a 50 x 50 grid is the standard size when a reference is made to "gridded data". From the CRU site linked to in my last post;

 

Dataset Terminology

CRUTEM3 land air temperature anomalies on a 5° by 5° grid-box basis

CRUTEM3v variance adjusted version of CRUTEM3

HadCRUT3 combined land and marine [sea surface temperature (SST) anomalies from HadSST2, see Rayner et al., 2006] temperature anomalies on a 5° by 5° grid-box basis

 

Again, I apologise for not being clear.

  • 2 weeks later...
Posted

Okay, I missed this:

 

A grid of 8000 grid boxes of equal area is used. Time series are changed to series of anomalies. For each grid box, the stations within that grid box and also any station within 1200km of the center of that box are combined using the reference station method.

 

So if I understand correctly, their "grid boxes" are immensely more fine grained than your "grids"?

 

I'd also be curious about the relative uncertainties between your method and theirs.

Posted
So if I understand correctly, their "grid boxes" are immensely more fine grained than your "grids"?

Yes and no. I used the grid method as used by the Hadley Centre, so mine is a straight map grid.

 

The reference station method uses smaller boxes yes, but includes all stations within 1200 km of the centre of the box, so the figures used for each include a large number of stations from outside the box.

 

For example if a "box" was centred on Chimore, roughly the dead centre of Bolivia, it would include all stations within 1200 km. IOW, almost to Lima Peru. Consequently, since the data is missing from the GHCN, it gets infilled from the surrounding stations. Which means that high, cold plains and mountains of Bolivia gets it's data from the Peruvian coast and the Brazillian jungle. I doubt either of these are truly representative of the actual Bolivian climate. However, once the box has it's value decided, it will then be used for calculations concerning temps in every box within 1200 km.

 

Put bluntly, it relies on the concept of "teleconnection" to be actual and real. I've always thought it to be a crock myself. The idea that you can extrapolate temps from stations 1200 km away, just doesn't seem to make sense. Can you really interpolate he temps in Washington based on the temps in St. Louis? "Teleconnection" is the only reason that the GISS maps have a wider coverage than Hadleys.

 

The grid method only used stations physically within the selected grid box but the Reference Station method extrapolates to a 1200 km radius from each station. So the GISS maps extrapolate further into the Arctic than Hadley does.

 

So the Reference method infills data where there is none according to the principles set down in the linked paper. The question is "Does it do this well?"

 

By comparing the actual data from Bolivia with the extrapolated data, I think we can say "No."

 

In many ways it is not surprising that this hasn't been noticed before. The method generally extrapolates into areas where there are no stations, like the Arctic.

 

Where do we expect to see the most warming, whether anthrpogenic or not? The high latitudes. Where do we see the most warming in the GISS maps? The high latitudes. So we are seeing in the maps exactly what we expect to see. Why would we think that there is something amiss? The data is agreeing with the theory.

 

I doubt that there are many areas of the planet where the extrapolations can be checked against records. Bolivia is fortuitously one of them.

 

As to uncertainties, IIRC the Hadley Centre claims an accuracy of .040 on the raw data. I can only say that my data is as accurate as the thermometers taking the temps. There will have been extra, very small uncertainties introduced by rounding the averages to two decimal places.

 

I've done no smoothing, extrapolation, interpolation or any other process.

 

Every step has been shown above. The maths involved in Grid maps is about as basic as it gets. (Otherwise I would have had a lot more trouble.:D)

 

I suppose the bottom line is that the data does not match the extrapolations as used by the Reference method. Since this has been shown in Bolivia, is it also true in the Arctic? And if so, how far out is it?

 

BTW, here is the latest graph from Hadley.

anomaly.png

 

Notice the blanks in South America and elsewhere.

Posted

For example if a "box" was centred on Chimore, roughly the dead centre of Bolivia, it would include all stations within 1200 km. IOW, almost to Lima Peru. Consequently, since the data is missing from the GHCN, it gets infilled from the surrounding stations. Which means that high, cold plains and mountains of Bolivia gets it's data from the Peruvian coast and the Brazillian jungle. I doubt either of these are truly representative of the actual Bolivian climate. However, once the box has it's value decided, it will then be used for calculations concerning temps in every box within 1200 km.

 

Do the calculations compensate for elevation? Are the data weighted for distance?

 

Put bluntly, it relies on the concept of "teleconnection" to be actual and real. I've always thought it to be a crock myself. The idea that you can extrapolate temps from stations 1200 km away, just doesn't seem to make sense. Can you really interpolate he temps in Washington based on the temps in St. Louis? "Teleconnection" is the only reason that the GISS maps have a wider coverage than Hadleys.

 

But to put that another way, you have to be taking the position that there is no correlation between temperatures in St. Louis and Washington. And if there is weighting for distance, that the correlation isn't progressively better if you look at Indianapolis, Cincinnati and Baltimore.

Posted

A few words on the boxes used in the Reference Station method. I screwed up a bit.:embarass: As I noted in an earlier post, Hansen and Lebedeff divide the world into 80 regions or boxes. However, each of these is further divided into a 10 x 10 grid, giving 8,000 sub-boxes. Each of these is around 200 km on a side and as bascule noted, is quite a fine grid.

 

I can see some pluses and minuses for this compared to the standard "Grid Box" method. Obviously the resolution is much finer at first glance, however in the more undeveloped areas or the more sparsely populated, there would be a larger number of sub-boxes that have no stations in them and therefore would require extrapolation.

 

It also means that while the sub-box is smaller than the 50 x 50 Grid Box, since it draws it's data from all stations within 1200 km, it actually used a larger area than the other method which is constrained by latitudinal and longditudinal limits. So is it really a finer grid?

 

They really are two very different methods.

 

swansont, good questions. I'll try to provide meaningful answers.

 

Elevation.

 

No, the calculations used by GISS do not compensate for elevation. The way the calculations are done makes elevation differences between stations irrelevent as it is based on station means over time.

 

At the most basic level. If there are two stations A and B, and we have records for A for 50 years and records for B for 30 years. (The most recent 20 years of B being missing.) The mean for both stations for the common period is calculated. The mean for B is then subtracted from the mean for A. (It can be the other way around, the lesser is subtracted from the higher, but it makes no practical difference which way you go.) This results in the "Bias" between the two stations. For the purpose of this exercise, we'' call the bias +.20. So for the reference period, (30 years) station A read, on average, +.20 higher than station B.

 

This bias is then applied to the missing period in station Bs records, to give a complete record for station B for the entire period. Hansen and Lebedeff note that (obviously) the correlation drops off as the stations get further apart and drops to .5 at 1200 km. Hence the 1200 km limit.

 

As can be seen, because the process uses the bias between the means, then differences in elevation shouldn't matter. In the long term records, we are comparing and averaging the biases. Elevation differences would only matter if station A was moved to a different position at a different elevation, as that would change the bias.

 

Even though I have trouble with the method, on a logical basis it does seem reasonable. The following makes it more so.

 

Weighting.

 

Yes, weighting is done. As the records are combined they are weighted. This is done on a sliding scale from 1 to 0. A weight of 1 is given to a station 0 km from the sub-box centre and a weight of 0 is given to a station 1200 km from the sub-box centre.

 

So as two stations get closer together, the correlation of the bias approaches 1. Entirely reasonable if constrained by latitudinal limits. This is the point of H & L, they do not try to find correlations between stations at largely different latitudes. A correlation between stations at say 400 North is reasonable, a correlation between stations at 400 North and 100 is not.

 

Which is a reason I'm concerned about the extrapolation being done in the Arctic, it covers too many degrees of latitude for the correlation to be good. It may all be "The Arctic", but I wouldn't want to extrapolate Australia from 10-12 stations, even if it is all "Australia".

 

On the whole, and I think anyone who reads Hansen and Lebedeff would agree, conceptually the "Reference Station" method should be the superior method. However, as I think I have reasonably shown in the above calculations, when compared to the actual temperatures, the infilling "Bias" method is not as accurate as it should be. I freely admit that I have no idea how to improve it.

 

An interesting check would be to take the CCC code and remove all data from the source file for the US States of Nebraska, Kansas, Oklahoma and Missouri, those 4 States being roughly the size of Bolivia and roughly in the centre of the US. Or perhaps Utah, Colorada and Wyoming to see the effects of mountains?

 

A comparison between the two results, one using the full dataset and the other using the truncated dataset would perhaps show if the extrapolation tends to artificially create a warming bias. It appears to do so in Bolivia, but one example does not a bias make.:D

 

While I'm also concerned about the GISS "Nightlight" method of Rural/Urban station weighting, that is a different question.

  • 1 month later...
Posted

Okay, this has been bugging me for some time now.

 

As I said above, I think the Reference Station method should be accurate and it's irritated that it apparently isn't.

 

I have come up with a possible explanation, but have no idea how to test it. It's not only "above my paygrade", but above my abilities.:D

 

The RSM uses the bias between stations to extrpolate into areas of station drop out. Because it uses the long term averages to calculate the bias, the bias during either a warming or cooling time period will move towards the centre of the range. It's an average, it doesn't care which way the temps go. Also, by being a constant, it to a great degree ignores altitude.

 

However, if the reference station warms faster than the one being extrapolated, while it will effect the bias, it will still result in a constant. The bias is a constant.

 

Anyone reading on climate change is well aware of "Arctic Amplification", that higher latitudes will warm faster than lower ones. I wondered if there is a detectable difference between the rates of warming at different altitudes.

 

Do higher altitudes warm faster or slower than lower altitudes? It struck me that there should be a difference.

 

Consequently I wonder if the bias, rather than being a constant derived from the difference between two averages would be better worked as an evolving value derived from the difference between the rates of change of the two averages.

 

So, depending on whether the difference between the rates of change of the averages converge or diverge the bias will increase or decrease. There would have to be a limit to the bias, but how to arrive at that figure I just don't know.

 

Given a good enough base period, I think it might go some way towards solving the problem.

 

Thoughts?

  • 6 months later...
Posted
I therefore submit that the "Reference Station" methodology used by GISS in the preperation of their Global Temperature Maps is falsified and is therefore wrong.

 

After a quick chat with a gentleman from the CCC it is apparent that this statement is wrong.

 

I therefore totally and unreservedly retract it.

 

The best that this exercise can demonstrate is that the GISS code appears to introduce a warming bias when it infills data for 1200 km smoothing in this case.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.