Pitfalls of gdal_calc
Not long ago I wrote about the raster calculator in GDAL. It is a useful tool and has enough functionality to solve most tasks. But as it turns out, it is not without its bugs.
I was playing with this calculator and looked at the code. Almost immediately I noticed that it reads data “as is”, which was something to worry about, as I had already stepped on that rake some time ago (/me is immersed in memories for a few minutes. It was a fun time: two parallel projects, both related to GDAL; lots of new stuff to learn…). A little test confirmed my fears. Here is the result of an NDVI calculation for a Byte raster, performed with RasterCalc and saved as Float32.
And this is the result of the same operation performed by gdal_calc
You can’t get this across with an image, but the white areas aren’t really white at all — they contain a NODATA value.
What can I say, the results are very interesting. But the explanation is simple: since the data is read “as is”, when performing arithmetic operations, you can run into things like integer overflow and rounding of the result (in Python, the result of dividing two integers is also an integer).
Obviously, this is a bug. So let’s write to the author, report the issue and propose a patch.
If you are using this tool, you may not want to wait for the next version of GDAL and fix the bug yourself. To do this, open gdal_calc.py
in your favourite text editor and lines 230–232
myval=BandReadAsArray(myFiles[i].GetRasterBand(myBands[i]), xoff=myX, yoff=myY, win_xsize=nXValid, win_ysize=nYValid)
change to this
myval=BandReadAsArray(myFiles[i].GetRasterBand(myBands[i]), xoff=myX, yoff=myY, win_xsize=nXValid, win_ysize=nYValid).astype(numpy.float32)