Sunday, 24 July 2011

Utilizing Numpy to perform complex GIS operation in ARCGIS 10

Today my country Nepal played with Jordan for securing  a place in world cup 2014 qualification. I was hoping for at least a draw but after 20 minutes or so, Nepal was already 3-0 down. I could not follow the live update anymore, therefore decided to grab a beer and  write something for my blog.This is how this post came into existence.

I wanted to write about array manipulation within a arcgis for a long time. The idea came from a problem that was faced by one of my close senior friend who  frequently uses  arcgis in his work. He had a raster (SRTM ) data and he wanted to calculated maximum value in a varying window ( like 5X5, 7X7, 9X9) for the data and marked that maximum value in the window as one. Focal statistics function in arcgis can calculated maximum value in a specified neighborhood but it assigns the maximum value in the centre cell. It does not tell where the maximum value lies within the window. So he asked me for suggestion, and I , being a MATLAB savvy person quickly wrote a function in MATLAB which would do the trick. I send the solution to the friend but the problem was, he was not familiar to MATLAB and he did not have access to MATLAB as well. He wanted to have a solution that works within the arcgis.

So, I have dived into NUMPY which is a fundamental package in Python for large, multi-dimensional array and matrices for solving the problem. It is already integrated into arcgis 10. In arcgis 10, there are two functions namely RasterToNumPyArray and NumPyArrayTORaster. First function converts a raster to matrix and second function converts a matrix to array. These two functions open a new opportunity for performing MATLAB like matrix manipulation within the arcgis.

So, in this post I will show you how problem mentioned before can be solved. First I will show you the code and will explain it concisely.

  1. import arcpy

  2.  from numpy import *

  3. myArray = arcpy.RasterToNumPyArray('basic_raster')

  4. [cols,rows]= myArray.shape

  5.  winsize= 3;

  6.  winhalf = 1

  7.  I=zeros((cols, rows))

  8.  for n in range(winhalf,rows-winhalf):

  9. for m in range (winhalf,cols-winhalf):

  10. data = myArray[m-winhalf:m+winhalf+1,n-winhalf+1:n+winhalf+1]

  11. maxdata = data.max()

  12. indices = where(data ==maxdata)

  13.  I[m-winhalf+indices[0],n-winhalf+indices[1]] = 1

Line 1 and Line 2 imports the arcpy and numpy module.

Line 3 stores raster ‘basic_raster’ in a variable myArray

Line 4 stores size of the matrix myArray

Line 5 defines a window size ( three in this case)

Line 6 creates a matrix which is same size of rater ‘basic_raster’ but all filled with zero

Line 8-13 does the main computation. It will iterate through 3X3 window and calculates the maximum value and assigns 1 to the location of cell which contains the maximum value in the window.

Now we could already exports the I matrix using the function NumPyArrayTORaster (I). But doing so has couple of disadvantage because by doing so we would have just exported the matrix as a raster without any information about the cell size, coordinate and projection information. Therefore we need to get all these information from ‘basic_raster’ which is the raster based on which we calculated matrix I. For this we can use a bit of arcpy.

14 descData=arcpy.Describe('basic_raster')
15 cellSize=descData.meanCellHeight
16 extent=descData.Extent
17 spatialReference=descData.spatialReference
18 pnt=arcpy.Point(extent.XMin,extent.YMin)
19 newRaster = arcpy.NumPyArrayToRaster(I,pnt, cellSize,cellSize)
Line 14 uses a function ‘Describe’ to get information about the raster ‘basic_raster’.

Line 15-18 stores different information in various variables

Line 159transforms matrix I into a raster named ‘newraster’ using information from the raster ‘basic_raster’

Now we have a new raster where one represents cells with a maximum values in a sliding window of 3x3 which perfectly aligns with the raster ‘basic_raster’ in the same projection system.

I hope you have gained some insights into enormous advantage of integrating numpy and arcpy in your GIS workflow. If you have some prior programming knowledge, you would find pretty easy to learn python. Even if you don’t have any programming experience, I suggest you to gets your hands dirty with python. It’s not that hard to learn python, believe me :).


  1. Here is my code to manipulate matrix in C

  2. This comment has been removed by the author.

  3. Could you explain how the winsize term is used? Its defined but never utilized in code 8-13. And the for n and for m is used to move the moving window by 1 from the intial winhalf value of 1? Have you considered using the stride method suggested on - not sure if it would work for this application though...?