Transcript 11042011Geoprocessing with Python and Numpy
Geoprocessing with GDAL and Numpy in Python
Delong Zhao 11-03-2011
Geoprocessing with GDAL and Numpy in Python
• • • GDAL - Geospatial Data Abstraction Library Numpy - the N-dimensional array package for scientific computing with Python. Both of them are open source software Read raster dataset using GDAL Do some calculation using Numpy Output to geospatial dataset using GDAL
GDAL
• Supports about 100 raster formats – ArcInfo grids, ArcSDE raster, Imagine, Idrisi, – ENVI, GRASS, GeoTIFF – HDF4, HDF5 – USGS DOQ, USGS DEM – ECW, MrSID – TIFF, JPEG, JPEG2000, PNG, GIF, BMP – See http://www.gdal.org/formats_list.html
Numpy
• • • • • • the fundamental package needed for scientific computing with Python. a powerful N-dimensional array object sophisticated (broadcasting) functions tools for integrating C/C++ and Fortran code useful linear algebra, Fourier transform, and random number capabilities.
http://numpy.scipy.org/
Installation
• • • 1. Enthought python scientific computing package, includes numpy – http://www.enthought.com/ 2. GDAL - Geospatial Data Abstraction Library – http://www.lfd.uci.edu/~gohlke/pythonlibs/ Or all of these has been installed on EOMF and Cybercommons servers
Tutorial
• • • • http://itmetr.net/itmetr.cgi/PyIntro http://www.gis.usu.edu/~chrisg/python/2009 / http://www.scipy.org/NumPy_for_Matlab_Us ers https://www.cfa.harvard.edu/~jbattat/comput er/python/science/idl-numpy.html
Sample 1
• • Read two tif files (red band and nir band) Calculate • • Output NDVI in same projection and georeference as the input file.
Numpy example
Algorithm development for global cropping intensity from 2000-2011 1-crop per year 2-crop per year 3-crop per year
Dynamics of winter wheat and paddy rice fields in Nanjing, Jiangsu, China
1.0
0.8
0.6
0.4
NDVI LSWI EVI Field site in Jiangsu 0.2
0.0
1/1/02 winter wheat 3/1/02 5/1/02 paddy rice 7/1/02 9/1/02 Time (8-day interval) 11/1/02 1/1/03 (b) 6/11/99 rice field preparation (c) 7/3/99 2-weeks after rice transplanting (d) 9/6/99 rice plant heading
MODIS 8-day composites of surface reflectance product (MOD09A1) NDSI NDVI, EVI, LSWI Snow mask Cloud mask Permanent water mask Evergreen vegetation mask
Temporal profile analysis of individual pixels
Cropping intensity ( # of crops per year) Crop calendar (planting & harvesting dates) Inundation and paddy rice
Global Mapping of Croplands Cropping Intensity map in 2004
1.0
0.8
0.6
0.4
0.2
NDVI LSWI EVI one MODIS pixel in Bangkok area 0.0
-0.2
1 2 3 4 Starting date 5 6 7 8 Month in 2004 Starting date 9 10 11 12 1.0
0.8
0.6
0.4
0.2
One MODIS pixel in Mekong basin NDVI LSWI EVI 0.0
-0.2
1 2 3 flooding & transplanting 4 5 6 7 flooding & transplanting 8 9 10 Month in 2004 flooding & transplanting 11 12
Python multiprocessing
• • • • http://docs.python.org/library/multiprocessin g.html
import multiprocessing pool = multiprocessing.Pool(processes=multiprocessing.cpu_count()) pool.map(doprocess,findfiles(root_dir))
Benchmark
• • • • I did some benchmark. By using all 8 cpu and 16G memory on one eomf server can finish 1 MODIS tile NDVI, EVI, CLOUD,SNOW, LANDWATER, FLOOD, Drought products in 15 minutes.
This means we can finish global 296 tiles 2000 2011 MODIS data in 786 hours (32 days) with one server. And we have 6 computation servers, we can improve it to 6 days if all the servers can do the job.