Transcript Document

DATA ANALYSIS USING
PYTHON-I
Baburao Kamble and Ayse Kilic
University of Nebraska-Lincoln
University of Nebraska-Lincoln
GIS in Water Resources Lecture, 2014
7/7/2015
Objectives
To be able to understand and write Python scripts
Numerical, Text (Weather) &Geospatial (Environmental)
data manipulation
7/7/2015
Agenda
Part I: Basic Python Programming
• Python Syntax, Strings, Array, Conditional and Control flow,
Functions
• File Input/Output and Text Data Processing
Part II: Geospatial Data Analysis
• Installation of GDAL bindings on windows operating system
• Reading Raster (Landsat and DEM) data
• Using GDAL function from command line (interpolation, translation)
• Raster Processing.
• NDVI calculation
• DN2Rad2Ref
Why use GDAL/NumPy instead of canned GIS software?
• Not advisable if what you want to do is easily handled within
ArcGIS/Imagine/etc. – there is a lot of programming overhead
• Well suited for process model applications where the logic at a cell
based is too complex
• Example:
• Grid algebra : grid1 + grid2 (probably use GIS)
• Finding NN in multidimensional space (maybe use GDAL/Numpy)
• Also useful if your spatial data is NOT standard GIS formats
(JPEG, PNG, etc.)
Geoprocessing with GDAL and Numpy
in Python
• GDAL - Geospatial Data Abstraction Library
• Numpy - the N-dimensional array package for scientific
computing with Python.
• Both of them are open source software
Read raster
dataset using
GDAL
Do some
calculation
using Numpy
Output to geospatial
dataset using GDAL
Python Libraries for
Geospatial Development
• There are two popular Python libraries for raster and vector data:
• GDAL (Geospatial Data Abstraction Library)
• Is for raster-based geospatial data; aailable for download at:
http://gdal.org
• Windows user use Frank Warmerdam’s “FWTools” open source
GIS binaries for Windows (32 bit); available at:
http://fwtools.maptools.org
# includes gdal and ogr
• OGR
• Is for vector-based geospatial data and is available for download
at: http://gdal.org/ogr
• NOTE: These are now merged together and are downloaded together
under the common name GDAL
GDAL
(Geospatial Data Abstraction Library)
• Presents an “abstract data model” for processing spatial
data
• Can be used directly from C/C++ and can be “wrapped”
for use with Python, Perl, VB, C#, R, Java …
• Early developers have chosen Python as their scripting
language and documentation is relatively good for this.
UGIC 2009
GDAL
• Geospatial Data Abstraction Library
• Raster data access
• Supports about 100 different formats
• ArcInfo grids, ArcSDE raster, Imagine, Idrisi, ENVI, GRASS,
GeoTIFF
• HDF4, HDF5
• USGS DOQ, USGS DEM
• ECW, MrSID
• TIFF, JPEG, JPEG2000, PNG, GIF, BMP
NumPy
(Numerical Python)
• An array/matrix package for Python
• Well suited for image processing – i.e. one function can
operate on the entire array
• Slicing by dimensions and applying functions to these
slices is concise and straightforward
• Nearly 400 methods defined for use with NumPy arrays
(e.g. type conversions, mathematical, logical, etc.)
GDAL and NumPy
• Since GDAL 1.3(?), GDAL has
implemented NG (New Generation)
Python bindings which includes NumPy
• Process:
Get raster band(s)
Convert the raster
band(s) to a NumPy
array using
ReadAsArray()
Open GDALDataset
Write out
GDALDataset
Convert the NumPy
array to GDAL raster
bands using
WriteAsArray()
Process the raster
band(s) using
NumPy functionality
7/7/2015
GDAL
• GDAL installation
• Make sure NumPy is installed
(http://sourceforge.net/projects/numpy/files/)
• Download gdalwin32exe160.zip from
http://download.osgeo.org/gdal/win32/1.6/
• Unzip to C:\gdalwin32-1.6
11
7/7/2015
Test Your GDAL in Python
12
7/7/2015
GDAL Supports more than 100 raster formats
http://www.gdal.org/formats_list.html
Finding available formats
gdalinfo --formats
13
GDAL Data Model
• GDAL’s data model includes:
• Dataset, which hold the raster data in a collection of raster
bands
• Raster Band: represents a band, channel, or layer within
•
•
•
•
•
an image (e.g., RGB image has red, green, and blue
components of the image)
Raster size: specifies the width of the image in pixels and
overall height of the image in lines
Georeferencing transform converts from x,y coordinates
into georeferenced coordinates (on the surface of Earth)
Affine transformation mathematical formula allowing
operations such as X offset, Y offset, X scale, Y scale,
horizontal shear, vertical shear
Coordinate system includes the projection and datum
Metadata provide additional information about the dataset
GDAL example Python code
•
Use GDAL to calculate the average of the height values contained in a DEM:
from osgeo import gdal, gdalconst
dataset = gdal.Open(”DEM.dat”)
band = dataset.GetRAsterBand(1)
fmt= “<“ + (“h” * band.XSize)
totHeight = 0
for y in range (band.Ysize):
scanline = band.ReadRaster (0, y, band.Xsize, 1, band.Xsize, 1, band.DataType)
values = struct.unpack(fmt, scanline)
for value in values:
totHeight = totHeight + value
average = totHeight / (band.xSize * band.YSize)
print “Average height = “, average
OGR
• Datasource: represents the file (e.g., a country), has
many:
• Layers: sets of related data, e.g., terrain, contour lines,
roads layers
• Spatial reference: specifies datum and projection
• Attributes: additional metadata about the feature
• Geometry: specifies the shape and location of the
feature. Are recursive, i.e., can have sub-geometries
OGR example Python code
import osgeo.ogr
from osgeo import ogr
shapefile = ogr.Open(“TM_WORLD_BORDERS-0.3.shp”)
layer = shapefile.GetLayer (0)
for i in range (layer.GetFeatureCount()):
feature = layer.GetFeature(i)
name = feature.GetField(“NAME”)
geometry = feature.GetGeometryRef()
print i, name, geometry.GetGeometryName()
Sample 1
• Read two tif files (red band and nir band)
• Calculate
• Output NDVI in same projection and georeference as the
input file.
• Numpy example
19
7/7/2015
Raster File input/output
read_raster.py
read_Write_raster_in_Block.py
NDVI.py
DN2rad2ref.py
GDAL Command Line Utilities
gdalinfo - report information about a file.
gdal_translate - Copy a raster file, with control of output format.
gdaladdo - Add overviews to a file…pyramids
gdalwarp - Warp an image into a new coordinate system.
gdal_contour - Contours from DEM.
gdaldem - Tools to analyze and visualize DEMs.
rgb2pct.py - Convert a 24bit RGB image to 8bit paletted.
pct2rgb.py - Convert an 8bit paletted image to 24bit RGB.
dal_merge.py - Build a quick mosaic from a set of images.
gdal_rasterize - Rasterize vectors into raster file.
gdaltransform - Transform coordinates.
nearblack - Convert nearly black/white borders to exact value.
gdal_grid - Create raster from the scattered data.
gdal_polygonize.py - Generate polygons from raster.
gdal_sieve.py - Raster Sieve filter.
gdal_fillnodata.py - Interpolate in nodata regions.
gdal-config - Get options required to build software using GDAL.