TerraLib: Open Source Tools for GIS Application Development

Download Report

Transcript TerraLib: Open Source Tools for GIS Application Development

Developing open source
GIS: what are the
challenges?
Gilberto Câmara
INPE – Brasil
www.terralib.org
Institute for Geoinformation – TU Wien – 16 June 2004
The Promise of Open Source


When an OSS project reaches a “critical size” we obtain
many benefits
Robustness


Cooperation


``Given enough eyeballs, all bugs are shallow.''
``Somebody finds the problem and somebody else understands
it'‘ (Linus Thorvalds)
Continuous Improvement

“Treating your users as co-developers is your least-hassle route
to rapid code improvement and effective debugging”
Naïve view of open source projects

Software



Development network


Large number of developers, single repository
Open source products


Product of an individual or small group (peer-pressure)
Based on a “kernel” with “plausible promise”
View as complex, innovative systems (Linux)
Incentives to participate


Operate at an individual level (“self-esteem”)
Wild-west libertarian (“John Waynes of the modern era”)
Idealized model of OS software
Networks of committed individuals
The Reality of Open Source

Previous existence of conceptual designs of similar
products (the potential for reverse engineering)


Design is the hardest part of software (Fred Brooks)
Problem granularity (the potential for distributed
development)

Effective peer-production requires high granularity
Potential for Reverse Engineering

Post-mature




A private company develops a software product.
Product becomes popular and it becomes part of the “public
commons”.
Others develop a public domain equivalent (e.g.,Open Office)
Standards-led





Standards consolidate a technology
Allow compatible solutions to compete in the marketplace.
SQL database standard (e.g.,mySQL and PostgreSQL).
POSIX standard (guidance to Linux)
OpenGIS specifications (e.g.,Degree, MapServer, GeoServer)
Potential for Distributed Development

Parts of a software product


Operating systems (Linux)



kernel and additional functions that use it (its periphery).
well-defined kernel for process control
periphery consisting of programs such as device drivers,
applications, compilers and network tools.
Database management systems


strong kernel of highly integrated functions (such as the parser,
scheduler, and optimizer)
much smaller periphery.
Potential for Distributed Development

Each type of software product - periphery/kernel ratio


Kernel


a tightly-organized and highly-skilled programming team.
Periphery


constrains the potential for distributed development
More widespread programmers of various skills
Example

Out of more than 400 developers, the top 15 programmers of
the Apache web server contribute 88% of added lines [Mockus,
2002 #2293].
Four Types of Open Source Software

High reverse engineering, high distribution potential

High reverse engineering, low distribution potential

Low reverse engineering, high distribution potential

Low reverse engineering, low distribution potential
Type 1 – High-High

High reverse engineering, high distribution potential:

Archetypical open source projects


Developers



The “Linux” model.
May have a separate job
Time allocated in agreement with their employer.
community-led projects.
Type 2 – High-Low

High reverse engineering, low distribution potential

Large number of projects


Databases, office automation tools, web services.
Large presence of private companies


products similar to market leaders.
reduced risk in reverse engineering.
 main design decisions take place within the institution

Examples



mySQL and PostgreSQL DBMS,
GNOME from Ximian
corporation-led projects.
Type 3 – Low/High

Low reverse engineering, high distribution potential

Stable kernel, innovative periphery



Origin


academic environments
Examples


usually there is no commercial counterpart
share a relatively simple software kernel
GRASS GIS software and the R suite of statistical tools.
collaborative projects
Type 4 – Low/Low

Low reverse engineering, low distribution potential

Innovative kernel, small periphery

Small teams under a public R&D contract



High mortality rate


addressing specific requirements
aiming to demonstrate novel scientific work.
most of them are restricted to the lifetime of a research grant.
innovative products.
High-Low
Potential
Rev Eng
High-High
mySQL
Linux
PostgreSQL
OpenOffice
perl
Apache
Postgres
GRASS
R
NCSA browser
Low-Low
Low-High
Potential
Distrib Develop
High-Low
Potential
Rev Eng
High-High
corporate
communitary
innovative
collaborative
Low-Low
Low-High
Challenges?
Potential
Distrib Develop
Lessons from Open Source Projects

“It's fairly clear that one cannot code from the ground up
in bazaar style . One can test, debug and improve in
bazaar style, but it would be very hard to originate a
project in bazaar mode. Linus didn't try it. Your nascent
developer community needs to have something runnable
and testable to play with” (Eric Raymond)
Moving from the Low-Low Quadrant

Software in the “Low-Low” quadrant


Moving from an innovative to a collaborative project



Unsustainable in the long run
Sharing innovation
Transforming a crude prototype into a modular, well designed
system
How do you build innovation into a modular design?
Moving from the Low-Low Quadrant

“Perfection in design is achieved not when there is
nothing more to add, but rather when there is nothing
more to take away”. (Saint-Exupery)

How do you achive perfection in information science?



Good scientific foundation
Usually, sound mathematical abstractions
What is the situation in GIS?
Do we have a solid foundation for GIS?
id name year
selection
projection
cartesian prod
union difference
SELECT name
FROM faculty
WHERE year > 1960
relations
relational algebra
SQL query language
Operations on
ST types
Spatio-temporal
data types
Spatial algebra
?
GIS language
Challenges for geoinformation
Source: Gassem Asrar (NASA)
The Road Ahead: Smart Sensors
SMART DUST
Autonomous sensing and
communication in a cubic
millimeter
Source: Univ Berkeley, SmartDust project
Knowledge gap for spatial data
source: John McDonald (MDA)
What’s the Current Status of Open Source
GIS?

High-Low products




Low-high products



Standards-based
Spatial DBMS: mySQL, PostgreSQL
OpenGIS + Web: MapServer, Degree
Stable kernel, innovation at the periphery
GRASS and R
What about GIScience challenges?

spatio-temporal data models, geographical ontologies, spatial statistics
and spatial econometrics, dynamic modelling and cellular automata,
environmental modelling, neural networks for spatial data
TerraLib: Open source GIS library

Data management


Functions


All of data (spatial + attributes) is in
database
Spatial statistics, Image Processing,
Map Algebra
Innovation

Based on state-of-the-art techniques
 Same timing as similar commercial
products

Web-based co-operative development

http://www.terralib.org
Operational Vision of TerraLib
DBMS
TerraLib
Geographic
Application
Spatial
Operations
API for
Spatial
Operations
Spatial
Operations
Access
Oracle
Spatial
MySQL
Postgre
SQL
TerraLib  MapObjects + ArcSDE + cell spaces + spatio-temporal models
TerraLib applications

Cadastral Mapping


Public Health


Indicators of social exclusion in innercity areas
Land-use change modelling


Spatial statistical tools for
epidemiology and health services
Social Exclusion


Improving urban management of large
Brazilian cities
Spatio-temporal models of
deforestation in Amazonia
Emergency action planning

Oil refineries and pipelines (Petrobras)
TerraCrime
Palm-top
Exemplos de Produtos Web
TerraLib Structure
Java Interface
COM Interface
OGIS Services
C++ Interface
Functions
kernel
Visualization
Controls
Spatio-Temporal
Data Structures
File and DBMS
Access
I/O Drivers
External
Files
DBMS
Spatio-Temporal Data Types
Events
time
Near in space, near
in time?
y
x
Dynamical Spatial Model
f ( I (t) )
f ( I (t+1) )
F
f ( I (t+2) )
f ( I (tn ))
F
..
“A dynamical spatial model is a mathematical
representation of a real-world process when a location
changes in response to external forces (Burrough)
Spatial Simulation
S2
Reality - Bauru in 1988
S3
Cell Spaces: Old Wine, New Bottle
Regression with Spatial Data: Understanding
Deforestation in Amazonia
Future Deforestation Scenarios
Terra do Meio
South of Amazonas State
Hot-spots map for new deforestation
Modelling anisotropic space
Spatial relations in Amazonia are not isotropic!
Desigining for Extensibility

Algorithms





basic core of most successful GIS
large number of them do not depend on some particular
implementation of a data structure
based a few fundamental semantic properties of the structure
properties can be - for example - the ability to get from one
element of the data structure to the next, and to compare two
elements of the data structure .
Spatial analysis algorithms

can be abstracted away from a particular data structure and
described only in terms of their properties.
Same Algorithm, Different Geometries
Generic GIS Programming

How to decouple algorithms from data structures ?



Idea: Iterators (“inteligent pointers”)
Algoritms are not classes !!
“Decide which algorithms you want; parametrize them so
they work for a variety of suitable types and data
structures”
Algorithms
Iterators
Geometries
Scientific Challenges for Innovation in GIS

How can we design an algebra for ST types?


How do we design a language for spatial modelling?



Requires a caracterization of measurents
Cognitively meaningful interfaces
Representation of Space


What are the spatial-temporal data types?
How do we represent anisotropic space?
Extensibility of Models and Algorithms

How do we design for extensibility?
Why am I here today in TU-Wien?

Innovation in GISystems


Requires addressing challenges in GIScience
Cooperation with prof. Andrew Frank




Generic GIS Programming
Semantics of Geographical Measurements
Spatio-Temporal Types and Algebras
Methods for Representation of Anisotropic Space
Result of Sound Scientific Work
High-Low
Potential
Rev Eng
High-High
mySQL
Linux
PostgreSQL
OpenOffice
perl
Apache
Postgres
GRASS
R
NCSA browser
TerraLib
Low-Low
Low-High
Potential
Distrib Develop
Conclusions

Open Source software model



Geoinformation



The Linux example is not applicable to all situations
Moving from the individual level to the organization level
Innovative open source GIS software has a large role
Sound research is needed to support innovation
Cooperation in GIScience is fundamental



The problem is enormous...requires a combination of R&D
We are few R&D groups
Cooperation is the only way to ensure a future for GIScience