TerraLib: Open Source Tools for GIS Application Development
Download
Report
Transcript TerraLib: Open Source Tools for GIS Application Development
Developing open source
GIS: what are the
challenges?
Gilberto Câmara
INPE – Brasil
www.terralib.org
Institute for Geoinformation – TU Wien – 16 June 2004
The Promise of Open Source
When an OSS project reaches a “critical size” we obtain
many benefits
Robustness
Cooperation
``Given enough eyeballs, all bugs are shallow.''
``Somebody finds the problem and somebody else understands
it'‘ (Linus Thorvalds)
Continuous Improvement
“Treating your users as co-developers is your least-hassle route
to rapid code improvement and effective debugging”
Naïve view of open source projects
Software
Development network
Large number of developers, single repository
Open source products
Product of an individual or small group (peer-pressure)
Based on a “kernel” with “plausible promise”
View as complex, innovative systems (Linux)
Incentives to participate
Operate at an individual level (“self-esteem”)
Wild-west libertarian (“John Waynes of the modern era”)
Idealized model of OS software
Networks of committed individuals
The Reality of Open Source
Previous existence of conceptual designs of similar
products (the potential for reverse engineering)
Design is the hardest part of software (Fred Brooks)
Problem granularity (the potential for distributed
development)
Effective peer-production requires high granularity
Potential for Reverse Engineering
Post-mature
A private company develops a software product.
Product becomes popular and it becomes part of the “public
commons”.
Others develop a public domain equivalent (e.g.,Open Office)
Standards-led
Standards consolidate a technology
Allow compatible solutions to compete in the marketplace.
SQL database standard (e.g.,mySQL and PostgreSQL).
POSIX standard (guidance to Linux)
OpenGIS specifications (e.g.,Degree, MapServer, GeoServer)
Potential for Distributed Development
Parts of a software product
Operating systems (Linux)
kernel and additional functions that use it (its periphery).
well-defined kernel for process control
periphery consisting of programs such as device drivers,
applications, compilers and network tools.
Database management systems
strong kernel of highly integrated functions (such as the parser,
scheduler, and optimizer)
much smaller periphery.
Potential for Distributed Development
Each type of software product - periphery/kernel ratio
Kernel
a tightly-organized and highly-skilled programming team.
Periphery
constrains the potential for distributed development
More widespread programmers of various skills
Example
Out of more than 400 developers, the top 15 programmers of
the Apache web server contribute 88% of added lines [Mockus,
2002 #2293].
Four Types of Open Source Software
High reverse engineering, high distribution potential
High reverse engineering, low distribution potential
Low reverse engineering, high distribution potential
Low reverse engineering, low distribution potential
Type 1 – High-High
High reverse engineering, high distribution potential:
Archetypical open source projects
Developers
The “Linux” model.
May have a separate job
Time allocated in agreement with their employer.
community-led projects.
Type 2 – High-Low
High reverse engineering, low distribution potential
Large number of projects
Databases, office automation tools, web services.
Large presence of private companies
products similar to market leaders.
reduced risk in reverse engineering.
main design decisions take place within the institution
Examples
mySQL and PostgreSQL DBMS,
GNOME from Ximian
corporation-led projects.
Type 3 – Low/High
Low reverse engineering, high distribution potential
Stable kernel, innovative periphery
Origin
academic environments
Examples
usually there is no commercial counterpart
share a relatively simple software kernel
GRASS GIS software and the R suite of statistical tools.
collaborative projects
Type 4 – Low/Low
Low reverse engineering, low distribution potential
Innovative kernel, small periphery
Small teams under a public R&D contract
High mortality rate
addressing specific requirements
aiming to demonstrate novel scientific work.
most of them are restricted to the lifetime of a research grant.
innovative products.
High-Low
Potential
Rev Eng
High-High
mySQL
Linux
PostgreSQL
OpenOffice
perl
Apache
Postgres
GRASS
R
NCSA browser
Low-Low
Low-High
Potential
Distrib Develop
High-Low
Potential
Rev Eng
High-High
corporate
communitary
innovative
collaborative
Low-Low
Low-High
Challenges?
Potential
Distrib Develop
Lessons from Open Source Projects
“It's fairly clear that one cannot code from the ground up
in bazaar style . One can test, debug and improve in
bazaar style, but it would be very hard to originate a
project in bazaar mode. Linus didn't try it. Your nascent
developer community needs to have something runnable
and testable to play with” (Eric Raymond)
Moving from the Low-Low Quadrant
Software in the “Low-Low” quadrant
Moving from an innovative to a collaborative project
Unsustainable in the long run
Sharing innovation
Transforming a crude prototype into a modular, well designed
system
How do you build innovation into a modular design?
Moving from the Low-Low Quadrant
“Perfection in design is achieved not when there is
nothing more to add, but rather when there is nothing
more to take away”. (Saint-Exupery)
How do you achive perfection in information science?
Good scientific foundation
Usually, sound mathematical abstractions
What is the situation in GIS?
Do we have a solid foundation for GIS?
id name year
selection
projection
cartesian prod
union difference
SELECT name
FROM faculty
WHERE year > 1960
relations
relational algebra
SQL query language
Operations on
ST types
Spatio-temporal
data types
Spatial algebra
?
GIS language
Challenges for geoinformation
Source: Gassem Asrar (NASA)
The Road Ahead: Smart Sensors
SMART DUST
Autonomous sensing and
communication in a cubic
millimeter
Source: Univ Berkeley, SmartDust project
Knowledge gap for spatial data
source: John McDonald (MDA)
What’s the Current Status of Open Source
GIS?
High-Low products
Low-high products
Standards-based
Spatial DBMS: mySQL, PostgreSQL
OpenGIS + Web: MapServer, Degree
Stable kernel, innovation at the periphery
GRASS and R
What about GIScience challenges?
spatio-temporal data models, geographical ontologies, spatial statistics
and spatial econometrics, dynamic modelling and cellular automata,
environmental modelling, neural networks for spatial data
TerraLib: Open source GIS library
Data management
Functions
All of data (spatial + attributes) is in
database
Spatial statistics, Image Processing,
Map Algebra
Innovation
Based on state-of-the-art techniques
Same timing as similar commercial
products
Web-based co-operative development
http://www.terralib.org
Operational Vision of TerraLib
DBMS
TerraLib
Geographic
Application
Spatial
Operations
API for
Spatial
Operations
Spatial
Operations
Access
Oracle
Spatial
MySQL
Postgre
SQL
TerraLib MapObjects + ArcSDE + cell spaces + spatio-temporal models
TerraLib applications
Cadastral Mapping
Public Health
Indicators of social exclusion in innercity areas
Land-use change modelling
Spatial statistical tools for
epidemiology and health services
Social Exclusion
Improving urban management of large
Brazilian cities
Spatio-temporal models of
deforestation in Amazonia
Emergency action planning
Oil refineries and pipelines (Petrobras)
TerraCrime
Palm-top
Exemplos de Produtos Web
TerraLib Structure
Java Interface
COM Interface
OGIS Services
C++ Interface
Functions
kernel
Visualization
Controls
Spatio-Temporal
Data Structures
File and DBMS
Access
I/O Drivers
External
Files
DBMS
Spatio-Temporal Data Types
Events
time
Near in space, near
in time?
y
x
Dynamical Spatial Model
f ( I (t) )
f ( I (t+1) )
F
f ( I (t+2) )
f ( I (tn ))
F
..
“A dynamical spatial model is a mathematical
representation of a real-world process when a location
changes in response to external forces (Burrough)
Spatial Simulation
S2
Reality - Bauru in 1988
S3
Cell Spaces: Old Wine, New Bottle
Regression with Spatial Data: Understanding
Deforestation in Amazonia
Future Deforestation Scenarios
Terra do Meio
South of Amazonas State
Hot-spots map for new deforestation
Modelling anisotropic space
Spatial relations in Amazonia are not isotropic!
Desigining for Extensibility
Algorithms
basic core of most successful GIS
large number of them do not depend on some particular
implementation of a data structure
based a few fundamental semantic properties of the structure
properties can be - for example - the ability to get from one
element of the data structure to the next, and to compare two
elements of the data structure .
Spatial analysis algorithms
can be abstracted away from a particular data structure and
described only in terms of their properties.
Same Algorithm, Different Geometries
Generic GIS Programming
How to decouple algorithms from data structures ?
Idea: Iterators (“inteligent pointers”)
Algoritms are not classes !!
“Decide which algorithms you want; parametrize them so
they work for a variety of suitable types and data
structures”
Algorithms
Iterators
Geometries
Scientific Challenges for Innovation in GIS
How can we design an algebra for ST types?
How do we design a language for spatial modelling?
Requires a caracterization of measurents
Cognitively meaningful interfaces
Representation of Space
What are the spatial-temporal data types?
How do we represent anisotropic space?
Extensibility of Models and Algorithms
How do we design for extensibility?
Why am I here today in TU-Wien?
Innovation in GISystems
Requires addressing challenges in GIScience
Cooperation with prof. Andrew Frank
Generic GIS Programming
Semantics of Geographical Measurements
Spatio-Temporal Types and Algebras
Methods for Representation of Anisotropic Space
Result of Sound Scientific Work
High-Low
Potential
Rev Eng
High-High
mySQL
Linux
PostgreSQL
OpenOffice
perl
Apache
Postgres
GRASS
R
NCSA browser
TerraLib
Low-Low
Low-High
Potential
Distrib Develop
Conclusions
Open Source software model
Geoinformation
The Linux example is not applicable to all situations
Moving from the individual level to the organization level
Innovative open source GIS software has a large role
Sound research is needed to support innovation
Cooperation in GIScience is fundamental
The problem is enormous...requires a combination of R&D
We are few R&D groups
Cooperation is the only way to ensure a future for GIScience