Transcript Document
GIS UPDATE?
• Review Lab 1: Scale
• Today’s Material:
Data Models:
• Vector Data
• Raster Data
• TIN Data
Today’s Lab:
• Data Models
http://www.colorado.edu/geography/gcraft/notes/datacon/datacon_f.html
VECTOR DATA
Feature classes are homogeneous
collections of common features,
each having the same spatial
representation, such as points, lines,
or polygons, and a common set of
attribute columns, for example, a
line feature class for representing
road centerlines. The four most
commonly used feature classes in
the geodatabase are points, lines,
polygons, and annotation (the
geodatabase name for map text).
In the illustration below, these are
used to represent four datasets for
the same area: (1) manhole cover
locations as points, (2) sewer lines,
(3) parcel polygons, and (4) street
name annotation.
Generally, feature classes are thematic collections of points, lines, or polygons, but
there are seven feature class types:
• Points—Features that are too small to represent as lines or polygons as well as
point locations (such as GPS observations).
• Lines—Represent the shape and location of geographic objects, such as street
centerlines and streams, too narrow to depict as areas. Lines are also used to
represent features that have length but no area such as contour lines and
boundaries.
• Polygons—A set of many-sided area features that represents the shape and
location of homogeneous feature types such as states, counties, parcels, soil types,
and land-use zones.
• Annotation—Map text including properties for how the text is rendered. For
example, in addition to the text string of each annotation, other properties are
included such as the shape points for placing the text, its font and point size, and
other display properties. Annotation can also be feature linked and can contain
subclasses.
Dimensions—A special kind of
annotation that shows specific
lengths or distances, for
example, to indicate the length
of a side of a building, a land
parcel, or the distance between
two features. Dimensions are
heavily used in design,
engineering, and facilities
applications for GIS.
Multipoints—Features that are
composed of more than one
point. Multipoints are often used
to manage arrays of very large
point collections, such as lidar
point clusters, which can contain
literally billions of points. Using a
single row for such point
geometry is not feasible.
Clustering these into multipoint
rows enables the geodatabase to
handle massive point sets.
Single-part and multipart lines and polygons
Line and polygon feature classes in the geodatabase can
be composed of single parts or multiple parts. For
example, a state can contain multiple parts (Hawaii's
islands) but is considered to be a single state feature.
Vertices, segments, elevation, and measurements
Feature geometry is primarily composed of coordinate
vertices. Segments in lines and polygon features span
vertices. Segments can be straight edges or parametrically
defined curves. Vertices in features can also include z-values
to represent elevation measures and m-values to represent
measurements along line features.
Segment types in line and polygon features
Lines and polygons are defined by two key elements: (1) an ordered list
of vertices that define the shape of the line or polygon and (2) the
types of line segments used between each pair of vertices. Each line
and polygon can be thought of as an ordered set of vertices that can be
connected to form the geometric shape. Another way to express each
line and polygon is as an ordered series of connected segments where
each segment has a type: straight line, circular arc, elliptical arc, or
Bézier curve.
The default segment type is a straight
line between two vertices. However,
when you need to define curves or
parametric shapes, you have three
additional segment types that can be
defined: circular arcs, elliptical arcs,
and Bézier curves. These shapes are
often used for representing built
environments such as parcel
boundaries and roadways.
Vertical measurements using z-values
Feature coordinates can include x,y and x,y,z vertices. Z-values are
most commonly used to represent elevations, but they can
represent other measurements such as annual rainfall or air quality.
X,Y tolerance
When you create a new feature class, you will be asked to set the x,y tolerance.
The x,y tolerance is used to set the minimum distance between coordinates in
clustering operations, such as topology validation, buffer generation, and polygon
overlay, as well as in some editing operations.
Feature processing operations are influenced by the x,y tolerance, which
determines the minimum distance separating all feature coordinates (nodes and
vertices) during those operations. By definition, it also defines the distance a
coordinate can move in x or y (or both) during clustering operations.
The x,y tolerance is an extremely small
distance (the default is 0.001 meters in
on-the-ground units). It is used to resolve
inexact intersection locations of
coordinates during clustering operations.
When processing feature classes using
geometry operations, coordinates whose
x distance and y distance are within the
x,y tolerance of each other are considered
to be coincident (in other words, share
the same x,y location). Thus, the clustered
coordinates are moved to a common
location.
RASTER DATA
In its simplest form, a
raster consists of a matrix
of cells (or pixels)
organized into rows and
columns (or a grid) where
each cell contains a value
representing information,
such as temperature.
Rasters are digital aerial
photographs, imagery
from satellites, digital
pictures, or even scanned
maps.
Data stored in a raster format represents real-world
phenomena:
• Thematic data (also known as discrete) represents features
such as land-use or soils data.
• Continuous data represents phenomena such as
temperature, elevation, or spectral data such as satellite
images and aerial photographs.
• Pictures include scanned maps or drawings and building
photographs.
Thematic and continuous rasters may be displayed as data
layers along with other geographic data on your map but are
often used as the source data for spatial analysis with the
ArcGIS Spatial Analyst extension. Picture rasters are often
used as attributes in tables—they can be displayed with your
geographic data and are used to convey additional
information about map features.
Rasters as basemaps
A common use of raster data in a GIS is as a background
display for other feature layers. For example,
orthophotographs displayed underneath other layers provide
the map user with confidence that map layers are spatially
aligned and represent real objects, as well as additional
information. Three main sources of raster basemaps are
orthophotos from aerial photography, satellite imagery, and
scanned maps. Below is a raster used as a basemap for road
data.
Rasters as surface maps
Rasters are well suited for representing data that changes
continuously across a landscape (surface). They provide an
effective method of storing the continuity as a surface. They
also provide a regularly spaced representation of surfaces.
Elevation values measured from the earth's surface are the
most common application of surface maps, but other values,
such as rainfall, temperature, concentration, and population
density, can also define surfaces that can be spatially
analyzed. The raster below displays elevation—using green to
show lower elevation and red, pink, and white cells to show
higher elevations.
Rasters as thematic maps
Rasters representing thematic data can be derived from analyzing
other data. A common analysis application is classifying a satellite
image by land-cover categories. Basically, this activity groups the
values of multispectral data into classes (such as vegetation type)
and assigns a categorical value. Thematic maps can also result
from geoprocessing operations that combine data from various
sources, such as vector, raster, and terrain data. For example, you
can process data through a geoprocessing model to create a raster
dataset that maps suitability for a specific activity. Below is an
example of a classified raster dataset showing land use.
Rasters as attributes of a feature
Rasters used as attributes of a feature may be digital
photographs, scanned documents, or scanned drawings
related to a geographic object or location. A parcel layer may
have scanned legal documents identifying the latest
transaction for that parcel, or a layer representing cave
openings may have pictures of the actual cave openings
associated with the point features. Below is a digital picture of
a large, old tree that could be used as an attribute to a
landscape layer that a city may maintain.
Why store data as a raster?
The advantages of storing your data as a raster are as
follows:
• A simple data structure —A matrix of cells with
values representing a coordinate and sometimes
linked to an attribute table
• A powerful format for advanced spatial and
statistical analysis
• The ability to represent continuous surfaces and
perform surface analysis
• The ability to uniformly store points, lines, polygons,
and surfaces
• The ability to perform fast overlays with complex
datasets
There are other considerations for storing your data as a
raster that may convince you to use a vector-based storage
option. For example:
• There can be spatial inaccuracies due to the limits imposed
by the raster dataset cell dimensions.
• Raster datasets are potentially very large. Resolution
increases as the size of the cell decreases; however,
normally cost also increases in both disk space and
processing speeds. For a given area, changing cells to onehalf the current size requires as much as four times the
storage space, depending on the type of data and storage
techniques used.
• There is also a loss of precision that accompanies
restructuring data to a regularly spaced raster-cell
boundary.
Rasters are stored as an
ordered list of cell values, for
example, 80, 74, 62, 45, 45,
34, and so on.
The area (or surface) represented by each cell
consists of the same width and height and is an equal
portion of the entire surface represented by the
raster. For example, a raster representing elevation
(that is, digital elevation model) may cover an area of
100 square kilometers. If there were 100 cells in this
raster, each cell would represent 1 square kilometer
of equal width and height (that is, 1 km x 1 km).
Discrete and continuous data
Discrete data, which is sometimes called thematic, categorical, or
discontinuous data, most often represents objects in both the feature
(vector) and raster data storage systems. A discrete object has known
and definable boundaries: it is easy to define precisely where the
object begins and where it ends. A lake is a discrete object within the
surrounding landscape. Where the water’s edge meets the land can be
definitively established. Other examples of discrete objects include
buildings, roads, and parcels. Discrete objects are usually nouns.
Discrete and continuous data
A continuous surface represents phenomena in which each location on
the surface is a measure of the concentration level or its relationship
from a fixed point in space or from an emitting source. Continuous data
is also referred to as field, nondiscrete, or surface data. One type of
continuous surface is derived from those characteristics that define a
surface, in which each location is measured from a fixed registration
point. These include elevation (the fixed point being sea level) and
aspect (the fixed point being direction: north, east, south, and west).
Discrete and continuous data
Another type of continuous surface includes phenomena
that progressively vary as they move across a surface from
a source. Illustrations of progressively varying continuous
data are fluid and air movement. These surfaces are
characterized by the type or manner in which the
phenomenon moves. The first type of movement is
through diffusion or any other locomotion in which the
phenomenon moves from areas with high concentration
to those with less concentration until the concentration
level evens out. Surface characteristics of this type of
movement include salt concentration moving through
either the ground or water, contamination level moving
away from a hazardous spill or a nuclear reactor, and heat
from a forest fire. In this type of continuous surface, there
has to be a source. The concentration is always greater
near the source and diminishes as a function of distance
and the medium the substance is moving through.
The dimension of the cells can be as large or as small as needed to
represent the surface conveyed by the raster dataset and the features
within the surface, such as a square kilometer, square foot, or even
square centimeter. The cell size determines how coarse or fine the
patterns or features in the raster will appear. The smaller the cell size,
the smoother or more detailed the raster will be. However, the greater
the number of cells, the longer it will take to process, and it will increase
the demand for storage space. If a cell size is too large, information may
be lost or subtle patterns may be obscured. For example, if the cell size
is larger than the width of a road, the road may not exist within the
raster dataset. In the diagram below, you can see how this simple
polygon feature will be represented by a raster dataset at various cell
sizes.
The location of each cell is
defined by the row or column
where it is located within the
raster matrix. Essentially, the
matrix is represented by a
Cartesian coordinate system, in
which the rows of the matrix
are parallel to the x-axis and the
columns to the y-axis of the
Cartesian plane. Row and
column values begin with 0. In
the example below, if the raster
is in a Universal Transverse
Mercator (UTM) projected
coordinate system and has a
cell size of 100, the cell location
at 5,1 would be 300,500 East,
5,900,600 North.
Often you need to specify the extent
of a raster. The extent is defined by the
top, bottom, left, and right coordinates
of the rectangular area covered by a
raster, as shown below.
The level of detail (of features/phenomena) represented by a raster is
often dependent on the cell (pixel) size, or spatial resolution, of the
raster. The cell must be small enough to capture the required detail but
large enough so computer storage and analysis can be performed
efficiently. More features, smaller features, or a greater detail in the
extents of features can be represented by a raster with a smaller cell
size. However, more is not often better. Smaller cell sizes result in larger
raster datasets to represent an entire surface; therefore, there is a need
for greater storage space, which often results in longer processing time.
Choosing an appropriate cell size is not always simple. You must balance
your application's need for spatial resolution with practical
requirements for quick display, processing time, and storage. Essentially,
in a GIS, your results will only be as accurate as your least accurate
dataset. If you're using a classified dataset derived from 30-meter
resolution Landsat imagery, then creating a digital elevation model
(DEM) or other ancillary data at a higher resolution, such as 10 meters,
may be unnecessary. The more homogeneous an area is for critical
variables, such as topography and land use, the larger the cell size can
be without affecting accuracy.
VECTOR AS RASTER
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/How_features_are_represen
ted_in_a_raster/009t00000006000000/
Points
A point is represented by an explicit x,y coordinate in vector format,
but as a raster, it is represented as a single cell—the smallest unit of a
raster. By definition, a point has no area but is converted to a cell
representing area. Therefore, the smaller the cell size, the smaller the
area and, thus, the closer the representation of the point feature. For
example, it is assumed that a well, a telephone pole, or the location of
an endangered plant occupies the entire area covered by a cell.
VECTOR AS RASTER
Lines
In vector format, a line is an ordered list of x,y coordinates, but in raster
format, it is represented as a chain of spatially connected cells with the
same value. When there is a break between the chain of same-valued
cells, it represents a break in the line feature, which could represent
different features such as two roads or two rivers that do not intersect.
VECTOR AS RASTER
Polygons
A vector polygon is an enclosed area defined by an ordered list of x,y
coordinates in which the first and last coordinates are the same, thereby
representing area. By contrast, a raster polygon is a group of contiguous
cells with the same value that most accurately portray the shape of the
area. Polygonal, or area, data is best represented by a series of
connected cells. Examples of polygonal features include buildings,
ponds, soils, forests, swamps, and fields. The accuracy of the raster
representation below is dependent on the scale of the data and the size
of the cell. The finer the cell resolution and the greater the number of
cells that represent small areas, the more accurate the representation.
RASTER AS VECTOR: Converting raster data to polygons
When you convert a raster dataset containing area features, each group
of contiguous cells with the same values converts to a polygon. Arcs are
created from cell borders in the raster. NoData cells in the input raster
do not become polygons in the output.
The following is a raster image (left) and what it looks like once
converted to polygons (right):
RASTER AS VECTOR: Converting raster data to polylines
When you convert a raster dataset containing linear features, a polyline
is created from each cell in the input raster dataset. The polyline is
positioned so that it passes through the center of each cell. NoData cells
in the input raster dataset do not become features in the output.
The following is a raster image (left) and what it looks like once
converted to polyline features (right):
RASTER AS VECTOR: Converting raster data to points
When you convert a raster dataset containing point features, each cell in
the input raster dataset converts to a point in the output. Each new
point is positioned at the center of the cell it represents. NoData cells do
not convert to points.
The following is a raster image (left) and what it looks like once
converted to point features (right):
Vector Data
Advantages :
• Data can be represented at its original resolution and form without generalization.
• Graphic output is usually more aesthetically pleasing (traditional cartographic
representation);
• Since most data, e.g. hard copy maps, is in vector form no data conversion is required.
• Accurate geographic location of data is maintained.
• Allows for efficient encoding of topology, and as a result more efficient operations that
require topological information, e.g. proximity, network analysis.
Disadvantages:
• The location of each vertex needs to be stored explicitly.
• For effective analysis, vector data must be converted into a topological structure. This is
often processing intensive and usually requires extensive data cleaning. As well, topology is
static, and any updating or editing of the vector data requires re-building of the topology.
• Algorithms for manipulative and analysis functions are complex and may be processing
intensive. Often, this inherently limits the functionality for large data sets, e.g. a large
number of features.
• Continuous data, such as elevation data, is not effectively represented in vector form.
Usually substantial data generalization or interpolation is required for these data layers.
• Spatial analysis and filtering within polygons is impossible
Raster Data
Advantages :
• The geographic location of each cell is implied by its position in the cell matrix. Accordingly,
other than an origin point, e.g. bottom left corner, no geographic coordinates are stored.
• Due to the nature of the data storage technique data analysis is usually easy to program and
quick to perform.
• The inherent nature of raster maps, e.g. one attribute maps, is ideally suited for
mathematical modeling and quantitative analysis.
• Discrete data, e.g. forestry stands, is accommodated equally well as continuous data, e.g.
elevation data, and facilitates the integrating of the two data types.
• Grid-cell systems are very compatible with raster-based output devices, e.g. electrostatic
plotters, graphic terminals.
Disadvantages:
• The cell size determines the resolution at which the data is represented.;
• It is especially difficult to adequately represent linear features depending on the cell
resolution. Accordingly, network linkages are difficult to establish.
• Processing of associated attribute data may be cumbersome if large amounts of data exists.
Raster maps inherently reflect only one attribute or characteristic for an area.
• Since most input data is in vector form, data must undergo vector-to-raster conversion.
Besides increased processing requirements this may introduce data integrity concerns due to
generalization and choice of inappropriate cell size.
• Most output maps from grid-cell systems do not conform to high-quality cartographic needs.
http://bgis.sanbi.org/gis-primer/page_19.htm
Triangular irregular networks (TIN) have
been used by the GIS community for many
years and are a digital means to represent
surface morphology. TINs are a form of
vector-based digital geographic data and
are constructed by triangulating a set of
vertices (points). The vertices are
connected with a series of edges to form a
network of triangles. There are different
methods of interpolation to form these
triangles, such as Delaunay triangulation
or distance ordering.
The edges of TINs form contiguous,
nonoverlapping triangular facets and can
be used to capture the position of linear
features that play an important role in a
surface, such as ridgelines or stream
courses. The graphics on the right show
the nodes and edges of a TIN (left) and
the nodes, edges, and faces of a TIN
(right).
http://resources.arcgis.com/en/help/main/10.1/index.html#//006000000001000000
http://upload.wikimedia.org/wikipedia/commons/9/97/Digitales_Gel%C3%A4ndem
odell.png
Contours
Edges
Faces
Nodes
http://resources.arcgis.com/en/help/main/10.1/0060/GUID-0684D3F4-B5BA-4C53-9613BBC48BBDFC21-web.png
Feature class storage in the geodatabase
In the geodatabase, each feature class is managed in a single table. A
Shape column in each row is used to hold the geometry or shape of
each feature.
In the feature class table, the following are true:
• Each feature class is a table.
• Individual features are held as rows.
• Feature attributes are recorded in columns .
• The Shape column holds each feature's geometry (point, line,
polygon, and so forth).
• The ObjectID column holds the unique identifier for each feature.