Transcript Document
Digitizing and Scanning
Primary Data Sources
Measurements
Remotely sensed data
Field
Lab
already secondary?
Creating geometries
Definitely in the realm of secondary data
Digitizing
Scanning
Why Do We Have To Digitize?
Existing data sets are general purpose,
so if you want something specific you
have to create it
In spite of 20+ years of GIS, most stuff
is still in analog form
Chances are somebody else has
digitized it before; but data sharing is
not what it should be
Digitizer
Digitizing table
10” x 10” to 80” x 60”
$50 - $2,000
1/100th inch accuracy
Stylus or
puck with control buttons
The Digitizing Procedure
Affixing the map to the digitizer
Registering the map
Actual digitizing
In point mode
In stream mode
Georeferencing
Entered:
at least 3 control points
Tic 1: 11° 15' N
30° 30' E
aka reference points or tics
Tic 2: 11° 15' N
73° 30' E
easily identifiable on the map
exact coordinates need to be known
East of Greenwich
Digitizing Table Coordinates
71°
72°
73°
11°
11°
12°
12°
South
Origin:
X = 4 in.
Y = 5 in.
Tic Points
71°
72°
73°
Digitizing Modes
Point mode
most common
selective choice of points digitized
requires judgment
for man-made features
Stream mode
large number of (redundant) points
requires concentration
For natural (irregular) features
Problems With Digitizing
Paper instability
Humidity-induced shrinking of 2%-3%
Cartographic distortion, aka
displacement
Overshoots, gaps, and spikes
Curve sampling
Errors From Digitizing
Fatigue
Map complexity
½ hour to 3 days for a single map sheet
Sliver polygons
Wrongly placed labels
5
6 7
8
Digitizing Costs
Rule of thumb: one
boundary per minute
ergo:
appr. 65 lines
= more than one hour
Automated Data Input
(Scanning)
Work like a photocopier or fax machine
Three types:
Flatbed scanners
Drum scanner
A4 or A3
600 to 2400 dpi optical resolution
$100 to $2,000
practically unlimited paper size
$10k TO $50k
Video line scanner
produces
vector data
Requirements for Scanning
Data capture is fast but preparation is tedious
Computers cannot distinguish smudges
Lines should be at least 0.1 of a mm wide
Text and preferably color separation
300
AI techniques don’t work (yet?)
Symbols such as are too variable for
automatic detection and interpretation
Semi-automatic Data Input
(Heads-up Digitizing)
Reasonable compromise between
traditional digitizing and scanning
Much less tedious
Incorporating your intelligence
Criteria for Choosing
Input Mode
Images without easily detectable line
work should be left in raster format
Really dense line work should be left as
background image –
unless it is really needed for automatic GIS
analysis; in which case you would have to
bite the bullet
Conversion from Other
Databases
Autocad .dxf and dBASE .dbf are de facto
standards for GIS data exchange
In the raster domain there is no
equivalent; .tif comes closest to a
“standard”
In any case: merging data that originate
from different scales is problematic – in
the best of all worlds; there is no
automatic generalization routine