Graphics Recognition – from Re
Download
Report
Transcript Graphics Recognition – from Re
Graphics Recognition – from
Re-engineering to Retrieval
Karl Tombre, Bart Lamiroy
LORIA, France
Document Analysis in the IR era
Information is at the core of industrial
strategies
A lot of digital or digitized information, but
often in very “poor” formats
The challenge: not necessarily reengineering of documents, but enrich
poorly structured information, add (limited)
amount of semantics, build indexes
Purposes: browsing, navigation, indexing
DAR methods and tools useful, but must
be adapted
Specific challenges of largescale IR applications
Genericity: we cannot necessarily build a
complete and exhaustive a priori model of
contextual knowledge (ontology)
Adaptability: various input data – scanned
paper, PDF, DXF, HTML, GIF… – various
resolutions
Robustness: “back-office” applications
Efficiency: online searching in
heterogeneous data
Scaling: methods have to scale to
increasing number of symbols/features
DAR and IR
Media without (or with very little)
contextual knowledge
Image-based indexing and retrieval,
indexing of video sequences
Documents do explicitly convey
information from one person to another
person
Much more structure, syntax and
semantics
DAR and IR – some examples
Indexing and/or searching scanned text
without OCR
Similarities, signatures
Query or index on layout structure
Table spotting
Keyword spotting
…
What about Graphics
Recognition?
Subfield of DAR, for graphics-rich
documents
Numerous methods for various analysis
and recognition problems
Raster-to-vector
conversion
Text/graphics separation
Symbol recognition
Many specific technical areas: maps,
architectural drawings, engineering
drawings, diagrams and schematics, …
Graphics recognition methods
Text/graphics separation
Graphics recognition methods
Vectorization
Graphics recognition and IR
applications
Usual text-based indexing and retrieval
still useful
But need for access to other kinds of
information:
Symbols
Text-drawing
connections
Description-illustration connections
Some contributions
Syeda-Mahmood – maintenance drawings
IEEE Trans. On PAMI 21(8):737-751, Aug. 1999
Some contributions
Arias et al., Najman et al. – use of information
contained in legend / title block
Proc. GREC’01, Kingston (Ontario, Canada), p.19-26, Sept. 2001
Some contributions
Samet & Soffer – symbols from legend
IEEE Trans. On PAMI 18(8):783-798, Aug. 1996
Some contributions
Müller & Rigoll – graphical retrieval in database
of engineering drawings
Proc. ICDAR’99, Bangalore (India), pp. 697-700, Sept. 1999
Some contributions
Boose et al. (Boeing) – Generation of Layered
Illustrated Parts Drawings (GREC’ 03)
Proc. GREC’03, Barcelona, pp. 139-144
Wishful thinking?
Symbol DB
Or even better…
Symbol recognition
Natural features for indexing and retrieval
Most methods work with known databases
of reference symbols – what about
interactive
querying of arbitrary symbols?
Before we move on:
From segmentation
by recognition,
1st contestfollowed
on
to segmentation-free recognition, or
symbol recognition
segmenting while recognizing
held last week
Scalability
See IAPR TC10 homepage
Efficiency / complexity
for further details
Discrimination power
Signatures
Image-based signatures
Compute invariant signatures on binary
document image
F-signatures (ICDAR’01)
Radon
transform: R-signatures [Tabbone &
Wendling]
Ridgelets [Ramos Terrades & Valveny –
GREC’03] – aka wavelet transform of
Radon transform
R-signatures
Detection of arrowheads [Girardeau & Tabbone]
DEA degree thesis, INPL, Nancy, Jul. 2002
R-signatures
Another example [Girardeau & Tabbone]
Ridgelets
[Ramos Terrades & Valveny – GREC’03]
Proc. GREC’03, Barcelona,
pp. 202-211
Vector-based signatures
[Dosch & Lladós – GREC’03]
Based on set of basic graphical features:
Parallelism
Overlap
Collinearity
T
and V-junctions
Quality factor associated with the various
relations
Match signatures of reference symbols with
signatures of buckets
Vector-based signatures
Proc. GREC’03,
Barcelona,
pp. 159-169
Towards symbol spotting
Pre-compute – or compute on the spot – a
set of basic signatures
Can be sufficient for symbol spotting and
retrieval
Followed by classical symbol recognition if
more discrimination is needed
Symbol spotting
[Jabari & Tabbone] : graph matching through
probabilistic relaxation, with nodes=segments and
vertices=relations
DEA degree thesis, INPL, Nancy, Jul. 2003
Symbol spotting
[Jabari & Tabbone] : another example
Combining Text and Graphics
Extracting Text/Graphics relationships within
document
Using Text matching for inter-document
relationships
Transitive inter-document Graphics matching
No need for complex graphics matching
Restricted to well known document types
Example: continuation of Wiring
Diagrams (Boeing)
[Baum et al. – GREC’03]
Proc. GREC’03, Barcelona, pp. 132-138
Scan2XML Example
Proc. GREC’01, Kingston (Ontario, Canada), pp. 312-325
Indexing and Semantics
Signature + metric
Semantics = measured distance to signature
Applies only to homogenous contexts
Pre-segmented
images
Pre-determined image classes
Implicit application of domain kowledge
...
Semantics = Syntax
Example
Signature type A
Metric M
Signature value l
M(l,1) < 1 ?
M(l,2) < 2 ?
Semantics1 = (1,
1)
Semantics2 = (2,
semantics = measurement to reference value
2)
Heterogenous Document
Bases
Semantics do not have a unique syntax
anymore
Syntax metrics may be context sensitive
Semantics = Syntax + Context
Context needs to be considered
Two different contexts from the
automobile industry
Example
Context 1:
Signature type A
Metric M
Context 2:
Signature type B
Metric N
(1, 1) = Semantics1 = (t1, n1)
(2, 2) = Semantics2 = (t2, n2)
Signature value l
What if
M(l,1) < 1 and
N(l,t2) < n2 ?
A step to taking into account
context
(while consolidating existing approaches)
Component Algebra :
Image Analysis
= Pipeline
Syntax + algorithm = semantics
Algorithm
Data
(syntax)
Data
Algorithm
(semantics)
Data
(semantics)
Syntax and semantics need not be distinguished
Component Algebra
Components :
Known and implemented document analysis
algorithms, taking input data from one domain,
and producing data into another domain.
Application Context :
Set of all available Components.
Semantics :
Data sets needed by or produced by Components.
Component Algebra is a
Graph
Data
Data
Component
Data
Component
Data
Component Data
Data
Data
Advantages
Each node is a semantic concept, semantic
relationships are explicitly expressed.
Structure may support automatic reasoning
and knowledge inference.
Context is embedded in components, different
contexts give different paths in the graph.
Highly scalable and open architecture.
Bridge between signal-level document
analysis and high-level document
representation.
However ...
The formalism exists, the realization doesn't (yet)
What about parametrization ?
How context independant can you get ?
What about « guessing » context
appropriateness ?
How to design fully interoperable components ?
Conclusion
A lot of DA methods – and more specifically
GR methods – can be of direct use in IR,
indexing and browsing applications
Specific challenges
Scaling
and efficiency
Heterogeneous sets of documents
Incomplete domain knowledge
Symbol spotting
On-the-fly symbol searching
Sketch of open framework for including
document semantics when context can be
heterogeneous