Graphics Recognition – from Re

Download Report

Transcript Graphics Recognition – from Re

Graphics Recognition – from
Re-engineering to Retrieval
Karl Tombre, Bart Lamiroy
LORIA, France
Document Analysis in the IR era
Information is at the core of industrial
strategies
 A lot of digital or digitized information, but
often in very “poor” formats
 The challenge: not necessarily reengineering of documents, but enrich
poorly structured information, add (limited)
amount of semantics, build indexes
Purposes: browsing, navigation, indexing
DAR methods and tools useful, but must
be adapted

Specific challenges of largescale IR applications





Genericity: we cannot necessarily build a
complete and exhaustive a priori model of
contextual knowledge (ontology)
Adaptability: various input data – scanned
paper, PDF, DXF, HTML, GIF… – various
resolutions
Robustness: “back-office” applications
Efficiency: online searching in
heterogeneous data
Scaling: methods have to scale to
increasing number of symbols/features
DAR and IR
Media without (or with very little)
contextual knowledge
Image-based indexing and retrieval,
indexing of video sequences
 Documents do explicitly convey
information from one person to another
person
Much more structure, syntax and
semantics

DAR and IR – some examples
Indexing and/or searching scanned text
without OCR
Similarities, signatures
 Query or index on layout structure
 Table spotting
 Keyword spotting
 …

What about Graphics
Recognition?


Subfield of DAR, for graphics-rich
documents
Numerous methods for various analysis
and recognition problems
 Raster-to-vector
conversion
 Text/graphics separation
 Symbol recognition

Many specific technical areas: maps,
architectural drawings, engineering
drawings, diagrams and schematics, …
Graphics recognition methods

Text/graphics separation
Graphics recognition methods

Vectorization
Graphics recognition and IR
applications


Usual text-based indexing and retrieval
still useful
But need for access to other kinds of
information:
 Symbols
 Text-drawing
connections
 Description-illustration connections
Some contributions

Syeda-Mahmood – maintenance drawings
IEEE Trans. On PAMI 21(8):737-751, Aug. 1999
Some contributions

Arias et al., Najman et al. – use of information
contained in legend / title block
Proc. GREC’01, Kingston (Ontario, Canada), p.19-26, Sept. 2001
Some contributions

Samet & Soffer – symbols from legend
IEEE Trans. On PAMI 18(8):783-798, Aug. 1996
Some contributions

Müller & Rigoll – graphical retrieval in database
of engineering drawings
Proc. ICDAR’99, Bangalore (India), pp. 697-700, Sept. 1999
Some contributions

Boose et al. (Boeing) – Generation of Layered
Illustrated Parts Drawings (GREC’ 03)
Proc. GREC’03, Barcelona, pp. 139-144
Wishful thinking?
Symbol DB
Or even better…
Symbol recognition




Natural features for indexing and retrieval
Most methods work with known databases
of reference symbols – what about
interactive
querying of arbitrary symbols?
Before we move on:
From segmentation
by recognition,
1st contestfollowed
on
to segmentation-free recognition, or
symbol recognition
segmenting while recognizing
held last week
Scalability
See IAPR TC10 homepage
 Efficiency / complexity
for further details
 Discrimination power
Signatures
Image-based signatures

Compute invariant signatures on binary
document image
 F-signatures (ICDAR’01)
 Radon
transform: R-signatures [Tabbone &
Wendling]
 Ridgelets [Ramos Terrades & Valveny –
GREC’03] – aka wavelet transform of
Radon transform
R-signatures
Detection of arrowheads [Girardeau & Tabbone]
DEA degree thesis, INPL, Nancy, Jul. 2002
R-signatures
Another example [Girardeau & Tabbone]
Ridgelets
[Ramos Terrades & Valveny – GREC’03]
Proc. GREC’03, Barcelona,
pp. 202-211
Vector-based signatures
[Dosch & Lladós – GREC’03]
 Based on set of basic graphical features:
 Parallelism
 Overlap
 Collinearity
 T

and V-junctions
Quality factor associated with the various
relations
Match signatures of reference symbols with
signatures of buckets
Vector-based signatures
Proc. GREC’03,
Barcelona,
pp. 159-169
Towards symbol spotting



Pre-compute – or compute on the spot – a
set of basic signatures
Can be sufficient for symbol spotting and
retrieval
Followed by classical symbol recognition if
more discrimination is needed
Symbol spotting

[Jabari & Tabbone] : graph matching through
probabilistic relaxation, with nodes=segments and
vertices=relations
DEA degree thesis, INPL, Nancy, Jul. 2003
Symbol spotting

[Jabari & Tabbone] : another example
Combining Text and Graphics





Extracting Text/Graphics relationships within
document
Using Text matching for inter-document
relationships
Transitive inter-document Graphics matching
No need for complex graphics matching
Restricted to well known document types
Example: continuation of Wiring
Diagrams (Boeing)

[Baum et al. – GREC’03]
Proc. GREC’03, Barcelona, pp. 132-138
Scan2XML Example
Proc. GREC’01, Kingston (Ontario, Canada), pp. 312-325
Indexing and Semantics



Signature + metric
Semantics = measured distance to signature
Applies only to homogenous contexts
 Pre-segmented
images
 Pre-determined image classes
 Implicit application of domain kowledge
 ...

Semantics = Syntax
Example
Signature type A
Metric M
Signature value l

M(l,1) < 1 ?
M(l,2) < 2 ?
Semantics1 = (1,
1)
Semantics2 = (2,
semantics = measurement to reference value
2)
Heterogenous Document
Bases



Semantics do not have a unique syntax
anymore
Syntax metrics may be context sensitive
Semantics = Syntax + Context
Context needs to be considered
Two different contexts from the
automobile industry
Example
Context 1:
Signature type A
Metric M
Context 2:
Signature type B
Metric N
(1, 1) = Semantics1 = (t1, n1)
(2, 2) = Semantics2 = (t2, n2)
Signature value l

What if
M(l,1) < 1 and
N(l,t2) < n2 ?
A step to taking into account
context
(while consolidating existing approaches)
Component Algebra :
 Image Analysis
= Pipeline
 Syntax + algorithm = semantics
Algorithm
Data
(syntax)
Data
Algorithm
(semantics)
Data
(semantics)
Syntax and semantics need not be distinguished
Component Algebra

Components :
Known and implemented document analysis
algorithms, taking input data from one domain,
and producing data into another domain.

Application Context :
Set of all available Components.

Semantics :
Data sets needed by or produced by Components.
Component Algebra is a
Graph
Data
Data
Component
Data
Component
Data
Component Data
Data
Data
Advantages





Each node is a semantic concept, semantic
relationships are explicitly expressed.
Structure may support automatic reasoning
and knowledge inference.
Context is embedded in components, different
contexts give different paths in the graph.
Highly scalable and open architecture.
Bridge between signal-level document
analysis and high-level document
representation.
However ...
The formalism exists, the realization doesn't (yet)




What about parametrization ?
How context independant can you get ?
What about « guessing » context
appropriateness ?
How to design fully interoperable components ?
Conclusion


A lot of DA methods – and more specifically
GR methods – can be of direct use in IR,
indexing and browsing applications
Specific challenges
 Scaling
and efficiency
 Heterogeneous sets of documents
 Incomplete domain knowledge
 Symbol spotting
 On-the-fly symbol searching

Sketch of open framework for including
document semantics when context can be
heterogeneous