Towards automatic coin classification

Download Report

Transcript Towards automatic coin classification

Towards automatic coin
classification
L.J.P. van der Maaten
E.O. Postma
Introduction
• RICH project (Reading Images for the Cultural Heritage)
• Initiated by NWO-CATCH (grant 640.002.401)
• Institutions involved:
– MICC-IKAT (Maastricht University)
– ROB (Dutch State Service for Archaeology)
• People involved:
– E.O. Postma, A.G. Lange, H.H. Paijmans,
L.J.P. van der Maaten, P.J. Boon
Introduction
• Automatic coin classification based on
visual features
– May allow sorting heterogeneous coin
collections, both modern and historical
– For modern coins, applications for charity
organizations, financial institutions, and
change offices (MUSCLE CIS benchmark)
– For historical coins, applications in the cultural
heritage domain
Introduction
• Currently, some coin historical collections
are being disclosed on the Internet, e.g., in
NUMIS1
• NUMIS website shows information (and
sometimes photographs) on collected
coins
• However, the use of such websites for
non-experts is limited
1
A project of the Dutch Money Museum
Introduction
• Non-experts who find a coin would like to
know what sort of coin it is
– i.e. coin classification based on visual
features
• Non-experts would benefit from a system
for automatic coin classification
• Also beneficial to experts to speed up and
objectify the classification process
Introduction
• A modern and a historical coin photograph
Introduction
• This presentation
– Presents a number of features that can be
used for the classification of modern coins
– Shows promising results for these features
– Investigates the performance of the same
features on a medieval coin dataset
– Tries to provide some insight in why the
features fail on the medieval coin data
Features
• Contour features
– Edge distance distributions
– Edge angle distributions
– Edge angle-distance distributions
• Texture features
– Gabor histograms
– Daubechies D4 wavelet features
Contour features
• Measure statistical
distributions of edge
pixels
• Edge pixels computed
using Sobel filter
convolution (with non-maxima
suppression and dynamic thresholding)
• Coin borders are
removed
Edge distance distributions
• Estimate the
distribution of the
distances of edge
pixels to the center of
the coin
• Rotation invariant
feature
• Can be measured on
coarse-to-fine-scales
Edge angle distributions
• Measure distribution
of angles of edge
pixels w.r.t. the
baseline
• Not rotation invariant
by definition
(however, the
magnitude of the
Fourier transform is)
• Can be measured on
number of fine scales
Edge angle-distance distr.
• Incorporate both
angular and distance
information in the coin
stamp
• We measure EADD
using 2, 4, 8, and 16
distance bins and 180
angular bins
Gabor histograms
• Convolution of coin image
with Gabor filters of
various scales and
rotations
• Compute image
histograms of the
resulting convolution
images
• Apply PCA for
dimensionality reduction
(200 dimensional)
Daubechies D4 wavelet
• Perform wavelet decomposition using
Daubechies D4 wavelet
• Computed wavelet coefficients are used
as features (2-, 3-, and 4-level; ahvd)
• Do this for 16 rotated versions of the coins
in the training set (for rotation invariance)
• Apply PCA for dimensionality reduction
(results in 200-dimensional feature vector)
Experiments
• Performed on the MUSCLE CIS
benchmark coin dataset1
• The dataset contains 692 coin classes
with 2,270 coin faces
• Training set of 20,000 coins
• Test set of 5,000 coins
• Incorporate area measurements
1
Newer experiments than the ones described in the paper
Experiments
Experiments
• Classification performances (5-NN classifier)
Edge distance distributions
Edge angle distributions
68%
62%
Edge angle-distance distributions
Gabor histograms
Daubechies D4 wavelet features
78%
55%
46%
Experiments
• Subsequently, we
performed experiments
on the Merovingen
dataset1
• Contains 4,569 earlymedieval coins
• Class distribution skewed
• Experiments using 10fold cross validation
1 Dataset
property of Dutch Money Museum
Experiments
• Skewed class distributions
Class type
City
Mint master
Currency
Nation
No. of
classes
18
19
4
12
Mean class
size
53
69
859
199
St. dev. of
class size
125
121
1,469
438
Experiments
• Classification performances (naïve Bayes classifier)
Feature
City Mint master Currency Nation
Area
16%
10%
61%
17%
Edge dist. distr.
12%
8%
50%
20%
Edge angle distr. 8%
6%
34%
14%
Gabor histogram 5%
6%
25%
8%
D4 wavelet feat.
5%
25%
6%
8%
Discussion
• Although results on modern coin data are
promising
• Results on Merovingen coin dataset
disappointing
Discussion
• Reasons for results on medieval coins:
– Contour features highly rely on the correct
estimation of the center of the stamp
– Texture features more suitable for coins with
detailed artwork in stamps
– Errors and inconsistencies in these kind of
datasets
Discussion
• Coin classified as
Frankish
• Coin classified as
Frisian
Discussion
• Reasons for results on medieval coins :
– Medieval coins have larger within-class
variances due to quick deterioration of
medieval coin stamps
– Medieval coins have smaller between-class
variances (stamps often contain similar
pictures, such as a cross or the head of an
authority)
Discussion
• Reasons for results on medieval coins :
– Experts indicate that classifications are based
on small details
– I.e. expert classifications are based on a large
number of small (undocumented) rules
– Experts (consciously or not) take extrinsic
information into account (such as finding
location)
Discussion
• How should a system for automatic
classification of medieval coins work?
– Text is highly discriminating, however, cannot
be read by state-of-the-art in character
recognition
– We foresee the development of a semiautomatic adaptive system in which the expert
indicates distinguishing features of the coin
– Over time, the system should be able to learn
the undocumented rules
Conclusions
• Contour and texture features perform well
in the classification of modern coins
• The results of these features on earlymedieval coins are disappointing
• There are various reasons why the
features fail in the classification of
medieval coins
• Future work: semi-automatic approach
Questions
?