Indus Script

Download Report

Transcript Indus Script

Indus Script: Search for
1
Grammar
Nisha Yadav
Tata Institute of Fundamental Research
Collaborators:
Mayank Vahia, Iravatham Mahadevan, Hrishikesh Joglekar
1 Lecture given at a two day seminar on “The Indus Script: Problems and Prospects”, Chennai
Contents
Indus Script - An Overview
2)
Various Approaches
3)
Our Approach
4)
Dataset
5)
Preliminary Analysis
6)
Analysis - 1 : Check against random order
7)
Analysis - 2 : Positional analysis of Frequent Sign
combinations
8)
Text Beginners and Text Enders
9)
Segmentation of Indus Texts
10)
Summary
Note: In the lecture, unless specified otherwise, all text examples are from Mahadevan 1977 and all images are
1)
from Parpola’s UNESCO volumes of Indus seals.
Indus Script: Search for Grammar, Yadav et.al. (2007)
2
1) Indus Script: An Overview
Indus Script: Search for Grammar, Yadav et.al. (2007)
3
Indus Valley Civilization
Indus Script: Search for Grammar, Yadav et.al. (2007)
From Mahadevan, 1977
4
Indus Script : Pointers to understand
 Indus script is one of the few scripts that defy decipherment.
 Inscriptions found only on small objects like seals.
 The inscriptions are very brief: average length 4-5 signs.
 There are only 417 signs in the script as per Mahadevan’s Concordance (1977).
 The script is pictographic with signs showing human, fish etc.
 Signs are modified by joining or by strokes and many signs appear as combination of other
simple signs.
 The direction of the script is variable (mostly right to left: 83 % of times).
 In general the seals are of 1 to 2 square inches in size.
 There are no bi-lingual texts to aid decipherment.
Indus Script: Search for Grammar, Yadav et.al. (2007)
5
Direction indicators of the script




Cramping or overflow of signs at the left end
Orientation of asymmetric signs
Sequence of frequent combinations of signs
Split sequences
A split sequence indicating
direction
Indus Script: Search for Grammar, Yadav et.al. (2007)
6
Scale of a typical seal
For the most part, seals are between 1 inch or 2 inches square.
From Professor John C. Huntington’s ppt
Indus Script: Search for Grammar, Yadav et.al. (2007)
7
SEAL
SEAL
SEAL IMPRESSION
SEAL IMPRESSION
Indus Script: Search for Grammar, Yadav et.al. (2007)
From Professor John C. Huntington’s ppt
8
Specimens of Indus Texts on different objects
Text No.
Text
From Mahadevan, 1977
Indus Script: Search for Grammar, Yadav et.al. (2007)
9
2) Various Approaches
Indus Script: Search for Grammar, Yadav et.al. (2007)
12
Indus Script
Scientists from a variety of disciplines have attempted
to read the Indus script with no clear answer.
Various attempts so far include:
I. Mahadevan’s analytical work – Creation of first
Published Concordance (1977)
 Gift Siromoney’s statistical work
 A. Parpola’s comparison with Dravidian
 Russian group’s comparison with Dravidian
 Subbarayappa’s interpretation as pure numerals
 S. R. Rao’s interpretation as Vedic literature
 Others (Ref. Possehl,1996)

Indus Script: Search for Grammar, Yadav et.al. (2007)
13
3) Our Approach
Indus Script: Search for Grammar, Yadav et.al. (2007)
14

We make no assumption about its content or
meaning.

Our first emphasis is to attempt to WRITE IN
THE SCRIPT RATHER THAN READ.

We search for rules of writing without assigning
meanings or interpretations.

We ignore variation due to archaeological
context of sites, stratigraphy and type of objects.
Indus Script: Search for Grammar, Yadav et.al. (2007)
15
4) Dataset
Indus Script: Search for Grammar, Yadav et.al. (2007)
16
Dataset
Unambiguous data subset (EBUDS) was created for
analysis of the grammar of Indus writing, from the
original electronic dataset of Mahadevan (1977)
partially modified as M80.
EBUDS: Extended Basic Unique Dataset, excludes

All ambiguous lines

All texts from sides having multiple lines

All duplicates (keeping their single occurrence)
Thus, EBUDS consists of 1548 lines of texts, with
7000 sign occurrences.
Indus Script: Search for Grammar, Yadav et.al. (2007)
17
5) Preliminary Analysis
Indus Script: Search for Grammar, Yadav et.al. (2007)
18
Frequency distribution of Indus Signs
Frequency
range in M77
In M77
No. of signs
Total sign
occurrences
Present Work (EBUDS)
Total sign
occurrences (in
percent)
No. of signs
Total sign
occurrences
Total sign
occurrences (in
percent)
>1000
1
1395
10.43
1
715
10.21
999-500
1
649
4.85
1
377
5.39
499-100
31
6344
47.44
31
3230
46.14
99-50
34
2381
17.81
34
1243
17.76
49-10
86
1833
13.71
86
975
13.93
9-2
152
658
4.92
152
388
5.54
1
112
112
0.84
72
72
1.03
0
0
40
-
-
417
7000
100.00
Total
417
13372
100.00
Only 67 (16% of total no. of signs) signs account for over 80% of
the writing.

Indus Script: Search for Grammar, Yadav et.al. (2007)
19
Conclusions from Preliminary Analysis

The frequency distribution of the signs in
EBUDS is consistent with M77.

The manner of choosing the data set has not
changed the pattern of occurrence of various
signs and the results are consistent with the
analysis of M77.

Only 67 signs (16% of total no. of signs) account
for over 80% of the writing.
Indus Script: Search for Grammar, Yadav et.al. (2007)
20
6) Analysis 1:
Check against Random Order
Indus Script: Search for Grammar, Yadav et.al. (2007)
21
Methodology

We take 1548 unique texts (7000 signs) present in
EBUDS.

We randomise their appearance keeping the frequency of
each sign as in EBUDS.

We split this long random string (of 7000 signs) into texts
of 1 to 14 signs as in EBUDS.

We create 10 such random databases.

We then compare the frequency of their sign pairs, triplets
etc. with Genuine Indus database (EBUDS) to check if
Indus texts have any significant sequencing.
Indus Script: Search for Grammar, Yadav et.al. (2007)
22
Comparison of EBUDS with Random Datasets
No. of signs
in the sign
combination
Frequency of most frequent sign combination
Random Data set
EBUDS
1
2
3
4
5
6
7
8
9
10
Mean
2
60
54
62
51
57
56
63
66
58
56
58.3
168
3
5
3
3
4
3
5
7
5
5
3
4.3
34
4
1
1
1
2
1
1
2
2
1
1
1.3
16
5
1
1
1
1
1
1
1
1
1
1
1
4
6
1
1
1
1
1
1
1
1
1
1
1
2
Indus Script: Search for Grammar, Yadav et.al. (2007)
23
Result of Analysis 1
Most Frequent Sign combination Frequency vs No. of signs in
the combination
Frequency of most frequent
combination
180
160
140
120
100
80
60
40
20
0
1
2
3
4
5
6
7
No. of signs in the combination
Random Datasets (Mean)
Indus Script: Search for Grammar, Yadav et.al. (2007)
Genuine Indus Dataset
24
Conclusions from Analysis 1

String lengths of 2, 3 and 4 signs appear with frequency
far higher than expected by random chance.

The signs are ordered in a specific manner.

It is justifiable to state that Indus texts followed certain
rules and thereby meant something significant and
meaningful.
Indus Script: Search for Grammar, Yadav et.al. (2007)
25
7) Analysis 2:
Positional analysis of Frequent Sign
Combinations
Indus Script: Search for Grammar, Yadav et.al. (2007)
26
Positional Analysis of Frequent Two-sign Combinations
Two-sign Combination
Frequency
Solo (%)
Left (%)
Middle (%)
Right (%)
99
267
168
0.60
1.79
11.90
85.71
89
336
75
0.00
0.00
89.33
10.67
176
342
59
0.00
96.61
3.39
0.00
342
8
58
1.72
72.41
25.86
0.00
99
391
56
0.00
0.00
8.93
91.07
342
347
56
0.00
89.29
10.71
0.00
1
342
48
0.00
89.58
10.42
0.00
123
293
40
0.00
0.00
0.00
100.00
59
87
39
0.00
0.00
79.49
20.51
342
48
38
2.63
52.63
28.95
15.79
59
171
36
0.00
0.00
80.56
19.44
162
249
34
0.00
0.00
85.29
14.71
211
89
34
0.00
91.18
8.82
0.00
245
245
33
0.00
60.61
21.21
18.18
211
59
31
0.00
90.32
9.68
0.00
67
65
27
0.00
0.00
74.07
25.93
130
51
27
0.00
7.41
70.37
22.22
67
99
26
0.00
0.00
100.00
0.00
342
162
25
4.00
84.00
12.00
0.00
343
123
25
0.00
0.00
100.00
0.00
Indus Script: Search for Grammar, Yadav et.al. (2007)
27
Positional Analysis of Frequent Three-sign Combinations
Three-sign Combination
Frequency
Solo (%)
Left (%)
Middle (%)
Right (%)
211
89
336
34
2.94
88.24
5.88
2.94
343
123
293
25
0.00
0.00
0.00
100.00
342
162
249
24
4.17
83.33
8.33
4.17
342
169
249
20
5.00
70.00
20.00
5.00
342
8
171
19
5.26
73.68
5.26
15.79
149
130
51
19
0.00
0.00
78.95
21.05
59
87
99
16
0.00
0.00
100.00
0.00
342
87
403
16
6.25
81.25
6.25
6.25
342
149
130
16
0.00
75.00
25.00
0.00
67
99
267
14
0.00
0.00
7.14
92.86
87
99
267
14
0.00
0.00
21.43
78.57
89
336
72
14
0.00
0.00
85.71
14.29
65
99
267
12
0.00
0.00
8.33
91.67
342
244
67
12
8.33
66.67
8.33
16.67
15
389
178
11
9.09
72.73
0.00
18.18
59
171
53
10
0.00
0.00
60.00
40.00
245
245
25
10
10.00
90.00
0.00
0.00
Indus Script: Search for Grammar, Yadav et.al. (2007)
28
Positional Analysis of Frequent Four-sign Combinations
Four-sign Combination
Frequency
Solo (%)
Left (%)
Middle (%)
Right (%)
342
149
130
51
16
6.25
68.75
6.25
18.75
59
87
99
267
9
0.00
0.00
33.33
66.67
89
336
59
171
6
0.00
0.00
83.33
16.67
15
389
178
98
5
0.00
100.00
0.00
0.00
342
53
230
175
5
20.00
80.00
0.00
0.00
342
169
249
65
5
20.00
20.00
20.00
40.00
211
89
336
72
5
0.00
80.00
0.00
20.00
Indus Script: Search for Grammar, Yadav et.al. (2007)
29
Conclusions from Positional analysis

The most frequent two-sign, three-sign and four-sign combinations
appear at fixed positions.

The exact location varies from combination to combination.

However, frequently occurring two-sign, three-sign and four-sign
combinations may be incomplete except of course when they occur
as solo texts.

It can be seen that two-sign, three-sign and four-sign combinations
which are complete have typically one of the text-enders (mostly
342
or 211
) at the end. This is confirmed by the solo
occurrences of such texts.
Indus Script: Search for Grammar, Yadav et.al. (2007)
30
8) Text Beginners and Text Enders
Indus Script: Search for Grammar, Yadav et.al. (2007)
31
Indus Text Beginners and Enders
Enders and Beginners (EBUDS)
Fractional Cumulative Frequency
1.20
1.00
0.80
0.60
0.40
Enders
0.20
Beginners
0.00
0
20
40
60
80
100
120
140
160
180
200
Number of Signs
Indus Script: Search for Grammar, Yadav et.al. (2007)
32
Consider an Indus Text with Signs
G F E D C
B A
Frequent
Text
Beginners
Frequent
Text
Enders
(In order of their statistical significance)
Indus Script: Search for Grammar, Yadav et.al. (2007)
33
Specimens of Indus Texts illustrating syntactical patterns
From Mahadevan (1986)
Indus Script: Search for Grammar, Yadav et.al. (2007)
34
Conclusions for Indus Script

There are well defined text-enders though textbeginners are not that well-defined.

Sign distribution within the strings seems to be
ordered as per some specific rules. The
distribution is far more significant than would
arise by chance.

This indicates existence of patterns and rules
that need to be dug out.
Indus Script: Search for Grammar, Yadav et.al. (2007)
35
9) Segmentation of Indus Texts
Indus Script: Search for Grammar, Yadav et.al. (2007)
36
Segmentation Approach
There can be various methods which can be used for segmenting an
Indus text namely




Comparing texts
Using frequent combinations of signs
Using Pair Frequencies
Using Single Signs (Enders, Beginners, Auxiliary Enders)
These methods are overlapping and hence it is decided to select an
approach which takes into consideration the effect of each of these.
A cumulative method based on statistically significant units, is thus
formulated.
Indus Script: Search for Grammar, Yadav et.al. (2007)
37
Segmentation Process
INDUS TEXT
Percent of texts split
(for texts of 5 or more
signs)
Look for pair, triplet and quad texts
successively
55 % split
Look for frequent 4, 3 and 2 sign
combinations successively
77 % split
Look for Enders, Beginners and
Auxiliary Enders successively
88 % split
TEXT SEGMENTS
Indus Script: Search for Grammar, Yadav et.al. (2007)
41
Segment Length vs. Segment Frequency in EBUDS before and after segmentation
Length vs Number of Texts or Segments
Number of Texts or Segments
1800
1600
EBUDS before
segmentation
1400
EBUDS after
segmentation
1200
1000
800
600
400
200
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Text or Segment length
Indus Script: Search for Grammar, Yadav et.al. (2007)
42
EBUDS before and after segmentation
EBUDS
Segmentation
EBUDS before Segmentation
EBUDS after Segmentation
EBUDS
afterafter
Segmentation
fore
EBUDS before
14Segmentation
14Segmentation
0%
011
12
13
%
1%
0%
0%
1
4%
12
13
8 9 1011
1%
0%
24% 2%1%
7
12%
9%
1
1
4%
112
10
7
8
9
11
13
14
2
12%
4
5
3
18%
6
13%
3
18%
5
4
19%
17%
2
1
21%
3
4
4
3
18%
5
5
6
6
7
7
7
8
8
8
9
9
9
10
10
10
11
11
11
6
3
18%
4
17%
1
7
8
9
11
12
13
14
5 610
4
21%
1
0%
0%
0%
0%
2%
6%
21%
3
56
4
20%
0%
0%
0%
1%
2%
6%
3
12
13
2
52%
14
Indus Script: Search for Grammar, Yadav et.al. (2007)
12
13
12
2
52%
13
14
14
43
Few Examples of
Segmentation
Indus Script: Search for Grammar, Yadav et.al. (2007)
44
Conclusions from segmentation

It is possible to segment 88% of Indus texts of length 5
and above into segments of length 4 and below by using
statistically significant signs and their combinations in
addition to all the texts of length 2, 3 and 4.

Many frequent sign combinations make their appearance
as independent texts.

The Indus texts after segmentation can be viewed as
permutations of the identifiable units (segments) of 2, 3
or 4 signs.

The identifiable units may or may not be standalone (or
complete) pieces of information.
Indus Script: Search for Grammar, Yadav et.al. (2007)
45
10) Summary
Indus Script: Search for Grammar, Yadav et.al. (2007)
46
Summary

The writing is highly ordered.

Typical length of information containing units is 2, 3 or
maximum 4 signs.

However, they are not always complete enough to exist as
standalone pieces of text.

This suggests a more complex grammar in the writing
where information units need proper beginners or enders.

The present study shows that Indus writing seems to have
specific ordering as would be expected if sophisticated
information is coded. This is consistent with the general
level of sophistication associated with the Indus culture.
Indus Script: Search for Grammar, Yadav et.al. (2007)
47
End
Indus Script: Search for Grammar, Yadav et.al. (2007)
48