full text pdf

Download Report

Transcript full text pdf

Genre-dependent interaction of coherence
and lexical cohesion in written discourse*
ILDIKó BERZLÁNOVICH and GISELA REDEKER
Abstract
We investigate the interaction between coherence and lexical cohesion in
e­xpository and persuasive texts using seven encyclopedia texts and seven fundraising letters. We describe genre structure in terms of genre-specific moves and
coherence structure with Rhetorical Structure Theory. For lexical c­ohesion, we
identify repetitions, systematic semantic relations and collocations across discourse units, modeled as weighted multigraphs. By comparing the prominence
of discourse units in the coherence structure with the centrality in the lexical
cohesion structure in the two genres, we show that lexical cohesion is closely
aligned with coherence in the expository texts, but not in the persuasive texts.
Keywords: genre, move analysis, coherence, RST, lexical cohesion, encyclopedia texts, fundraising letters
1. Introduction
The research reported in this paper is part of a larger project on textual organization, which involves the construction of an annotated corpus with texts from
various expository and persuasive genres. The focus of the current paper is the
contribution of coherence and lexical cohesion to discourse organization,
which we hypothesize to vary across genres. In particular, we will show that
coherence and lexical cohesion are closely aligned in expository texts, but not
in persuasive texts. This implies that lexical cohesion information provides
valuable cues to discourse organization in expository texts, but should not be
relied on in other (in particular, persuasive) text types.
1.1 Genre structure
The notion of genre can be defined as the pragmatic knowledge shared by the
members of a discourse community about a more or less conventionalized
Corpus Linguistics and Linguistic Theory 8–1 (2012), 183 – 208
DOI 10.1515/cllt-2012-0008
1613-7027/12/0008–0183
© Walter de Gruyter
Unauthenticated
Download Date | 5/9/16 10:32 PM
184 I. Berzlánovich and G. Redeker
class of communicative events with common communicative purposes (Swales
1990). This shared knowledge fixes a standard structure with default elements
for the texts of a particular genre (also known as superstructure; van Dijk 1988)
but also introduces expectations about, for instance, subject matter and stylistic
choices. The texts of a given genre vary in the extent to which they conform to
a conventionalized superstructure.
This also holds for expository and persuasive texts under consideration in
this study. For maximal distinction and comparability, we have chosen typical
expository and persuasive genres with conventionalized discourse structures,
viz., encyclopedia entries and fundraising letters, respectively. With our choice
of encyclopedia entries (EEs), we focus on strongly information-oriented
learned expository texts (Biber 1989). For a strongly persuasive genre, we
d­ecided to analyze fundraising letters (FLs): they are part of promotional
d­iscourse directed at a particular (though not necessarily specified) addressee.
Their strong persuasive force (aiming to convince the reader to financially
s­upport the promoted organization) has been shown in previous studies (e.g.,
Abelen et al. 1993; Bhatia 1998; Upton 2002).
Our hypothesis is based on the comparison of the discourse organization in
expository and persuasive texts. Information-oriented expository texts present
facts. The larger discourse units (DU) are formed around related concepts,
t­opics and their subtopics. Each DU of the expository texts aims to elaborate
on the active topic or to move on to a subtopic or a new topic in order to provide new information about the main topic for the reader (Britton 1994). The
reader-related (ideational or semantic) structure thus seems to be dominant in
expository texts. As the linear organization of text reflects this way of presenting information on a specific main topic, we assume that lexical cohesion built
upon semantic relations is closely aligned with this manner of presentation.
While the emphasis is on the content in expository texts, persuasive texts are
built around a central illocutionary force (i.e., persuade) to have an effect on
the reader. Hence, the illocutionary (intentional) structure of discourse dominates over the ideational structure in persuasive texts. The illocutionary structure already offers a clear structure for the text, which leads to a less close
r­elation between this structure and lexical cohesive resources. In sum, we
h­ypothesize that lexical cohesion (contributing to the ideational structure of
discourse) plays a pivotal role in the structuring of expository texts, whereas it
is less prominent in the case of persuasive texts.
1.2 Coherence and cohesion
Coherence structure and cohesion structure describe the organization of individual texts. Coherence refers to the underlying semantic and pragmatic relations between text parts which are interpretable against the background of
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 185
s­pecific world knowledge. These relations are not necessarily signaled on the
surface of the text. Cohesion refers to the overt semantic relations between
grammatical and lexical items in the text.
1.2.1 Coherence structure
The structure of coherence can be captured by identifying the relations holding
between the DUs that keep the whole text together as one unit. All coherence
approaches classify these relations into semantic and pragmatic relations depending on the source of the relation (Taboada and Mann 2006). Other labels
in terminology referring to the same distinction are: ideational and interpersonal relations (Halliday and Matthiessen 2004; Redeker 2000); subject-matter
and presentational relations (Mann and Thompson 1988). Similarly, Grosz and
Sidner’s (1986) model of discourse structure contains informational and intentional structures.
Semantic relations result from the locutionary meanings of the DUs i­nvolved.
See Example (1), where the second DU elaborates on the information in the
first DU (DUs are marked with square brackets in our examples). The first DU
introduces the topic of Pluto’s orbit around the sun, and the second provides
further details.
(1)[Pluto maakt eens in de 248 jaar een omloop rond de zon.] [Het vlak
waarin Pluto rond de zon draait heeft een helling van 17 graden ten
opzichte van het vlak waarin de Aarde rond de zon draait.] (EE06)
[Pluto completes an orbit around the sun once every 248 years.] [The
plane in which Pluto revolves around the sun has an inclination of 17
degrees compared to the plane in which the Earth revolves around the
sun.]
In contrast, pragmatic relations arise from the illocutionary meanings of the
DUs.
(2)[Hiervoor vragen wij dan ook uw steun van slechts €2,50 per maand.]
[In de folder leest u hoe zieke kinderen genieten van een bezoekje van
de CliniClowns.] (FL02)
[For this we thus ask for your support of only €2.50 per month.][In the
brochure you can read how much sick children enjoy a visit by the
CliniClowns.]
Here the relation holds between two speech acts. Drawing the reader’s attention to the brochure and its content makes the reader more ready to accept the
request for support, as the reading of the brochure justifies the writer’s right to
present the request.
Unauthenticated
Download Date | 5/9/16 10:32 PM
186 I. Berzlánovich and G. Redeker
1.2.2 The structure of lexical cohesion
There is no generally accepted method for cohesion analysis. Cohesion studies
vary according to the units of analysis, the selection of the items for analysis,
the classification of cohesive relations and the measuring of cohesive force in
the texts. This terminological, methodological, and conceptual diversity complicated comparisons between cohesion studies and generalizations about lex­
ical cohesion in texts. In general, three subsystems are investigated in terms of
overall cohesion: relational cohesion (connectives, discourse markers that signal coherence relations between DUs), referential cohesion (anaphoric chains,
spatial and temporal chaining, and ellipsis) and lexical cohesion (semantic
r­elations between lexical items) (Halliday and Hasan 1976; Halliday and Matthiessen 2004). Here we focus on lexical cohesion, and we propose a model for
the representation of its structure.
The aim of lexical cohesion analysis is to cover all the semantic relations
that hold among lexical items in the text. These relations emerge through preexisting connections in the lexicon. Lexical cohesion thus contributes to the
ideational (informational) structuring of discourse. The main types of lexical
cohesion links are repetition, systematic semantic relations (e.g., hyponymy,
meronymy, synonymy, and antonymy) and collocation. For a discussion of
previous approaches see Tanskanen (2006: Ch. 3).
There are two main views about what structure the semantic relations build
up. It might be interpreted as a chain (i.e., a sequence of semantically related
lexical items following the linearity of the text) or as a net (i.e., a network of
the lexical items with multiple relations among them). The notion cohesive
chain was introduced by Hasan (1984). She proposed to consider the interaction between two types of chains: the identity chain based on co-referentiality
and the similarity chain built upon not text-bound semantic (i.e., lexical cohesive) relations. Example (1) above contains one similarity chain (Pluto – orbit
– sun – Pluto – revolves – sun – Earth – revolves) and two identity chains
(Pluto – Pluto and sun – sun). The chains interact by sharing the lexical items
Pluto and sun (i.e., these lexical items occur in more than one chain). The
chaining approach is also employed in computational applications (e.g., M­orris
and Hirst 1991; Silber and McCoy 2002).
In contrast to the linear, single-linkage model of the chaining approach,
Hoey (1991) argues that multiple relations hold among the lexical items and
build up a lexical network for the text. This network approach identifies all the
semantic relations among the lexical items. In Example (1), the second occurrence of Pluto thus forms direct cohesive relations with the first occurrence of
Pluto, orbit and sun. Graph-based computational methods calculate semantic
similarity with regard to this network of meanings (Erkan and Radev 2004).
Modeling lexical cohesion with a graph structure obviously provides a much
richer representation than the lexical cohesive chains model. As we identify all
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 187
Figure 1. L
exical cohesion analysis and graph structure for an example fragment (translated
from the Dutch original)
the semantic relations among the lexical items, each item may be involved in
more than one lexical cohesive relation. Such multiple relations are illustrated
in Figure 1. Note that we do not identify lexical relations within one EDU
(e­lementary discourse unit, see Section 2.2 below), as clause-internal relations
are governed by grammatical rules. The network of relations can be r­epresented
as a graph with the lexical items as vertices and the lexical cohesive relations
as edges (see Figure 1; the indices in the graph show the number of the EDU
in which an item occurs).
The graph structure very intuitively models the centrality of a lexical item
in the text at hand in terms of the number of edges it is involved in. In addition, the edges of this graph can be labeled with the type and strength of the
relations.
The strength of the cohesion relation between token lexical items depends
on their underlying semantic relatedness and on their distance in the text. Distance can be seen as the linear distance in the text or the time between the
o­ccurrences of the two tokens involved. This might be further refined by taking
into account structural characteristics of the intervening material, but the necessity to do so is much less clear for lexical relations than e.g. for anaphoric
ones.
Measuring the underlying (text-independent) semantic relatedness of two
lexical items is more difficult. With large amounts of suitable reference material, Latent Semantic Analysis (Landauer et al. 1998) could be used to estimate
word similarities. Comparisons across genres using LSA similarities is not
straightforward, however, as genres cover different semantic domains and lexical items used across genres may differ in their co-occurrence behavior and
thus in the estimated similarities.
Another common approach to measuring semantic similarity is the use of
large-scale lexicographic databases like WordNet (Fellbaum 1998), where the
Unauthenticated
Download Date | 5/9/16 10:32 PM
188 I. Berzlánovich and G. Redeker
hierarchy of the concepts is organized mainly by is-a and part-of relations.
Semantic similarity can be defined as the path length between the items: the
shorter the path from one node to another, the more closely related they are
semantically. The common problem with this definition is that the vertical
d­istance between the items cannot be measured simply by the number of the
semantic levels, as these are not equally distant from each other. The WordNet
database, for instance, is structured in a way that vertical distance is larger at
higher levels in the hierarchy. Another problem is that certain categories are
more developed than others. To compensate for this, there have been attempts
to adapt the basic edge counting with information of the depth of words and the
density of the branches in the hierarchy, and using statistical word association
such as mutual information (for a detailed summary see Budanitsky and Hirst
2006). A further complication is that the databases do not contain all the lexical
items of the analyzed corpora. Finally, a serious drawback for our purposes is
that they are restricted to is-a and part-of relations and thus do not include collocation relations (Stokes 2004).
Neither the co-occurrence-based measures (i.e., Latent Semantic Analysis),
nor the dictionary-based measures of semantic relatedness thus seem appropriate for our analysis, as our study covers all lexical relations and two different
genres. We therefore refrained from attempting to measure semantic relatedness. We used only textual distance and a simple distinction between repetition
and all other relations in the calculation of weights for lexical cohesive relations (cf. Section 2.5).
2. Method
2.1 Corpus
For this study, we selected 14 texts from our corpus of expository and persuasive texts in Dutch, seven encyclopedia entries (EEs) and seven fundraising
letters (FLs) (see Table 1). The encyclopedia entries were collected from o­nline
encyclopedias on astronomy (http://www.astro.uva.nl/encyclopedie/; http://
www.astronomie.nl/ ). The fundraising letters were collected from direct-mail
campaigns of charitable organizations. Some of the organizations operate in or
from the Netherlands, and some are international organizations with a local
office in the Netherlands. For example texts for both genres, see EE02 and FL02
in Appendices A and B.
Before starting our analyses, we removed the illustrations and their captions
for both EEs and FLs (we do keep screenshots and scans for reference). We
also removed the so-called structural segments from the texts. Although these
segments are genre-specific, they are not related to the strategic decisions of
the writer while producing the text. For the analyses of FLs, we did not include
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 189
Table 1. Texts selected for the present study
Texts
Words
EDUs
Encyclopedia entries
EE01 De Zon (The sun)
EE02 Mercurius (Mercury)
EE03 Venus
EE04 Saturnus (Saturn)
EE05 Jupiter
EE06 Pluto
EE07 Eris
289
288
326
331
473
404
227
31
26
29
33
45
36
23
Fundraising letters
FL01 Dierenbescherming (Animal Welfare)
FL02 CliniClowns
FL03 Rode Kruis (Red Cross)
FL04 Diabetes Fonds (Diabetes Fund)
FL05 Oxfam Novib
FL06 Revalidatiefonds (Revalidation Fund)
FL07 Vluchtelingenwerk (Refugee Support)
223
217
304
276
278
350
245
24
23
30
31
32
35
29
the structural elements Date Line, Address Information, Salutation, Complimentary Close, Signature, Signature Footer and Footnote Information (Upton
2002). When the structural segments were parts of a sentence, we included the
rest of the sentence for the analysis (see Example 3).
(3)[Met vriendelijke groet]structural segment en alvast heel hartelijk dank,
(FL09)
[Kind regards]structural segment and thank you in advance,
2.2 Segmentation
The texts were manually segmented into elementary discourse units (EDUs),
which function as the smallest units for both the coherence analysis and the
lexical cohesion analysis. To allow for semi-automatic parsing, our segmentation is strictly surface-oriented, relying on syntax and on punctuation. EDUs
are simple sentences, finite clauses, and fragments that function as complete
utterances. In particular, the following rules are applied:
– Non-restrictive relative clauses are separate EDUs.
(4) [Op het oppervlak zijn een aantal wolkenbanden te zien] [die rond de
hele planeet lopen.] (EE05)
[On the surface, there are a number of cloud banks] [which go around
the whole planet.]
Unauthenticated
Download Date | 5/9/16 10:32 PM
190 I. Berzlánovich and G. Redeker
– Coordinated clauses with nominal predicates are separate EDUs.
(5) [De structuur van de gasatmosfeer van Jupiter is zeer complex] [en ook
zeer veranderlijk.] (EE05)
[The structure of the gas atmosphere of Jupiter is very complex] [and
also very variable.]
– Past participle clauses are separate EDUs.
(6) [Vandaag de dag onderscheiden we zeven ringen,] [aangeduid met de
letters A t/m G.] (EE04)
[These days, we distinguish seven rings,] [indicated with the letters A
through G.]
– If a comparison is expressed with a clause, we take the clause as a separate
EDU.
(7) [De zon is zo dicht bij de aarde] [dat we het oppervlak in detail kunnen
bestuderen.] (EE01)
[The sun is so close to the Earth] [that we can study the surface in
detail.]
– Infinitival purpose clauses are separate EDUs.
(8) [Haar man was naar de stad] [om werk te zoeken.] (FL05)
[Her husband had gone to town] [to look for a job.]
– Finite clauses between hyphens, in parentheses, after colons and semi-­
colons are separate EDUs.
(9) [Daar knapt ze zichtbaar op;] [ze begint ook weer te praten!] (FL07)
[There she recovers visibly;] [she also begins to speak again!]
2.3 Genre analysis
The genre-specific structure of a text can be described in terms of the moves
employed to reach the text’s global goal (Biber et al. 2007). For conventionalized genres, particular sets of moves can be identified in all or most texts,
yielding a prototypical or canonical move pattern for that genre. Except for
highly standardized genres, this genre-specific pattern allows some variability.
In any particular text, moves may be omitted or realized several times in different parts of texts. The order of moves may also vary (though there is often a
preferred or most common ordering).
For the move analysis of fundraising letters, we follow Upton (2002), who
identified seven canonical moves (Get Attention, Introduce the Cause and/or
Establish Credentials of Organization, Solicit Response, Offer Incentive, Reference Insert, Express Gratitude, and Conclude with Pleasantries). Two of
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 191
these moves, Introduce the Cause and/or Establish Credentials of Organization and Solicit Response, are considered to be obligatory, as they are essential to the main function of the letters. In our analysis, we decided to treat
I­ntroduce Cause and Establish Credentials of Organization as two separate
moves. We share Upton’s observation that they are closely related thematically
and usually realized in adjacent discourse units, but we could still always
c­learly distinguish them in our analysis. We did not find any Conclude with
Pleasantries moves in our Dutch fundraising letters (this is probably a cultural
difference).
For encyclopedia texts, we are not aware of any studies describing the genrespecific move structure. We have therefore devised our own labels for the
r­ecurring functional components in our texts: Name, Define, and Describe. The
move Name gives the topic of the EE, in our case the name of an astronomical
object (e.g., planet, star, or moon). Define classifies the topic entity under a
category, mentioning the key features that distinguish it from other members
of the category. There are usually three to six Describe moves in our texts,
a­ddressing various attributes of the topic entity or specific details (e.g., surface,
atmosphere, moons of the planet). Given the communicative function of
e­ncyclopedia texts, the moves Name and Define should be considered to be
obligatory.
To illustrate our analysis, Table 2 summarizes the move structure of EE07.
Name identifies the topic. Define classifies Eris as a dwarf star and uniquely
identifies it among the dwarf stars with the superlative “the largest”. Each
D­escribe move then discusses Eris from a different perspective (its name,
moon, rotation, and temperature). Note that moves can vary considerably in
length, ranging from a single word (functioning as an independent utterance
and thus as an EDU) to several paragraphs (forming a complex discourse unit).
The fact that a move is obligatory does not mean that it has to be identifiable
at the discourse unit level in each text. In Example (10), the Define move is
expressed with a nominal apposition, which is not a separate EDU in our
a­nalysis.
Table 2. The move structure of EE07
Move
EDU
Text
Name
Define
Describe
01
02
03–11
Describe
Describe
Describe
12–13
14 –18
19–23
“Eris”
“Eris is the largest dwarf star in our solar system.” [translated from Dutch]
describes the name of Eris (what previous names it had, and how it got the
name Eris)
describes the moon of Eris
describes how Eris rotates around the Sun
describes how temperature changes on Eris
Unauthenticated
Download Date | 5/9/16 10:32 PM
192 I. Berzlánovich and G. Redeker
(10)De ring rond Saturnus, de zesde planeet in ons zonnestelsel, werd in
1610 voor het eerst wazig gezien door de Italiaanse
natuurwetenschapper Galileo Galilei. (EE04; italics added)
The ring around Saturn, the sixth planet in our solar system, was first
blurredly seen by the physicist Galileo Galilei in 1610.
The EDU in (10) was thus classified as contributing to a Describe move in
EE04. There are other occasions where an EDU might be seen as contributing
to more than one move. In Example (11), for instance, Reference Insert cooccurs with Establish Credentials of Organization.
(11)Op de achterkant van deze brief leest u hoe effectief onze hulp in de
praktijk blijkt te zijn. (FL03)
On the back of this letter you can read how effective our help turns out
to be in practice.
In all these cases, we chose the most dominant function of the EDU. The
EDU in (11) was thus labeled Reference Insert, as its main function is to draw
the reader’s attention to material beyond the letter itself (although that material
is described as adding to the organization’s credentials).
2.4 Coherence analysis
We describe the coherence structure of a text in terms of non-binary labeled
trees using Rhetorical Structure Theory (RST; Mann and Thompson 1988),
which has been found useful for the study of coherence in various languages
and genres. RST aims to reconstruct what the writer might plausibly have
i­ntended by producing a certain DU at the text position it occurs in. The analysts’ judgments are expressed in the assignment of RST relations and guided
by operational definitions (available from the RST website at http://www.sfu.
ca/rst/index.html). For creating the annotations, we use O’Donnell’s RSTTool
version 3.45 (available at http://www.wagsoft.com). All analyses are discussed
in detail by the project team, based on independent initial analyses from at least
two analysts. This is an important step in ensuring reliability. Examples showing the top level of the RST analysis will be discussed in Section 3.
2.5 Lexical cohesion analysis
Our analysis of lexical cohesion identified semantic links between lexical
items to describe lexical patterns in the text. The items entering into the lexical
cohesion analysis are content words (nouns, verbs, adjectives, place and time
adverbs, and adverbs of frequency) and proper names. The elements of multiword units (except for proper names) are treated as separate lexical items,
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 193
Table 3. Categories of lexical cohesion
Category
Example
Repetition
Full repetition
Partial repetition
planet – planet
planet – planetary
Systematic semantic relations
Hyponymy
Hyperonymy
Co-hyponymy
Meronymy
Holonymy
Co-meronymy
Synonymy
Antonymy
sun – star
gas – hydrogen
Venus – Mercury
planet – solar system
solar system – sun
Earth – sun
life – existence
light – heavy
Collocation
light – star
while compounds are taken as indecomposable single units. Table 3 lists the
relations we distinguish.
By repetition we mean word repetition. Full repetition is a relation between
two lexical items that are fully identical in their word form or they differ only
in their inflectional suffix. Under the label partial repetition, we categorize
lexical items differing in the derivational suffix. As derivational suffixes often
drastically change the word form and also the meaning of the lexical item,
we decided to keep them separate from full repetition. Systematic semantic
relations cover the traditional lexical semantic relations. Collocation is formed
between two lexical items which tend to occur together, and at the same time
– belonging to the same lexical field – fit into knowledge structures such as
frames or scripts where concepts mutually evoke and depend on each other
(Tanskanen 2006).
We view the structure of lexical cohesion as a weighted multigraph where
the edges are the lexical cohesive relations and the vertices are the lexical items
(see Section 1.2.2). We allow multiple relations among the lexical items, and
we assign weights to each relation. The centrality of a token lexical item is then
determined by the weighted sum of the relations it is involved in. The centrality of an EDU is the sum of these scores over all its content words.
To determine sensible weights for the relations, we need to measure the
strength of the underlying semantic relation and the textual distance between
the two lexical items. As we have argued above (in Section 1.2.2), lexicographic databases are too incomplete to be useful here. We therefore adopt a
minimal distinction between repetition relations on the one hand and systematic semantic relations and collocations on the other. Repetition relations are
stronger, as they are based not only on the semantic relatedness of the lexical
items, but also on the full or partial identity in the word form of the lexical
Unauthenticated
Download Date | 5/9/16 10:32 PM
194 I. Berzlánovich and G. Redeker
Weight of the relation =
Strength of the lexical cohesive relation
Textual distance
Figure 2. The weight assigned to the lexical cohesive relation in the structure of lexical cohesion
items. We assigned the following rather arbitrarily chosen scores: 2 to full repetition relations, 1.5 for partial repetition and 1 for all other relations. We calculated the weights for the lexical cohesive relations by dividing the strength
score by the textual distance measured as the number of the EDU boundaries
crossed (Figure 2).
We annotated the texts for lexical cohesion analysis with the MMAX2 tool
that stores the data in XML files (Müller and Strube 2006), then we calculated
the lexical density scores with XQuery.
3. Results and discussion
To assess the alignment between coherence and lexical cohesion, we compare
the centrality of the DUs. In the tree structure of coherence, the most central
nucleus represents the most central DU. We map the move structure onto the
tree structure to see which move is realized by this central DU. In the structure
of lexical cohesion, we distinguish the weighted relations whether they hold
across move boundaries (external relations) or within the DUs realizing the
moves (internal relations). Internal relations show how cohesive the DUs are
internally. To see which move is central in the structure of lexical cohesion, we
examine the external relations. They show to what extent the DUs (and the
moves they realize) are cohesive relative to each other.
3.1 Prominence in the coherence structure
For the investigation of the alignment hypothesis, we first mapped the move
structure onto the RST structure (see Figures 3 and 4 for examples).
For most of the EE texts we found that the Name and Define moves were
realized with the most central nuclei in the RST structure. In EE02, the Define
move is the nucleus for the Describe moves, and all these moves together form
a single Elaboration satellite of Name. The Describe moves vary in number
and in the way that are combined (serially or stacked or a combination of both,
as in EE02). Most Describe moves are Elaboration satellites, but occasionally
they are satellites of other relations (e.g., Non-Volitional Result or Interpretation in EE02). In EE07, two Describe moves are combined together with a
causal relation before the combined DU attaches to the Define move.
Unauthenticated
Download Date | 5/9/16 10:32 PM
Figure 3. Move structure and coherence structure for EE02 (top left), EE04 (top right), and EE07 ( bottom)
Genre-dependent interaction of coherence and lexical cohesion 195
Unauthenticated
Download Date | 5/9/16 10:32 PM
196 I. Berzlánovich and G. Redeker
The coherence structure of EE04 is irregular compared to the other texts
(Figure 3). The Describe moves neatly map onto spans in the RST structure
that are combined by a multinuclear Joint relation to form the Elaboration satellite of the span presenting the Name move. Remember that this was the only
text where no Define move could be identified at or above EDU level (see Section 2.3). The centrality of Name, even without Define, is clear in the coherence structure.
In the FLs, the most central nucleus realizes the move Solicit Response formulating the financial request (see Figure 4). All the other moves except the
move Get Attention can attach directly to this central Solicit Response or at
lower levels in the tree structure. Get Attention, which is often realized by a
catch phrase above the body of the letter, if it occurs, tends to attach at a high
level to the whole text with a Preparation relation. Introduce Cause usually
maps onto the ‘problem’ satellite of a Solutionhood relation, the ‘solution’
nucleus of which presents the move Establish Credentials of Organization.
This shows that these two moves are strongly related to each other. The moves
Offer Incentive and Express Gratitude often attach directly to Solicit Response
as the satellites of separate Motivation relations. Similarly, Reference Insert is
a typical Justify satellite of the central nucleus in the coherence structure.
3.2 Centrality in the structure of lexical cohesion
To get a first impression whether there are differences in the lexical cohesive
structure between EEs and FLs, we looked at the total number of cohesive relations for both genres. As Table 4 shows, there are more than twice as many
relations in the EEs than in the FLs (2524 versus 1116), corresponding for
relative frequencies of 11.32 relations per EDU for the EEs and 5.47 relations
per EDU for the FLs. The lexical cohesion structure thus seems much denser
for the EEs than for the FLs.
For the FLs, 56.99% of the lexical cohesive links are collocation relations,
compared to only 39.18% for the EEs. FLs contain text parts with narrative and
procedural discourse types, where lexical items often evoke one or more other
lexical items on the basis of a shared frame that they all fit into. In FL02, for
instance, the lexical items doctor, research, treatment, hospital and sick form
collocation relations in this way.
Another striking difference between the two genres is the percentage of systematic semantic relations, which make up almost half (47.82%) of the lexical
cohesive relations in the EEs, whereas they have the smallest share (19.62%)
in the FLs. The relative frequency ( per EDU) is more than five times higher in
EE than in FL (5.41 versus 1.07). Closer inspection of the types of systematic
semantic relations involved (see Table 5) reveals that this difference is mainly
Unauthenticated
Download Date | 5/9/16 10:32 PM
Figure 4. Move structure and coherence structure for FL01 (top) and FL02 ( bottom)
Genre-dependent interaction of coherence and lexical cohesion 197
Unauthenticated
Download Date | 5/9/16 10:32 PM
198 I. Berzlánovich and G. Redeker
Table 4. Lexical cohesive relations in EEs and FLs
Type of lexical cohesion
EEs
FLs
Count
Percentage
per
EDU
Count
Percentage
per
EDU
Repetition
Systematic semantic
relations
Collocation
328
1207
12.99
47.82
1.47
5.41
261
219
23.39
19.62
1.28
1.07
989
39.18
4.43
636
56.99
3.12
Total
2524
100
11.32
1116
100
5.47
Table 5. The systematic semantic relations in EEs and FLs
Systematic semantic relations
EEs
FLs
Count
Percentage*
Count
Percentage*
Hyponymy
Hyperonymy
Co-hyponymy
Meronymy
Holonymy
Co-meronymy
Synonymy
Antonymy
145
101
128
204
182
351
58
38
5.74
4.00
5.07
8.08
7.21
13.91
2.30
1.51
31
45
35
19
30
17
29
13
2.77
4.03
3.13
1.70
2.68
1.52
2.59
1.16
Totalsystematic semantic relations
1207
47.82
219
19.62
* The percentages are calculated with respect to the total number of all cohesion relations.
due to the very high numbers of meronymic relations in EE texts. This is not
surprising, as EEs define and describe entities. In the Define move, for i­nstance,
the topic entity is categorized, giving rise to a holonym or a hyperonym, and
then distinguished from other instances of the same category, i.e., co-­meronyms
or co-hyponyms of the topic with a meronymic or hyponymic relation to their
shared holonymic or hyperonymic concept.
As described in Section 2.5, we computed weights for the lexical cohesive
relations and added them over all links and all items in an EDU as a measure
of the EDU’s centrality in the cohesion structure. Combining this with the
move analysis, we distinguished internal links (to other EDUs in the same
move) and external links (across move boundaries). The resulting absolute
c­ohesion densities (sums) are shown in Figures 5 and 6 for EE02 and FL02,
respectively).
Figure 5 illustrates for EE02 that the EDUs in the initial parts of EEs play an
important role in lexical cohesion. They score high on external and internal
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 199
Figure 5. Moves and lexical cohesion for the EDUs in EE02
Figure 6. Moves and lexical cohesion for the EDUs in FL02
relations. These EDUs realize Define and the first Describe move. For FL02 in
Figure 6, lexical cohesion seems dominant in the move Establish Credentials
of Organization. The external relations (involving two entities from different
moves) measure how central the DUs and by inference the moves they belong
to are in the text. In EE02, the Define move and the first Describe move contain
the highest external relations scores, making these the most central moves in
Unauthenticated
Download Date | 5/9/16 10:32 PM
200 I. Berzlánovich and G. Redeker
Table 6. Cohesion densities in EE02 and FL02
Genre-specific moves
EDUs
Number of
lexical items
Sum of weighted
external lexical
cohesive relations
Cohesion density
(external links
per lexical item)
EE02: Mercurius (Mercury)
Name
1
Define
2– 6
Describe 1
7–13
Describe 2
14 –24
Describe 3
25–26
1
21
35
53
7
5.00
24.88
27.26
15.98
4.12
5.00
1.18
0.78
0.30
0.59
FL02: CliniClowns
Offer incentive 1
Get attention
Credentials of org.
Introduce cause
Solicit response 1
Reference insert
Solicit response 2
Express gratitude
Offer incentive 2
7
7
21
16
3
7
6
6
8
3.89
5.78
12.24
10.33
0.91
4.85
1.96
6.43
4.75
0.56
0.83
0.58
0.65
0.30
0.69
0.33
1.07
0.59
1
2–7
8–11
12–16
17
18
19–20
21
22–23
the cohesion structure. In FL02, the differences are smaller and more diffuse.
The move Establish Credentials of Organization contains many fairly high
scoring EDUs and may thus be said to be (slightly) more central than e.g. the
Solicit Response moves.
For comparisons across texts, we need to aggregate the scores at the level of
moves. We start by computing the sums of the weighted external relations
across the EDUs of each move, which reflects that move’s connectedness with
the rest of the text. Note however that this measure is directly dependent on the
length of the DU that realizes the move, more precisely, the number of lexical
items involved. We therefore calculate a standardized cohesion density score
for a move by dividing the sum of the weighted external relations by the number of the lexical items in the move. Table 6 presents the results for EE02 and
FL02.
In EE02, the sum of the weighted external cohesive relations is the highest
for Define and for Describe 1, while the standardized cohesion density scores
are highest for the obligatory moves Name and Define. In FL02, Introduce
Cause and Establish Credentials of Organization have the highest absolute
scores, but Get Attention and Reference Insert score higher on the standardized
cohesion density. Note that Solicit Response, the most prominent move in the
coherence structure, scores the lowest both for the absolute and the standardized cohesion density scores in this text.
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 201
Table 7. Move structure and lexical cohesion density for the EEs and the FLs
Moves
Cohesion density
EE01
EE02
EE03
EE04
EE05
EE06
EE07
3.00
1.05
0.88
0.41
0.08
5.00
1.18
0.78
0.30
0.59
4.00
1.68
3.35
0.72
0.64
0.65
0.16
0.26
5.00
4.00
0.58
0.46
0.79
0.44
0.13
3.00
0.88
1.58
1.19
0.42
2.00
1.49
0.27
1.47
0.73
0.54
FL01
FL02
FL03
FL04
FL05
FL06
FL07
Get attention
Introduce cause 1
Introduce cause 2
Introduce cause 3
0.08
0.28
0.83
0.65
1.22
0.34
1.06
0.34
0.47
0.61
0.13
0.32
1.12
1.03
0.18
0.68
Credentials of org. 1
Credentials of org. 2
Credentials of org. 3
0.11
0.58
0.40
1.45
0.29
0.52
0.54
0.44
0.25
0.28
1.70
0.20
Solicit response 1
Solicit response 2
Solicit response 3
Solicit response 4
0.14
0.30
0.33
0.98
0.04
0.80
0.92
0.11
0.65
0.73
0.13
0.00
0.80
0.89
0.93
Offer incentives 1
Offer incentives 2
0.56
0.59
1.37
0.83
1.03
0.49
0.91
0.85
0.62
Reference insert 1
Reference insert 2
0.69
0.61
0.71
0.59
1.07
1.16
0.00
1.10
0.31
0.00
2.46
0.50
0.00
Name
Define
Describe 1
Describe 2
Describe 3
Describe 4
Describe 5
Describe 6
Express gratitude 1
Express gratitude 2
0.48
0.35
0.76
0.77
0.90
0.85
The standardized cohesion density scores for all fourteen texts in this study
are presented in Table 7. In the EE texts, the most prominent move is Name. It
has by far the highest density score in each of these texts. The other obligatory
move, Define, has the highest score in EE03 and the second highest in EE07.
In EE03 and EE05, one of the Describe moves scores higher than Define. In
EE06, Define is not central in the lexical cohesion structure, as two out of the
three Describe moves show higher cohesive density. This Define move is atypical: Instead of giving a categorization of the topic entity and its distinguishing
features, Define in EE06 discusses the difficulties of classifying Pluto (describing how and why its categorization was reconsidered and changed from planet
to dwarf planet).
Unauthenticated
Download Date | 5/9/16 10:32 PM
202 I. Berzlánovich and G. Redeker
For the FL texts, Table 7 shows no systematic association of higher cohesion
densities with particular moves. There is much variation across the texts for
each move type and within the texts across multiple instantiations of the moves.
3.3 The interaction of coherence, lexical cohesion, and genre
Our results so far have shown genre differences in coherence and in lexical
cohesion and indications that coherence structure and cohesion-based structure
are closely aligned in the EE texts, but not in the FL texts. In order to test this
interaction directly, we calculated standardized cohesion density scores for the
obligatory moves and for the optional moves of each genre, that is, we divided
the sum of all weighted external lexical cohesive relations across all o­bligatory/
optional moves by the total number of lexical items in that set of moves. For
EE, the obligatory moves are Name and Define, for FL, the moves Introduce
Cause, Establish Credentials of Organization, and Solicit Response are obligatory; all other moves are optional (see Section 2.3).
The results are shown in Figure 7. The standardized cohesion density for the
EEs is indeed much higher for the obligatory than for the optional moves. For
the FLs, there is a smaller difference in the opposite direction.
A repeated measures ANOVA yielded a significant effect of Genre, with
average cohesion densities of 1.2 for EE and 0.5 for FL (F(df=1,12) = 4.9,
p = .048; eta2 = .29), and a significant interaction of Prominence and Genre
(F(df=1,12) = 7.6, p = .017; eta2 = .39). The main effect of Prominence fails to
reach significance. Even the large mean difference between obligatory and op-
Figure 7. Cohesion density for obligatory and optional moves in the two genres
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 203
tional moves in EE is not quite significant (1.8 versus 0.6; t(df=6) = 2.2, p = .067).
The unexpected reverse difference in FL, by contrast, is highly significant (0.4
versus 0.7; t(df=6) = 3.51, p = .013). We assume that this surprising result might
be caused by the often long and thus deeply structured (obligatory) Introduce
Cause and Establish Credentials of Organization moves, which contain many
EDUs that are low in the RST tree and possibly low in cohesion density. Contrasting only the most central move in FL, Solicit Response, with all other
moves, does indeed yield a much smaller and no longer significant difference
(0.5 versus 0.6).
4. Conclusions
The analyses reported in this paper have shown that the contribution of lexical
cohesion to discourse organization varies with genre. In the encyclopedia
e­ntries in this study, the most prominent discourse units (realizing obligatory
moves) have many lexical cohesive links to discourse units realizing other
moves, while less prominent units have a much lower density of such external
cohesion relations. In the fundraising letters, by contrast, the prominent moves
show extremely low cohesion density scores, as low as or lower than the
o­ptional moves. Coherence and lexical cohesion are thus found to be closely
aligned in expository texts, but not in persuasive texts.
These results have implications for the use of lexical cohesion information
as cues to discourse organization. Lexical cohesion has been used in computational linguistics in various applications for segmenting texts (e.g., Stokes
2004) and for summarization (e.g., Silber and McCoy 2002; Erkan and Radev
2004). Our results suggest that these applications are unlikely to be successful
for persuasive texts (e.g., reviews, opinionating blogs, etc.). For expository
texts, our analyses may suggest directions for refining the use of lexical cohesive information. In particular, the advantages of a network model (allowing
multiple relations between lexical items) over the chaining model should be
further explored.
Further research on the alignment between coherence structure and lexical
cohesion structure should not only involve more texts and more genres (to
become available as our project progresses), but should also go beyond the
binary distinction of obligatory versus optional moves. For a more complete
comparison of prominence in the coherence structure with centrality in the
lexical cohesion structure, we will use the hierarchical information in the RST
trees and derive a hierarchical clustering of discourse units from the lexical
cohesion information. The distance between the two trees will then serve as
a measure of (mis)alignment between those two components of discourse
o­rganization.
Unauthenticated
Download Date | 5/9/16 10:32 PM
204 I. Berzlánovich and G. Redeker
Appendix A: Segmented text of EE02 Mercurius (Mercury)
(original text available at http://www.astronomie.nl/encyclopedie/90/m­ercurius.
html)
EDU
Dutch Text
English Translation
1
2
Mercurius
Mercurius is de binnenste planeet in het
zonnestelsel.
Haar gemiddelde afstand tot de zon is
slechts 58 miljoen kilometer.
De planeet draait in 58.6 dagen om haar
as
en in 88.0 dagen rond de zon.
Met andere woorden, ze roteert precies
drie maal rond haar as in twee
Mercurius jaren.
Gezien de nabijheid tot de zon kan de
temperatuur op Mercurius hoog
oplopen, tot wel 465 graden Celsius.
Dit is, samen met de temperatuur op
Venus, de hoogste
oppervlaktetemperatuur in ons
zonnestelsel.
Echter gedurende de nacht, [ . . . 10 . . . ]
daalt de temperatuur tot zo’n −185
graden Celsius,
die op Mercurius aardse maanden lang
kan duren,
wat weer tot de laagste in ons
zonnestelsel mag worden gerekend.
In kraters nabij de polen van Mercurius,
[ . . . 13 . . . ] bestaat misschien zelfs
ijs.
waar nooit zonlicht komt,
Op het eerste gezicht lijkt het oppervlak
van Mercurius erg veel op dat van de
Maan.
Er zijn grofweg twee typen landschap:
hoogland en laagland.
In vergelijking met de hooglanden op de
Maan zijn er relatief minder
inslagkraters in het hoogland gebied
van Mercurius.
Mogelijk komt dit doordat in de vroege
geschiedenis het oppervlak eens
vloeibaar is geweest,
waardoor veel van de toen al aanwezige
kraters zijn weggevaagd.
Mercury
Mercury is the innermost planet in the solar
system.
Its average distance to the sun is only 58
million kilometers.
The planet revolves around its axis in 58.6
days
and around the sun in 88.0 days.
In other words, it rotates exactly three
times around its axis in two Mercurian
years.
Given the closeness to the sun, the
temperature on Mercury can rise high, to
as much as 465 degrees Celsius.
Together with the temperature on Venus,
this is the highest surface temperature in
our solar system.
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
However, during the night, [ . . . 10 . . . ]
the temperature drops to about −185
Celsius degrees,
that can last for Earth months on Mercury,
which again can be considered to be among
the lowest in our solar system.
In the craters near the poles of Mercury,
[ . . . 13 . . . ] there may even be ice.
where sunlight never comes,
At first sight, Mercury’s surface looks a lot
like the Moon’s.
There are roughly two types of landscape:
highland and lowland.
In comparison with the highlands on the
Moon, there are relatively fewer impact
craters in the highland territory of
Mercury.
Probably this is caused by the fact that the
surface was liquid once in the early
history,
causing many of the craters that were
already present at that time to be swept
away.
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 205
Appendix A (Continued )
EDU
Dutch Text
English Translation
19
Het tweede type terrein, het laagland,
telt relatief nog minder kraters dan het
hoogland,
wat aangeeft dat het later gevormd moet
zijn.
Ongeveer 3.85 miljoen jaar geleden
werd Mercurius geraakt door een
object dat ongeveer 150 km in
doorsnede moet zijn geweest.
De inslag, op het noordelijke halfrond,
liet een krater achter met een diameter
van 1340 km.
Deze wordt Caloris basin genoemd,
en is de op één na grootste inslagkrater
in ons zonnestelsel.
De omstandigheden op Mercurius zijn
zo extreem,
dat op deze planeet hoogstwaarschijnlijk
nooit leven heeft kunnen ontstaan.
On the second type of territory, the
lowland, there are even fewer craters
than on the highland,
which indicates that it must have been
formed later.
About 3.85 million years ago, Mercury was
hit by an object that must have had a
diameter of about 150 km.
20
21
22
23
24
25
26
The impact on the northern hemisphere has
left behind a crater with a diameter of
1340 km.
This is called the Caloris Basin,
and it is the second biggest impact crater in
our solar system.
The circumstances on Mercury are so
extreme,
that most likely no life has ever been able
to develop on this planet.
Appendix B: Segmented text of FL02 CliniClowns
EDU
Dutch Text
English Translation
1
Voor maar €2,50 per maand helpt u mee
zieke kinderen hun tranen even te laten
vergeten.
Even geen dokters,
even geen onderzoek,
even geen behandeling
en even geen teleurstelling.
Kortom: even geen ziekenhuis
en even geen tranen.
Want mede dankzij dit kleine bedrag
komen de CliniClowns op bezoek bij de
vele zieke kinderen in de ziekenhuizen.
Door de clowneske afleiding van de
CliniClowns vergeten de kinderen even
al het andere om zich heen.
For only €2.50 per month you contribute
to helping sick children forget their
tears for a moment.
For a moment no doctors,
for a moment no examinations,
for a moment no treatment,
and for a moment no disappointment.
In short: for a moment no hospital
and for a moment no tears.
Because thanks in part to this small
amount, the CliniClowns come to visit
the many sick children in the hospitals.
Through the CliniClowns’ clownish
distraction the children forget
everything else around them for a
moment.
The clowns let the children decide on the
game themselves,
so that they can be carefree children
again for a moment.
2
3
4
5
6
7
8
9
10
11
De clowns laten de kinderen zelf het spel
bepalen,
zodat ze weer even onbezorgd kind
kunnen zijn.
Unauthenticated
Download Date | 5/9/16 10:32 PM
206 I. Berzlánovich and G. Redeker
Appendix B (Continued )
EDU
Dutch Text
English Translation
12
CliniClowns komen daar waar kinderen
het moeilijk hebben.
Maar helaas nog niet bij al deze kinderen.
CliniClowns go where children are
having a hard time.
But unfortunately, not yet to all these
children.
Simply because we do not have enough
money for that.
There is a need for new, well-trained
clowns,
so that we can let even more children
forget their tears for a moment.
For this we thus ask for your support of
only €2.50 per month.
In the brochure you can read how much
sick children enjoy a visit by the
CliniClowns.
Fill in the payment form today,
and support the work of CliniClowns.
We thank you very much in the name of
all sick children who because of you
can for a moment forget their tears!
P.S. For only €2.50 per month you make
the difference.
With this, you let sick children forget
their tears for a moment.
13
14
15
16
17
18
19
20
21
22
23
Gewoon omdat we daarvoor niet genoeg
geld hebben.
Er is een behoefte aan nieuwe, goed
opgeleide clowns,
zodat we nog meer kinderen even hun
tranen kunnen laten vergeten.
Hiervoor vragen wij dan ook uw steun van
slechts €2,50 per maand.
In de folder leest u hoe zieke kinderen
genieten van een bezoekje van de
CliniClowns.
Vul vandaag nog de acceptgiro in,
en steun het werk van CliniClowns.
Wij danken u hartelijk namens alle zieke
kinderen die door u hun tranen even
kunnen vergeten!
P.S. Voor slechts €2,50 per maand maakt u
het verschil.
Zo zorgt u ervoor dat zieke kinderen hun
tranen even kunnen vergeten.
Bionotes
Ildikó Berzlánovich is currently a PhD student at the University of Groningen,
The Netherlands. Her research focuses on discourse structure (genre, c­oherence,
and cohesion) in written discourse. Email: [email protected]
Gisela Redeker is professor of communication studies at the University of
Groningen, The Netherlands. Her main research area is corpus-based discourse
analysis (Rhetorical Structure Theory, Appraisal analysis, discourse markers).
She is currently leading the NWO Program Modelling Textual Organisation
(www.let.rug.nl/mto). Email: [email protected]
Note
* This research is supported by grant 360-70-282 of the Netherlands Organization for Scientific
Research ( NWO) and is part of the NWO-funded program Modelling discourse organisation
(http://www.let.rug.nl/mto/ ). We wish to thank Gosse Bouma for his technical advice and
Unauthenticated
Download Date | 5/9/16 10:32 PM
Genre-dependent interaction of coherence and lexical cohesion 207
a­ssistance with the XML (MMAX) tool for the lexical cohesion annotation and two a­nonymous
reviewers for their insightful comments on an earlier version of this paper.
References
Abelen, Eric, Gisela Redeker & Sandra A. Thompson. 1993. The rhetorical structure of US-­
American and Dutch fund-raising letters. Text 13(3). 323–350.
Bhatia, Vijay Kumar. 1998. Generic patterns in fundraising discourse. New Directions for Philanthropic Fundraising 22. 95–110.
Biber, Douglas. 1989. A typology of English texts. Linguistics 27. 3– 43.
Biber, Douglas, Ulla Connor & Thomas A. Upton (eds.). 2007. Discourse on the move. Using
corpus analysis to describe discourse structure. Amsterdam: Benjamins.
Britton, K. Bruce. 1994. Understanding expository text. Building mental structures to induce
i­nsights. In Morton Ann Gernsbacher (ed.), Handbook of psycholinguistics, 641– 674. San
D­iego: Academic Press.
Budanitsky, Alexander & Graeme Hirst. 2006. Evaluating WordNet-based measures of lexical
s­emantic relatedness. Computational Linguistics 32(1). 13– 47.
Dijk, Teun van. 1988. News as discourse. Hillsdale, NJ: Erlbaum.
Erkan, Güneş & Dragomir R. Radev. 2004. LexRank: Graph-based lexical centrality as salience in
text summarization. Journal of Artificial Intelligence Research 22. 457– 479.
Fellbaum, Christiane (ed.). 1998. Wordnet. An electronic lexical database. Cambridge: MIT Press.
Grosz, Barbara J. & Candace L. Sidner. 1986. Attention, intentions, and the structure of discourse.
Computational Linguistics 12(3). 175–204.
Halliday, M. A. K. & Ruquaiya Hasan. 1976. Cohesion in English. London: Longman.
Halliday, M. A. K. & Christian M. I. M. Matthiessen. 2004. An introduction to functional g­rammar,
3rd edn. London: Arnold.
Hasan, Ruquaiya. 1984. Coherence and cohesive harmony. In James Flood (ed.), Understanding
reading comprehension: Cognition, language and the structure of prose, 181–219. Newark, DE:
International Reading Association.
Hoey, Michael. 1991. Patterns of lexis in text. Oxford: Oxford University Press.
Landauer, Thomas, Peter W. Foltz & Darrell Laham. 1998. An introduction to latent semantic
analysis. Discourse Processes 25(2–3). 259–284.
Mann, William C. & Sandra A. Thompson. 1988. Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8(3). 243–281.
Morris, Jane & Graeme Hirst. 1991. Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics 17(1). 21– 48.
Müller, Christoph & Michael Strube. 2006. Multi-level annotation of linguistic data with MMAX2.
In Sabine Braun, Kurt Kohn & Joybrato Mukherjee (eds.), Corpus technology and language
pedagogy. New resources, new tools, new methods, 197–214. Frankfurt: Peter Lang.
Redeker, Gisela. 2000. Coherence and structure in text and discourse. In Harry Bunt & William
Black (eds.), Abduction, belief and context in dialogue. Studies in computational pragmatics,
233–263. Amsterdam: Benjamins.
Silber, H. Gregory & Kathleen F. McCoy. 2002. Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Computational Linguistics 28(4). 487– 496.
Stokes, Nicola. 2004. Applications of lexical cohesion analysis in the topic detection and tracking
domain. Dublin: University College Dublin dissertation.
Swales, John. 1990. Genre analysis. English in academic and research settings. Cambridge: Cambridge University Press.
Unauthenticated
Download Date | 5/9/16 10:32 PM
208 I. Berzlánovich and G. Redeker
Taboada, Maite & William C. Mann. 2006. Rhetorical Structure Theory: looking back and moving
ahead. Discourse Studies 8. 423– 459.
Tanskanen, Sanna-Kaisa. 2006. Collaborating towards coherence: Lexical cohesion in English
discourse. Amsterdam: Benjamins.
Upton, Thomas A. 2002. Understanding direct mail letters as a genre. International Journal of
Corpus Linguistics 7(1). 65–85.
Unauthenticated
Download Date | 5/9/16 10:32 PM