Synchronous Context-Free Grammars
and Optimal Parsing Strategies
Daniel Gildea∗
University of Rochester
Giorgio Satta∗∗
Universit`a di Padova
The complexity of parsing with synchronous context-free grammars is polynomial in the sentence
length for a fixed grammar, but the degree of the polynomial depends on the grammar. Specifi-
cally, the degree depends on the length of rules, the permutations represented by the rules, y
the parsing strategy adopted to decompose the recognition of a rule into smaller steps. We address
the problem of finding the best parsing strategy for a rule, in terms of space and time complexity.
We show that it is NP-hard to find the binary strategy with the lowest space complexity. Nosotros también
show that any algorithm for finding the strategy with the lowest time complexity would imply
improved approximation algorithms for finding the treewidth of general graphs.
1. Introducción
Synchronous context-free grammars (SCFGs) generalize context-free grammars (CFGs)
to generate two strings simultaneously. The formalism dates from the early days of
automata theory; it was developed under the name syntax-direct translation sche-
mata to model compilers for programming languages (Lewis and Stearns 1968; Aho
and Ullman 1969). SCFGs are widely used today to model the patterns of re-ordering
between natural languages, and they form the basis of many state-of-the-art statistical
machine translation systems (Chiang 2007).
Despite the fact that SCFGs are a very natural extension of CFGs, and that the
parsing problem for CFGs is rather well understood nowadays, our knowledge of the
parsing problem for SCFGs is quite limited, with many questions still left unanswered.
In this article we tackle one of these open problems.
Unlike CFGs, SCFGs do not admit any canonical form in which rules are bounded
in length (Aho and Ullman 1972), as for instance in the well-known Chomsky normal
form for CFGs. A consequence of this fact is that the computational complexity of
parsing with SCFG depends on the grammar. Más precisamente, for a fixed SCFG, analizando
is polynomial in the length of the string. The degree of the polynomial depends on the
∗ Computer Science Department, University of Rochester, Rochester NY 14627.
Correo electrónico: gildea@cs.rochester.edu.
∗∗ Dipartimento di Ingegneria dell’Informazione, Universit`a di Padova, Via Gradenigo 6/A, 35131 Padova,
Italia. Correo electrónico: satta@dei.unipd.it.
Envío recibido: 28 Julio 2015; versión revisada recibida: 4 Febrero 2016; accepted for publication:
8 Febrero 2016.
doi:10.1162/COLI a 00246
© 2016 Asociación de Lingüística Computacional
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
length of the grammar’s rules, the re-ordering patterns represented by these rules, como
well as the strategy used to parse each rule. The complexity of finding the best parsing
strategy for a fixed SCFG rule has remained open. This article shows that it is NP-hard to
find the binary parsing strategy having the lowest space complexity. We also consider
the problem of finding the parsing strategy with the lowest time complexity, and we
show that it would require progress on long-standing open problems in graph theory
either to find a polynomial algorithm or to show NP-hardness.
The parsing complexity of SCFG rules increases with the increase of the number of
nonterminals in the rule itself. Practical machine translation systems usually confine
themselves to binary rules, eso es, rules having no more than two right-hand side
nonterminals, because of the complexity issues and because binary rules seem to be
adequate empirically (Huang et al. 2009). Longer rules are of theoretical interest because
of the naturalness and generality of the SCFG formalism. Longer rules may also be of
practical interest as machine translation systems improve.
For a fixed SCFG, complexity can be reduced by factoring the parsing of a grammar
rule into a sequence of smaller steps, which we refer to as a parsing strategy. Each step
of a parsing strategy collects nonterminals from the right-hand side of an SCFG rule into
a subset, indicating that a portion of the SCFG rule has been matched to a subsequence
of the two input strings, as we explain precisely in Section 2. A parsing strategy is
binary if it combines two subsets of nonterminals at each step to create a new subset.
We consider the two problems of finding the binary parsing strategy with optimal time
and space complexity. For time complexity, there is no benefit to considering parsing
strategies that combine more than two subsets at a time. For space complexity, el
lowest complexity can be achieved with no factorization whatsoever; sin embargo, estafa-
sidering binary parsing strategies can provide an important trade-off between time and
space complexity. Our results generalize those of Crescenzi et al. (2015), who show NP-
completeness for decision problems related to both time and space complexity for linear
parsing strategies, which are defined to be strategies that add one nonterminal at a time.
Our approach constructs a graph from the permutation of nonterminals in a given
SCFG rule, and relates the parsing problem to properties of the graph. Our results for
space complexity are based on the graph theoretic concept of carving width, cuyo
decision problem is NP-complete for general graphs. Sección 3 relates the space com-
plexity of the SCFG parsing problem to the carving width of a graph constructed from
the SCFG rule. En la sección 4, we show that any polynomial time algorithm for optimizing
the space complexity of binary SCFG parsing strategies would imply a polynomial
time algorithm for carving width of general graphs. Our results for time complexity
are based on the graph theoretic concept of treewidth. En la sección 5, we show that any
polynomial time algorithm for the decision problem of the time complexity of binary
SCFG strategies would imply a polynomial time constant–factor approximation algo-
rithm for the treewidth of general graphs. In the other direction, NP-completeness of the
decision problem of the time complexity for SCFG would imply the NP-completeness of
treewidth of general graphs of degree six. These are both long-standing open problems
in graph theory.
2. Synchronous Context-Free Grammars and Parsing Strategies
In this section we informally introduce the notion of synchronous context-free grammar
and define the computational problem that we investigate in this article. We assume
the reader to be familiar with the notion of context-free grammar, and only briefly
208
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
summarize the adopted notation. Throughout this article, for any positive integer n we
write [norte] to denote the set {1, . . . , norte}.
2.1 Synchronous Context-Free Grammars
As usual, a CFG has a finite set of rules having the general form A → α, where A is
a nonterminal symbol to be rewritten and α is a string of nonterminal and terminal
symbols. A synchronous context-free grammar is a rewriting system based on a finite
set of so-called synchronous rules. Each synchronous rule has the general form [A1 →
α1, A2 → α2], where A1 → α1 and A2 → α2 are CFG rules. By convention, we refer to
A1 → α1 and A2 → α2 as the Chinese and English components of the synchronous rule,
respectivamente. Además, α1, α2 must be synchronous strings. This means that there
exists a bijection between the occurrences of nonterminals in α1 and the occurrences of
nonterminals in α2, and that this bijection is explicitly provided by the synchronous
regla. Nonterminal occurrences that correspond under the given bijection are called
linked nonterminal occurrences, or just linked nonterminals when no ambiguity arises.
We assume that linked nonterminals always have the same label.
As a simple example, consider the synchronous rule
[A → aA 1 bB 2 d, A → bB 2 cA 1 ]
where A, B are nonterminal symbols and a, b, C, d are terminal symbols. We have indi-
cated the bijection associated with the synchronous rule by annotating the nonterminal
occurrences with natural numbers, with the intended meaning that linked nonterminals
get the same number. We refer to this number as a nonterminal’s index.
The bijection associated with a synchronous rule plays a major role in the derivation
of a pair of strings by the SCFG. De hecho, in an SCFG we can only apply a synchronous
rule to linked nonterminals. Para ilustrar esto, we use our running example and consider
the synchronous strings [A 1 , A 1 ]. We can construct a derivation by applying our
synchronous rule to the linked nonterminals, written
[A 1 , A 1 ] ⇒ [aA 2 bB 3 d, bB 3 cA 2 ]
Note that we have renamed the indices in the synchronous rule to make them disjoint
from the indices already in use in the synchronous string to be rewritten. Although this
is unnecessary in this first derivation step, this strategy will always be adopted to avoid
conflicts in more complex derivations. We can move on with our derivation by applying
once more our synchronous rule to rewrite the linked A nonterminals, obtaining
[A 1 , A 1 ] ⇒ [aA 2 bB 3 d, bB 3 cA 2 ] ⇒ [aaA 4 bB 5 dbB 3 d, bB 3 cbB 5 cA 4 ]
In this derivation, the renaming of the indices is crucial to avoid conflicts.
Let S be a special nonterminal of our SCFG, which we call the starting nonterminal.
Using our notion of derivation, one can start from [S 1 , S 1 ] and attempt to derive a pair
of strings [tu, v] entirely composed of terminal symbols. Whenever this is possible, nosotros
say that [tu, v] is in the translation generated by the SCFG, meaning that v is one of the
possible translations of u.
209
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
Ejemplo 1
Consider the following list of synchronous rules, where symbols si are rule labels to be
used later as references
s1 : [S → A 1 B 2 , S → B 2 A 1 ]
s2 : [A → aA 1 b, A → bA 1 a]
s3 : [A → ab, A → ba] ,
s4 : [B → cB 1 d, B → dB 1 C]
s5 : [B → cd, B → dc]
What follows is a valid derivation of the SCFG, obtained by rewriting at each step the
linked nonterminals with the Chinese component occurring at the left-most position in
the sentential form
[S 1 , S 1 ] ⇒s1 [A 2 B 3 , B 3 A 2 ]
⇒s2 [aA 4 bB 3 , B 3 bA 4 a]
⇒s2 [aaA 5 bbB 3 , B 3 bbA 5 aa]
⇒s3 [aaabbbB 3 , B 3 bbbaaa]
⇒s4 [aaabbbcB 6 d, dB 6 cbbbaaa]
⇒s5 [aaabbbccdd, ddccbbbaaa]
It is not difficult to see that the given SCFG provides the translation {[apbpcqdq,
dqcqbpap] | pag, q ≥ 1}.
One can also associate SCFG derivations with parse trees, in a way very similar to
what we do with CFG derivations. Más precisamente, an SCFG derivation is represented
as a pair of parse trees, each generating one component in the derived string pair.
Además, linked nonterminals in the SCFG derivation are annotated in the parse
trees to indicate the place of application of some synchronous rule.
Ejemplo 2
The SCFG derivation for string pair [aaabbbccdd, ddccbbbaaa] in Example (1) is associated
with the following tree pair:
S 1
S 1
A 2
B 3
B 3
A 2
a A 4
a A 5
b
b
a
b
C
B 6
d
d B 6
C
b A 4
C
d
d
C
b A 5
a
a
d
C
210
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
One common view of SCFGs is that each synchronous rule implements a permuta-
tion of the (occurrences of the) nonterminal symbols in the two rule components. Más
precisamente, the nonterminals in the right-hand side of the Chinese rule component are
re-used in the right-hand side of the English component, but with a different ordering,
as defined by the bijection associated with the rule. This reordering is at the core of the
translation schema implemented by the SCFG. We will often use the permutation view
of a synchronous rule in later sections.
As already mentioned, we assume that linked nonterminals in a synchronous rule
have the same label. In natural language processing applications, this is by far the most
common assumption, but one might also drop this requirement. The results presented
in this article do not rest on such a restriction.
2.2 Parsing Strategies
The recognition and the parsing problems for SCFGs are natural generalizations of the
same problems defined for CFGs. Given an SCFG of interest, in the recognition problem
one has to decide whether an input pair [tu, v] can be generated. Además, para el
parsing problem one has to construct a parse forest, in some compact representation,
with all tree pairs that the SCFG produces when generating [tu, v]. When dynamic
programming techniques are used, it turns out that recognition and parsing algorithms
are closely related, since the elementary steps that are performed during recognition can
be memoized and later used to construct the desired parse forest.
Despite the similarity between CFGs and SCFGs, it turns out that there is a consid-
erable gap in the space and time complexity for recognition and parsing based on these
two classes. More specifically, for a fixed CFG, we can solve the recognition problem
in time cubic in the length of the input string, using standard dynamic programming
técnicas. A diferencia de, for a fixed SCFG we can still recognize the input pair [tu, v] en
polynomial time, but the degree of the polynomial depends on the specific structure of
the synchronous rules of the grammar, and can be much larger than in the CFG case. A
similar scenario also holds for the space complexity. The reason for this gap is informally
explained in what follows.
In the investigation of the recognition problem for CFGs and SCFGs, a crucial
notion is the one of parsing strategy. Recognition algorithms can be seen as sequences of
elementary computational steps in which two parse trees that generate non-overlapping
portions of the input are combined into a larger parse tree, according to the rules
of the grammar. We call parsing strategy any general prescription indicating the ex-
act order in which these elementary steps should be carried out. Generally speak-
En g, parsing strategies represent the order in which each parsing tree of the input is
constructed.
For the CFG case, let us consider the standard top–down chart parsing algorithm
of Kay (1980). The algorithm adopts a parsing strategy that visits the nonterminals in
the right-hand side of a rule one at a time and in a left-to-right order, combining the
associated parse trees accordingly. Parse trees are then compacted into so-called chart
bordes, which record the two endpoints of the generated substring along with other
grammar information. Chart parsing can be generalized to SCFGs. Sin embargo, en el
latter case parse trees no longer generate single substrings: They rather generate tuples
with several substrings. This can be ascribed to the added component in synchronous
rules and to the re-ordering of the nonterminals across the two components.
Let us call fan-out the number of substrings generated by a parse tree (this notion
will be formally defined later). It is well-known among parsing practitioners that the
211
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
fan-out affects the number of stored edges for a given input string, and is directly
connected to the space and time performance of the algorithm. A binary parsing strat-
egy of fan-out ϕ has space complexity O(n2ϕ) and time complexity at most O(n3ϕ)
where n is the sentence length (Seki et al. 1991; Gildea 2010). If we adopt the appropriate
parsing strategy, we can reduce the fan-out, resulting in asymptotic improvement in the
space and time complexity of our algorithm. To illustrate this claim we discuss a simple
ejemplo.
Ejemplo 3
Consider the synchronous rule
s : [A → B 1 B 2 B 3 B 4 B 5 B 6 B 7 B 8
A → B 5 B 7 B 3 B 1 B 8 B 6 B 2 B 4 ]
(1)
The permutation associated with this rule is schematically visualized in Figure 1.
A naive parsing strategy for rule s would be to collect the nonterminals B k one at
a time and in ascending order of k. Por ejemplo, at the first step we combine B 1 y
B 2 , constructing a parse tree with fan-out 3, as seen from Figure 1. The worst case is
attested when we construct the parse tree consisting of occurrences B 1 , . . . , B 5 , cual
has a fan-out of 4.
Alternativamente, we could explore more flexible parsing strategies. An example is
depicted in Figure 2, where each elementary step is represented as an internal node
of a tree. Esta vez, at the first step (node n1) we combine B 3 y B 4 , as depicted
in Figure 3a. At the second step (node n3), we combine the result of node n1 and B 2 ,
again constructing a parse tree with fan-out three, as depicted in Figure 3b, etcétera.
This strategy is non-linear, because it might combine parses containing more than one
nonterminal occurrence each; see Figure 3c. From Figure 1 it is not difficult to check that
at each node nk the constructed parse has fan-out bounded by 3. We therefore conclude
that our second strategy is more efficient than the left-to-right one.
B 8
B 7
B 6
B 5
B 4
B 3
B 2
B 1
A
Cifra 1
Graphical representation of the permutation associated with synchronous rule s of Equation (1).
212
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
n7
n5
n6
B 1
n3
B 5
n4
B 2
n1
B 6
n2
B 3 B 4
B 7 B 8
Cifra 2
A bidirectional parsing strategy for rule s of Equation (1).
This example shows that the maximum fan-out needed to parse a synchronous
rule depends on the adopted parsing strategy. In order to optimize the computational
resources of a parsing algorithm, we need to search for a parsing strategy that minimizes
the maximum fan-out of the intermediate parse trees. The problem that we investigate
in this article is therefore the one of finding optimal parsing strategies for synchronous
normas, where in this context optimal means with a value of the maximal fan-out as small
as possible.
2.3 Fan-out and Parsing Optimization
In this section we provide a mathematical definition for the concepts that we have
informally introduced in the previous section. We need to do so in order to be able
to precisely define the computational problem that is investigated in this article.
(a)
(b)
(C)
B 4
B 3
n1
B 2
n1
n3
n5
n6
A
Cifra 3
First step (a), second step (b), and last step (C) of the parsing strategy in Figure 2.
213
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
Assume a synchronous context-free rule s with r ≥ 2 linked nonterminals. necesitamos
to address occurrences of nonterminal symbols within s. We write (cid:104)1, i(cid:105) to represent
the ith occurrence (from left to right) in the right-hand side of the Chinese component
of s. Similarmente, (cid:104)2, i(cid:105) represents the ith occurrence in the right-hand side of the English
component of s. Each pair (cid:104)h, i(cid:105), h ∈ [2] and i ∈ [r], is called an occurrence.
We assume without loss of generality that the nonterminals of the Chinese compo-
nent are indexed sequentially. Eso es, the index of (cid:104)1, i(cid:105) is i. Let π be the permutation
over set [r] implemented by s, meaning that (cid:104)2, i(cid:105) is annotated with index π(i), y
therefore is co-indexed with (cid:104)1, Pi(i)(cid:105). Como ejemplo, in rule (1) we have r = 8 y
Pi(1) = 5, Pi(2) = 7, Pi(3) = 3, etc.. Under this convention, each pair ((cid:104)1, Pi(i)(cid:105), (cid:104)2, i(cid:105)) y,
equivalently, each pair ((cid:104)1, i(cid:105), (cid:104)2, π−1(i)(cid:105)), i ∈ [r] is called a linked pair.
A parsing strategy for s is a rooted, binary tree τs with r leaves, where each leaf is
a linked pair ((cid:104)1, Pi(i)(cid:105), (cid:104)2, i(cid:105)). As already explained, the intended meaning is that each
internal node of τs represents the operation of combining the linked pairs below the left
node with the linked pairs below the right node. These operations must be performed
in a bottom–up fashion.
Let n be an internal node of τs, and let τn be the subtree of τs rooted at n. We write
y(norte) to denote the set of all occurrences (cid:104)h, i(cid:105) appearing in the linked pair of some leaf
of τn. We say that occurrence (cid:104)h, i(cid:105) ∈ y(norte) is a right boundary of n if i = r or if i < r and
(cid:104)h, i + 1(cid:105) (cid:54)∈ y(n). Symmetrically, we say that (cid:104)h, i(cid:105) ∈ y(n) is a left boundary if i = 1 or
if i > 1 y (cid:104)h, i − 1(cid:105) (cid:54)∈ y(norte). Note that the occurrences (cid:104)1, 1(cid:105) y (cid:104)2, 1(cid:105) are always left
boundaries, y (cid:104)1, r(cid:105) y (cid:104)2, r(cid:105) are always right boundaries. We let bd(norte) be the total
number of right and left boundaries in y(norte).
Intuitivamente, the number of boundaries in y(norte) provides the number of endpoints of
the substrings in the rule components of s that are spanned by the occurrences in y(norte).
Dividing this total number by two provides the number of substrings spanned by the
occurrences in y(norte). We therefore define the fan-out at node n as
para(norte) = 1
2
bd(norte)
(2)
As discussed in Section 2.2, the largest fan-out over all internal nodes of a parsing
strategy provides space and time bounds for a dynamic programming parsing algo-
rithm adopting that strategy. Given an input synchronous rule s, we wish to find the
parsing strategy that minimizes quantity
mín.
t
máximo
norte
para(norte)
(3)
where τ ranges over all possible parsing strategies for s, and n ranges over all possible
nodes of τ. One of the two main results in this article is that this optimization problem
is NP-hard.
3. Cyclic Permutation Multigraphs and Carving Width
En esta sección, we relate the fan-out of parsing strategies for an SCFG rule to the
properties of a specific graph derived from the rule’s permutation.
We begin by introducing some basic terminology. A multiset is a set where there can
be several occurrences of a given element. Usamos (cid:93) as the merge operation defined for
multisets: This operation preserves the number of occurrences of the merged multisets.
As usual, we denote an undirected graph as a pair G = (V, mi) where V is a finite
214
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
set of vertices and E is a set of edges, with each edge consisting of an unordered
pair of vertices. A multigraph is a graph that uses a multiset of edges. This means
that a multigraph can have several occurrences of an edge impinging on a pair of
vertices.
A cyclic permutation multigraph is a multigraph G = (V, A (cid:93) B) such that both
PA = (V, A) and PB = (V, B) are Hamiltonian cycles, eso es, cycles that visit all the
vertices in V exactly once. En el siguiente, the edges in A will be called red and the
edges in B will be called green. Our definition is based on the permutation multigraphs
of Crescenzi et al. (2015), which differ in that they consist of two acyclic Hamiltonian
paths.
In this article, we use cyclic permutation multigraphs to encode synchronous rules.
Let s be a synchronous rule with r ≥ 2 linked pairs. Let also ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)) be a special
linked pair representing the left-hand side nonterminal of s. We construct the cyclic
permutation multigraph Ms, representing s, como sigue. The linked pairs ((cid:104)1, Pi(i)(cid:105), (cid:104)2, i(cid:105)),
i ∈ [r], junto con ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)), form the vertices of Ms. The red cycle of Ms begins
con ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)), then follows the order in which nonterminals occur in the Chinese
component of s, and finally returns to ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)). Similarmente, the green cycle of Ms
begins and ends with ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)), and follows the order in which nonterminals occur
in the English component.
In the other direction, given any cyclic permutation multigraph, we can derive
a corresponding SCFG rule by, primero, choosing an arbitrary vertex as representing the
left-hand side nonterminal, and then following the red cycle to obtain the sequence of
Chinese nonterminals, and finally following the green cycle to obtain the sequence of
English nonterminals.
Ejemplo 4
Consider the SCFG rule s of Equation (1) in Example (3). Cifra 4 shows the
cyclic permutation multigraph associated with s. The red path starts at ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)),
followed by the vertices in the order of the Chinese nonterminals of the rule, eso
es, ((cid:104)1, 1(cid:105), (cid:104)2, 4(cid:105)), ((cid:104)1, 2(cid:105), (cid:104)2, 7(cid:105)), ((cid:104)1, 3(cid:105), (cid:104)2, 3(cid:105)), . . ., ((cid:104)1, 8(cid:105), (cid:104)2, 5(cid:105)). The green path starts at
((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)), followed by the vertices in the order of the English nonterminals, eso es,
((cid:104)1, 5(cid:105), (cid:104)2, 1(cid:105)), ((cid:104)1, 7(cid:105), (cid:104)2, 2(cid:105)), ((cid:104)1, 3(cid:105), (cid:104)2, 3(cid:105)), . . ., ((cid:104)1, 4(cid:105), (cid:104)2, 8(cid:105)).
An important property of Ms is that every edge connects two linked pairs that
share a boundary in either the Chinese component of the rule (red edge) or in the
English component (green edge). Tenga en cuenta que, as a special case, we consider the linked
pair ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)) as sharing two left boundaries with the linked pairs ((cid:104)1, 1(cid:105), (cid:104)2, π−1(1)(cid:105))
((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105))
((cid:104)1, 1(cid:105), (cid:104)2, 4(cid:105))
(cid:104)1, 2(cid:105), (cid:104)2, 7(cid:105))
((cid:104)1, 3(cid:105), (cid:104)2, 3(cid:105))
((cid:104)1, 4(cid:105), (cid:104)2, 8(cid:105))
((cid:104)1, 5(cid:105), (cid:104)2, 1(cid:105))
((cid:104)1, 6(cid:105), (cid:104)2, 6(cid:105))
((cid:104)1, 7(cid:105), (cid:104)2, 2(cid:105))
((cid:104)1, 8(cid:105), (cid:104)2, 5(cid:105))
Cifra 4
The cyclic permutation multigraph corresponding to the SCFG rule s of Equation (1). En esto
figure and all subsequent figures, green edges are shown with dashed lines.
215
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
y ((cid:104)1, Pi(1)(cid:105), (cid:104)2, 1(cid:105)), and as sharing two right boundaries with the linked pairs
((cid:104)1, r(cid:105), (cid:104)2, π−1(r)(cid:105)) y ((cid:104)1, Pi(r)(cid:105), (cid:104)2, r(cid:105)).
Consider any set S of linked pairs from Ms such that ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)) (cid:54)∈ S. Let S be the
complement set of S, eso es, S is the set of linked pairs from Ms not in S. It is not difficult
to see that the multiset of edges connecting S and S corresponds to the set of boundaries
of any parse tree associated with the linked pairs in S. We can apply this property to
parsing strategies, which we have previously defined as rooted binary trees, in order to
count the boundaries that are open after completing some parsing step represented by
an internal node n, which we have defined as bd(norte). Consider for instance the parsing
strategy shown in Figure 2, and consider the internal node n3. At this node we have
collected nonterminals B 2 , B 3 , y B 4 , and we have bd(n3) = 6. If we consider the set
S with the corresponding linked pairs ((cid:104)1, 2(cid:105), (cid:104)2, 7(cid:105)), ((cid:104)1, 3(cid:105), (cid:104)2, 3(cid:105)), y ((cid:104)1, 4(cid:105), (cid:104)2, 8(cid:105)), nosotros
can see that the number of edges of the multigraph that connect S with S is six, eso
es, exactly the value of bd(n3). In order to express these observations in a mathematical
way, we need to introduce the notions of tree layout, width, and carving width, cual
we borrow from graph theory.
A tree layout T of a graph G is an undirected binary branching tree having one
vertex of G at each leaf. To avoid confusion, in what follows we use the term node when
referring to vertices of layout trees and we use the term arc when referring to edges of
layout trees. Edges of G are routed along the arcs of T. A simple example is provided in
Cifra 5, showing a graph G and a tree layout T of G.
Informalmente, the width of an arc of T is the number of edges from G that are routed
through that arc. We define this notion more precisely in what follows. We start with
some auxiliary notation. Let V be the set of vertices of G and let S be any subset of V.
The edge boundary of S in V, written ∂V(S), is the set of edges of G that connect vertices
in S and vertices in the complement set S = V \ S. Let a be an arc in a tree layout T of G,
and let T1 and T2 be the two components of the graph obtained by removing a from T.
Let also L(T1) be the subset of V appearing at the leaves of T1. The width of a in T is the
number of edges in G that cross between T1 and T2:
wdG,t(a) = |∂V(l(T1))|
(4)
The maximum width among all of the arcs of T is the carving width of T. The carving
width of T in Figure 5 es 3. The carving width of G is the minimum carving width over
all possible tree layouts of G:
wd(GRAMO) = min
t
máximo
a
wdG,t(a)
(5)
Cifra 5
(a) Graph G. (b) Tree layout T of G; G’s edges are routed along T’s arcs and are represented as
dashed lines.
216
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
Deciding whether the carving width of arbitrary graphs is less than or equal to a
given integer is an NP-complete problem (Seymour and Thomas 1994). Throughout this
artículo, we extend to multigraphs all of the discussed notions of tree layout, width, y
carving width, in the obvious way. Because graphs are a subset of multigraphs, carving
width of multigraphs is also an NP-complete problem.
We can now apply the notion of tree layout to parsing strategies for SCFG rules, y
show an important property that relates the notions of fan-out (equivalently, boundary
count) and carving width. Let s be an SCFG rule with r linked pairs. Recall that a parsing
strategy for s is a rooted binary tree where each leaf node is a linked pair representing
some nonterminal from the right-hand side of s. We attach to the root of our parsing
strategy an additional leaf node, representing the special linked pair ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)). El
resulting tree has r + 1 leaves, and can therefore be used as a tree layout for the cyclic
permutation multigraph Ms encoding s. Además, each internal node of the tree
layout is associated with a parsing step.
Ejemplo 5
Consider the SCFG rule s in Equation (1) and the associated cyclic permutation multi-
graph Ms shown in Figure 4. Consider also the parsing strategy τs for s shown in
Cifra 2. By attaching the linked pair ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)) to the root node of τs, we can derive a
tree layout T for Ms, which is shown in Figure 6. Observe that every linked pair (vertex)
of Ms, including the special linked pair ((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105)), is placed at some leaf node of T,
and the edges of Ms are routed along arcs of T. Let a be an arc of T and let n be the
node below a, with respect to the root node. Observe how edges that are routed along
a correspond to boundaries that are open at the associated step n of the parsing
strategy τs.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
((cid:104)1, 0(cid:105), (cid:104)2, 0(cid:105))
n7
n5
n6
((cid:104)1, 1(cid:105), (cid:104)2, 4(cid:105))
n3
((cid:104)1, 5(cid:105), (cid:104)2, 1(cid:105))
n4
(cid:104)1, 2(cid:105), (cid:104)2, 7(cid:105))
n1
((cid:104)1, 6(cid:105), (cid:104)2, 6(cid:105))
n2
((cid:104)1, 3(cid:105), (cid:104)2, 3(cid:105))
((cid:104)1, 4(cid:105), (cid:104)2, 8(cid:105))
((cid:104)1, 7(cid:105), (cid:104)2, 2(cid:105))
((cid:104)1, 8(cid:105), (cid:104)2, 5(cid:105))
Cifra 6
A tree layout of the cyclic permutation multigraph of Figure 4 obtained from the parsing
strategy in Figure 2.
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
217
Ligüística computacional
Volumen 42, Número 2
The next lemma summarizes the connection between carving width and fan-out.
Lema 1
The minimum fan-out of any parsing strategy of an SCFG rule s is half of the carving
width of the cyclic permutation multigraph constructed from s.
Prueba
Let Ms be the cyclic permutation multigraph derived from s. Consider a parsing strategy
τs for s. We construct a tree layout TMs,τs of Ms from the strategy τs by attaching a new
leaf node for the left-hand side nonterminal to the root of τs. Each vertex of the cyclic
permutation multigraph is placed at the corresponding leaf of the tree layout. Cuando
the two nonterminals sharing a given boundary are combined, the edge representing
the boundary is routed along the two arcs below the node for this combination. El
edges representing the left-most and right-most boundaries of the rule’s Chinese and
English components are routed through the node for the parsing strategy’s root, porque
they must continue through the root to reach the leaf representing the left-hand side
nonterminal. At each arc a of TMs,τs , the right-hand side of Equation (4) corresponds to
the right-hand side of Equation (2), and thus half the carving width of TMs,τs is equal to
the fan-out of τs.
In the other direction, let TMs be a tree layout of Ms. A parsing strategy τs can be
constructed from TMs by removing the leaf node of TMs corresponding to the left-hand
side of s, and choosing the adjacent node as the parsing strategy’s root. As before, el
fan-out of τs is half the carving width of TMs . The minimum fan-out over all strategies
is also the minimum carving width over all tree layouts. (cid:4)
4. Carving Width of Cyclic Permutation Multigraphs
En esta sección, we reduce the problem of carving width of general graphs to the problem
of carving width of cyclic permutation multigraphs, defined in Section 3. This reduction
involves constructing a cyclic permutation multigraph from an input graph as outlined
en la sección 4.1. The details of the construction are given in Section 4.2, and the proof of
NP-completeness for carving width of the resulting multigraphs is given in Section 4.3.
This result is then used in Section 4.4 to show that finding a space-optimal binary
parsing strategy for an SCFG rule is an NP-hard problem.
4.1 General Idea
Suppose that we are given an instance of the general carving width problem, consisting
of a graph G and a positive integer k, where we have to decide whether the carving
width of G is less than or equal to k. We identify the vertices of G with positive
integers in [norte], where n ≥ 2 is the number of vertices of G. Our reduction consists in
the construction of a cyclic permutation multigraph M such that the carving width of M
is equal to 4k if and only if the carving width of G is equal to k.
The main idea underlying our construction is that each vertex i of G is associated
with a gadget in M consisting of two components. The first component is a grid-like
graph Xi of 2di rows and 2di columns, where di is the degree of vertex i in G. El segundo
component is a grid-like graph Gi of 4k rows and 4k columns. Besides the edges of the
grids Xi and Gi, M also includes some extra edges, called interconnection edges. Para
each i ∈ [norte], there are some interconnection edges connecting Gi and Xi. Además,
218
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
Cifra 7
A tree layout TM of cyclic permutation multigraph M (shown later in Figure 11) derived from the
tree layout T of Figure 5.
for each edge (i, j) of G, there are interconnection edges connecting components Xi
and Xj.
Interconnection edges serve two main purposes. Primero, they are used to connect the
red and green paths within M, so that M satisfies the definition of cyclic permutation
multigraph. Segundo, the connections between component pairs Xi and Xj mimic the
structure of the source graph G, as will be explained in more detail later. This second
condition guarantees that an optimal tree layout TM of M can be obtained from an
optimal tree layout T of G, by placing each Xi and Gi component under the leaf node
of T that is associated with vertex i of G. A simple example of the construction is
schematically depicted in Figure 7.
4.2 Construction
En esta sección, we define precisely the grid-like graphs Xi and Gi, for each vertex i of the
source graph G, and outline the overall structure of the cyclic permutation multigraph
M constructed from G.
4.2.1 Depth-First Traversal. To be used later in our construction, we need to define a
special ordering of the edges of G. We do this here by constructing a path through
G that starts at an arbitrary vertex and visits each edge exactly twice. We adapt the
standard procedure for depth-first traversal of an undirected graph. Más precisamente,
whenever we reach a vertex i, we continue our path by arbitrarily choosing an edge
Algoritmo 1 Procedure for depth-first traversal of G starting at i
1: procedure DFS(i)
2:
3:
4:
return
append i to path
if each edge (i, j) has already been visited then
5:
6:
7:
8:
for each edge (i, j) not already visited (in arbitrary order) hacer
DFS( j)
append i to path
return
219
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
Cifra 8
The path constructed by Algorithm 1 starting at vertex 1 of the displayed graph is
(cid:104)1, 2, 5, 2, 1, 3, 6, 3, 7, 4, 1, 4, 7, 3, 1(cid:105).
(i, j) that has not already been visited, and move to j. If all of the edges at i have already
been visited, we backtrack to previous vertices of our path, until we reach edges that
have not already been visited. The path construction stops when we reach the starting
vertex and all of the edges of G have been visited. Algoritmo 1 provides a recursive
procedure for the incremental construction of the path, assuming that the path is a
global variable initialized as the empty sequence of vertices. A simple example is also
como se muestra en la figura 8.
Let m be the number of edges of G. It is not difficult to show that the path produced
by Algorithm 1 is a sequence (cid:104)i1, . . . , i2m+1(cid:105) of vertices from G satisfying both of the
following properties.
(cid:114)
(cid:114)
For each edge (i, j) of G, there is a unique k ∈ [2metro] such that ik = i and
ik+1 = j, and there is a unique h ∈ [2metro] such that ih = j and ih+1 = i.
For each j ∈ [norte] tenemos |{k : k ≤ 2m, ik = j}| = dj. (Recall that dj is the
degree of vertex j in G.)
The first bullet says that each edge of G is traversed exactly twice, the first time in one
direction and the second time in the opposite direction. The second bullet is a direct
consequence of the first bullet. These properties will be used in the specification of cyclic
permutation multigraph M.
4.2.2 Component Xi. The graph component Xi, i ∈ [norte], embeds a square grid with 2di
rows and 2di columns. In addition to the edges of the grid, Xi also contains a set
of interconnection edges that connect Xi to component Gi (specified later) y eso
connect Xi to all components Xj such that (i, j) is an edge of G. This is schematically
depicted in Figure 9. (A complete example for components Xi and Gi will be presented
later.)
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Más precisamente, let x(i)
metro,n be the vertex in the mth row and nth column of Gi.
let g(i)
metro,n be the vertex in the mth row and nth column of Xi. Similarmente,
220
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
Cifra 9
Grid Xi corresponding to vertex i of G with di = 3. Interconnection edges from the left column
and top row reach grids Xj for vertices j that are neighbors of i in G.
(cid:114)
(cid:114)
For each p ∈ [2di − 1], Xi has an interconnection edge α(i)
= (X(i)
Además, Xi has an interconnection edge α(i)
2di
pag = (X(i)
pag,2di
, gramo(i)
2k,1).
2di,2di
, gramo(i)
pag,1).
Symmetrically, for each q ∈ [2di − 1], Xi has an interconnection edge
q = (X(i)
b(i)
= (X(i)
b(i)
2di
2di,q, gramo(i)
1,q). Además, Xi has an interconnection edge
, gramo(i)
1,2k).
2di,2di
En general, this specification provides 4di edges connecting Xi and Gi. As will be discussed
más tarde, 2di of these edges belong to the red path and 2di belong to the green path.
For the connections between gadget Xi and other gadgets Xj, our strategy is more
involved, because we need to reproduce the topology of G and, at the same time, nosotros
need to construct a red and a green path within M. For each vertex j that is a neighbor
of vertex i in G, we add two red edges between the left column of Xi and the left
column of Xj, and two green edges between the top row of Xi and the top row of
Xj; see again Figure 9. Crucial to our construction, the order in which we visit the
neighbors of i is induced from the depth-first traversal of G specified by Algorithm 1 en
Sección 4.2.1.
Let γ = (cid:104)i1, . . . , i2m+1(cid:105) be the path produced by Algorithm 1 when given vertex 1 como
aporte, so that i1 = i2m+1 = 1. We use a function dfγ(i, k), for i ∈ [norte] and k ∈ [2metro], eso
221
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
counts the number of interconnection edges already established for the left column
(equivalently, for the top row) of Xi, at step k of the depth-first traversal, eso es, bien
before the kth edge (I, ik+1) of γ is processed. In this way, dfγ(i, k) + 1 is the next
available position in the left column (equivalently, fila superior) of Xi when visiting the kth
edge of γ.
For technical reasons, grid X1 needs a special treatment: the last edge (i2m, i2m+1) in γ
is used to enter grid X1 for the first time, and is therefore placed at vertex x(1)
1,1. Por lo tanto,
all other edges of γ that are placed at X1 need to be shifted by one position. This means
that we have to increase our count dfγ(i, k) by one unit when i = 1 and k < 2m, and we
have to treat the case of i = 1 and k = 2m in a special way.
Formally, for i ∈ [n] and k ∈ [2m], we define
dfγ(i, k) =
|{k(cid:48) : k(cid:48) < k, i ∈ {ik(cid:48) , ik(cid:48)+1}}|,
|{k(cid:48) : k(cid:48) < k, i ∈ {ik(cid:48) , ik(cid:48)+1}}| + 1, if i = 1, k < 2m
if i = 1; k = 2m
1,
if i > 1
We are now ready to specify the interconnection edges for Xi. Recall from Sec-
ción 4.2.1 eso, for an edge (i, j) of G, there exists a unique k such that ik = i and ik+1 = j
in γ, and there exists a unique k(cid:48) such that ik(cid:48) = j and ik(cid:48)+1 = i.
(cid:114)
(cid:114)
dfγ( j,k)+1,1). Symmetrically, Xi has an interconnection edge
1,dfγ( j,k)+1).
For each edge (i, j) of G, let k be as above. Xi has an interconnection edge
dfγ(i,k)+1,1, X( j)
(X(i)
1,dfγ(i,k)+1, X( j)
(X(i)
For each edge (i, j) of G, let k(cid:48) be as above. Xi has an interconnection edge
dfγ(i,k(cid:48) )+1,1, X( j)
(X(i)
dfγ( j,k(cid:48) )+1,1). Symmetrically, Xi has an interconnection edge
1,dfγ(i,k(cid:48) )+1, X( j)
(X(i)
1,dfγ( j,k(cid:48) )+1).
Informalmente, this specification means that for each edge (i, j) of G we have four inter-
connection edges between Xi and Xj: two edges from when the depth-first traver-
sal first explores the edge from i to j, and two edges from when the depth-first
traversal travels back from j to i. This amounts to 4di
interconnection edges for
grid Xi.
p and β(i)
q
4.2.3 Component Gi. We now turn to (multi)graph Gi, i ∈ [norte], which embeds a square grid
with 4k rows and 4k columns; here k is the positive integer in the input instance of the
carving width problem in our reduction; mira la sección 4.1. We have already introduced
interconnection edges α(i)
for Gi in Section 4.2.2. In addition to these edges,
we need to double some occurrences of the edges internal to the grid Gi, in order to
connect the red and green paths within M. We remind the reader that M is defined as a
multigraph; therefore we can introduce multiple edges joining the same pair of vertices
of Gi. As will be discussed in detail later, our construction always assigns different colors
(red or green) to two edges joining the same pair of vertices. An overall picture of Gi is
schematically presented in Figure 10 for di = 3 y k = 5.
We first provide the specification of the red edges of Gi.
For each p ∈ [2k], we double edge (gramo(i)
2p−1,4k, gramo(i)
2pag,4k).
(cid:114)
222
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
Cifra 10
Grid gadget Gi corresponding to vertex i. We assume di = 3 y k = 5. The red path is drawn
using a heavy red line; the green path is drawn using a dashed, heavy green line. The red path
visits Gi one row at a time. The only exception is at row 2k: after reaching the middle of that
row, the red path switches to row (2k + 1). The path comes back to row 2k only after having
completed the lower half of Gi. The green path follows a symmetrical pattern in visiting Gi one
column at a time, with a switch in the middle of column 2k.
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
(cid:114)
(cid:114)
(cid:114)
For each p with di ≤ p ≤ k − 1, we double edge (gramo(i)
2pag,1, gramo(i)
For each p with k + 1 ≤ p ≤ 2k − 1, we double edge (gramo(i)
2p+1,1).
2pag,1, gramo(i)
2p+1,1).
We add edge (gramo(i)
2k+1,1, gramo(i)
4k,1).
This set of edges form the red extra edges of Gi. Symmetrically, for each red extra edge
(gramo(i)
q(cid:48),pag(cid:48) ) to Gi, eso es, the new edge
has row and column reversed for both endpoints.
pag(cid:48),q(cid:48) ) arriba, we also add a green extra edge (gramo(i)
pag,q, gramo(i)
q,pag, gramo(i)
223
Ligüística computacional
Volumen 42, Número 2
Ejemplo 6
We now provide an example that uses the specifications in this and the previous section.
Consider the graph G in Figure 5, and let k = 4. Assume the depth-first traversal of G
that produces the vertex path γ = (cid:104)1, 2, 3, 4, 3, 1, 3, 2, 1(cid:105). From G, k, and γ we construct
the cyclic permutation multigraph M shown in Figure 11.
Let us focus on vertex 3 in G, and the associated components X3 and G3 in M.
Because d3 = 3, X3 has size 6 × 6. Because vertex 3 is connected to vertices 1, 2, y 4 en
GRAMO, grid X3 has two red and two green interconnection edges to each of the components
X1, X2, and X4. Consider now all the edges of G impinging on vertex 3. These edges
are visited twice by γ, in the order (2, 3), (3, 4), (4, 3), (3, 1), (1, 3), (3, 2). Respectivamente,
when visiting the first column of X3 from top to bottom, we touch upon the red
interconnection edges for grids X2, X4, X4, X1, X1, and X2. Exactly the same order is
found for the green interconnection edges, when visiting the first row of X3 from left to
bien.
Grid G3 has size 4k × 4k, eso es, 16 × 16. This is also the size of any other grid
Gi in M. We have 2d3 = 6 red interconnection edges connecting G3 and X3; these are
the α(3)
pag , p ∈ [5], impinges on vertex
pag,1, and edge α(3)
gramo(3)
8,1. A symmetrical pattern is seen for the green
interconnection edges β(3)
p edges of Section 4.2.2, for p ∈ [6]. Each edge α(3)
impinges on vertex g(3)
q , q ∈ [6], connecting G3 and X3.
6
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Cifra 11
Cyclic permutation multigraph M constructed from the example graph G shown in Figure 5 con
k = 4 and depth-first traversal path γ = (cid:104)1, 2, 3, 4, 3, 1, 3, 2, 1(cid:105). Edges are routed according to the
high-level tree layout of Figure 7.
224
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
4.2.4 Cyclic Permutation Multigraph M. To summarize the previous sections, the cyclic
permutation multigraph M contains components Xi and Gi for each i ∈ [norte], where n
is the number of vertices of the source graph G. Besides the edges of the grid-like
components Xi and Gi, M also contains some interconnection edges. Más precisamente, para
each vertex i of G, M contains 2di red edges α(i)
q connecting Xi
and Gi. Además, for each edge (i, j) of G, M contains two red edges and two green
edges connecting Xi and Xj.
p and 2di green edges β(i)
1,1 to vertex g(i)
We still need to show that M is a cyclic permutation multigraph, eso es, all the edges
mentioned above form a red Hamiltonian cycle and a green Hamiltonian cycle over the
vertices of M. We start by observing that, within each component Gi, these two cycles
are symmetric, in the sense that the green Hamiltonian cycle can be obtained from the
red Hamiltonian cycle by switching the first and the second indices of each vertex in an
borde. En otras palabras, within Gi the green Hamiltonian cycle can be obtained from the
red Hamiltonian cycle by a rotation along the axis from vertex g(i)
4k,4k. Este
is also apparent from Figure 10. A similar observation holds for the components Xi, como
apparent from Figure 9, and for the interconnection edges. Por esta razón, we outline
in the following only the red Hamiltonian cycle of M.
The red cycle of M starts and ends at vertex g(1)
1,1 in G1. The cycle follows a depth-
first traversal of G specified by Algorithm 1 en la sección 4.2.1. Each time the depth-first
traversal visits node i of G, the red cycle travels across one row of Xi from left to right,
then passes through Gi, and then travels across the next row in Xi from right to left,
before proceeding to the gadget for the next vertex of G in the depth-first traversal.
Exactly how the red cycle travels through Gi differs for the first di − 1 times that Gi is
visited and the final time; ver figura 10. The first di − 1 times that Gi is visited, the red
cycle travels from left to right across one row, descends to the next row, and traverses it
from the right to left. On the final trip into Gi, the cycle visits all the remaining rows as
follows. It travels left to right across row 2(di − 1) + 1 of Gi, then zig-zags across the next
2k − 2(di − 1) − 1 rows in the upper half of Gi. Upon reaching vertex g(i)
2k,2k+1, próximo
from its right and proceeding to the left, the red cycle moves to g(i)
2k+1,2k+1 using an edge
internal to the grid. This choice is designed to ensure, for reasons that will become clear
más tarde, that none of the extra edges added to the grid underlying Gi crosses between the
grid’s four quadrants.
Próximo, the red cycle travels from vertex g(i)
2k+1,2k+1 toward the right to reach g(i)
4k,1, the red cycle jumps to vertex g(i)
2k+1,4k,
and then zig-zags across the next 2k − 1 rows of the lower half of Gi. Upon reaching
vertex g(i)
2k+1,1 using a single extra edge. It then travels
toward the right to reach g(i)
2k+1,2k, moves to g(i)
2k,2k using an edge internal to the grid,
and finally proceeds toward the left to reach g(i)
2k,1. This completes the last traversal of Gi
by the red cycle.
From g(i)
2k,1 the red cycle leaves Gi and enters Xi at vertex x(i)
. It then travels across
2di,2di
2di,1, and then proceeds to visit the gadget Xj for
the bottom row of Xi to reach vertex x(i)
the next vertex j of G in the depth-first traversal, as already mentioned.
4.3 NP-Completeness
The main result in this section shows that deciding whether the carving width of a
cyclic permutation multigraph is less than or equal to a given integer is an NP-complete
problema. We develop this result by means of several intermediate lemmas, reduciendo
225
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
from the carving width decision problem for a general graph. Throughout this section,
we assume that G, k, and M are defined as in the previous sections.
Lema 2
Within a tree layout TM of M, we can organize all vertices of Gi, for any i ∈ [norte], en
a (conectado) subtree TGi of TM in such a way that there is an upper bound of 4k for
the width of any arc internal to TGi and for the width of any arc connecting TGi to the
remaining nodes of TM.
Prueba
It has been shown by Kozawa, Otachi, and Yamazaki (2010) that the carving width of
an m × m grid with m even is m. We adapt here their construction to show the statement
of the lemma.
The subtree TGi is built by dividing the grid Gi into four quadrants of size 2k × 2k,
shown by black dotted lines in Figure 12. Consider first the upper left quadrant. Nosotros
build a linear subtree TUL by adding vertices of this quadrant one at a time in column
major order. More specifically, we add columns of the quadrant from left to right, y
we add vertices within each column from top to bottom, as shown Figure 12. Usamos
symmetrical constructions for the bottom left quadrant, the bottom right quadrant,
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Cifra 12
Subtree layout for grid Gi, assuming di = k = 3. The subtree layout is drawn using a thick black
line. A dotted thin black line shows the division of grid Gi into four quadrants.
226
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
and the upper right quadrant of Gi, resulting in the linear trees TBL, TBR, and TUR,
respectivamente. The top-most nodes of the linear trees TUL and TBL are connected to a new
node nL; similarmente, the top-most nodes of TUR and TBR are connected to a new node nR,
and nL and nR are connected together. This completes the specification of the tree layout
TGi . Finalmente, an extra edge is used to connect the bottom-most internal node of TUL to the
remaining nodes of the tree layout TM. See again Figure 12.
We now prove an upper bound of 4k for the width of any arc internal to TUL and for
the width of any arc connecting TUL to the remaining nodes of TM. The binary tree TUL
has 4k2 leaf nodes and 4k2 internal nodes. Let us name the internal nodes of TUL from
bottom to top, using integers in [4k2] in increasing order. Because the width of TUL does
not decrease at the increase of di, in what follows we consider the worst case of di = k.
Observe that the arc connecting internal node 1 to the remaining nodes of TM routes
4k edges of Gi, a saber, the interconnection edges of Gi that lead to Xi. When we move
to arc (1, 2) of TUL, we lose the two interconnection edges impinging on vertex g(i)
1,1 of Gi,
1,1, gramo(i)
but those edges are replaced by two new internal edges of Gi, a saber, bordes (gramo(i)
2,1)
y (gramo(i)
1,2). Por lo tanto, arc (1, 2) of TUL still routes 4k edges of Gi. Climbing up tree
TUL at arcs (2, 3), (3, 4), etcétera, shows exactly the same pattern, with two edges
impinging on the newly added vertex of Gi replaced by two new edges of Gi impinging
on the same vertex. This process goes on until we reach the arc that connects node 4k2
(the top-most node of TUL) to node nL. This arc routes the 4k edges of the upper left
quadrant that reach the upper right quadrant and the lower left quadrant.
1,1, gramo(i)
To conclude our proof, we observe that the upper left quadrant embeds any
of the three remaining quadrants of Gi. Because the trees TBL, TBR, and TUR are sym-
metrical to TUL, the former trees must also have an upper bound of 4k on the width
of their internal arcs as well as on the width of the arcs connecting these trees to the
remaining nodes of TM. Finalmente, we observe that the arc connecting nodes nL and nR also
routes 4k edges of Gi, namely the edges from the two left quadrants to the two right
quadrants. (cid:4)
In the case of di = k, the grid component Xi is the same as the top left quadrant of
Gi. Por lo tanto, the analysis provided in the proof of Lemma 2 can also be used to prove
the next lemma.
Lema 3
Within a tree layout TM of M, we can organize all vertices of Xi, for any i ∈ [norte], into a
subtree TXi of TM in such a way that there is an upper bound of 4k for the width of any
arc internal to TXi and for the width of any arc connecting TXi to the remaining nodes
of TM.
The tree layout for Xi is a long chain, isomorphic to the tree layout of the upper
left quadrant of Gi in Figure 12. In what follows, we refer to the node of the layout
immediately above x(i)
as bottom-
2di,2di
mayoría. We can now prove the correctness of our reduction for one direction.
1,1 as top-most, and the node immediately above x(i)
Lema 4
If G has a tree layout T of carving width k, then M has a tree layout TM of carving
width 4k.
Prueba
Each leaf node of T is associated with a vertex i of G. The main idea of this proof is
to construct a tree layout TM by using T as the top level of TM, and by attaching a
227
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
tree layout for each grid Xi and Gi under the node of T associated with vertex i. As an
ejemplo, for the cyclic permutation multigraph M of Figure 11, this will provide the tree
layout schematically depicted in Figure 7.
For each i ∈ [norte], we use the tree layout T(Gi) from Lemma 2 and connect it to the
tree layout T(Xi) from Lemma 3. The connection arc is created from the bottom-most
internal node of subtree TUL of T(Gi), to the top-most node of T(Xi). The tree layout
t(Xi) is in turn connected to the tree layout T for G. The connection is established by
merging the bottom-most internal node of T(Xi) and the leaf node of T associated with
vertex i of G.
From Lemma 2 and Lemma 3, we have that any of the trees T(Gi) and T(Xi) tener
width at most 4k. The edges of M that are routed through the arc connecting T(Gi) a
t(Xi) are exactly the 4di ≤ 4k interconnection edges between Gi and Xi. The edges of
M routed through the connection between T(Xi) and T are the 4di ≤ 4k interconnection
edges between Xi and all the Xj for j a neighbor of vertex i in G. Finalmente, the top-level
subtree T within TM also has carving width at most 4k. To see this, observe that for each
borde (i, j) routed by an arc a of the tree layout T of G, the same arc a in (the copy of) t
used as a subtree of the tree layout TM routes two red and two green edges connecting
components Xi and Xj of M. Because the carving width of the tree layout T of G is k, nosotros
conclude that the carving width of the tree layout T within TM is 4k. (cid:4)
We now deal with the other direction in the correctness of our reduction. We must
show that if M has a tree layout of carving width 4k, then G has a tree layout of carving
width k. This will be Lemma 8, for which we will need to develop a few intermediate
lemmas. We introduce our first intermediate lemma by means of a simple example.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ejemplo 7
Consider the tree layout of grid Gi, i ∈ [norte], that has been depicted in Figure 12 as a
subtree of a tree layout TM of M. Let a be the arc that connects the bottom right quadrant
of the grid to the remainder of the layout. Using the notation in the proof of Lemma 2,
arc a can be written as (nR, nBR), with nBR the top-most node of linear tree TBR. Observe
that there are 4k edges of M routed through arc a (En figura 12 we have k = 3). These are
the 2k red edges that connect the bottom right quadrant with the bottom left quadrant
of Gi, and the 2k green edges that connect the bottom right quadrant with the top right
quadrant. These 4k edges of M are all internal to grid Gi, meaning that they connect
vertices within Gi. Finalmente, observe the subtree rooted at node nBR contains 4k2 vertices
from Gi.
The special properties of arc a that we have mentioned here are not dependent on
the choice of the tree layout TM. In fact the next lemma shows that, for every choice of
a tree layout of M having carving width 4k, and for every grid Gi in M, i ∈ [norte], allá
always exists an arc that satisfies the properties of Example 7. Intuitivamente, this happens
because the vertices in a grid have a relatively high degree of interconnections, y
when we pick up a sufficiently large set of vertices of a grid Gi, we have a large number
of edges connecting the vertices in our set to the remaining vertices of Gi. This in turn
means that, in any tree layout TM, any subtree Ti containing a sufficiently large set of
vertices of Gi must route a large number of edges from Gi through the arc a that connects
Ti to the rest of TM. If TM has bounded carving width, the edges routed through arc a
saturate the width of this arc, making it impossible for other vertices not from Gi to be
placed within Ti. En otras palabras, there is always some core set of vertices from Gi that
is placed in some subtree of a tree layout of M, and this subtree must exclude vertices
from other grids.
228
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
Lema 5
Let TM be a tree layout of M having carving width 4k. For each vertex i of G, there exists
an arc ai of TM satisfying all of the following properties:
(i)
(ii)
(iii)
(iv)
The width of ai is 4k.
The edges of M routed through ai are all internal to grid Gi.
One of the two subtrees obtained by removing ai from TM contains only
vertices internal to Gi; we call this the subtree below ai.
The subtree below ai contains at least 4k2 vertices from Gi.
Prueba
For a subtree T of TM, let Li(t) be the number of vertices of Gi that are at the leaves of
t. Consider an arc a of TM, and let T1 and T2 be the two subtrees of TM obtained by
removing a from TM; this is exemplified in Figure 13. We say that a is balanced for Gi
if the choice of a minimizes |li(T1) − Li(T2)|.
The following argument is a standard one, see for instance Kozawa, Otachi, y
Yamazaki (2010, Lema 3.1). Assume that a is a balanced arc in TM. Dejar |Gi| = (4k)2
be the number of vertices of Gi. Without loss of generality, we assume that Li(T1) ≥
li(T2). Porque |Gi| = Li(T1) + li(T2), this implies Li(T2) ≤ 1
2 |Gi|. Porque |Gi| ≥ 16, nosotros
must have Li(T1) > 1, otherwise a would not be balanced. Then the root of T1 must
have two children, rooting subtrees T(cid:48)
1 of T1; see again Figure 13. We must have
li(t(cid:48)
1 ) ≤ Li(T2), otherwise a would not be balanced. This in turn
implies Li(T2) ≥ 1
3 |Gi|. To summarize these inequalities, tenemos
1) ≤ Li(T2) and Li(t(cid:48)(cid:48)
1 and T(cid:48)(cid:48)
1
3
|Gi| ≤ Li(T2) ≤ 1
2
|Gi|
(6)
For each vertex i of G, we now choose ai in the statement of the lemma to be a balanced
arc for Gi. In the following discussion we focus on an arbitrary choice of i ∈ [norte], y
again we let T1 and T2 be the two subtrees of TM obtained by removing ai from TM, con
li(T1) ≥ Li(T2), so that we can use Equation (6).
a
T1
T2
t(cid:48)
1
t(cid:48)(cid:48)
1
Cifra 13
Definition of balanced arc a of TM for some grid Gi. Trees T1 and T2 are the two subtrees of TM
obtained by removing a from TM.
229
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
C
oh
yo
i
_
a
_
0
0
2
4
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 42, Número 2
Let S be a grid of size 4k × 4k, eso es, a square grid with |Gi| nodos. It is known
eso, for any choice of s nodes from S with 1
2 |Gi|, there are at least 4k edges
of S connecting the chosen nodes to nodes in the complement set. For this claim, ver
for instance Ahlswede and Bezrukov (1995) or Rolim, S ´ykora, and Vrt’o (1995). Este
claim must also be true for the edges internal to Gi, since Gi embeds the grid S. De
Ecuación (6) we then have that at least 4k edges internal to Gi are routed through ai.
Además, since TM has carving width 4k, the width of ai cannot exceed 4k. Nosotros entonces
conclude that conditions (i) y (ii) are both satisfied.
3 |Gi| ≤ s ≤ 1
Let Ni be the set of vertices of M that are not vertices of Gi. It is easy to see from the
specification in Section 4.2 that the sub-multigraph of M induced by Ni is connected.
(This is the very reason why we introduced the components Xi in our construction of
METRO.) This implies that for any pair of vertices in Ni, there exists a path in M connecting
the two vertices that is entirely composed of edges that are not internal to Gi. If both
T1 and T2 have some vertex from Ni, there would be an edge not internal to Gi routed
through ai, in contrast to condition (ii). We therefore conclude that one among T1 and T2
contains only vertices internal to Gi, satisfying condition (iii). This subtree is called the
subtree below ai.
From Equation (6) and by our choice of T1, tenemos 1
3 |Gi| ≤ Li(T2) ≤ Li(T1). Fur-
thermore, 4k2 < 1
3 |Gi|, and hence both T1 and T2 have at least 4k2 vertices from Gi.
Regardless of which of T1 or T2 is below ai, we conclude that condition (iv) is satisfied.
(cid:4)
Consider a square grid Q of size 4k × 4k, and let d be some fixed integer with d ∈ [k].
Assume that 2d vertices in the left-most column of Q and 2d vertices in the top-most row
of Q have been selected as connector vertices. Observe that the number of connector
vertices in Q is either 4d − 1 or 4d, depending on whether the vertex at the top-left
corner of Q is selected twice or not. Let S be any subset of the vertices of Q. Recall from
Section 3 that the edge boundary of S in Q, written ∂Q(S), is the set of edges of Q that
connect vertices in S and vertices in S, the complement set of S.
Lemma 6
Let Q be a square grid of size 4k × 4k and let d ∈ [k]. Assume 2d connector vertices in the
left-most column and 2d connector vertices in the top-most row of Q. Let also S be a set
of vertices of Q such that |S| ≥ 4k2 and S does not contain any connector vertex. Then
|∂Q(S)| ≥ 4d.
Proof
We distinguish three cases in the following, depending on whether the set S contains
any entire column and/or any entire row of Q.
Case 1: S contains at least one entire column of Q and at least one entire row
of Q. (This part of the proof is adapted from Kozawa, Otachi, and Yamazaki [2010,
Propositions 4.4, 4.5, and 4.7].) Let r be the number of rows of Q that have at least one
vertex in S. Within each such row there must be a vertex not in S, since at least one entire
column of Q is not in S. This means that at least one edge of this row is in the set ∂Q(S),
for a total of r edges in ∂Q(S). A similar argument applies to the number of columns c of
Q that have at least one vertex in S.
Because rows and columns have disjoint sets of edges, we have |∂Q(S)| ≥ r + c. It
is well known that the arithmetic mean (r + c)/2 is always larger than or equal to the
4k2 =
geometric mean
4k ≥ 4d.
rc. Because rc ≥ |S|, we can write |∂Q(S)| ≥ 2
(cid:112)|S| ≥ 2
rc ≥ 2
√
√
√
230
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
Case 2: S contains at least one entire column of Q but no entire row, or else S contains
at least one entire row of Q but no entire column. In the first case we have r = 4k and,
as explained in Case 1, 4k ≥ 4d edges in the set ∂Q(S). Symmetrically, in the second case
we have c = 4k and thus 4k ≥ 4d edges in ∂Q(S).
Case 3: S does not contain any entire column of Q, nor any entire row of Q. In this
case each row of Q has at least one vertex in S. Because the connector vertices are all
contained in the set S, each of the 2d rows of Q that have a connector vertex contributes
at least one edge to the set ∂Q(S). A symmetrical argument applies to the 2d columns
of Q that have a connector vertex. Again, because the rows and the columns of Q have
disjoint edges, we conclude that |∂Q(S)| ≥ 4d. (cid:4)
Let TM be any tree layout of M with carving width 4k. In Lemma 5 we associated
each vertex i of G with an arc ai of TM having some specific properties. Similarly, our
next lemma associates each edge (i, j) of G with a set of four paths in TM. These paths
are routed through ai and through aj, and do not share any of their edges. Using this
property, we will later be able (in Lemma 8) to derive the carving width of G from the
carving width of M. We use a simple example here to illustrate the special paths we are
looking for.
Example 8
Consider once more the grid Gi and its subtree layout depicted in Figure 12, where
we have di = k = 3. We report Gi in Figures 14 and 15, where for convenience we have
ignored the associated tree layout. As in Example (7), consider the arc in the linear
layout that connects the bottom right quadrant of Gi to the remainder of the layout.
Here we denote this arc as ai. There are di = 3 neighbors of i in G, and we need to
associate four paths with each neighbor, for a total of 12 paths routed through ai. In
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
231
Figure 14
Paths starting with the red edges routed through arc ai of TM. These paths and the paths in
Figure 15 do not share any of their edges.
Computational Linguistics
Volume 42, Number 2
Figure 15
Paths starting with the green edges routed through arc ai of TM. These paths and the paths in
Figure 14 do not share any of their edges.
what follows we focus our attention on the first segment of each of these 12 paths,
more precisely, the segment of these paths that starts at an edge of M routed through ai
and ends at some edge of M that leaves Gi to reach grid Xi.
In Figure 14 we outline the first segment of six paths of M routed through arc ai,
starting with the red edges that connect the bottom right quadrant with the bottom left
quadrant of Gi. These paths reach the six vertices in the left-most column of Gi that are
connected with the grid Xi and will eventually reach the grids Xj and Gj, for each vertex
j that is a neighbor of i in G. Similarly, in Figure 15 we outline six more paths of M routed
through arc ai, starting with the green edges that connect the bottom right quadrant with
the top right quadrant of Gi. These paths reach the six vertices in the top-most row of Gi
that will eventually reach the grids Xj and Gj for each neighbor j of i.
Finally, observe that these 12 paths routed through ai share some of their vertices in
the top left quadrant of Gi but do not share any of their edges. To see this, consider the
diagonal of the top left quadrant from the bottom left corner to the upper right corner.
Notice then that the paths in Figure 14 use the vertical edges of Gi below the diagonal,
and the paths in Figure 15 use the horizontal edges below the diagonal. Similarly,
the paths in Figure 14 use the horizontal edges above the diagonal, and the paths in
Figure 15 use the vertical edges above the diagonal.
The next lemma generalizes the previous example, showing that for any tree layout
TM we can associate each edge (i, j) of G with a set of four paths in M that are edge
disjoint. We need to introduce some auxiliary notation. Let TM and ai, i ∈ [n], be defined
as in Lemma 5. Consider an arc (i, j) in the source graph G. A path in M is called an
(i, j)-path for TM if it starts at a vertex inside the subtree below ai (see Lemma 5(iii) for
the definition of this subtree) and it ends at a vertex inside the subtree below aj. We also
say that two paths in M are edge disjoint if they do not share any edge (they might
however share some vertex).
232
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
Lemma 7
Let TM be a tree layout of M having carving width 4k. There exists a set P of paths in M
satisfying both of the following properties:
(i)
For every edge (i, j) in the source graph G, P contains four (i, j)-paths for
TM.
(ii)
Every two paths in P are edge-disjoint.
Proof
We call connector edges the edges of M that are not internal to any component Gi,
i ∈ [n]. We call connector vertices those vertices of Gi, i ∈ [n], that have an impinging
connector edge. In this way, each Gi has 4di − 1 connector vertices. These are placed in
the left-most column of Gi and in the top-most row of Gi. We claim that, for each i ∈ [n],
the connector vertices of Gi must all be outside of the subtree of TM below ai. To see this,
let us assume that some connector vertex of Gi is found inside the tree below ai. Because
connector vertices are connected to vertices outside of Gi, there would be a vertex not
internal to Gi inside the tree below ai, against Lemma 5(iii), or else there would be a
connector edge routed through ai, against Lemma 5(ii). We therefore conclude that the
connector vertices of Gi must all be outside of the tree below ai.
The idea underlying the proof is to define each (i, j)-path in P as the concatenation
of three sub-paths, called segments, specified as follows.
(cid:114)
(cid:114)
(cid:114)
The first segment starts with an edge routed through arc ai. By definition,
one of the two end vertices of such an edge is inside the subtree below ai,
and this is also the starting vertex of the segment. Furthermore, the
segment only uses edges that are internal to Gi, and ends with a connector
vertex of Gi.
The second segment only uses connector edges, and ends with a connector
vertex of Gj.
The third segment only uses edges internal to Gj, and ends with an edge
routed through arc aj. The end vertex of such an edge that is inside the
subtree below aj is the end vertex of the segment.
From now on, we focus on a specific vertex i of G and prove the existence of 4di edge
disjoint (i, j)-paths for vertices j that are neighbors of i in G. We do this by separately
specifying the three segments of the paths.
First segment. From Lemma 5(ii), there are 4k edges internal to Gi that are routed
through arc ai. We know that k ≥ di, since the carving width of a graph is always at
least its degree. Thus we start our segments with any choice of 4di disjoint edges routed
through arc ai. We now show that these edges can be extended to 4di segments that reach
the 4di − 1 connector vertices of Gi, without sharing any of their edges. We do this by
reducing our problem to a network flow problem.
If we assume that edges of Gi have flow capacity of one, each segment corresponds
to a flow through Gi having capacity one. The existence of 4di edge disjoint segments
is then equivalent to the existence of a flow of capacity 4di from the subtree below ai to
the 4di − 1 connector vertices of Gi. Let V(Gi) be the set of vertices of Gi. The max-flow
min-cut theorem states that this flow can be realized if and only if any cut of Gi with the
233
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Computational Linguistics
Volume 42, Number 2
source vertices on one side and the target vertices on the other side has capacity not less
than 4di.
In our specific case, consider the square grid Q underlying component Gi. Let S ⊆
V(Gi) be a vertex set including all the vertices in the subtree below ai and excluding all
the 4di − 1 connector vertices of Gi. If we can show the relation
|∂Q(S)| ≥ 4di
(7)
for any choice of S as before, then we have proved the existence of the desired 4di edge
disjoint segments.
In order to prove Equation (7), we observe that the set of vertices in the subtree
below ai has size greater than or equal to 4k2, by Lemma 5(iv). Furthermore, observe
that there are 2di connector vertices of Gi in the left-most column of Q as well as 2di
connector vertices in the top-most row. Under these conditions, we can apply Lemma 6
to the grid Q, with d = di ≤ k, and derive Equation (7). We then conclude that there exist
4di edge disjoint first segments starting at vertices in the subtree below ai, as desired.
Second segment. As already observed in Section 4.2.2, for every connector vertex in
the left-most column of Gi there is a red segment to some connector vertex in the left-
most column of Gj, where j is some neighbor of i in G. Symmetrically, for every connector
vertex in the top-most row of Gi there is a green segment to some connector vertex in
the top-most row of some Gj. This provides a total of 4di segments. These segments are
all edge disjoint, by construction of the components Xh, h ∈ [n], and by construction of
the interconnection edges.
Third segment. We observe that the third segment of an (i, j)-path is the reversal of the
first segment of a ( j, i)-path. Therefore we can use our argument for the first segments to
show the existence of 4di edge disjoint third segments ending at a vertex in the subtree
below aj.
To summarize, for each i ∈ [n] we have shown the existence of 4di edge disjoint
(i, j)-paths for vertices j that are neighbors of i in G, to be added to set P. To complete
the proof, we observe that two paths in P having disjoint start vertices must have edge
disjoint first segments, since these segments are defined for edges that are internal to
different components Gi. A similar argument applies to the third segments of paths
having disjoint end vertices. (cid:4)
Lemma 8
If M has a tree layout TM of width at most 4k, then G has a tree layout T of width at
most k.
Proof
For each i ∈ [n], let arc ai and the subtree below ai be defined as in Lemma 5. We denote
by r(ai) the root of the subtree below ai. The tree layout T of G is constructed from TM
as follows.
For each i ∈ [n], we prune from TM the subtree below ai, that is, we remove
from TM all the arcs of this subtree and all of its nodes but r(ai).
For every i, j ∈ [n] with i (cid:54)= j, consider the unique path in TM that joins
nodes r(ai) and r(aj). We remove from TM all nodes that are not in any such
path, along with the arcs impinging on the removed nodes.
(cid:114)
(cid:114)
234
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
(cid:114)
For any of the remaining nodes of TM with degree 2, we remove the node
and “merge” the impinging arcs into a new arc.
Observe that T has exactly n leaves, represented by the nodes r(ai), i ∈ [n]. We then
place each vertex i of G at node r(ai). We show now that this tree layout of G has width
no larger than k.
Let a be an arbitrary arc of T. Let aM be the arc of TM equal to a, if aM has been
preserved in the construction of T; otherwise, let aM be an arbitrary arc of TM that has
been merged into a (in the third bullet of the construction). Assume that wa is the width
of a. According to the definition of width of an arc, there are exactly wa edges (i, j) in G,
such that the paths in T joining nodes r(ai) and r(aj) share arc a. We denote by Sa this set
of edges.
Lemma 7 states that for each edge (i, j) ∈ Sa, there are 4 (i, j)-paths through M, for
a total of 4wa paths through M that are pairwise edge-disjoint. From the construction
of T, we know that arc aM must be in the path within TM joining nodes r(ai) and r(aj).
If we remove aM from TM, nodes r(ai) and r(aj) are no longer connected, since paths
are unique in a tree. This implies that, in the tree layout TM, our 4wa paths through
M associated with the edges in Sa must all be routed through arc aM. Because these
paths are pairwise edge-disjoint, we conclude that there must be 4wa distinct edges of
M routed through aM. Because in our tree layout the width of aM is at most 4k, we can
write 4wa ≤ 4k and hence wa ≤ k. Since arc a was chosen arbitrarily, this concludes our
proof. (cid:4)
We can now provide the main result of this section.
Theorem 1
Let M be a cyclic permutation multigraph and let k ≥ 1 be some integer. The problem of
deciding whether M has carving width less than or equal to k is NP-complete.
Proof
For the hardness part, we reduce from the problem of deciding whether a (stan-
dard) graph G has carving width less than or equal to k, which is an NP-complete
problem, as already mentioned. We construct the cyclic permutation multigraph M
from G as in Section 4.2. It is not difficult to see that the construction can be car-
ried out in time O(nk2 + e), where n and e are the number of vertices and edges,
respectively, in G. Because we can assume k ≤ e, we have that the reduction takes
polynomial time in the size of the input. By combining Lemmas 4 and 8, we have
that G has carving width k if and only if M has carving width 4k. This completes our
reduction.
For the completeness part, given a cyclic permutation multigraph M, we can guess
a tree layout T, and accept if the width of T is less than or equal to k. All of this
computation can be carried out in linear time. (cid:4)
4.4 NP-Completeness of Fan-out
We can now present the main result on the optimization of the fan-out of parsing
strategies for an SCFG rule.
235
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Computational Linguistics
Volume 42, Number 2
Theorem 2
Let s be a synchronous rule and let k ≥ 1 be some integer. The problem of deciding
whether there exists a binary parsing strategy for s with fan-out less than or equal to k
is NP-complete.
Proof
This directly follows from Lemma 1 and from Theorem 1. (cid:4)
5. Time Complexity
In this section, we address the problem of finding the parsing strategy with the optimal
time complexity. The time complexity of one combination step in a parsing strategy is
determined by the total number of variables involved in the step, which is the total
number of distinct endpoints of the spans being combined. Time complexity can differ
from space complexity due to the fact that the time complexity takes into account which
endpoints are shared in the two groups of spans being combined.
Time complexity of general parsing problems can be analyzed using the graph-
theoretic concept of treewidth, which we now proceed to define precisely. A tree de-
composition of a graph G = (V, E) is a type of tree having a subset of G’s vertices at
each node. We define the nodes of this tree T to be the set I, and its edges to be the set
F. The subset of V associated with node i of T is denoted by Xi. A tree decomposition is
therefore defined as a pair ({Xi | i ∈ I}, T = (I, F)), where each Xi, i ∈ I, is a subset of V,
and tree T has the following properties
(cid:114)
(cid:114)
(cid:114)
Vertex cover: The nodes of the tree T cover all the vertices of G: (cid:83)
i∈I Xi = V.
Edge cover: Each edge in G is included in some node of T. That is, for all
edges (u, v) ∈ E, there exists an i ∈ I with u, v ∈ Xi.
Running intersection: The nodes of T containing a given vertex of G form a
connected subtree. Mathematically, for all i, j, k ∈ I, if j is on the (unique)
path from i to k in T, then Xi
(cid:84) Xk ⊆ Xj.
Figure 16
A tree decomposition of a graph is a set of overlapping clusters of the graph’s vertices, arranged
in a tree. This example has treewidth 2.
236
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
The treewidth of a tree decomposition ({Xi}, T) is maxi |Xi| − 1. The treewidth of a
graph is the minimum treewidth over all tree decompositions
tw(G) =
min
({Xi},T)∈TD(G)
max
i
|Xi| − 1
where TD(G) is the set of valid tree decompositions of G. We refer to a tree decomposi-
tion achieving the minimum possible treewidth as being optimal.
In general, more densely interconnected graphs have higher treewidth. Any tree has
treewidth 1, a graph consisting of one large cycle has treewidth 2, and a fully connected
graph of n vertices has treewidth n − 1. Low treewidth indicates some treelike structure
in the graph, as shown by the example with treewidth 2 in Figure 16. As an example
of the running intersection property, note that the vertex N appears in three adjacent
nodes of the tree decomposition. Finding the treewidth of a graph is an NP-complete
problem (Arnborg, Corneil, and Proskurowski 1987).
We use a representation known as a dependency graph to analyze the time com-
plexity of parsing for a given SCFG rule. Dependency graphs (known under a large
number of different names) are used to represent the interaction between variables in
constraint satisfaction problems, and have one vertex for each variable, and an edge
between vertices representing any two variables that appear together in a constraint. In
parsing problems, the variables consist of indices into the string being parsed, and the
constraints require that a nonterminal on the right-hand side of a rule has previously
been identified. Thus, the graph has a vertex for each endpoint involved in the rule, and
a clique for each nonterminal (including the left-hand side nonterminal) connecting all
its endpoints.
Example 9
The dependency graph for a CFG rule with n nonterminals, including the left-hand
side nonterminal, is a cycle of n vertices. For instance, the CFG rule S → A B C D is
shown in the left part of Figure 17. The five vertices represent the possible endpoints
that need to be considered when parsing any instantiation of the rule, and each arc is a
nonterminal appearing in the rule, including the left-hand side nonterminal. In this way,
each nonterminal has two endpoints, and adjacent nonterminals share one endpoint.
In particular, the left-hand side nonterminal shares one endpoint with the left-most
y0
x0
x1
x2
x3
x4
x0
y1
x1
y2
x2
y3
x3
y4
x4
Figure 17
Dependency graph for the CFG rule S → A B C D (left) and the SCFG rule
[S → A 1 B 2 C 3 D 4 , S → B 2 D 4 A 1 C 3 ] (right).
237
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Computational Linguistics
Volume 42, Number 2
Figure 18
Gadget grid Gi in the construction of cyclic permutation multigraph M, assuming di = 3.
nonterminal in the right-hand side and one endpoint with the right-most nonterminal
in the right-hand side.
In the case of an SCFG rule, each nonterminal has four endpoints, two on the
English side and two on the Chinese side. If the rule has n linked nonterminals, in-
cluding the left-hand side linked nonterminal, the dependency graph consists of 2n
vertices and n cliques of size four. For instance, the SCFG rule [S → A 1 B 2 C 3 D 4 , S →
B 2 D 4 A 1 C 3 ] is shown in the right part of Figure 17.
A tree decomposition of the dependency graph corresponds directly to a pars-
ing strategy, and the treewidth of the graph plus one is the exponent in the time
complexity of the optimal parsing strategy (Gildea 2011). Each cluster of vertices in
the tree decomposition corresponds to a combination step in a parsing strategy. The
running intersection property of a tree decomposition ensures that each endpoint in
the parsing rule has a consistent value at each step. Treewidth depends on the number
of vertices in the largest cluster of the tree decomposition, which in turn determines
the largest number of endpoints involved in any combination step of the parsing
strategy.
Our hardness result in this section is a reduction from treewidth of general graphs.
Given an input graph G to the treewidth problem, we construct an SCFG rule using
techniques similar to those in the previous section. We then show that finding the pars-
ing strategy with optimal time complexity for this rule would imply an approximation
bound on the treewidth of the original graph G.
The precise construction proceeds as follows. Given an instance of the treewidth
problem consisting of a graph G and integer k, we construct a cyclic permutation
multigraph M as in the previous section, with the exception that the grid Gi will contain
only 2di rows and 2di columns. This grid, shown in Figure 18, is a simplified version of
the grid used in Figure 10. Furthermore, there are no gadgets Xi in M, but rather, for
each edge (i, j) in G, there are a total of four edges in M connecting Gi directly to Gj. We
refer to these edges as connector edges, and we refer to edges internal to some Gi as
internal edges.
We construct a dependency graph D by replacing each edge (i, j) in M with a vertex
v(i, j) in D, and adding edges (v(i, j), v(i, k)) to D connecting all vertices in D that derive
from adjacent edges in M, as shown in Figure 19. As is required for the dependency
graph representation, this gives us a graph with one vertex corresponding to each
boundary between adjacent nonterminals in the SCFG rule and a clique connecting the
238
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
Figure 19
Grid Di of dependency graph D. Connector vertices are in the left-most column and top-most
row, and have edges, shown as dashed lines, internal to other grids Dj.
vertices for the four boundary points of each nonterminal (two boundaries in English
and two in Chinese).
We refer to vertices in D derived from connector edges in M as connector vertices,
and we refer to vertices in D derived from internal edges in M as internal vertices.
We refer to the subgraph of D derived from vertex i of G as Di. To be precise, Di
consists of connector vertices and internal vertices derived from edges of M incident
on vertices of Gi, and the edges of D connecting these vertices. We refer to the vertices
of Di as red or green, depending on whether they are derived from red or green edges
in M.
We now show that any dependency graph D constructed from a cyclic permutation
multigraph M is the dependency graph for some SCFG rule r. Interpret an arbitrary
vertex in M as the left-hand side nonterminal. Each green edge represents a boundary in
English shared between two nonterminals, including the boundary shared between the
first right-hand side nonterminal and the left-hand side nonterminal, and the boundary
shared between the last right-hand side nonterminal and the left-hand side nonterminal.
Similarly, each red edge represents the boundary between two nonterminals in Chinese.
We then replace each edge in M with a vertex in D, and replace each vertex in M with a
clique in D. D is the dependency graph for an SCFG rule, because it has a vertex for each
boundary, and a clique that connects the boundaries of each nonterminal (including the
left-hand side nonterminal).
With this construction in place, we can prove the main result of this section.
Theorem 3
A polynomial time algorithm for finding the parsing strategy of an SCFG rule having the
lowest time complexity would imply a polynomial time constant-factor approximation
algorithm for the treewidth of graphs of fixed degree.
239
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Computational Linguistics
Volume 42, Number 2
Proof
Let T be a tree decomposition of G having treewidth k − 1, and let ∆(G) be the maximum
degree of a vertex of G. Let M and D be the cyclic permutation multigraph and the
dependency graph, respectively, constructed from G as previously specified. We can
construct a tree decomposition TD of D having treewidth 4k∆(G) − 1 by replacing each
occurrence of vertex i in T with the 4di connector vertices in D that are derived from
the 4di connector edges in M that are adjacent to Gi. From one of the nodes of TD
containing the connector vertices of i, we attach a subtree containing the remaining
vertices of Di. This subtree has treewidth 4di − 1, and can be constructed in a linear chain
adding vertices from Di one at a time. That is, we process the nodes in Di in column-
major order, at each step constructing a new node in the tree decomposition of D by
adding one vertex of Di, either red or green, and removing the corresponding vertex
from the previous red or green column respectively. In sum, if k − 1 = OPT = tw(G),
then
tw(D) < 4∆(G)(OPT + 1)
Suppose that we have an algorithm for finding the treewidth of dependency graphs
derived from SCFG rules. Given a tree decomposition TD of D of treewidth k(cid:48) − 1, a
valid tree decomposition TG of G having treewidth at most 2k(cid:48) − 1 can be constructed
by replacing each connector vertex in TD with the corresponding two vertices i and j of
G, and replacing occurrences of internal vertices of D with the corresponding vertex i of
G. This is a valid tree decomposition of G, because of the following.
(cid:114)
(cid:114)
For any edge (i, j) in G, there is a node in TD containing the corresponding
connector vertex in D, and hence a node in TG containing both i and j.
Any two vertices u and v in D that are associated with i in G (either as
connector vertices or as internal vertices) are connected by a path in D
containing only vertices associated with i. Hence, the path between any
two occurrences of i in TG contains nodes that also contain i, guaranteeing
the running intersection property of TG.
Thus, by constructing D from G, finding an optimal tree decomposition of D, and then
translating the result into a tree decomposition of G, we have a tree decomposition of G
having width SOL where SOL ≤ 2tw(D). Combining the results, we have
SOL < 8∆(G)(OPT + 1)
Therefore, an algorithm for minimizing the treewidth of dependency graphs of SCFG
rules would imply the existence of a constant-factor approximation algorithm for the
treewidth of graphs of fixed degree. (cid:4)
The existence of a polynomial time constant-factor approximation algorithm for
the treewidth of graphs of fixed degree is a longstanding open problem. Therefore
Theorem 3 suggests that designing a polynomial time algorithm for finding the parsing
strategy of an SCFG rule having optimal time complexity is a difficult task.
240
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
In the other direction, a proof of hardness for finding the optimal time complexity
for SCFG parsing would also require progress on a longstanding open problem, as
stated by the following theorem.
Theorem 4
If it is NP-hard to find the parsing strategy with optimal time complexity for an SCFG
rule, then it is also NP-hard to find the treewidth of graphs of degree 6.
Proof
Dependency graphs constructed from an SCFG rule have degree at most 6. This is
because each vertex in the graph is a member of two cliques corresponding to the two
nonterminals meeting at a given boundary point in the rule. Each clique has three other
vertices, for a total degree of 6. Any NP-hardness result for dependency graph of SCFG
rules would imply NP-hardness for general graphs of degree 6. (cid:4)
6. Conclusion
In the context of machine translation, the problem of synchronous parsing with an
SCFG corresponds to the problem of analyzing parallel text into grammar derivations,
often as part of learning a translation model with Expectation Maximization or related
algorithms. The synchronous parsing problem also applies to decoding with an SCFG
and an integrated n-gram language model.
Parsing with an SCFG requires time (and space) polynomials in the sentence length,
but with the degree of the polynomial depending on the specific grammar. A loose
time upper bound obtained using dynamic programming techniques is O(n2r+2), where
n is the sentence length and r is the maximum rule length, that is, the maximum
number of linked nonterminals in the right-hand side of a rule. Gildea and ˇStefankoviˇc
(2007) show a lower bound of Ω(ncr) for some constant c, meaning that the exponent
grows linearly with the rule length. In this article, we show that even finding this
exponent is itself computationally difficult. For space complexity, finding the best
parsing strategy is NP-hard. For time complexity, finding the best parsing strategy
would require progress on approximation algorithms for the treewidth of general
graphs.
To our knowledge, the use of the notion of carving width of a graph in connection
with the space complexity of dynamic programming parsing is novel to this article. In
previous work, Crescenzi et al. (2015) used the notion of graph cutwidth to investigate
space complexity for SCFGs restricted to linear parsing strategies, that is, strategies that
add only one nonterminal at a time. However, graph cutwidth does not extend to the
general binary parsing strategies that we consider in this article.
The connection of space complexity to carving width means that we can use
existing linear-time algorithms for computing the tree layout of graphs of bounded
carving width (Thilikos, Serna, and Bodlaender 2000), and approximation algorithms
for carving width such as the O(log n)-factor approximation algorithm of Khuller,
Raghavachari, and Young (1994). The connection of time complexity to treewidth means
that we can apply algorithms for treewidth of general graphs that tend to find op-
timal results quickly in practice, such as the branch-and-bound algorithm of Gogate
and Dechter (2004). We can also apply linear time algorithms for graphs of bounded
241
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Computational Linguistics
Volume 42, Number 2
treewidth (Bodlaender 1996), and approximation algorithms such as the
approximation algorithm of Feige, Hajiaghayi, and Lee (2005).
(cid:112)
log(k)-factor
Because SCFG is an instance of the general class of linear context-free rewriting
systems (LCFRSs) (Vijay-Shankar, Weir, and Joshi 1987), our hardness results also apply
to the latter class. Therefore our results generalize those of Crescenzi et al. (2011), who
show NP-hardness for optimizing space and time complexity for LCFRSs, again with
the restriction for linear parsing strategies.
Viewing SCFG as an instance of LCFRS also allows connecting the parsing op-
timization problems investigated in this article to the rule factorization problem for
SCFGs, investigated by Huang et al. (2009). Rule factorization is the problem of replac-
ing an individual rule with a large size with several rules of smaller size, in a way
that the language generated by the overall grammar is preserved. (The algorithm for
casting a CFG in Chomsky normal form is an example of rule factorization.) As already
mentioned in the Introduction, it is known that SCFGs do not admit any canonical
binary form (Aho and Ullman 1972). This means that, when we attempt to factorize
a SCFG rule into “smaller pieces,” some of the resulting pieces might no longer be
SCFG rules, because they span more than two substrings; see again Figure 3. However,
when viewing a SCFG as a LCFRS, factorization is always possible, and the parsing
strategy trees defined in Section 2.3 directly provide a binary factorization of a SCFG
into a LCFRS. Under this view, the problems investigated in this article are related to
the problem of detecting rule factorizations for certain LCFRS, where we require that
the factorized LCFRS are space or time optimal when used in standard algorithms for
parsing based on dynamic programming. Rule factorization for general LCFRS has been
investigated by G ´omez-Rodr´ıguez et al. (2009) and Gildea (2010).
As a final remark, if a strategy of space or time O(nk) exists for a fixed k, it can be
found in space or time O(rk), respectively, by parsing the SCFG rule’s right-hand itself,
represented as a string of nonterminals, with a grammar representing all possible pars-
ing strategies. Such a grammar can be constructed by instantiating all possible LCFRS
rules of space or time complexity O(nk). If a parse of the SCFG rule’s right-hand side is
found, the resulting parse tree can be used as a parsing strategy for the original rule.
Acknowledgments
We are grateful to Pierluigi Crescenzi for
pointing us in the direction of carving width.
D. Gildea has been partially funded by
NSF award IIS-1446996. G. Satta has been
partially supported by MIUR under project
PRIN No. 2010LYA9RH 006.
References
Ahlswede, R. and S. L. Bezrukov. 1995. Edge
isoperimetric theorems for integer point
arrays. Applied Mathematics Letters,
8(2):75–80.
Aho, Alfred V. and Jeffery D. Ullman. 1969.
Syntax directed translations and the
pushdown assembler. Journal of Computer
and System Sciences, 3:37–56.
Aho, Alfred V. and Jeffery D. Ullman. 1972.
The Theory of Parsing, Translation, and
Compiling, volume 1. Prentice-Hall,
Englewood Cliffs, NJ.
Arnborg, Stefen, Derek G. Corneil, and
Andrzej Proskurowski. 1987. Complexity
of finding embeddings in a k-tree. SIAM
Journal of Algebraic and Discrete Methods,
8:277–284.
Bodlaender, H. L. 1996. A linear time
algorithm for finding tree decompositions
of small treewidth. SIAM Journal on
Computing, 25:1305–1317.
Chiang, David. 2007. Hierarchical
phrase-based translation. Computational
Linguistics, 33(2):201–228.
Crescenzi, Pierluigi, Daniel Gildea, Andrea
Marino, Gianluca Rossi, and Giorgio Satta.
2011. Optimal head-driven parsing
complexity for linear context-free
rewriting systems. In Proceedings of the
49th Annual Meeting of the Association for
Computational Linguistics (ACL-11),
pages 450–459. Portland, OR.
Crescenzi, Pierluigi, Daniel Gildea, Andrea
Marino, Gianluca Rossi, and Giorgio Satta.
242
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Gildea and Satta
Synchronous Context-Free Grammars and Optimal Parsing Strategies
2015. Synchronous context-free grammars
and optimal linear parsing strategies.
Journal of Computer and System Sciences,
81(7):1333–1356.
Feige, Uriel, MohammadTaghi Hajiaghayi,
and James R. Lee. 2005. Improved
approximation algorithms for
minimum-weight vertex separators. In
STOC ’05: Proceedings of the Thirty-Seventh
Annual ACM Symposium on Theory of
Computing, pages 563–572. Baltimore,
MD.
Gildea, Daniel. 2010. Optimal parsing
strategies for linear context-free rewriting
systems. In Proceedings of the 2010 Meeting
of the North American Chapter of the
Association for Computational Linguistics
(NAACL-10), pages 769–776. Los Angeles,
CA.
Gildea, Daniel. 2011. Grammar factorization
by tree decomposition. Computational
Linguistics, 37(1):231–248.
Gildea, Daniel and Daniel ˇStefankoviˇc. 2007.
Worst-case synchronous grammar rules. In
Proceedings of the 2007 Meeting of the North
American Chapter of the Association for
Computational Linguistics (NAACL-07),
pages 147–154. Rochester, NY.
Gogate, Vibhav and Rina Dechter. 2004. A
complete anytime algorithm for treewidth.
In Proceedings of the 20th Conference on
Uncertainty in Artificial Intelligence (UAI),
pages 201–208.
G ´omez-Rodr´ıguez, Carlos, Marco
Kuhlmann, Giorgio Satta, and David Weir.
2009. Optimal reduction of rule length in
linear context-free rewriting systems. In
Proceedings of the 2009 Meeting of the North
American Chapter of the Association for
Computational Linguistics (NAACL-09),
pages 539–547. Boulder, CO.
Huang, Liang, Hao Zhang, Daniel Gildea,
and Kevin Knight. 2009. Binarization of
synchronous context-free grammars.
Computational Linguistics, 35(4):559–595.
Kay, M. 1980. Algorithm schemata
and data structures in syntactic
processing. Technical report CSL-80,
Xerox Palo Alto Research Center,
Palo Alto, CA.
Khuller, Samir, Balaji Raghavachari, and
Neal Young. 1994. Designing
multi-commodity flow trees. Information
Processing Letters, 50:49–55.
Kozawa, Kyohei, Yota Otachi, and Koichi
Yamazaki. 2010. The carving-width of
generalized hypercubes. Discrete
Mathematics, 310(21):2867–2876.
Lewis, II, P. M. and R. E. Stearns. 1968.
Syntax-directed transduction. Journal of the
Association for Computing Machinery,
15(3):465–488.
Rolim, Jos´e, Ondrej S ´ykora, and Imrich Vrt’o.
1995. Optimal cutwidths and bisection
widths of 2- and 3-dimensional meshes. In
Manfred Nagl, editor, Graph-Theoretic
Concepts in Computer Science, volume 1017
of Lecture Notes in Computer Science.
Springer, Berlin Heidelberg,
pages 252–264.
Seki, H., T. Matsumura, M. Fujii, and
T. Kasami. 1991. On multiple context-free
grammars. Theoretical Computer Science,
88:191–229.
Seymour, P. D. and R. Thomas. 1994. Call
routing and the ratcatcher. Combinatorica,
14(2):217–241.
Thilikos, Dimitrios M., Maria J. Serna, and
Hans L. Bodlaender. 2000. Constructive
linear time algorithms for small cutwidth
and carving-width. In Algorithms and
Computation (ISAAC 2000), Taipei.
pages 192–203.
Vijay-Shankar, K., D. L. Weir, and A. K. Joshi.
1987. Characterizing structural
descriptions produced by various
grammatical formalisms. In Proceedings of
the 25th Annual Conference of the Association
for Computational Linguistics (ACL-87),
pages 104–111, Stanford, CA.
243
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
c
o
l
i
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
2
2
2
0
7
1
8
0
7
7
9
9
/
c
o
l
i
_
a
_
0
0
2
4
6
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Descargar PDF