Dynastic Potential Crossover Operator
Francisco Chicano
ITIS Software, University of Malaga, Espagne
Gabriela Ochoa
University of Stirling, ROYAUME-UNI
L. Darrell Whitley
Colorado State University, Etats-Unis
Renato Tinós
University of Sao Paulo, Brazil
chicano@lcc.uma.es
gabriela.ochoa@cs.stir.ac.uk
whitley@cs.colostate.edu
rtinos@ffclrp.usp.br
https://doi.org/10.1162/evco_a_00305
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Abstrait
An optimal recombination operator for two-parent solutions provides the best solu-
tion among those that take the value for each variable from one of the parents (gene
transmission property). If the solutions are bit strings, the offspring of an optimal re-
combination operator is optimal in the smallest hyperplane containing the two parent
solutions. Exploring this hyperplane is computationally costly, in general, requiring ex-
ponential time in the worst case. Cependant, when the variable interaction graph of the
objective function is sparse, exploration can be done in polynomial time. In this arti-
clé, we present a recombination operator, called Dynastic Potential Crossover (DPX),
that runs in polynomial time and behaves like an optimal recombination operator for
low-epistasis combinatorial problems. We compare this operator, both theoretically and
experimentally, with traditional crossover operators, like uniform crossover and net-
work crossover, and with two recently defined efficient recombination operators: par-
tition crossover and articulation points partition crossover. The empirical comparison
uses NKQ Landscapes and MAX-SAT instances. DPX outperforms the other crossover
operators in terms of quality of the offspring and provides better results included in a
trajectory and a population-based metaheuristic, but it requires more time and memory
to compute the offspring.
Mots clés
Recombination operator, dynastic potential, gray box optimization.
1
Introduction
Gene transmission (Radcliffe, 1994) is a popular property commonly fulfilled by many
recombination operators for genetic algorithms. When the solutions are represented
by a set of variables taking values from finite alphabets (possibly different from each
other) with no constraints among the variables, this property implies that any variable
in any offspring will take the value of the same variable in one of the parents. Au pair-
particulier, the variables having the same value for both parents will have the same value
in all the offspring (c'est à dire., the respect property [Radcliffe, 1994] is obeyed). The other (dif-
fering) variables will take one of the values coming from a parent solution. The gene
transmission property is a formalization of the idea that taking (good) features from
Manuscript received: 7 Juillet 2019; revised: 31 Août 2020, 12 Août 2021, 18 Novembre 2021, et 27 Novem-
ber 2021; accepted: 29 Novembre 2021.
© 2021 Massachusetts Institute of Technology.
Publié sous Creative Commons
Attribution 4.0 International (CC PAR 4.0) Licence.
Evolutionary Computation 30(3): 409–446
F. Chicano et al.
the parents should produce better offspring. This probably explains why most of the
recombination operators try to fulfill this property or some variant of it. The set of all
the solutions that can be generated by a recombination operator from two parents is
called dynastic potential. If we denote with h(X, oui) the Hamming distance (number of
differing variables) between two solutions x and y, the cardinality of the largest dy-
nastic potential of a recombination operator fulfilling the gene transmission property is
2h(X,oui). The dynastic potential of uniform crossover has this size. The dynastic potential
of single-point crossover has size 2h(X, oui) because it takes from 1 to h(X, oui) consecu-
tive differing bits from any of the two parents. In two-point crossover, two cut points
are chosen. If the first one is before the first differing bit or the second one is after the
last differing bit, the operator behaves like the single-point crossover. Otherwise, le
two cut points are chosen among the h(X, oui) − 1 positions between differing bits and
(cid:2)
h(X,oui)−1
ad-
for each cut the bits can be taken from any of the two parents, providing 2
2
ditional solutions to the single-point crossover dynastic potential. Ainsi, the size of the
(cid:2)
h(X,oui)−1
= h(X, oui)2 − h(X, oui) + 2.
two-point crossover dynastic potential is 2h(X, oui) + 2
2
En général, z-point crossover has a dynastic potential of size (cid:2)(h(X, oui)z) when z is small
compared with the number of decision variables, n.
(cid:3)
(cid:3)
An optimal recombination operator (Eremeev and Kovalenko, 2013) obtains a best off-
spring from the largest dynastic potential of a recombination operator fulfilling the gene
transmission property, which has size 2h(X,oui). In the worst case, such a recombination
operator is computationally expensive because finding a best offspring in the largest
dynastic potential of an NP-hard problem is an NP-hard problem. A proof of the NP-
hardness is given by Eremeev and Kovalenko (2013), but it can also be easily concluded
from the fact that applying an optimal recombination operator to two complementary
solutions (par exemple., all zeroes and all ones) is equivalent to solving the original problem, et
the NP-hardness is derived from the NP-hardness of the original problem.
We propose a recombination operator, named Dynastic Potential Crossover (DPX),
that finds a best offspring of the largest dynastic potential if the objective function f
has low epistasis, c'est, if the number of nonlinear interactions among variables is
petit, typically (cid:2)(n). En particulier, we assume that the objective function f is defined
on n binary variables and has k-bounded epistasis. This means that f can be written as
a sum of m subfunctions f(cid:2), each one depending on at most k variables:
F (X) =
m(cid:4)
(cid:2)=1
F(cid:2)(xi((cid:2),1), xi((cid:2),2), . . . , xi((cid:2),k(cid:2) )),
(1)
where i((cid:2), j ) is the index of the j -th variable in subfunction f(cid:2) and k(cid:2) ≤ k. These func-
tions have been named Mk Landscapes by Whitley et al. (2016).
DPX has a non-negative integer parameter, β, bounding the exploration of the dy-
nastic potential. The precise condition to certify that the complete dynastic potential
has been explored depends on β and will be presented in Section 3.3. The worst-time
complexity of DPX is O(4β (n + m) + n2) for Mk Landscapes.
DPX uses the variable interaction graph of the objective function to simplify the
evaluation of the 2h(X,oui) solutions in the dynastic potential by using dynamic program-
ming. The ideas for this efficient evaluation date back to the basic algorithm by Ham-
mer et al. (1963) for variable elimination and are also commonly used in operations over
Bayesian networks (Bodlaender, 2005). DPX requires more than just the fitness values
of the parent solutions to do the job and, thus, it is framed in the so-called gray box
optimization (Whitley et al., 2016).
410
Evolutionary Computation Volume 30, Nombre 3
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
Recently defined crossover operators similar to DPX are partition crossover (PX)
(Tinós et al., 2015) and articulation points partition crossover (APX) (Chicano et al., 2018).
Although they were proposed to work with pseudo-Boolean functions, they can also
be applied to the more general representation of variables defined over a finite alpha-
bet. We will follow the same approach here. In the rest of the article, we will focus on
a binary representation, where each variable takes value in the set B = {0, 1}. But most
of the results, claims, and the operator itself can be applied to a more general solution
representation, where variables take values from finite alphabets. PX and APX also use
the variable interaction graph of the objective function to efficiently compute a good
offspring among a large number of them. PX and APX produced excellent performance
in different problems (Chicano et al., 2017, 2018; Chen et al., 2018; Tinós et al., 2018).
When combined with other gray-box optimization operators, PX and APX were capa-
ble of optimizing instances with one million variables in seconds. We compare DPX
with these two operators from a theoretical and experimental point of view. The new re-
combination operator is also compared with uniform crossover and network crossover
(Hauschild and Pelikan, 2010), which also uses the variable interaction graph.
This work extends a conference paper (Chicano et al., 2019) where DPX was first
présenté. The new contributions of this journal version are as follows:
• We added a theorem to prove the correctness of the dynamic programming
algorithme.
• We revised the text to clarify some terms, add extended explanations of some
details of the operator and provide more examples.
• We provided new experiments to compare the performance of DPX with the
one of other crossover operators when they are applied to random solutions.
The other crossover operators are uniform crossover, network crossover, par-
tition crossover, and articulations point partition crossover.
• We included DPX in an evolutionary algorithm, to check its behavior in this
algorithme.
• We compared the five crossover operators inside a trajectory-based and a
population-based metaheuristic used to solve two NP-hard pseudo-Boolean
optimization problems: NKQ Landscapes and MAX-SAT.
• We modified the code of DPX to improve its performance and updated the
source code repository1 with the new code of the evolutionary algorithm, le
newly implemented crossover operators used in the comparison, and the al-
gorithms to process the results.
• We added a local optima network analysis (Ochoa and Veerapen, 2018) of DPX
included within an iterated local search framework to better understand its
working principles.
The article is organized as follows. Section 2 presents the required background to
understand how DPX works. The proposed recombination operator is presented in
1Available at https://github.com/jfrchicanog/EfficientHillClimbers
Evolutionary Computation Volume 30, Nombre 3
411
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Section 3. Section 4 describes the experiments and presents the results and, finally, Sec-
tion 5 concludes the article.
2 Background
In our gray-box optimization setting, the optimizer can independently evaluate the set
of m subfunctions in Equation (1); c'est, it is able to evaluate f(cid:2) when the values of
xi((cid:2),1) . . . xi((cid:2),k(cid:2) ) are given, and it also knows the variables each f(cid:2) depends on. This con-
trasts with black-box optimization, where the optimizer can only evaluate full solutions
(x0, . . . , xn−1) and get their fitness value f (X). We assume that we do not know the in-
ternal details of f(cid:2) (we can only evaluate it), and this is why we call it gray box and not
white box.
2.1 Variable Interaction Graph
The variable interaction graph (VIG) (Whitley et al., 2016) is a useful tool that can be con-
structed under gray-box optimization. It is a graph V I G = (V , E), where V is the set
of variables and E is the set of edges representing all pairs of variables (xi, xj ) having
nonlinear interactions. The set Bn contains all the binary strings of length n. Given a set
of variables Y , the notation 1n
Y is used to represent a binary string of length n with 1
{0,2} = 101. If
in the positions of the variables in Y and zero in the rest, Par exemple, 13
the set Y has only one element, we will omit the curly brackets in the subscript, par exemple.,
{3} = 0001. Observe that the first index of the binary strings is 0. When the length
14
3
is clear from the context we omit n. The operator ⊕ is a bitwise exclusive OR.
= 14
We say that variables xi and xj have a nonlinear interaction when the expression
(cid:3)si (X) = f (x ⊕ 1i ) − f (X) does depend on xj . Checking the dependency of (cid:3)si (X) sur
xj can be computationally expensive because it requires an evaluation of the expres-
sion on all the strings in Bn−1. There are other approaches to find the nonlinear interac-
tions among variables: d'abord, assuming that every pair of variables appearing together
in a subfunction has a nonlinear interaction. It is not necessarily true that there is a
nonlinear interaction among variables appearing as arguments in the same subfunc-
tion, but adding extra edges to E does not break the correctness of the operators based
on the VIG and requires only a very simple check that is computationally cheap. Le
graph obtained this way is usually called a co-ocurrence graph. A second, and precise,
approach to determine the nonlinear interactions is to apply the Fourier transform
(Terras, 1999), and then look at every pair of variables to determine if there is a nonzero
Fourier coefficient associated to a term with the two variables. This second method is
precise and can be done in (cid:2)(m4k ) time if we know the variables appearing in each
subfunction f(cid:2) and we can evaluate each f(cid:2) independently (our gray box setting). Le
interested reader can see the work of Whitley et al. (2016) and Rana et al. (1998) for a
more detailed description of this second approach.
The first approach is especially useful when k is relatively large because it requires
temps (cid:2)(mk2), which is polynomial in k, in contrast with the exponential time in k of
the second approach. In some problems, like MAX-SAT, we know that the co-ocurrence
graph (obtained by the first approach) is exactly the variable interaction graph (obtained
by the second approach). In some other problems, like NK Landscapes, both are the
same with high probability. In these two cases, it makes sense to use the first approach
and work with the co-ocurrence graph even in the case that k is small enough to compute
the Fourier transform in a reasonable time. In the rest of the article, when we use the
term “variable interaction graph” we could replace it by the “co-ocurrence graph.”
412
Evolutionary Computation Volume 30, Nombre 3
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
Chiffre 1: Sample variable interaction graph (VIG).
An example of the construction of the variable interaction graph for a function with
n = 18 variables (numbered from 0 à 17) and k = 3 is given below. We will refer to the
variables using numbers, Par exemple, 9 = x9. The objective function is the sum over
the following 18 subfunctions:
f0(0, 6, 14) f5(5, 4, 2)
f10(10, 2, 17)
f15(15, 7, 13)
f1(1, 0, 6)
f6(6, 10, 13) f11(11, 16, 17)
f16(16, 9, 11)
f2(2, 1, 6)
f7(7, 12, 15) f12(12, 10, 17) f17(17, 5, 2)
f3(3, 7, 13) f8(8, 3, 6)
f13(13, 12, 15)
f4(4, 1, 14) f9(9, 11, 14) f14(14, 4, 16).
From these subfunctions, assume we extract the nonlinear interactions that are
shown in Figure 1. In this example, every pair of variables that appear together in a
subfunction has a nonlinear interaction.
2.2 Recombination Graph
Let us assume that we have two solutions to recombine. We call these two solutions red
and blue parents. All the variables with the same value in both parents will also share
the same value in the offspring and the solutions in the dynastic potential will be in a
hyperplane determined by the common variables. Par exemple, let the two parents be
red = 000000000000000000,
blue = 111101011101110110,
in our sample function presented in Section 2.1. Donc, x4, x6, x10, x14, and x17 are
identical in both parents. The rest of the variables are different. Both parents reside in
a hyperplane denoted by H = ∗∗∗∗0∗0∗∗∗0∗∗∗0∗∗0 where ∗ denotes the variables that
are different in the two solutions, et 0 marks the positions where they have the same
variable values.
We use the hyperplane H = ∗∗∗∗0∗0∗∗∗0∗∗∗0∗∗0 to decompose the VIG in order to
produce a recombination graph. We remove all the variables (vertices) that have the same
“shared variable assignments” and also remove all edges that are incident on the ver-
tices corresponding to these variables. This produces the recombination graph shown
in Figure 2.
Evolutionary Computation Volume 30, Nombre 3
413
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Chiffre 2: Recombination graph for the solutions (parents) red = 000000000000000000
and blue = 111101011101110110.
The recombination graph also defines a reduced evaluation function. This new eval-
uation function is linearly separable, and decomposes into q ≤ m subfunctions defined
over the connected components of the recombination graph. In our example:
g(X(cid:5)
) = a + g1(9, 11, 16) + g2(0, 1, 2, 5) + g3(3, 7, 8, 12, 13, 15),
where g(X(cid:5)) = f |H (X(cid:5)) and x(cid:5) are solutions restricted to a subspace of the hyperplane
H that contains the parent strings as well as the full dynastic potential. The constant
a = f (X(cid:5)
) depends on the common variables. Every recombination graph
with q connected components induces a new separable function g(X(cid:5)
) that is defined as:
i=1 gi (X(cid:5)
(cid:5)
3
) −
g(X(cid:5)
) = a +
q(cid:4)
je = 1
gi (X(cid:5)
).
(2)
Partition crossover (PX), defined by Tinós et al. (2015), generates an offspring, quand
recombining two parents, based on the recombination graph: all of the variables in the
same recombining component in the recombination graph are inherited together from
one of the two parents. Partition crossover selects the decision variables from one or an-
other parent yielding the best partial evaluation for each subfunction gi (X(cid:5)
). This way,
PX obtains a best offspring among 2q in O(q ) temps, which is a remarkable result. The ef-
ficiency of PX depends on the number of connected components q in the recombination
graph. Larger values for q provide a better performance. We wonder if this number can
be large in pseudo-Boolean problems with interest in practice. We provide a positive
answer with the VIG and recombination graph in Figure 3. It shows a sample recom-
bination graph with 1,087 connected components of a real SAT instance of the 2014
Competition.2 The graph was generated by Chen and Whitley (2017).
Articulation points partition crossover (APX) (Chicano et al., 2018) goes fur-
ther and finds the articulation points of the recombination graphs. Articulation points
are variables whose removal increases the number of connected components. A bi-
connected component of a graph is a maximal subgraph with the property that there
are two vertex-disjoint paths between any two vertices. Articulation points join sev-
eral bi-connected components. Variables x1, x2 and x3 are articulation points in our
example (voir la figure 2) and the subgraphs induced by the vertex sets {x5, x2
} et
} are examples of bi-connected components. Alors, APX efficiently
{x3, x7, x12, x13, x15
2http://www.satcompetition.org/2014/
414
Evolutionary Computation Volume 30, Number 3
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
Chiffre 3: Variable interaction graph (gauche) and recombination graph (droite), generated
by Chen and Whitley (2017), for instance atco_enc3_opt1_13_48 (1,067,657 variables)
from the SAT Competition 2014. The recombination graph contains 1,087 connected
components.
simulates what happens when the articulation points are removed, one at a time, depuis
the recombination graph by flipping the variable in each of the parent solutions before
applying PX, and the best solution is returned as offspring. In our example, APX would
work as follows. D'abord, it applies PX to the red solution and a copy of the blue solution
where the variable x1 is flipped and stores the best children. Alors, it applies PX to the
blue solution and a copy of the red solution with variable x1 flipped. The same pro-
cess is repeated with flips in variables x2 and x3 (the other articulation points). Enfin,
it applies PX to the original red and blue solutions. APX returns the best solution of
all the applications of PX. The key ingredient of APX is that all these computations do
not require repeated applications of PX. With the appropriate data structures, all the
computations can be done in O(n2 + m), the same complexity of PX, for any choice of
parents in Mk Landscapes.
3 Dynastic Potential Exploration
The proposed dynastic potential crossover operator (DPX) takes the ideas of PX and
APX even further. DPX starts from the recombination graph, like the one in Figure 2.
Alors, DPX tries to exhaustively explore all the possible combinations of the parent val-
ues in the variables of each connected component to find the optimal recombination
regarding the hyperplane H . This exploration is not done by brute force, but using dy-
namic programming. Following with our example, in order to compute the best com-
bination for the variables x9, x11, and x16, we need to enumerate the 8 ways of taking
each variable from each parent, and this is not better than brute force. Cependant, le
component containing variables x0, x1, x2, and x5 forms a path. Dans ce cas, we can store
in a table what is the best option for variable x0 when any of the two possible values
for variable x1 are selected. Alors, we can store in the same table what is the value of
the sum of subfunctions depending only on x0 and x1 (and possibly common variables
eliminated in the recombination graph). After this step, we can consider that variable x0
has been removed from the problem, and we can proceed in the same way with the rest
of the variables in order: x1, x2, and x5. Enfin, 12 evaluations are necessary, instead of
le 16 required by brute force. En général, for a path of length (cid:2), dynamic programming
requires 4((cid:2) − 1) evaluations while brute force requires 2(cid:2) evaluations.
Evolutionary Computation Volume 30, Number 3
415
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
The idea of variable elimination using dynamic programming dates back to the
1960s and the basic algorithm by Hammer et al. (1963). The problem of variable
elimination has also been studied in other contexts, like Gaussian elimination (Tarjan
and Yannakakis, 1984) and Bayesian networks (Bodlaender, 2005). En fait, we utilize
the ideas for computing the junction tree in Bayesian networks. D'abord, a chordal graph
is obtained from the recombination graph using the maximum cardinality search and a
fill-in procedure to add the missing edges. Then the clique tree (or junction tree) is com-
puted, which will fix the order in which the variables are eliminated using dynamic
programming. After assigning the subfunctions to the cliques in the clique tree, dy-
namic programming is applied to find a best offspring, which is later reconstructed
using the information computed in tables during dynamic programming. The runtime
of the variable elimination method depends, entre autres, on the number of missing
edges added by the fill-in procedure. Malheureusement, finding the minimum fill-in is an
NP-hard problem (Bodlaender, 2005). Ainsi, we do not try to eliminate the variables in
the most efficient way, but we apply algorithms that are efficient in finding a variable
elimination order. Our proposal, DPX, is the first, to the best of our knowledge, applying
these well-known ideas to design a recombination operator. There is also a difference
between our approach and the variable elimination methods in the literature: we in-
troduce a parameter β to limit the exploration of the variables (see Section 3.3). Le
high-level pseudocode of DPX is outlined in Algorithm 1. In the next subsections we
will detail each of these steps.
3.1 Chordal Graphs
In Algorithm 1, after finding the recombination graph (Doubler 1), each connected com-
ponent is transformed into a chordal graph (Lines 2 et 3), if it is not already one. UN
chordal graph is a graph where all the cycles of length four or more have a chord (bord
joining two nodes not adjacent in the cycle). All the connected components in Figure 2
416
Evolutionary Computation Volume 30, Number 3
Dynastic Potential Crossover Operator
Chiffre 4: Maximum cardinality search applied to the third connected component of Fig-
ure 2 (gauche) and clique tree with the sets Si and Ri (droite).
are chordal graphs. Tarjan and Yannakakis (1984) provided algorithms to test if a graph
is chordal and add new edges to make it chordal if it is not. Their algorithms run in
time O(n + e), where n is the number of nodes in the graph and e is the number of
edges. In the worst case the complexity is O(n2). The first step to check the chordality is
to number the nodes using maximum cardinality search (MCS). This algorithm numbers
each node in descending order, choosing in each step the unnumbered node with more
numbered neighbors and solving the ties arbitrarily. The number associated to node
u is denoted with γ (toi). Chiffre 4 (gauche) shows the result of applying MCS to the third
connected component of Figure 2, where we started numbering node 12.
If the graph is chordal then MCS will provide a numbering of the nodes γ such that
for each triple of nodes u, v and w, avec (toi, v), (toi, w) ∈ E and γ (toi) < min{γ (v), γ (w)}, it
happens that (v, w) ∈ E. If this is not the case, the graph is not chordal. A fill-in algorithm
tests this condition and adds the required edges to make the graph chordal. This algo-
rithm runs in O(n + e(cid:5)
is the number of edges in the final chordal graph.
Again, in the worst case, the complexity is O(n2). These two steps, MCS and fill-in, can
be applied to each connected component separately or to the complete recombination
graph with the same result (Tarjan and Yannakakis, 1984).
) time, where e(cid:5)
3.2 Clique Tree
Dynamic programming is used to exhaustively explore all the variables in each clique3
in the chordal graph. The maximum size of a clique in the chordal graph is an upper
bound of its treewidth, and determines the complexity of applying dynamic program-
ming to find the optimal solution. A clique tree of a chordal graph is a tree where the
nodes are cliques and for any variable appearing in two of such cliques, the path among
the two cliques in the tree is composed of cliques containing the variable (junction tree
property). We can also identify a clique tree with a tree-decomposition of the chordal
graph (Bodlaender, 2005). This clique tree will determine the order in which the vari-
ables can be eliminated.
3We will use the term clique to refer to a maximal complete subgraph, as the cited literature does.
However, the term clique is sometimes used to refer to any complete subgraph (not necessarily
maximal).
Evolutionary Computation Volume 30, Number 3
417
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Starting from the chordal graph provided in the previous steps, we apply an al-
gorithm by Galinier et al. (1995) to find the clique tree T (Line 4 in Algorithm 1). This
algorithm runs in O(n + e(cid:5)
) time and finds all the O(n) cliques of the chordal graph. In a
chordal graph, the number of cliques cannot exceed the number of nodes n of the graph
(Galinier et al., 1995). The cliques will be identified with the sets of variables it contains,
Ci, where i is an integer index for the clique that increases when a clique is discovered
by the algorithm. An edge joining two cliques in the clique tree can be labeled with a
separator, which is the intersection of the variables in both cliques. A clique Ci is parent
of a clique Cj if they are joined by an edge and i < j . The set of child cliques of a clique i
is denoted with ch(i). Although separators are associated to edges, in each clique Ci we
highlight a particular separator, the separator with its parent clique, and we will use Si
to denote it. If a clique Ci has no parent, then Si = ∅. The residue, Ri, in a clique Ci is the
set of variables of Ci that are not in the separator with its parent, Si. In each clique Ci,
the residue, Ri, and the separator with the parent, Si, forms a partition of the variables
in Ci. Due to the junction tree property, for each variable xi, the cliques that contain it
form a connected component in the clique tree T . The variable is in the set Sj of all the
cliques j in the connected component except in the ancestor of all of them (with the
lowest index j ), where xi is member of its residue Rj . Thus, each variable is only in the
residue of one clique. In Figure 4 (right) the residues, Ri, and separators with the parent,
Si, for all the cliques of the third connected component of Figure 2 are shown.
After computing the clique tree, all the subfunctions f(cid:2) depending on a nonempty
set V d
(cid:2) of differing variables must be assigned to one (and only one) clique i where
⊆ Ci (Line 5 in Algorithm 1). They will be evaluated when this clique is processed.
V d
(cid:2)
There can be more than one clique where the subfunction can be assigned. All of them
are valid for a correct evaluation, but the clique with less variables is preferred to reduce
the runtime. We denote with Fi the set of subfunctions assigned to clique Ci.
An optimal offspring is found in Algorithm 2 by exhaustively exploring all the vari-
able combinations in each clique Ci and storing the best ones. Before describing the al-
gorithm, we need to introduce some additional notation. The operator ∧ is a bitwise
AND and the expression Bn ∧ 1Y denotes the set of binary strings of length n with zero
in all the variables not in Y .
For each combination of the variables in the separator Si (Line 2 in Algorithm 2),
all the combinations of the variables in the residue Ri are considered (Line 4 in Algo-
rithm 2) and evaluated over the subfunctions assigned to the clique (Lines 6–7) and their
child cliques (Lines 8–9). Then, the best combination for the residue Ri is stored in the
variable[i] array4 (Line 11) and its value in the value[i] array (Line 12). The eval-
uation in post-order makes it possible to have the value[j ] array of the child cliques
filled when they are evaluated in Line 9. At this point, variables in the residue Ri can
be obviated (eliminated) in the rest of the computation. When the separator Si = ∅, w
takes only one value, the string with n zeroes. This happens in the root of the clique tree,
and its effect is to iterate over all the variables combinations for Ri = Ci to find the best
value. The variable array will be used in the reconstruction of the offspring solution
(Line 7 in Algorithm 1).
Following with our previous example in Figure 4, first, clique C3 is evaluated. For all
the values of x3 (variable in S3), and all the values of x8 (variable in R3) the subfunctions
in F3 are evaluated. All these subfunctions depend only on x3, x8, and variables with
4What is really stored in Algorithm 2 is the change v of the variables in Ri over the parent solution
x, but this can be considered an implementation detail.
418
Evolutionary Computation Volume 30, Number 3
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
= 0, 1, the array value[3] is filled
common values in both parents. For each value x3
with the maximum value of the sum of the subfunctions in F3 for the two values of
= 0, 1. The array variable[3] will store the value of x8 for which the maximum is
x8
obtained in each case. After C3 has been evaluated, variable x8 is eliminated. Now clique
C2 is evaluated, and value[2] and variable[2] are filled for each combination of x7
and x13. In this case the variable to eliminate is x3 and the evaluation also includes the
values in value[3] in addition to the functions in F2, because C3 is a child clique of C2.
Finally, the root clique C1 is evaluated and all the 24 possible combinations of variables
x7, x12, x13, and x15 are evaluated using the subfunctions in F1 and the array value[2]
to find the objective value of an optimal offspring. The offspring itself is built using the
arrays variable. In particular, variable[1] will store the combination of variables
x7, x12, x13, and x15 that produces the best offspring. Array variable[2] will provide
the value of x3 given the ones of x7 and x13, which were provided by variable[1].
Array variable[3] will provide the value for x8 given the value of x3 provided by
variable[2].
Evolutionary Computation Volume 30, Number 3
419
F. Chicano et al.
(3)
(4)
Theorem 1: Given two parent solutions x and y with differing set of variables d(x, y) that
produces clique tree T , Algorithm 2 computes a best offspring z in the largest dynastic potential
of x and y. That is:
f (z) = max
f (x ⊕ w).
w∈Bn∧1d (x,y)
Proof: We will prove the theorem by structural induction over the clique tree. We will
denote with Ti the subtree of T with Ci in the root. We also introduce C(Ti ) as the
union of the Cj sets for all the cliques Cj ∈ Ti and will use the convenient notation
R(Ti ) = C(Ti ) − Si. Observe that C(T ) = R(T ) = d(x, y). Regarding the subfunctions,
we introduce the notation FTi to refer to the set of subfunctions associated with a clique
= ∪Cj ∈Ti Fj . We only need to consider the subfunctions in FT because the re-
in Ti: FTi
maining ones are constant in the dynastic potential of x and y. Thus, Eq. (3) is equivalent
to:
(cid:4)
f (z) = max
w∈Bn∧1d (x,y)
f (x ⊕ w).
f ∈FT
The claim to be proven is that for each clique Ci, after its evaluation using Lines 2
to 12 of Algorithm 2, the array value[i] holds the following equation:
value[i][w] = max
v∈Bn∧1R(Ti )
f (x ⊕ w ⊕ v) ∀w ∈ Bn ∧ 1Si .
(5)
Eq. (5) reduces to Eq. (4) when the clique Ci is the root of T and Ti = T . Thus, we only
need to prove Eq. (5) using structural induction. Let us start with the base case: a leaf
clique. In this case, R(Ti ) = Ri and C(Ti ) = Ci and there is no child clique to iterate over
in the for loop of Lines 8 to 9. Eq. (5) is reduced to:
value[i][w] = max
v∈Bn∧1Ri
f (x ⊕ w ⊕ v) ∀w ∈ Bn ∧ 1Si ,
(6)
(cid:4)
f ∈FTi
(cid:4)
f ∈Fi
and Lines 8 to 9 fill the value[i] array using exactly the expression in Eq. (6).
node in the tree. In this case we have C(Ti ) = Ci ∪
(cid:6)
Now, we use the induction hypothesis to prove that Eq. (5) holds for any other
j ∈ch(i) C(Tj ) and R(Ti ) = Ri ∪
j ∈ch(i) R(Tj ). The values computed by Lines 8 to 9 and stored in the value[i] array
(cid:6)
for all w ∈ Bn ∧ 1Si are:
value[i][w] = max
v∈Bn∧1Ri
⎛
⎝
(cid:4)
f ∈Fi
f (x ⊕ w ⊕ v) +
(cid:4)
j ∈ch(i)
value[j ][(w ⊕ v) ∧ 1Sj ]
⎞
⎠ ,
(7)
and using the induction hypothesis we can replace value[j ][(w ⊕ v) ∧ 1Sj ] by the right-
hand side of Eq. (5) to write:
⎛
⎛
value[i][w] = max
v∈Bn∧1Ri
(cid:4)
⎝
f ∈Fi
f (x ⊕ w ⊕ v) +
(cid:4)
j ∈ch(i)
max
v(cid:5)∈Bn∧1R(Tj )
(cid:4)
f ∈FTj
f (x ⊕ w ⊕ v ⊕ v(cid:5)
)
⎠ ,
where we replaced (w ⊕ v) ∧ 1Sj by w ⊕ v in the inner sum because the subfunctions
in FTj do not depend on any variable in Ci − Sj , which are the ones that differ in both
expressions. The sets FTj are disjoint for all j ∈ ch(i), as well as the sets R(Tj ). Thus, we
can swap the maximum and the sum to write:
value[i][w] = max
v∈Bn∧1Ri
(cid:4)
⎝
f ∈Fi
f (x ⊕ w ⊕ v) + max
v(cid:5)∈Bn∧1R(Ti )−Ri
(cid:4)
f ∈FTi
−Fi
f (x ⊕ w ⊕ v ⊕ v(cid:5)
)
⎠ ,
420
Evolutionary Computation Volume 30, Number 3
⎞
⎞
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
− Fi to
where we used the identities
simplify the expression. Finally, we can introduce the first sum in the maximum and
notice that v(cid:5)
j ∈ch(i) R(Tj ) = R(Ti ) − Ri and
is zero in the variables of Ri to write:
j ∈ch(i) FTj
= FTi
(cid:6)
(cid:6)
value[i][w] = max
v∈Bn∧1Ri
max
v(cid:5)∈Bn∧1R(Ti )−Ri
f (x ⊕ w ⊕ v ⊕ v(cid:5)
),
(cid:4)
f ∈FTi
which is Eq. (5) written in a different way.
(cid:2)
The operator described is an optimal recombination operator: it finds a best off-
spring from the largest dynastic potential. The time required to evaluate one clique in
Algorithm 2 is O((|Fi| + |ch(i)|)2
). The number of children is bounded by n and the
number of subfunctions m is bounded by O(nk ) due to the k-bounded epistasis of f .
However, the exponential factor is a threat to the efficiency of the algorithm. In the
worst case Ci can contain all the variables and the factor would be 2n.
|Ci |
3.3
Limiting the Complexity
In order to avoid the exponential runtime of DPX, we propose to limit the exploration
in Lines 2 and 4 of Algorithm 2. Instead of iterating over all the possible combinations
for all the variables in the separators Si and the residues Ri, we fix a bound β on the
number of variables that will be exhaustively explored. The remaining variables will
jointly take only two values, each one coming from one of the parents. In a separator
Si or residue Ri with more than β variables, we still have to decide which variables are
exhaustively explored and which ones will be taken from one parent. One important
constraint in this decision is that once we decide that two variables xi and xj will be
taken from the same parent, then this should also happen in the other cliques where the
two variables appear. We use a disjoint-set data structure (Tarjan, 1975) to keep track
of the variables that must be taken from the same parent. In each clique, the variables
that are articulation points in the VIG are added first to the list of variables to be fully
explored. The motivation for this can be found in Section 3.4. The other variables are
added in an arbitrary order (implementation dependent) to the list of variables to fully
explore. We defer to future work a more in depth analysis on the strategies to decide
which variables are fully explored in each clique.
Let us illustrate this with our example of Figures 2 and 4, where we set β = 2. This
does not affect the evaluation of C2 or C3, because in both cases the sets Si and Ri have
cardinality less than or equal to β = 2, so all the variables in R2, S2, R3, and S3 will be
fully explored. However, once cliques C3 and C2 (in that order) have been evaluated,
we need to evaluate C1 and, in this case, |R1
| = 4 > 2 = β. For the evaluation of C1,
two variables, say x7 and x12, are fully enumerated (the four combinations of values for
them are considered) and the other two variables, x13 and x15, are taken from the same
parent and only two combinations are considered for them: 00 (red parent) et 11 (blue
parent). In total, only 23 = 8 combinations of values for the variables in this clique are
explored, instead of the 24 = 16 possible combinations.
This reduces the exponential part of the complexity of Algorithm 2 à 22(β+1). Since
β is a predefined constant parameter decided by the user, the exponential factor turns
into a constant. The operator is not anymore an optimal recombination operator. In the
cases where β + 1 ≥ max{|Ri|, |Si|} for all the cliques, DPX will still return the optimal
offspring.
Theorem 2: Given a function in the form of Equation (1) with m subfunctions, the complexity
of DPX with a constant bound β for the number of exhaustively explored variables is O(4β (n +
m) + n2).
Evolutionary Computation Volume 30, Nombre 3
421
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Proof: We have seen in Section 3.1 that the complexity of maximum cardinality search,
the fill-in procedure and the clique tree construction is O(n2). The assignment of sub-
functions to cliques can be done in O(n + m) temps, using the variable ordering found by
MCS to assign the subfunctions that depend on each visited variable to the only clique
where the variable is a residue. The complexity of the dynamic programming compu-
tation is:
(cid:11)
(cid:12)(cid:12)
(cid:11)
(cid:12)
(cid:11)
(|Fi| + |ch(je)|) 22(β+1)
= O
4β
m +
|ch(je)|
= O(4β (m + n)),
(cid:4)
Ô
je
(cid:4)
je
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
where we used the fact that the sum of the cardinality of the children for all the cliques is
the number of edges in the clique tree, which is the number of cliques minus one, et le
number of cliques is O(n). The reconstruction of the offspring solution requires reading
all the variable arrays until building the solution. The complexity of this procedure
(cid:2)
is O(n).
In many cases, the number of subfunctions m is O(n) or O(n2). In these cases, le
complexity of DPX reduces to O(4βn2). But complexity can even reduce to O(n) in some
cases. En particulier, when all the connected components of the recombination graph
are paths or have a number of variables bounded by a constant, the number of edges
in the original and the chordal graph is O(n) and the complexity of DPX inherits this
linear behavior. This is the case for the recombination graph showed in Figure 3 for a
real SAT instance, so this linear time complexity is not unusual, even in real and hard
instances.
3.4
Theoretical Comparison with PX and APX
DPX is not worse than PX, depuis, in the worst case, it will pick the variables for each con-
nected component of the recombination graph from one of the parent solutions (what
PX does). Autrement dit, if β = 0 and there is only one clique in all the connected com-
ponents of the recombination graph (worst case), DPX and PX behave the same and
produce offspring with the same quality. We wonder, cependant, if this happens with
APX. If β + 1 ≥ max{|Ri|, |Si|} for all the cliques Ci in the chordal graph derived from
the recombination graph, DPX cannot be worse than any recombination operator with
the property of gene transmission and, in particular, it cannot be worse than APX. Oth-
erwise, if the limit in the exploration explained in Section 3.3 is applied, it could happen
that articulation points are not explored as they are in APX. One possible threat to the
articulation points exploration in DPX is that they disappear after making the graph
chordal. Heureusement, to make a graph chordal, the fill-in procedure only adds edges
joining vertices in a cycle and, thus, it keeps the articulation points. We provide formal
proofs in the following.
Lemma 1: The fill-in procedure adds edges joining vertices in a cycle of the original graph.
Proof: Let us assume that edge (v, w) is added to the graph by the fill-in procedure.
The values γ (v) and γ (w) are the numbers assigned by maximum cardinality search to
nodes v and w. Let us assume without loss of generality that γ (v) < γ (w). The definition
of fill-in (Tarjan and Yannakakis, 1984) implies that there is a path between v and w
where all the intermediate nodes have a γ value lower than γ (v). On the other hand,
during the application of maximum cardinality search the set of numbered nodes form
a connected component of the graph. This implies that at the moment in which v was
numbered there existed a path between v and w with γ values higher than γ (v). As a
422
Evolutionary Computation Volume 30, Number 3
Dynastic Potential Crossover Operator
Figure 5: Connected component in a recombination graph (left) and its clique tree
(right). DPX with β = 1 explores the articulation points in a different way as APX.
consequence, two non-overlapping paths exist between v and w in the original graph
(cid:2)
and they form a cycle.
Theorem 3: Articulation points of a graph are kept after the fill-in procedure.
Proof: According to Lemma 1, all the edges added by the fill-in procedure join vertices
in a cycle of the original graph. This means that the edges are added to bi-connected
components of the graph, and never join vertices in two different bi-connected compo-
nents. Adding edges to a bi-connected component never removes articulation points
(cid:2)
and the result follows.
The previous theorem implies that articulation points of the original recombination
graph are also articulation points of the chordal graph. Articulation points of a chordal
graph are minimal separators of cardinality one (Galinier et al., 1995) and they will ap-
pear in the sets Si of some cliques Ci. They are, thus, identified during the clique tree
construction. This inspires a mechanism to reduce the probability that a solution ex-
plored in APX is not explored in DPX. In each clique Ci when β variables are chosen
to be exhaustively explored (Lines 2 and 4 of Algorithm 2) we choose the articulation
points first. This way, articulation points can be exhaustively explored with higher prob-
ability. The only thing that can prevent articulation points from being explored is that
many of them appear in one single clique. This situation is illustrated in Figure 5. For
β = 0 all the cliques are evaluated only in the two-parent solutions as PX does, while
APX explores 20 different combinations of variables according to Eq. (5) of Chicano et al.
(2018). For β = 1, the cliques C2, C3, and C4 are fully explored, but in the clique of ar-
ticulation points, C1, only variable x4 is fully enumerated, variables x5 and x6 are taken
from the same parent. The total number of solutions explored is 32, which is more than
the ones analyzed by APX (20), but the articulation points x5 and x6 are not explored
in the same way as in APX and the set of solutions explored by DPX and APX differ.
Thus, APX could find an offspring with higher fitness than the one obtained by DPX
with β = 1.
3.5 Generalization of DPX
Although this article focuses on Mk Landscapes, defined over binary variables, DPX
can also be applied as is when the variables take their values in a finite set different
from the binary set. In this case, one can imagine that 0 represents the value of a differ-
ing variable in the red parent, and 1 the value of the same variable in the blue parent.
All the results, including runtime guarantees, are the same. The only difference is that
the offspring of DPX is not optimal, in general, in the smallest hyperplane containing
Evolutionary Computation Volume 30, Number 3
423
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
/
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
the parent solutions, because an optimal solution could have values not present in the
parents. Even in this case, DPX can be modified to keep the same runtime and provide
an optimal solution in the mentioned hyperplane, at the only cost of violating the gene
transmission property.
4
Experiments
This section will focus on the experimental evaluation of DPX in comparison with other
previous crossover operators. In particular, we want to answer the following two re-
search questions:
• RQ1: How does DPX perform compared with other crossover operators in
terms of runtime and quality of offspring?
• RQ2: How does DPX perform included in a search algorithm for solving NP-
hard pseudo-Boolean optimization problems?
Regarding the other operators in the comparison, we include PX and APX because
they are gray box crossover operators using the VIG and we want to check the claims
exposed in Section 3.4 that relate DPX with these two operators. We also include in
this comparison two other operators which dynastic potential has the same size as DPX
(2h(x,y)): uniform crossover and network crossover. In uniform crossover (UX), each vari-
able is taken from one of the two parents with probability 0.5. Network crossover (NX)
(Hauschild and Pelikan, 2010) uses the learned linkages among the variables to select
groups of variables from one parent such that variables from the same connected com-
ponent in the linkage graph are selected first. In our case, we have complete knowledge
of the linkage graph: the variable interaction graph. The VIG is used in our implemen-
tation of network crossover and variables are selected using randomized breadth first
search in the VIG and starting from a random variable until half of the variables are
selected. Then, the group of selected variables is taken from one of the parents and in-
serted into the other to form the offspring.
Two different kinds of NP-hard problems are used in the experiments: NKQ Land-
scapes, an academic benchmark which allows us to parameterize the density of edges in
the VIG by changing K, and MAX-SAT instances from the MAX-SAT Evaluation 2017.5
Random NKQ (“Quantized” NK) landscapes (Newman and Engelhardt, 1998) can be
seen as Mk Landscapes with one subfunction per variable (m = n). Each subfunction f(cid:2)
depends on variable x(cid:2) and other K random variables, and the codomain of each sub-
function is the set {0, 1, . . . , Q − 1}, where Q is a positive integer. Thus, each subfunction
depends on exactly k = K + 1 variables. The values of the subfunctions are randomly
generated; that is, for each subfunction and each combination of variables in the sub-
function, an integer value in the interval [0, Q − 1] is randomly selected following a uni-
form distribution. Random NKQ landscapes are NP-hard when K ≥ 2. The parameter
K determines the higher order nonzero Walsh coefficients in its Walsh decomposition,
which is a measure of the “ruggedness” of the landscape (Hordijk and Stadler, 1998).
Regarding MAX-SAT, we used the same instances as Chicano et al. (2018)6 to allow the
comparison with APX. They are 160 unweighted and 132 weighted instances.
5http://mse17.cs.helsinki.fi/benchmarks.html.
6The list of instances is available together with the source code of DPX in Github.
424
Evolutionary Computation Volume 30, Number 3
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
The computer used for the experiments is one multicore machine with four Intel
Xeon CPU (E5-2680 v3) at 2.5 GHz, summing a total of 48 cores, 192 GB of memory,
and Ubuntu 16.04 LTS. The source code of all the algorithms and operators used in the
experiments can be found in Github7 including a link to a docker image to ease the
reproduction of the experimental results.
Section 4.1 answers RQ1 and Section 4.2 answers RQ2. In Section 4.3, we include a
local optima network analysis of the best overall algorithm identified in Section 4.2 to
better understand its behavior.
4.1 Crossover Comparison
This section will present the experiments to answer RQ1: how does DPX perform com-
pared with APX, PX, NX, and UX in terms of runtime and quality of offspring? In the
case of DPX, we use values for β from 0 to 5. The optimization problem used is random
NKQ Landscapes with n = 10 000 variables, K = 2, 3, 4, 5 and Q = 64. For each value
of K we generated 10 different instances, summing a total of 40 NKQ Landscapes in-
stances. In each of them we randomly generated 6,000 pairs of solutions with different
Hamming distance between them and applied all the crossover operators. Six different
values of Hamming distance h were used, generating 1 000 pairs of random solutions
for each Hamming distance. Expressed in terms of the percentage of differing variables,
the values for h are 1%, 2%, 4%, 8%, 16%, and 32%. Two metrics were collected in each
application of all the crossover operators: runtime and quality improvement over the
parents. The crossover runtime was measured with nanoseconds precision (expressed
in the tables in an appropriate multiple) and the quality of the offspring is expressed
with a relative measure of quality improvement. If x and y are the parent solutions and z
is the offspring we define the quality improvement ratio (QIR) in a maximization problem
as:
QIRf (x, y, z) = f (z) − max{f (x), f (y)}
max{f (x), f (y)}
,
(8)
that is, the fraction of improvement of the offspring compared with the best parent. All
the experiments were run with a memory limit of 5 GB of RAM. In the case of PX, APX,
and DPX we also collected the number of implicitly explored solutions, expressed with
its logarithm (the offspring is one best solution in the set of implicitly explored solu-
tions) and the fraction of runs in which the crossover behaves like an optimal recombi-
nation (returns the best solution in the largest dynastic potential). Tables 1 to 4 present
the runtime, quality of improvement, logarithm of explored solutions and percentage
of crossover runs where an optimal offspring is returned. The figures are averages over
10 000 samples (1 000 crossover operations in each of the ten instances for each value
of K).
Regarding the runtime, we observe some clear trends that we will comment in the
following. Uniform crossover is the fastest algorithm (less than 200 μs in all the cases).
It randomly selects one parent for each differing bit and this can be done very fast.
The other operators are based on the VIG and they require more time to explore it and
compute the offspring. Their runtime can be best measured in milliseconds. NX, PX,
and APX have runtimes between less than one millisecond to 20 ms. DPX is clearly the
slowest crossover operator when the parent solutions differ in 32% of the bits (3 200
variables), reaching 1 second of computation for instances with K = 5. For lower val-
ues of h, APX is sometimes slower than DPX. We also observe an increase in runtime
7https://github.com/jfrchicanog/EfficientHillClimbers
Evolutionary Computation Volume 30, Number 3
425
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
/
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Table 1: Average runtime of crossover operators for random NKQ Landscapes with n =
10 000 variables. Time is in microseconds (μs) for UX and in milliseconds (ms) for the
rest. The Hamming distance between parents, h, is expressed in percentage of variables.
h
UX NX
PX APX
DPX (ms)
% μs ms ms ms
β = 0
β = 1
β = 2
β = 3
β = 4
β = 5
1
2
4
8
16
32
1
2
4
8
16
32
1
2
4
8
16
32
1
2
4
8
16
32
73
95
93
120
113
154
92
87
97
125
120
143
79
96
93
99
116
143
68
82
85
119
113
139
1.2
2.3
2.3
2.3
1.2
1.7
1.7
2.4
2.8
2.8
1.9
2.2
2.8
3.8
3.4
3.5
2.7
2.8
3.2
3.7
4.2
4.3
3.0
3.7
0.5
0.9
1.4
2.2
2.8
9.3
0.6
1.2
1.9
3.0
5.1
5.9
0.9
1.7
2.2
4.8
3.7
5.2
0.9
1.8
3.5
5.4
4.1
5.8
K = 2
0.9
2.3
2.8
6.9
5.9
22.8
K = 3
1.0
1.6
3.1
4.9
10.5
257.5
K = 4
1.1
1.9
3.3
5.7
31.9
601.9
K = 5
1.5
2.3
3.9
8.1
83.0
1 034.0
0.8
2.1
2.9
6.3
5.5
22.1
1.0
1.8
3.1
4.7
9.7
251.4
1.1
1.7
3.2
5.6
31.7
596.7
1.4
2.1
3.6
8.0
90.7
1 000.5
1.0
2.5
4.5
7.2
7.1
12.7
1.5
3.5
6.3
8.3
9.1
16.0
1.9
4.4
7.1
11.6
11.2
18.0
2.6
5.2
8.7
13.3
12.8
19.4
0.8
2.4
2.9
6.3
5.8
23.5
1.0
2.0
3.1
5.7
10.1
256.4
1.3
2.1
3.5
5.4
32.8
587.8
0.8
2.0
2.5
5.8
5.8
23.3
1.0
1.7
2.7
5.6
10.9
273.1
1.4
1.8
3.6
6.0
33.3
598.9
0.8
2.1
2.5
5.8
5.4
24.6
0.9
1.6
2.4
4.5
11.0
267.5
1.0
1.5
3.2
6.8
34.0
683.0
0.9
1.9
2.4
5.7
5.3
23.3
1.0
1.7
2.7
4.9
11.3
263.8
1.2
1.6
3.5
7.0
36.2
692.4
1.5
2.3
3.8
8.2
103.0
1 041.1
1.4
2.1
3.9
9.5
92.2
1 020.3
1.4
2.2
4.0
10.9
101.3
1 089.9
1.3
2.0
4.1
9.9
107.5
1 021.7
with h, which can be explained because the recombination graph to explore is larger.
It is interesting to note that no exponential increase is observed in the runtime when
β increases linearly. To explain this we have to observe Tables 3 and 4, where we can
see that DPX is able to completely explore the dynastic potential when h is low and
the logarithm of the number of solutions explored increase very slowly with β because
it is near the maximum possible. High runtime is one of the drawbacks of DPX, and
the other one is memory consumption. For the instances with n = 10 000 variables, 5GB
of memory is enough when K ≤ 5, but we run some experiments with n = 100 000 in
which DPX ended with an “Out of Memory” error. In these cases, increasing the mem-
ory could help, but the amount of memory required is higher than that required by PX
and APX, and much higher than the memory required by UX and NX.
We can observe that the quality improvement ratio (Table 2) is always positive in
PX, APX, and DPX. These three operators, by design, cannot provide a solution that is
426
Evolutionary Computation Volume 30, Number 3
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
Table 2: Average quality improvement ratio of crossover operators for random NKQ
Landscapes with n = 10 000 variables. The numbers are in parts per thousand (‰). The
Hamming distance between parents, h, is expressed in percentage of variables.
h
%
UX
‰
NX
‰
PX
‰
APX
‰
DPX (‰)
β = 0
β = 1
β = 2
β = 3
β = 4
β = 5
1 −0.58 −0.55
2 −0.79 −0.81
4 −1.13 −1.11
8 −1.56 −1.54
16 −2.08 −2.07
32 −2.72 −2.71
1 −0.64 −0.65
2 −0.92 −0.91
4 −1.29 −1.26
8 −1.72 −1.77
16 −2.39 −2.37
32 −2.85 −2.85
1 −0.74 −0.74
2 −1.04 −1.04
4 −1.42 −1.40
8 −1.92 −1.95
16 −2.47 −2.55
32 −3.15 −3.13
1 −0.79 −0.78
2 −1.10 −1.10
4 −1.53 −1.56
8 −2.07 −2.06
16 −2.68 −2.66
32 −3.15 −3.13
4.92
9.89
19.28
35.04
53.43
34.41
5.57
10.93
20.10
30.80
21.30
6.55
6.02
11.47
18.98
17.30
6.92
1.35
6.38
11.46
15.06
8.07
2.19
0.28
4.93
9.99
19.96
39.19
70.87
42.09
5.62
11.33
22.50
40.40
25.45
7.38
6.18
12.48
23.81
21.78
7.93
1.95
6.72
13.40
20.38
9.56
2.90
0.77
K = 2
4.92
9.95
19.70
38.15
75.03
108.86
5.04
10.38
21.21
42.80
85.72
123.98
5.04
10.39
21.23
42.92
86.21
134.38
5.04
10.39
21.23
42.92
86.21
137.29
5.04
10.39
21.23
42.92
86.21
138.78
5.04
10.39
21.23
42.92
86.21
139.76
K = 3
5.60
11.18
21.95
43.67
63.04
59.15
K = 4
6.12
12.20
24.25
41.39
41.63
40.98
K = 5
6.61
13.17
26.44
31.18
30.14
32.42
5.84
12.02
24.50
49.37
70.96
63.68
6.51
13.46
27.38
46.48
45.38
42.14
7.18
14.77
29.58
34.54
31.61
32.82
5.84
12.03
24.57
49.66
77.15
76.87
6.52
13.48
27.50
49.29
53.90
46.88
7.18
14.81
30.06
39.26
37.08
34.18
5.84
12.03
24.57
49.66
79.14
83.34
6.52
13.48
27.50
50.46
57.24
53.52
7.18
14.81
30.14
41.02
41.51
36.64
5.84
12.03
24.57
49.66
80.21
86.36
6.52
13.48
27.50
51.32
58.91
58.04
7.18
14.81
30.16
41.98
43.48
40.31
5.84
12.03
24.57
49.66
80.96
88.23
6.52
13.48
27.50
52.04
59.98
60.35
7.18
14.81
30.17
42.67
44.83
44.05
worse than the best parent. We also observe how the quality improvement ratio is al-
ways the highest for DPX. APX and PX are the second and third operators regarding
this metric, respectively. The worst operators are UX and NX. They always show a neg-
ative quality improvement ratio. We can explain this with the following intuition: the
expected fitness of the offspring is similar to the one of a random solution and the prob-
ability of improving both parents is 1/4 because the probability of improving each one
is 1/2, which means that in most of the cases (3/4 of the cases on average) the offspring
will be worse than the best parent. We can support this with some theory. Parents x and
y are random solutions and their fitness values, f (x) and f (y), are random variables
with unknown equal distribution.8 In UX and NX, the child z is a random solution in
8In general, they are not independent because the Hamming distance among them is fixed. But the
marginal distributions must be the same.
Evolutionary Computation Volume 30, Number 3
427
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
/
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Table 3: Average logarithm in base 2 of the solutions implicitly explored by PX, APX,
and DPX for random NKQ Landscapes with n = 10 000 variables. The Hamming dis-
tance between parents, h, is expressed in percentage of variables.
h
PX
APX
% log2
log2
DPX (log2)
β = 0
β = 1
β = 2
β = 3
β = 4
β = 5
1
2
4
8
16
32
1
2
4
8
16
32
1
2
4
8
16
32
1
2
4
8
16
32
97.1
188.1
352.9
613.5
873.6
660.7
94.1
176.5
306.6
437.3
351.5
141.5
90.2
161.2
247.0
238.2
119.7
31.1
85.4
142.0
175.4
113.2
38.9
7.5
97.3
190.3
368.1
703.3
1 220.6
828.3
95.2
184.3
352.3
602.5
426.4
155.1
93.0
179.0
324.7
305.8
134.1
39.5
91.1
172.4
246.1
132.7
47.6
13.7
97.2
189.3
362.0
679.7
1 311.2
2 055.6
94.7
181.2
341.2
663.0
1 019.5
1 099.0
91.9
173.7
332.8
580.8
651.3
719.8
89.1
168.5
332.2
420.5
449.0
534.0
K = 2
100.0
199.9
399.5
796.8
1 586.5
2 399.2
K = 3
100.0
199.8
398.4
793.2
1 174.7
1 179.1
K = 4
99.9
199.5
397.4
674.0
710.3
737.4
K = 5
99.9
199.2
390.6
470.8
469.0
539.3
100.0
200.0
400.0
800.0
1 600.0
2 586.9
100.0
200.0
400.0
799.9
1 271.0
1 395.2
100.0
200.0
400.0
713.9
831.5
812.0
100.0
200.0
398.2
530.8
542.8
559.3
100.0
200.0
400.0
800.0
1 600.0
2 636.5
100.0
200.0
400.0
800.0
1 300.4
1 499.5
100.0
200.0
400.0
729.6
878.1
914.8
100.0
200.0
399.4
552.6
601.9
595.7
100.0
200.0
400.0
800.0
1 600.0
2 661.3
100.0
200.0
400.0
800.0
1 316.0
1 547.6
100.0
200.0
400.0
740.9
901.3
983.3
100.0
200.0
399.8
564.4
627.7
649.7
100.0
200.0
400.0
800.0
1 600.0
2 677.4
100.0
200.0
400.0
800.0
1 326.9
1 576.8
100.0
200.0
400.0
750.4
915.9
1 018.2
100.0
200.0
399.9
572.8
645.3
703.5
the dynastic potential and the expectation of the random variable f (z) should not differ
too much from the one of f (x) and f (y) because the dynastic potential is large (at least
2100 in our experiments) and NKQ Landscapes are composed of functions randomly
generated. We can use the following equations for the random variables,
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
e
d
u
e
v
c
o
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
a
_
0
0
3
0
5
p
d
/
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
max{f (x), f (y)} + min{f (x), f (y)} = f (x) + f (y),
(cid:13)
(cid:13)
(cid:13) ,
(cid:13)f (x) − f (y)
max{f (x), f (y)} − min{f (x), f (y)} =
where we can take the expectation at both sides and add them to get:
2E[max{f (x), f (y)}] = E[f (x)] + E[f (y)] + E[
(cid:13)
(cid:13)f (x) − f (y)
(11)
Finally, we use the fact that E[f (x)] = E[f (y)] and the assumption that E[f (z)] =
E[f (x)] to write:
(9)
(10)
(cid:13)
(cid:13)
].
E[f (z) − max{f (x), f (y)}] = −E[
]/2 < 0.
(12)
(cid:13)
(cid:13)f (x) − f (y)
(cid:13)
(cid:13)
428
Evolutionary Computation Volume 30, Number 3
Dynastic Potential Crossover Operator
Table 4: Fraction of crossover runs where the full dynastic potential of the parent solu-
tions is explored in PX, APX, and DPX for random NKQ Landscapes with n = 10 000
variables. The numbers are in percentages (%). The Hamming distance between parents,
h, is expressed in percentage of variables.
h
%
1
2
4
8
16
32
1
2
4
8
16
32
1
2
4
8
16
32
1
2
4
8
16
32
PX
APX
DPX(%)
%
%
β = 0
β = 1
β = 2
β = 3
β = 4
β = 5
K = 2
5.61
0.00
0.00
0.00
0.00
0.00
0.27
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
6.45
0.00
0.00
0.00
0.00
0.00
0.47
0.00
0.00
0.00
0.00
0.00
0.02
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
5.61
0.00
0.00
0.00
0.00
0.00
0.27
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
99.07
93.17
60.73
4.11
0.00
0.00
96.46
77.88
20.97
0.22
0.00
0.00
92.11
59.18
8.21
0.00
0.00
0.00
K = 3
K = 4
100.00
100.00
100.00
99.98
99.67
0.00
99.99
99.94
98.39
88.87
0.00
0.00
99.95
99.54
95.38
0.06
0.00
0.00
K = 5
86.88
43.67
2.96
0.00
0.00
0.00
99.90
98.89
72.71
0.00
0.00
0.00
100.00
100.00
100.00
100.00
99.98
0.00
100.00
100.00
100.00
99.93
0.00
0.00
100.00
100.00
99.99
0.27
0.00
0.00
99.99
99.98
90.86
0.00
0.00
0.00
100.00
100.00
100.00
100.00
100.00
0.00
100.00
100.00
100.00
99.99
0.00
0.00
100.00
100.00
100.00
1.00
0.00
0.00
100.00
100.00
96.30
0.00
0.00
0.00
100.00
100.00
100.00
100.00
100.00
0.00
100.00
100.00
100.00
100.00
0.00
0.00
100.00
100.00
100.00
2.20
0.00
0.00
100.00
100.00
98.58
0.00
0.00
0.00
In Table 3, we can observe how DPX explores more solutions than PX and APX
when β > 0. We also observe how APX can outperform DPX in the number of explored
solutions when β = 0, as we illustrated with an example in Section 3.4. The latter hap-
pens when h ≤ 800 for K = 2, when h ≤ 400 for K = 3, and when h ≤ 200 for K = 4 et
K = 5. DPX always explores more solutions than PX independently of the value of β, comme
the theory predicts. As h grows, the dynastic potential increases and also the logarithm
of explored solutions for DPX. En fait, in many cases we observe that this logarithm
reaches in DPX the maximum possible value, h (the Hamming distance between the
parents). This is also reflected in Table 4, where we show the percentage of runs where
the full dynastic potential is explored by the operators or, equivalently, the percentage
of runs in which the logarithm of explored solutions is h. In the case of PX and APX the
increase in h does not always imply an increase in the number of explored solutions:
Evolutionary Computation Volume 30, Nombre 3
429
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
there is a value of h for which the logarithm of explored solutions reaches a maximum
and then decreases with h. The number of explored solutions in these two operators
is proportional to the number of connected components in the recombination graph.
Starting from an empty recombination graph the number of connected components in-
creases as new random variables are added, and this explains why the logarithm of
explored solutions in PX and APX increases with h for low values of h. At some criti-
cal value of h, the number of connected components starts to decrease because the new
variables in the recombination graph join connected components, instead of generating
new ones. The exact value of h at which this happens is approximately n/(K + 1) for the
adjacent NKQ Landscapes (Chicano et al., 2017). It is difficult to compute this value for
the random NKQ Landscapes that we use here, but the critical value must be a decreas-
ing function of K. This dependence on the critical value with K can also be observed in
Tableau 3: the value of h at which the number of explored solutions is maximum decreases
from h = 1,600 to h = 400 when K increases from 2 à 5.
Regarding the fraction of runs in which full dynastic potential exploration is
achieved, PX and DPX with β = 0 behaves the same and achieves full exploration for
some pairs of parents only when K ≤ 3. APX is slightly better and DPX is the best when
β ≥ 1, behaving like an optimal recombination operator in most of the cases for K = 2
and h ≤ 1,600. The fraction of runs with full dynastic potential exploration decreases
when K and h increase, due to the higher number of edges in the variable interaction
graph, which induces larger cliques.
4.2 Crossover in Search Algorithms
Dans cette section, we want to answer RQ2: how does DPX perform when it is included in a
search algorithm compared with APX, PX, NX, and UX? From Section 4.1, we conclude
that DPX is better than PX, APX, UX, and NX in terms of quality of the solution due to
its higher exploration capability. This feature is not necessarily good when the opera-
tor is included in a search algorithm, because it can produce premature convergence.
DPX is also slower, in general, than the other crossover operators in our experimen-
tal setting, and this could slow down the search. In order to answer the research ques-
tion, we included the crossover operators in two metaheuristic algorithms, each of them
from a different family: deterministic recombination and iterated local search (DRILS),
a trajectory-based metaheuristic, and an evolutionary algorithm (EA), a population-
based metaheuristic.
DRILS (Chicano et al., 2017) uses a first improving move hill climber to reach a local
optimum. The neighborhood in the hill climber is the set of solutions at Hamming dis-
tance one from the current solution. Alors, it perturbs the solution by randomly flipping
αn bits, where α is the so-called perturbation factor, the only parameter of DRILS to be
tuned. It then applies local search to the new solution to reach another local optimum
and applies crossover to the last two local optima, generating a new solution that is im-
proved further with the hill climber. This process is repeated until the stop condition is
satisfied. The pseudocode of DRILS can be found in Algorithm 3.
The evolutionary algorithm (EA) which we use in our empirical evaluation, is a
steady-state genetic algorithm where two parents are selected and recombined to pro-
duce an offspring that is mutated. The mutated solution replaces the worst solution in
the population only if its fitness is strictly higher. The mutation operator used is bit flip
with probability pm of independently flipping each bit. Three selection operators are
considered: binary tournament, rank selection, and roulette wheel selection. The pseu-
docode of the EA is presented in Algorithm 4.
430
Evolutionary Computation Volume 30, Nombre 3
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
We can also consider applying DPX alone to solve the pseudo-Boolean optimiza-
tion problems. C'est, if we use two parent solutions, x and y, that are complementary,
yi = 1 − xi for all i = 1, . . . , n, and we set β = ∞, DPX will return the global optimum.
This approach should be useful in families of Mk Landscapes where the treewidth of
the VIG is bounded by a small constant like, Par exemple, the adjacent model of NK
Landscapes, which can be solved in linear time (Wright et al., 2000). In these cases, le
cliques Ci found in the clique tree can be fully explored due to the low number of vari-
ables they contain. Cependant, we do not expect it to work well for NP-hard problems,
like the random model of NKQ Landscapes that we use here, or the general MAX-SAT
instances, that we also use in this section. The reason is that, in these cases, the cardinal-
ity of the cliques, |Ci|, usually grows with the number of variables, c'est, |Ci| ∈ ω(1)
and the complexity of DPX (with β = ∞) can be 2ω(1). This approach was used by Ochoa
and Chicano (2019) to find global optimal solutions in randomly generated MAX-SAT
instances of size n = 40 variables. It was found that DPX required much time to ex-
haustively explore the search space and the authors decomposed the search in different
parallel non-overlapping searches by fixing the values of some variables in the parent
Evolutionary Computation Volume 30, Nombre 3
431
F. Chicano et al.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
solutions before applying DPX. For the sake of completeness, cependant, we analyze the
execution of DPX alone in Section 4.2.4.
For the experiments in this section, we use NKQ Landscapes with K = 2 and K = 5
and MAX-SAT instances (unweighted and weighted). With this diverse setting for
the experiments (two metaheuristics belonging to different families and two pseudo-
Boolean problems with two categories of instances each), we want to test DPX in dif-
ferent scenarios to identify when DPX is useful for pseudo-Boolean optimization and
when it is not.
Parameter Settings
4.2.1
We used an automatic tuning tool, irace (López-Ibáñez et al., 2016), to help us find a
good configuration for each algorithm in the different scenarios. The parameters to be
tuned are β in DPX, α in DRILS, and pm, the selection operator, and the population size
in EA. The ranges of values for each of them are presented in Table 5. The version of irace
432
Evolutionary Computation Volume 30, Nombre 3
Dynastic Potential Crossover Operator
Tableau 5: Parameters tuned by irace and their domain.
DPX
DRILS
EA
Parameter Domain
Parameter Domain
Parameter
Domain
β
[0-5]
un
[0,0.5]
pm
popsize
selection
[0,0.5]
[10,100]
(tournament, rank, roulette)
used is 3.4.1 and the budget is 1 000 runs of the tuned algorithm. The rest of parameters
of irace were set to the default. We applied irace independently for each combination of
algorithme, crossover operator, and family of instances used. This way we can compare
the algorithms using a good configuration in each scenario.
The stop condition in the search algorithms was set to reach a time limit of 60 sec-
onds (one minute) during the irace tuning and also in the experiments of the following
sections. We use runtime as stop condition because the number of fitness function eval-
uations is not a good metric for the computational budget in our case, where gray-box
optimization operators are taking profit from the VIG and the subfunction evaluations.
This can be easily observed comparing the huge difference in runtime of UX and DPX
in Table 1. The concrete value for the time limit (60 seconds) was chosen based on our
previous experience with these algorithms including gray box optimization. In previ-
ous works and experiments, where we stopped the experiments after 300 seconds, nous
found that the results are rather stable after 60 seconds, with few exceptions. Using
this short time also allows us to perform a larger set of experiments in the same time
obtaining more insight on how the algorithms work.
At the end of the tuning phase, irace provides several configurations that are consid-
ered equivalent in performance (irace does not find statistically significant differences
among them). We took, in all the cases, the first of those configurations and we show
it in Tables 6 et 7 for each combination of algorithm, crossover operator, and set of
instances. These are the parameters used in the experiments of Sections 4.2.2 et 4.2.3.
We did not check the convergence speed of irace and we do not know how far the pa-
rameters in Tables 6 et 7 are from the best possible configurations. Ainsi, we cannot get
any insight from the parameters computed by irace. Our goal is just to compare all the
algorithms and operators using good configurations, and avoid bias due to parameter
setting.
In order to reduce the bias due to the stochastic nature of the algorithms, they were
run ten times for each instance and average results are presented in the next sections.
The Mann–Whitney test was run with the samples from the ten independent runs to
check if the observed difference in the performance of the algorithms for each instance
is statistically significant at the 0.05 confidence level or not.
4.2.2 Results for NKQ Landscapes
Dans cette section, we analyze the results obtained for NKQ Landscapes. We do not know
the fitness value of the global optimal solutions in the instances. For this reason, pour
each instance (there are ten instances per value of K) we computed the fitness of the
best solution found in any run by any algorithm+crossover combination, f ∗
. Nous avons utilisé
f ∗
to normalize the fitness of the best solutions found by the algorithms in the instance.
Evolutionary Computation Volume 30, Nombre 3
433
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Tableau 6: Configuration proposed by irace during the tuning phase of the algorithms for
NKQ instances with n = 10 000 variables and K = 2, 5.
K = 2
K = 5
β
1
β
3
DRILS
DPX
APX
PX
NX
UX
EA
DPX
APX
PX
NX
UX
un
0.2219
0.1873
0.1240
0.0154
0.0159
pm
selection
popsize
0.0044
0.0172
0.0084
0.0007
0.0003
roulette
roulette
rank
roulette
rank
61
72
47
37
41
β
3
β
2
un
0.0462
0.0231
0.0191
0.0268
0.0238
pm
selection
popsize
0.0080
0.0002
0.0034
0.0008
0.0006
rank
roulette
rank
rank
roulette
15
27
70
54
14
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Tableau 7: Configuration proposed by irace during the tuning phase of the algorithms for
unweighted and weighted MAX-SAT instances.
Unweighted
Weighted
β
4
β
5
DRILS
DPX
APX
PX
NX
UX
EA
DPX
APX
PX
NX
UX
un
0.0582
0.0941
0.0482
0.0299
0.0571
pm
selection
popsize
0.0038
0.0096
0.0051
0.0047
0.0019
rank
tournament
tournament
rank
rank
18
19
27
18
18
β
2
β
2
un
0.1832
0.1870
0.0996
0.0241
0.0214
pm
selection
popsize
0.0018
0.0069
0.0086
0.0020
0.0012
rank
tournament
rank
rank
tournament
52
24
20
27
78
Ainsi, if x is the best solution found by an algorithm in an instance, we define the quality
of solution x as q(X) = f (X)/f ∗. The benefit of this is that the quality of a solution x is a
real number between 0 et 1 that measures how near is the fitness of x to the best known
fitness (in this set of experiments). This also allows us to aggregate quality values from
different instances because they are normalized to the same range. Higher values of
quality are better.
The main results of the section are shown in Table 8, where the column quality is the
average over ten runs and ten instances (100 samples in total) of the quality of the best
434
Evolutionary Computation Volume 30, Nombre 3
Dynastic Potential Crossover Operator
Tableau 8: Performance of the five recombination operators used in DRILS and EA when
solving NKQ Landscapes instances with n = 10 000 variables. The symbols (cid:3), (cid:4), and =
are used to indicate that the use of the crossover operator in the row yields statistically
better, worse, or similar results than the use of DPX in each algorithm.
K = 2
K = 5
Statistical difference Quality
Statistical difference Quality
DRILS
DPX
APX
PX
NX
UX
EA
DPX
APX
PX
NX
UX
0(cid:3) 8(cid:4) 2 =
0(cid:3) 10(cid:4) 0 =
0(cid:3) 10(cid:4) 0 =
0(cid:3) 10(cid:4) 0 =
0(cid:3) 10(cid:4) 0 =
0(cid:3) 10(cid:4) 0 =
0(cid:3) 10(cid:4) 0 =
0(cid:3) 10(cid:4) 0 =
0.9997
0.9995
0.9990
0.9786
0.9790
0.9795
0.9568
0.9445
0.8803
0.9313
0(cid:3) 7(cid:4) 3 =
0(cid:3) 7(cid:4) 3 =
0(cid:3) 10(cid:4) 0 =
0(cid:3) 10(cid:4) 0 =
1(cid:3) 0(cid:4) 9 =
10(cid:3) 0(cid:4) 0 =
0(cid:3) 1(cid:4) 9 =
0(cid:3) 1(cid:4) 9 =
0.9972
0.9947
0.9949
0.9934
0.9935
0.8132
0.8890
0.9085
0.7811
0.8407
solution found by the algorithm+crossover combination in the row. The column sta-
tistical difference shows the result of a Mann–Whitney test (with significance level 0.05)
and a median comparison to check if the differences observed in quality between the
algorithm+crossover in the row and algorithm+DPX are statistically significant or not.
In each row and for each instance, the results of the ten runs are used as input to the
Mann–Whitney test. It determines equivalence or dissimilarity of samples. The sense of
the inequality is determined by the median comparison. In the table, the numbers fol-
lowed by a black triangle ((cid:3)), white triangle ((cid:4)), and equal sign (=) are the numbers of
instances in which the algorithm+crossover combination in the row is statistically bet-
ter, worse or similar to algorithm+DPX. In Figures 6 et 8 we show the average quality
of the best found solution at any time during the search using the different algorithms
and crossover operators. We group the curves in the figures by algorithm (DRILS and
EA) and ruggedness of the instances (K = 2 and K = 5).
The first important conclusion we obtain from the results in Table 8 is that DRILS
performs better with DPX than with any other crossover operator. There is no single
NKQ Landscapes instance in our experimental setting where another crossover oper-
ator outperforms DPX included in DRILS. There are only a few instances (8 in total)
where APX and/or PX show a similar performance. We can observe in Figure 6a that
DRILS+DPX obtains the best average quality at any time during the search when K = 2,
followed by PX and APX. UX and NX provide the worst average quality in this set of
instances. We observe in the figure signs of convergence in all the crossover operators.
Cependant, after a careful analysis checking the time of last improvement, whose distri-
bution is presented in Figure 7a, we notice that DRILS with DPX, APX, and PX provides
improvements to the best solution after 50 seconds in around 50% of the runs, alors que
DRILS with UX and NX seems to stuck in 30 à 40 seconds after the start of the search,
and much earlier in some cases. We wonder if this time could be biased by the different
Evolutionary Computation Volume 30, Nombre 3
435
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Chiffre 6: Average quality of the best found solution at any time for DRILS using differ-
ent crossover operators solving NKQ Landscapes.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Chiffre 7: Probability distribution of the time of last improvement to the best found so-
lution from the start of the search (un) and the average time between improvements for
the three last improvements (b) in DRILS with the different crossover operators when
solving NKQ Landscapes with K = 2.
runtime of crossover operators. Perhaps the algorithm produces the last improvement
near the end of the execution for DPX, APX, and PX but the previous one was in the
middle of the run, earlier than the last improvement of NX and UX. To investigate this,
we show in Figure 7b the distribution of the average time between improvements for
the last three improvements. This time is far below one second in most of the cases for
all the crossover operators, which means that they are producing better solutions sev-
eral times per second on average until the time of last improvement, and there is no bias
related to the different crossover runtime.
In the more rugged instances (K = 5), shown in Figure 6b, DRILS+DPX is the best
performing algorithm after 20 seconds of computation. Before that time DRILS+UX
provides the best performance. We can explain this with the help of Table 1. UX is
436
Evolutionary Computation Volume 30, Nombre 3
Dynastic Potential Crossover Operator
Chiffre 8: Average quality of the best found solution at any time for EA using different
crossover operators solving NKQ Landscapes.
the fastest crossover operator, and helps to advance in the search at the beginning.
DPX (as well as PX and APX) are slower operators and, even if they provide better
quality offspring, they slow down the search, requiring more time to be effective. Nous
have to recall here that DRILS includes a hill climber, which explains why when us-
ing a random black box operator like UX the quality of the best solution is still high
(au-dessus de 0.93).
If we analyze the performance of the crossover operators in EA, we observe that
DPX is also the best crossover operator for the instances with K = 2. Cependant, quand
K = 5 DPX is outperformed by PX and, in general, shows a performance similar to
APX, NX and UX. The best crossover operator in EA when K = 5 is PX. Taking a look
to Figure 8a, we observe that EA+PX and EA+APX provide the highest average quality
of the best solution during the first 35 à 40 seconds, then they are surpassed by DPX.
The reason for this slow behavior of EA+DPX is again the high runtime of DPX, lequel
in the case of EA is higher than in the case of DRILS due to the fact that the solutions in
the initial population are random and, thus, differ in around h = 0.5n = 5 000 bits. Ce
slow runtime is especially critical when K = 5 (Figure 8b), where EA+DPX is not able to
reach the average quality of EA+PX in 60 seconds. In Figure 9, we plot the time required
by DPX during the search when it is included in DRILS and EA. The runtime of DPX in
DRILS is a few milliseconds because the parent solutions differ in around αn = 462 bits
(α = 0.0462 according to Table 6), while the runtime of DPX in EA starts in six seconds
and goes down to one second at the end of the search. This behavior suggests that in
an EA a hybrid approach combining PX at the beginning of the search and DPX later
during the search could be a better strategy to reach better quality solutions in a short
temps. The random crossover operators, UX and NX, show a poorer performance in EA
compared with DRILS probably because there is no local search in EA.
Enfin, although it is not our goal to compare search algorithms (only crossover
operators) we would like to highlight some observations regarding the average quality
of the best found solutions in DRILS and EA. We conclude that DRILS is always better
than EA. For K = 2, the highest quality in EA is obtained when DPX is used and its
quality is only slightly higher than that of DRILS with NX and UX (the worst performing
crossover operators in DRILS). For K = 5 the difference is even higher: EA+PX reaches
an average quality of 0.9085 (highest for EA) which is far below any average quality
of DRILS, all above 0.9934 (the one of DRILS+NX). We think that the reason for that
Evolutionary Computation Volume 30, Nombre 3
437
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Chiffre 9: Runtime (in seconds) of DPX as the search progresses in DRILS and EA solving
NKQ Landscapes with K = 5.
Tableau 9: Performance of the five recombination operators used in DRILS and EA when
solving MAX-SAT instances. The symbols (cid:3), (cid:4), and = are used to indicate that the use
of the crossover operator in the row yields statistically better, worse, or similar results
than the use of DPX.
Unweighted
Weighted
Statistical difference Quality
Statistical difference Quality
DRILS
DPX
APX
PX
NX
UX
EA
DPX
APX
PX
NX
UX
14(cid:3) 91(cid:4) 55 =
8(cid:3) 103(cid:4) 49 =
2(cid:3) 126(cid:4) 32 =
0(cid:3) 124(cid:4) 36 =
52(cid:3) 68(cid:4) 40 =
17(cid:3) 107(cid:4) 36 =
18(cid:3) 101(cid:4) 41 =
27(cid:3) 96(cid:4) 37 =
0.9984
0.9973
0.9968
0.9946
0.9953
0.9644
0.9604
0.9095
0.8980
0.9134
15(cid:3) 86(cid:4) 31 =
25(cid:3) 80(cid:4) 27 =
1(cid:3) 126(cid:4) 5 =
1(cid:3) 126(cid:4) 5 =
43(cid:3) 63(cid:4) 26 =
8(cid:3) 109(cid:4) 15 =
18(cid:3) 103(cid:4) 11 =
18(cid:3) 99(cid:4) 15 =
0.9996
0.9984
0.9982
0.9915
0.9930
0.9583
0.9649
0.9057
0.8786
0.8989
could be the presence of a local search operator in DRILS, while EA is mainly guided
by selection and crossover (when PX, APX, or DPX is used).
4.2.3 Results for MAX-SAT
Dans cette section, we analyze the results obtained for MAX-SAT. We also use the qual-
ity of the solutions defined in Section 4.2.2 as a normalized measure of quality. Tableau 9
presents the main results of the section. The meaning of the columns is the same as
in Table 8. In the case of DRILS, all the instances are used in the statistical tests and
the computation of the average quality (160 instances in the unweighted category and
438
Evolutionary Computation Volume 30, Nombre 3
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
132 instances in the weighted category). In the case of EA, we observed that it failed
to complete the execution in some runs for some instances when it was combined with
DPX. The reason was an out of memory problem, caused by the large number of differ-
ing variables among the solutions in the initial generations. Dans ce cas, we computed
the average quality only for instances in which at least 90% of the runs were success-
ful (nine of the ten runs) and we manually counted the instances with less than 90%
successful runs as significantly worse than EA+DPX for the remaining EA+crossover
combinations in Table 9 without performing any statistical test. Nine unweighted in-
stances and five weighted instances had less than 90% successful EA+DPX runs. Three
unweighted instances and no weighted instance had exactly 90% of successful runs and
in the remaining instances EA+DPX ends successfully in all the runs.
From the results in Table 9 we conclude that both algorithms (DRILS and EA) par-
form better, in general, using DPX as the crossover operator. Only in very few cases
does any other crossover operator improve the final result of DRILS. When EA is used,
the difference is not so clear, but still significant.
Encore une fois, we also observe that the performance of DRILS is better than that of
EA. The maximum average quality in EA is 0.9649 (EA+APX in the weighted instances)
while the average quality of DRILS is always above 0.9915 for all the crossover operators
and category of instances.
We do not expect DRILS or EA to be competitive with state-of-the-art incomplete
MAX-SAT solvers like SATLike-c9 (Cai and Lei, 2020), because they are general opti-
mization algorithms. Cependant, DPX could be useful to improve the performance of
some incomplete MAX-SAT solvers, as PX did in recent work (Chen et al., 2018).
4.2.4 DPX Alone as Search Algorithm
Dans cette section, we analyze the results of DPX alone used to solve pseudo-Boolean opti-
mization problems. We apply DPX to a random solution and its complement, avec le
goal of finding the global optimum. Due to technical limitations of our current imple-
mentation of DPX, we cannot set β = ∞, but we use the maximum value of β allowed
by the implementation, which is 28. This also means that if the whole search space is not
explored for a concrete instance the result could depend on the initial (random) solu-
tion. For this reason, we run DPX ten times per instance on different random solutions.
We set a runtime limit of 12 hours.
After applying DPX to the 20 instances of NKQ Landscapes (ten instances for each
value of K), we found that all the runs were prematurely terminated due to an out of
memory error (the memory limit was set to 5 GB as in the previous experiments). Dans
the case of the MAX-SAT instances, the runs finished in less than 12 hours for eight
unweighted instances and two weighted instances. Dans 152 unweighted instances and
128 weighted ones, DPX ran out of memory. In two weighted instances, we stopped
DPX after 12 hours of computation. Tableau 10 shows the MAX-SAT instances that finished
without error and, for each one, it shows the average and maximum number of satisfied
clauses in the ten independent runs of DPX, the average runtime (in seconds) et le
logarithm in base 2 of the implicitly explored solutions (it is the same in all runs). Le
minimum number of satisfied clauses found in all the runs of DRILS+DPX in less than
60 seconds is shown for comparison in the last column. We mark with an asterisk in the
logarithm of implicitly explored solutions the runs that fully explored the search space
(thus, DPX was able to certify the global optimum).
9SATLike-c was the winner in the unweighted incomplete track of the MAX-SAT Evaluation 2021
and got third position in the weighted incomplete track.
Evolutionary Computation Volume 30, Nombre 3
439
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Tableau 10: Results of DPX with β = 28 for the MAX-SAT instances that finished in less
que 12 hours without memory error. The minimum number of satisfied clauses found
in all the runs of DRILS+DPX in less than 60 seconds is shown for comparison in the last
column. The asterisk means that the whole search space was explored and the number
of satisfied clauses is the global maximum.
Instance
avg
maximum
(s)
(log2)
+DPX
Satisfied clauses
Time
Explored DRILS
Unweighted instances
bcp-msp-normalized-ii16c1
maxcut-brock200_1.clq
maxcut-brock400_2.clq
maxcut-brock800_2.clq
maxcut-p_hat1000-2.clq
maxcut-p_hat700-3.clq
maxcut-san400_0.5_1.clq
maxcut-san400_0.9_1.clq
15 690
889
929
809
600
986
644
1047
15 848
892
933
816
600
989
644
1050
9 401
8 253
6 387
8 572
375
7 617
990
8 589
maxcut-johnson8-2-4.clq
maxcut-sanr200_0.9.clq
2 048
5 994
2 048
6 027
144
10 680
Weighted instances
49
32
32
31
*40
32
*40
30
*28
31
20 266
894
936
819
600
993
644
1052
2 048
6 036
The main conclusion is that DPX alone is not good compared with its combination
with a search algorithm. It requires a lot of memory and time to compute the result.
Even when DPX finishes, the results are always outperformed by DRILS+DPX (see last
column of Table 10).
In the previous experiment, we forced DPX to use a lot of memory and runtime be-
cause the value of β was high. One of the advantages of DPX compared with an optimal
recombination operator is that we can limit the time and space complexity of DPX using
β. We wonder if DPX alone with a low value of β could outperform DRILS or EA. Dans
order to answer this question we designed a new experiment in which we run a new al-
gorithm, called iDPX (iterated DPX), which consists of the iterated application of DPX
over a random solution and its complement. The algorithm stops when a predefined
time limit is reached (one minute in our case). We used irace to tune iDPX. The only pa-
rameter to tune is β, and we used the same configuration of irace that we used to tune
DRILS and EA. According to irace, the best value for β is 5 for both, NKQ Landscapes
and MAX-SAT. We run iDPX ten times per instance and compared it with DRILS and
EA using the five crossover operators.
In NKQ Landscapes, iDPX was statistically significantly worse than all the remain-
ing ten algorithms (DRILS and EA with the five crossover operators) for all the instances.
The average quality of the solutions obtained by iDPX was 0.7543 for K = 2 et 0.6640
for K = 5, which are very low values compared with the ones obtained by the other
algorithms (see Table 8).
In MAX-SAT, iDPX prematurely finished with out of memory errors in 59 et-
weighted instances and 25 weighted ones. In these cases, we ran iDPX with β = 0 à
alleviate the memory requirements but the algorithm ran out of memory again. Quand
iDPX does not have a result for an instance due to an error, we mark it as statistically
440
Evolutionary Computation Volume 30, Nombre 3
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
Tableau 11: Performance comparison of iDPX, DRILS and EA when solving MAX-SAT
instances. The symbols (cid:3), (cid:4), and = are used to indicate that the algorithm in the row
yields statistically better, worse, or similar results than the use of iDPX.
Unweighted
Weighted
Statistical difference Quality
Statistical difference Quality
iDPX
DRILS
DPX
APX
PX
NX
UX
EA
DPX
APX
PX
NX
UX
160(cid:3) 0(cid:4) 0 =
160(cid:3) 0(cid:4) 0 =
160(cid:3) 0(cid:4) 0 =
160(cid:3) 0(cid:4) 0 =
160(cid:3) 0(cid:4) 0 =
150(cid:3) 0(cid:4) 10 =
148(cid:3) 1(cid:4) 11 =
131(cid:3) 16(cid:4) 13 =
128(cid:3) 20(cid:4) 12 =
131(cid:3) 19(cid:4) 10 =
0.8772
0.9992
0.9979
0.9981
0.9966
0.9967
0.9869
0.9675
0.9344
0.9156
0.9334
132(cid:3) 0(cid:4) 0 =
132(cid:3) 0(cid:4) 0 =
132(cid:3) 0(cid:4) 0 =
130(cid:3) 1(cid:4) 1 =
130(cid:3) 1(cid:4) 1 =
126(cid:3) 0(cid:4) 6 =
126(cid:3) 0(cid:4) 6 =
112(cid:3) 3(cid:4) 17 =
95(cid:3) 27(cid:4) 10 =
120(cid:3) 4(cid:4) 8 =
0.8516
0.9997
0.9984
0.9982
0.9907
0.9925
0.9700
0.9676
0.9155
0.8836
0.9063
significantly worse than the other algorithms that finished with a result and equal to
the algorithms that did not finish (EA+DPX does not finish in some instances). Dans
Tableau 11 we compare iDPX with the other ten algorithms, providing the same met-
rics as in Section 4.2.3. The average quality values change with respect to the ones in
Tableau 9. The reason is that the instances used for the average computation in Table 11
are a subset of the instances used in Table 9, due to the memory errors of iDPX.
We observe in Table 11 that iDPX is statistically worse than DRILS+DPX,
DRILS+APX, and DRILS+PX in all the instances. iDPX does not outperform EA+DPX
in any case (it is equivalent in 16 instances). En outre, iDPX is worse in most of the
instances for algorithms not using DPX (par exemple., EA+UX). Ainsi, our main conclusion of
this section is that DPX alone is not competitive with any search algorithm and should
only be used as a crossover operator inside a search algorithm. DPX can provide the
global optimum only in the cases where the number of variables is low or the VIG has a
low number of edges. This is hardly the case in NP-hard problems (it happened in only
three MAX-SAT instances of our benchmark).
Local Optima Network Analysis of DRILS+DPX
4.3
The best algorithm+crossover combination identified in Section 4.2 is DRILS+DPX, pour
NKQ Landscapes and MAX-SAT. In this section we conduct a local optima network
(LON) analyse (Ochoa et al., 2008, 2015) in order to better understand the search dy-
namics of DRILS+DPX. A LON is a graph where nodes are local optima and edges rep-
resent search transitions among them with a given operator. DRILS has two transition
operators, crossover (DPX) and perturbation (each followed by local search to produce
a local optimum), which are modeled as two types of edges in the LON: DPX + hill
climber and perturbation + hill climber. An edge is, respectivement, improving if its end
node has better fitness than its start node, equal if the start and end nodes have the same
Evolutionary Computation Volume 30, Nombre 3
441
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Tableau 12: LON Statistics for NKQ Landscapes with n = 10 000 variables. Average (avg)
and standard deviation (std) of the ten instances for each value of K are shown.
Crossover Edges
Perturbation Edges
Nodes
improv.
equal worse.
improv.
equal worse.
K = 2
avg
std
K = 5
avg
std
24 361.7
2 387.7
20 490.7
1 970.6
3 676.9
1 422.5
23 371.8
3 137.5
22 697.6
3 033.4
123.0
15.0
0.0
0.0
0.0
0.0
10.8
0.6
53.2
5.4
0.0
0.0
0.0
0.0
12 263.4
1 194.9
11 898.8
1 612.8
fitness, and worsening, if the end node has worse fitness than the start node. Our analysis
reports the LONs extracted by ten runs of DRILS+DPX solving NKQ Landscapes in-
stances with n = 10 000 and the two extreme values of K used in our experiments (2 et
5). The parameters used are the ones in Table 5 for this algorithm and value of K. Le
process of extracting the LON data is as follows. For each of the ten runs per instance,
we record every unique local optima and edge encountered, from the start until the end
of the run. As it is done in previous work (Ochoa and Veerapen, 2018), we combined the
search trajectories produced by the ten runs as a sampling process to construct a single
LON and our analysis revealed that there are no overlapping nodes and edges across
the runs. Donc, the combined LONs contain ten connected components, one char-
acterizing each run. Tableau 12 reports some basic LON statistics, specifically, the number
of nodes (local optima) and the number edges of each type, crossover (DPX) and per-
turbation, grouped as improving, equal, and worsening edges. They are statistics over
the ten different instances (and LONs). There cannot be a worsening crossover edge, par
design of DPX, but we include the column for the sake of completeness. The number
of nodes and edges sampled is similar for the different values of K. Although DPX is
slower for K = 5, its perturbation factor α = 0.0462 is lower than in the case of K = 2.
With the parameters provided by irace, DRILS+DPX visits a similar number of unique
local optima during the search for both values of K. It is interesting to note that most
of the crossover edges are improving, while most of the perturbation edges are deterio-
rating. This suggests that the role of perturbation within DRILS is to provide diversity
as a raw material for DPX, which then incorporates newly found variable assignments
that improve fitness.
In order to have a closer look at the search trajectories, Chiffre 10 visualizes the LON
of a run obtaining the best solution for two selected instances with n = 10 000 et le
two values of K. The plots use a tree-based graph layout where the root node is the local
optimum at the beginning of the run, highlighted with cyan color. We color in yellow
the best local optima found. Red edges are crossover edges, where each parent adds an
edge pointing to the offspring. Blue edges with triangular marker represent perturba-
tion edges. Solid-, dashed-, and dotted-line styles are used for improving, worsening,
and equal edges. Since the complete LONs are large (plus que 2,000 nodes in each
run), we show only the neighborhood around the starting and best nodes. The search
process is very similar in structure for the two instances and reflects the working of the
algorithme. Cependant, we also observe some remarkable differences for different values
of K. We observe more improving perturbation edges when K = 5 than when K = 2.
442
Evolutionary Computation Volume 30, Nombre 3
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Chiffre 10: LONs for two NKQ Landscapes instances with n = 10 000 and K = 2, 5 comme
indicated in the captions. Yellow nodes highlight the best local optima found, alors que
cyan nodes the start local optimum. Red edges correspond to crossover and blue edges
with a triangular marker to perturbation. Improving edges are solid, worsening edges
are dashed, and equal edges are dotted.
En fait, only one improving perturbation edge appears for K = 2, which means that it
is difficult to improve a solution provided by DPX in the near neighborhood reached
by the perturbation operator followed by local search. We also should highlight that
improving perturbation edges appear only at the beginning of the search in both cases
(K = 2 and K = 5). After a few steps, only DPX provides improvements over the best
found solution. We also see that many different best local optima are found in K = 2
joined by equal crossover edges, while only one appears in the instance with K = 5.
Evolutionary Computation Volume 30, Nombre 3
443
F. Chicano et al.
DPX shows here an ability to jump from one best local optimum to another. No equal
crossover edges appear near the end node when K = 5: once the best local optimum is
trouvé, a worsening perturbation edge makes the search to escape from it.
5 Conclusions
This article proposes a new gray-box crossover operator, dynastic potential crossover
(DPX), with the ability to obtain a best offspring out of the full dynastic potential if the
density of interactions among the variables is low, where it can behave like an optimal
recombination operator. We have provided theoretical results proving that DPX, quand
recombining two parents, generates an offspring no worse than partition crossover (PX)
and usually no worse than articulation points partition crossover (APX). We also per-
formed a thorough experimental study to compare DPX with four crossover operators,
including PX and APX, and using two different algorithms: DRILS and EA. Nous avons utilisé
NKQ Landscapes and MAX-SAT instances in the comparison. We conclude that DPX
offers a great exploration ability, providing offspring that are much better than their
parents, compared with the other crossover operators. Cependant, this ability has a cost
in runtime and memory consumption, the main drawback of the operator. Included
in DRILS, DPX outperforms all the other crossover operators in NKQ Landscapes and
MAX-SAT instances. In the case of EA, it also outperforms the other crossover opera-
tors except in the case of NKQ Landscapes with K = 5, where PX is the best performing
operator. Ainsi, we suggest that a combined use of PX and DPX in EA could be optimal
in high epistasis functions.
An interesting future line of research is to analyze the shape of the connected com-
ponents of the recombination graph to design precomputed clique trees that could
speed up the operator. The code of DPX can also be much optimized when we focus
on particular problems, like MAX-SAT. We used in this work a general implementation
that can be optimized for particular problems. A problem-specialized version of DPX
could be combined with state-of-the-art incomplete MAX-SAT solvers to improve their
performance.
Remerciements
This research is partially funded by the Universidad de Málaga, Consejería de Economía
y Conocimiento de la Junta de Andalucía and FEDER under grant number UMA18-
FEDERJA-003 (PRECOG); under grant PID 2020-116727RB-I00 (HUmove) funded
by MCIN/AEI/ 10.13039/501100011033; and TAILOR ICT-48 Network (Non 952215)
funded by EU Horizon 2020 research and innovation program. The work is also par-
tially supported in Brazil by São Paulo Research Foundation (FAPESP), under grants
2021/09720-2 et 2019/07665-4, and National Council for Scientific and Technological
Développement (CNPq), under grant 305755/2018-8.
Special thanks to J.A. Lozano, Sebastian Herrmann, and Hansang Yun for providing
pointers to algorithms for the clique tree construction, and to the anonymous reviewers
for their thorough review and constructive suggestions.
Les références
Bodlaender, H. L. (2005). Discovering treewidth. In Proceedings of SOFSEM, pp. 1–16.
Cai, S., and Lei, Z. (2020). Old techniques in new ways: Clause weighting, unit propagation
and hybridization for maximum satisfiability. Artificial Intelligence, 287:103354. 10.1016/
j.artint.2020.103354
444
Evolutionary Computation Volume 30, Nombre 3
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
.
/
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Dynastic Potential Crossover Operator
Chen, W., and Whitley, D. (2017). Decomposing SAT instances with pseudo backbones. En Pro-
ceedings of EvoCOP, pp. 75–90. Lecture Notes in Computer Science, Vol. 10197.
Chen, W., Whitley, D., Tinós, R., and Chicano, F. (2018). Tunneling between plateaus: Improving
on a state-of-the-art MAXSAT solver using partition crossover. In Proceedings of the Genetic
and Evolutionary Computation Conference (GECCO), pp. 921–928.
Chicano, F., Ochoa, G., Whitley, D., and Tinós, R.. (2018). Enhancing partition crossover with artic-
ulation points analysis. In Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO), pp. 269–276.
Chicano, F., Ochoa, G., Whitley, D., and Tinós, R.. (2019). Quasi-optimal recombination operator.
In Proceedings of EvoCOP, pp. 131–146. Lecture Notes in Computer Science, Vol. 11452.
Chicano, F., Whitley, D., Ochoa, G., and Tinós, R.. (2017). Optimizing one million variable NK
landscapes by hybridizing deterministic recombination and local search. In Proceedings of
the Genetic and Evolutionary Computation Conference (GECCO), pp. 753–760.
Eremeev, UN. V., and Kovalenko, J.. V. (2013). Optimal recombination in genetic algorithms. CoRR,
abs/1307.5519.
Galinier, P., Habib, M., and Paul, C. (1995). Chordal graphs and their clique graphs. En M. Nagl
(Ed.), Graph-theoretic concepts in computer science, pp. 358–371. New York: Springer.
Hammer, P.. L., Rosenberg, JE., and Rudeanu, S. (1963). On the determination of the minima of
pseudo-Boolean functions. Studii ¸si Cercet˘ari de Matematic˘a, 14:359–364.
Hauschild, M.. W., and Pelikan, M.. (2010). Network crossover performance on NK landscapes
and deceptive problems. In Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO), pp. 713–720.
Hordijk, W., and Stadler, P.. F. (1998). Amplitude spectra of fitness landscapes. Advances in Complex
Systems, 1:39–66. 10.1142/S0219525998000041
López-Ibáñez, M., Dubois-Lacoste, J., Pérez-Cáceres, L., Birattari, M., and Stützle, T. (2016). Le
irace package: Iterated racing for automatic algorithm configuration. Operations Research Per-
spectives, 3:43–58.
Newman, M.. E. J., and Engelhardt, R.. (1998). Effect of neutral selection on the evolution of
molecular species. In Proceedings of the Royal Society of London B, 265:1333–1338. 10.1098/
rspb.1998.0438
Ochoa, G., and Chicano, F. (2019). Local optima network analysis for MAX-SAT. In Proceedings of
the Genetic and Evolutionary Computation Conference (GECCO) Companion, pp. 1430–1437.
Ochoa, G., Chicano, F., Tinós, R., and Whitley, D. (2015). Tunnelling crossover networks. En Pro-
ceedings of the Genetic and Evolutionary Computation Conference (GECCO), pp. 449–456.
Ochoa, G., Tomassini, M., Verel, S., and Darabos, C. (2008). A study of NK landscapes’ basins and
local optima networks. In Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO), pp. 555–562.
Ochoa, G., and Veerapen, N. (2018). Mapping the global structure of TSP fitness landscapes. Jour-
nal of Heuristics, 24(3): 265–294. 10.1007/s10732-017-9334-0
Radcliffe, N. J.. (1994). The algebra of genetic algorithms. Annals of Mathematics and Artificial Intel-
ligence, 10(4): 339–384. 10.1007/BF01531276
Rana, S., Heckendorn, R.. B., and Whitley, D. (1998). A tractable Walsh analysis of SAT and its
implications for genetic algorithms. In Proceedings of AAAI, pp. 392–397.
Tarjan, R.. E. (1975). Efficiency of a good but not linear set union algorithm. Journal of the ACM,
22(2): 215–225. 10.1145/321879.321884
Evolutionary Computation Volume 30, Nombre 3
445
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
F. Chicano et al.
Tarjan, R.. E., and Yannakakis, M.. (1984). Simple linear-time algorithms to test chordality of graphs,
test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM Journal on
Computing, 13(3): 566–579. 10.1137/0213035
Terras, UN. (1999). Fourier analysis on finite groups and applications. Cambridge, U.K.: Cambridge
Presse universitaire.
Tinós, R., Whitley, D., and Chicano, F. (2015). Partition crossover for pseudo-Boolean optimiza-
tion. In Proceedings of Foundations of Genetic Algorithms, pp. 137–149.
Tinós, R., Zhao, L., Chicano, F., and Whitley, D. (2018). NK hybrid genetic algorithm for clustering.
IEEE Transactions on Evolutionary Computation, 22(5): 748–761.
Whitley, D., Chicano, F., and Goldman, B. W. (2016). Gray box optimization for Mk landscapes
(NK landscapes and MAX-kSAT). Evolutionary Computation, 24:491–519. 10.1162/EVCO_a_
00184, PubMed: 27120114
Wright, UN., Thompson, R., and Zhang, J.. (2000). The computational complexity of NK fitness func-
tion. IEEE Transactions on Evolutionary Computation, 4(4): 373–379. 10.1109/4235.887236
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
/
/
e
d
toi
e
v
c
o
un
r
t
je
c
e
–
p
d
je
F
/
/
/
/
3
0
3
4
0
9
2
0
4
0
9
1
6
e
v
c
o
_
un
_
0
0
3
0
5
p
d
/
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
446
Evolutionary Computation Volume 30, Nombre 3