I've chosen EO since a few years to code metaheuristics, after having test −
and even code − several other frameworks,
and I did not regret this choice. Since a few month, I'm now the maintainer of
EO, and the main developper, along with my intern, Caner Candan, and
we have great plan for improving EO!
Petr Pošík let an interesting comment on a previous post : "Metaheuristic"
or "Stochastic Local Search"?. As bloggers always love worthwhile comments
from insightful readers, I copy it here, as an article, along with my
answer.
I would like to present here an idea that will take the "stochastic" and
"local" idea even more further than Thomas and Dirk. In that view, even GAs are
local search techniques! Why?
It has a lot to do with the definition of neighborhood of the current state
of the algorithm. The state may be the position of an algorithm in the search
space. In classical local search, the state is given by mere 1 point. In
population-based techniques, the state is the whole population. New solutions
are usually generated only by local modifications of the state, i.e. the search
takes place only in the local neighborhood of the algorithm state.
Yes, of course, the local neighborhood of the population as a whole is very
broad and such a local search is much less prone to premature convergence and
exhibits a kind of global behaviour. But, IMHO, it is still a local
search...
I completely understand your view: if there is a non-zero probability of
generating any solution in the search space, then the search must eventually
find the global optimum, and thus performs a global search.
This is the case for many stochastic local search algorithm in continuous
domain which use e.g. normal distribution. But if you set the variance too
high, you will get very slow convergence to that global optimum which is in
practice not desirable. If you set variance too low, you will quickly find
local optimum, but you will have to wait for the global one virtually
infinitely long - again undesirable. Or, put it in another way, would you call
a "global optimization" technique still "global" if it takes zilions of years
to find that global solution?
In practice, I think, for continuous spaces we have to resort to algorithms
that exhibit local behavior since we always have some hard constraints on the
length of the optimization run. In my eyes, this perfectly justifies the name
"stochastic local search".
In fact, as far as I comprehend them, both Dick Thierens and Thomas Stützle
shares your point of view. For them, GA are a kind of stochastic local
search.
More rigourously, I think that every stochastic metaheuristic does
try to avoid local optima. Stochastic processes are always here to do
so[1].
Maybe the problem is the definition of local search. In my humble
opinion, a local search is a method that search for a local optimum.
This can be opposed to global search, where one want to find a
global optimum.
You are right when you point out that this is related to ergodicity, as only
an ergodic[2] algorithm may converge[3], and thus be able to reliably find a global
optimum. Thus, yes, I will say that a true and rigorous optimization method is
global if, and only if, it is at least quasi-ergodic[4]. A pure random search is the basic global
optimization method, that a good metaheuristic should at least outperforms.
Here, it is the stochastic operators that manipulates the solutions that
permits the global behaviour. This is not related to the encoding of the
solutions/the neighbourhood structure, even if it is a really crucial
part of the algorithm.
Thus, a "stochastic local search" may be defined as a local search seeking a
global optimum, which is a paradoxical definition. I'd rather prefer that we
keep the implementation and the mathematical bases separated, and thus talk of
"stochastic search", or "stochastic metaheuristic".
Or else, one may want to use a paradoxical definition, as a matter of fun,
which is also a good reason to do so :-)
Notes
[1] There may be probabilistic choices that are not related
to such tasks, but they are not linked to the iterative aspects of the search
algorithms, thus not being stochastics
[3] i.e. that have a non-null probability of finding the
global optimum in a finite time, or to say it differently, that can find the
global optimum after a time that may tends towards infinity (for continuous
problems, for discrete ones it is bounded by the size of the instance)
[4] i.e. that can reach any solution in a finite
number of iterations
Here are some explanations about the work I have presented
at the META'08 conference. This post is based on the notes I used for my
presentation.
In all the automatic parameter setting methods, the problem of finding the
better parameter set is considered as an optimization problem, with only one
objective, generally finding the best optimum, or reducing the uncertainty of
the results. Sometimes, one try to improve the speed. More rarely, speed,
precision or robustness are aggregated in one criterion, with an adhoc formula.
In fact, one can set parameters according to several objectives (improve speed,
improve robustness, etc.). One cannot find a set of parameters fitting all the
potential uses of a single algorithm on a single problem instance. Thus,
parameter setting is a multi-objective problem.
The key point here is that it is easier to set the parameters of a solver
than to solve the problem directly. The simpler example of this idea is when
you want to solve a continuous optimization problem with hundreds of variables,
with a metaheuristic that have 3 parameters. Moreover, you only have to tune
your parameters once, even if you will solve many problems instances later.
In this work, I only consider speed and precision, although the method may
handle any performance metrics.
What is crucial in our method is that we do not want to aggregates the
criterions, instead, we want the Pareto front corresponding to all the
non-dominated parameters set. I use plots representing the Pareto front, that I
will sometimes call the « performance » front, or performance
profile.
The idea is that one can then compare more rigorously several algorithms, by
comparing their respective performance fronts. We can also benefits from having
a cursor, scaling from a behaviour oriented towards speed, at one extreme, or
precision, at the other side. Even more interesting is the performance profile
projected on the parameters space. One can see that every algorithm has its
very own profile, that tells a lot on how it behaves.
The figure above shows performance profiles of 4 metaheuristics: a
Simulated Annealing, a Genetic Algorithm, two Estimation of Distribution
Algorithms (produced by NSGA-II, with median estimation, on the Rosenbrock-2D
problem, using the parameter corresponding to the sampling density for each
method with an absolute time stopping criterion).
Our results suggest that the choice of the stopping criterion has a drastic
influence on the interest of the performance profile, it must be chosen
carefully. Similarly, the method can not naturally find a unique profile for a
set of problem instances, but is strictly valid only for an instance of a given
problem. Finally, we note that the performance profiles are often convex in the
objectives space, which could indicate that aggregation may be usefull.
The proposed method allows to aggregate all parameters into one, determining
the position within the profile of performance, since behavior strongly
oriented towards production (fast, unaccurate) to conception (slow, accurate).
The projection of the profile in the space of parameters can also reflect the
impact of parameters on performance, or dependence between parameters. Such
information may be very relevant to better understand some complex
metaheuristics. It also becomes possible to compare several metaheuristics,
delaying the performance profiles on the same scale. The statistic validation
also receives additional dimensions of discrimination.
In perspective, it remains to reduce the demand for calculations of the
meta-optimizer, using dedicated methods (SPO, racing, etc.).. It is also
possible to extend the method taking into account robustness as supplementary
objectives and checking the possibility of rebuilding correlations on a set of
instances.
Finally, here are the slides. I use light slides without a lot o text, so I
suggest that you read the notes while looking at the presentation. You will
find the abstract, the extended abstract and the slides on my professional
website, at the corresponding publication page.
During
their excellent tutorials at META 2008, both Thomas and Dick talked about
"stochastic local search" and seems to be rather uncomfortable with the
"metaheuristic" term. They seems to reserve it for very high level well known
algorithms.
I'm not sure that using the term "stochastic" along with "local" is a good
idea. In all the algorithms, the use of probabilistic processes aims at
avoiding local optima. Thus, stochastic algorithms are not "local search"
anymore, but "global search". While I agree that it is a very good approach to
start with local search techniques, I would say that when you introduce
stochastic processes, then you enter the field of metaheuristics. On the other
hand, the paradoxal use of "stochastic" along with "local" may be interesting
from a marketing point of vue... but I like paradoxes.
Anyway, despite the fact that there would be a lot more to say about the
problem of nomenclature in our field (who says "everything is evolutionary"?),
this is not very important, I tink I will continue using "metaheuristics",
until a common term establish itself in the litterature.
The weather was nice in Tunisia, we had a place to sleep, a restaurant and a
swimming pool, the organization was just fine. The acceptance rate was of 60%,
with 116 accepted papers, for 130 attendants and one short review by paper (at
least for mine).
OK, now let's talk about what is really exciting: science.
I was more than pleased to attend to two tutorials, given by Dick Thierens
and Thomas Stützle, that both were talking about the use of stochastic local
search.
What was definitely interesting is that these two talented researchers were
insisting a lot on the need of a rigorous experimental approach for the design
and the validation of metaheuristics. That's good news for our research domain:
the idea that metaheuristics should be employed in a scientific manner rather
than in an artistic one gains more and more importance.
First, they both says that a good way to tackle a hard optimization problem
is to employ a bottom-up approach: start first with a simple local search, then
use metaheuristics operators to improve the results.
Thomas Stützle, particularly, insist on the crucial need of rigorous
parameter setting and experimental validation with statistical tests. That's
definitely a very important point.
Another good point made by Thomas is the use of the term "algorithm
engineering" to describe a rigorous design and evaluation approach of
optimization algorithms. I was searching a nice term to name it, I think this
one is a good candidate. The bad news at this conference is that, despite these
two incredible tutorials, there was very few peoples speaking about algorithm
engineering. I was presenting a new method for
parameter setting and behaviour understanding, but I was in a unrelated
"metaheuristics for real-world problem" session. I haven't seen other works
specifically dedicated to such subjects.
More badly, I have attend to several presentations with very bad
experimental work. Some peoples keeps telling their stochastic algorithm is
better only by showing the best result found. More often, their is a mean and a
standard deviation, but without a statistical test. But there is hope as, since
2001 (when some works made by Éric D. Taillard definitely introduced
experimental validation for metaheuristics, at least for me), I find that the
proportion of better experimental plans is increasing in the literature.
Anyway, my wish is that there will be more and more special sessions on
algorithm engineering in future conferences on metaheuristics. In the meantime,
there is the 2nd
"Engineering Stochastic local search algorithms" conference, in september
2009, in Brussels, that seems really interesting...
There exists a lot of different algorithms families that can be called
"metaheuristics", stricly speaking, there are a very, very, very large number
of metaheuristicsinstances.
Defining what is a metaheuristic "family" is a difficult problem: when may I
called this or this algorithm an evolutionary one? Is estimation of
distribution a sub-family of genetic algorithms? What is the difference between
ant colony optimization and stochastic gradient ascent? Etc.
Despite the difficulty of classifying
metaheuristics, there is some interesting characteristics shared by
stochastic metaheuristics. Indeed, they are all iteratively manipulating a
sample of the objective function[1]
For example, simulated annealing is
often depicted as a probabilistic descent
algorithm, but it is more than that. Indeed, simulated annealing is based
on the Metropolis-Hastings
algorithm, which is a way of sampling any probability distributionn, as
long as you can calculate its density at any point. Thus, simulated
annealing use an approximation of the objective function as a probability
density function to generate a sampling.
It is even more obvious if you consider a step by step decrease of the
temperature. Estimation of
distribution are another obvious example: they are explicitly manipulating
samplings, but one can also have the same thoughts about evolutionary algorithms, even if they are
manipulating the sampling rather implicitely.
The diagram tries to illustrate this idea: (a) a descent algorithm can have
the same sampling behaviour than an iteration of a (b) "population" method.
Given these common processes, is it possible to design a kind of "universal"
metaheuristic ? Theoretically, the answer is yes. For example, in the
continuous domain, consider an estimation of distribution algorithm, using a
mixture of gaussian kernel:
it can learn any probability density function (possibly needing an infinite
number of kernels). Thus, carefully choosing the function to use at each
iteration and the selection operator, one can reproduce the behaviour
of any stochastic metaheuristic.
Of course, choosing the correct mixture (and the other parameters) is a very
difficult problem in practice. But I find interesting the idea that the
problem of designing a metaheuristic can be reduced to a configuration
problem.
Notes
[1] Johann Dréo, Patrick Siarry, "Stochastic
metaheuristics as sampling techniques using swarm intelligence. ", in
"Swarm Intelligence: Focus on Ant and Particle Swarm Optimization", Felix T. S.
Chan, Manoj Kumar Tiwari (Eds.), Advanced Robotic Systems International, I-Tech
Education and Publishing, Vienna, Austria , ISBN 978-3-902613-09-7 - December
2008
I personally think that estimation of distribution algorithms (EDA) are some
of the more elegant an easy to use metaheuristics. Obviously, this highly
depends of the mind configuration :-) Anyway, as a piece of code is more
comprehensive than a long speech, I've made a very simple and small EDA in
python, to illustrate my thought.
This is a simple continuous EDA using a (also simple) normal probability
density function to optimize a (once more simple) function with two variables.
As you can see, the code is (guess what ?) simple, only a few lines with some
scipy functions, and that's it.
from scipy import *
# The problem to optimize def x2y2( x ): return x[0]**2 +
x[1]**2
#
number of points to select
nb
= int( floor( self.sample_size * self.select_ratio ) )
#
selection
self.sample
= self.sample[:nb]
def estimate_parameters(
self ):
#
points sub array (without values)
mat
= self.sample[:,:self.dimensions]
#
row means (axis 0 in scipy)
self.means
= mean( mat, 0 )
#
row standard deviation
self.stdevs
= std( mat, 0 )
def draw_sample(self):
#
for each variable to optimize for i
in xrange(self.dimensions):
#
if the dispersion is null if self.stdevs[i]
== 0.0:
#
set it to a minimal value
self.stdevs[i]
= self.epsilon
#
for each point for i
in xrange( self.sample_size ):
#
draw in random normal
p
= random.normal( self.means, self.stdevs )
#
put it into the sample
self.sample[i][:self.dimensions]
= p
def evaluate(self):
#
for each point for i
in xrange( self.sample_size ):
d
= self.dimensions
#
call the objective function
#
the third element is the result of the objective function call
#
taking the first two elements as variables
self.sample[i][-1] =
self.objective_function( self.sample[i][:d] )
People using metaheuristics often forget that the price to pay for their
ease of adaptation to a new problem is the hard validation
work. There is several things to keep in mind when using a
metaheuristic, especially when one want to prove that they
work in practice.
This (kind of) mind map try to list what you should do, and a short set of
main tools to do it. It is not always mandatory to use all the tools, sometimes
it is just a matter of choice (like for the parameter setting), sometimes the
more you do, the better it is (like for performance assessment).
The graphic has been drawn in SVG, and I have put some references in a very
small font at the bottom of some boxes. Thus, it would be more confortable to
view it in Firefox or in Inkscape, and to zoom where needed. Try the SVG version.
At last, I found some time to translate my chronology of metaheuristics I
have made for the french Wikipedia. I put it here for the records, but you will
find a more up-to-date version on the english Wikipedia article about Metaheuristics.
Timeline
Detailled chronology
1952 : first works on stochastics optimization methods[1].
1954 : Barricelli carry out the firsts simulations of the evolution
process and use them on general optimization problems[2].
↑ V. Angel, La rugosité des paysages : une théorie pour
la difficulté des problèmes d’optimisation combinatoire relativement aux
métaheuristiques, thèse de doctorat de l’université de Paris-Sud, Orsay,
1998.
↑ J. Dreo, J.-P. Aumasson, W. Tfaili, P. Siarry, Adaptive
Learning Search, a new tool to help comprehending metaheuristics, to
appear in the International Journal on Artificial Intelligence Tools.
↑ El-Ghazali Talbi, A taxonomy of hybrid metaheuristics,
Journal of Heuristics, volume 8, no2, pages 541-564, septembre
2002
↑ W. Dullaert, M. Sevaux, K. Sörensen et J. Springael,
Applications of metaheuristics, numéro spécial du European Journal
of Operational Research, no179, 2007.
↑ Robbins, H. and Monro, S., A Stochastic Approximation
Method, Annals of Mathematical Statistics, vol. 22, pp. 400-407, 1951
↑ Barricelli, Nils Aall, Esempi numerici di processi di
evoluzione, Methodos, pp. 45-68, 1954
↑ Rechenberg, I., Cybernetic Solution Path of an Experimental
Problem, Royal Aircraft Establishment Library Translation, 1965
↑ W.K. Hastings. Monte Carlo Sampling Methods Using Markov
Chains and Their Applications, Biometrika, volume 57, no1,
pages 97-109, 1970
↑ Holland, John H., Adaptation in Natural and Artificial
Systems, University of Michigan Press, Ann Arbor, 1975
↑ Smith, S.F., A Learning System Based on Genetic Adaptive
Algorithms, PhD dissertation (University of Pittsburgh), 1980
↑ S. Kirkpatrick, C. D. Gelatt et M. P. Vecchi, Optimization
by Simulated Annealing, Science, volume 220, no 4598, pages 671-680,
1983
↑ V. Cerny, A thermodynamical approach to the travelling
salesman problem : an efficient simulation algorithm Journal of
Optimization Theory and Applications, volume45, pages 41-51, 1985
↑ Fred GLover, Future Paths for Integer Programming and Links to
Artificial Intelligence, Comput. & Ops. Res.Vol. 13, No.5, pp. 533-549,
1986
↑ J.D. Farmer, N. Packard and A. Perelson, The immune system,
adaptation and machine learning, Physica D, vol. 22, pp. 187--204,
1986
↑ F. Moyson, B. Manderick, The collective behaviour of
Ants : an Example of Self-Organization in Massive Parallelism, Actes
de AAAI Spring Symposium on Parallel Models of Intelligence, Stanford,
Californie, 1988
↑ Koza, John R. Non-Linear Genetic Algorithms for Solving
Problems. United States Patent 4,935,877. Filed May 20, 1988. Issued June
19, 1990
↑ Goldberg, David E., Genetic Algorithms in Search,
Optimization and Machine Learning, Kluwer Academic Publishers, Boston,
MA., 1989
↑ P. Moscato, On Evolution, Search, Optimization, Genetic
Algorithms and Martial Arts : Towards Memetic Algorithms, Caltech
Concurrent Computation Program, C3P Report 826, 1989.
↑ M. Dorigo, Optimization, Learning and Natural
Algorithms, Thèse de doctorat, Politecnico di Milano, Italie, 1992.
↑ Feo, T., Resende, M., Greedy randomized adaptive search
procedure, Journal of Global Optimization, tome 42, page 32--37, 1992
↑ Eberhart, R. C. et Kennedy, J., A new optimizer using
particle swarm theory, Proceedings of the Sixth International Symposium on
Micromachine and Human Science, Nagoya, Japan. pp. 39-43, 1995
↑ Kennedy, J. et Eberhart, R. C., Particle swarm
optimization, Proceedings of IEEE International Conference on Neural
Networks, Piscataway, NJ. pp. 1942-1948, 1995
↑ Mülhenbein, H., Paaß, G., From recombination of genes to
the estimation of distribution I. Binary parameters, Lectures Notes in
Computer Science 1411: Parallel Problem Solving from Nature, tome PPSN IV,
pages 178--187, 1996
↑ Rainer Storn, Kenneth Price, Differential Evolution – A
Simple and Efficient Heuristic for global Optimization over Continuous
Spaces, Journal of Global Optimization, volume 11, no4,
pages 341-359, 1997
↑ Rubinstein, R.Y., Optimization of Computer simulation
Models with Rare Events, European Journal of Operations Research, 99,
89-112, 1997
↑ Stefan Boettcher, Allon G. Percus, "Extremal
Optimization : Methods derived from Co-Evolution", Proceedings of the
Genetic and Evolutionary Computation Conference (1999)
↑ Takagi, H., Active user intervention in an EC Search,
Proceesings of the JCIS 2000
Marcelo De
Brito had interesting thoughts about what he call
New Wave Of Genetic Algorithms. He is surprised that when "evolutionary
computation" is applied to a new problem, the first algorithm used is the good
old canonic genetic algorithm, despite that there exist active researchs on
Estimation of Distribution Algorithms. Julian Togelius write
that it may be because people does not understand other algorithms, or even
know that anything else exists.
I think that is definitely true. This subject is a kind of hobby for me.
Indeed, as I have came from ecology to applied mathematics, I feel like a kind
of generalist researcher, not being able to be the best somewhere, but trying
to be as good as possible on several fields. Concerning the field of what
Marcelo called NWOGA, I would like to emphasize some other things.
As David E. Goldberg say in
its courses, genetic algorithm is the term everybody use. For specialist, a
GA is just a kind of "evolutionary algorithm", with specific rules, that are
more defined by history than by anything else.
The litterature on evolutionary computation is quite big, the first
algorithm being designed in 1965 (evolutionary strategies, followed by
evolutionary programming in 1966), making it difficult to spread deep changes
on basic concepts.
There exist a lot more stochastic algorithms for global optimization than
just evolutionary ones. I prefer to call the stochastic metaheuristics, or
simply metaheuristics, because this lead to far less bias than a metaphoric
naming (cf. the previous post on classification
of metaheuristics).
For example, during my PhD thesis, I was convinced that some Ant Colony
Optimization algorithms were just equivalent to Estimation of Distribution
Algorithms, when talking about continuous problems. Moreover, I'm now convinced
that a lot of metaheuristics just shares some common stochastic sampling
processes, that are not specifiquely related to evolution. For example,
mathematically, Simulated Annealing is just a kind of EDA using an
approximation of the objective function as a model (or inversely, of
course).
As Julian says: I know roughly what an EDA does, but I couldn't sit down
an implement one on the spot. This is, in my humble opinion, one of the
more important thing to keep in mind. Indeed, there exist more and more papers
claming that a correct parameter setting of a metaheuristic can lead to the
performances of any competing metaheuristic.
Thus, the true discriminatory criterion is not the fantasised intrinsic
capability, but the ease of implementation and parameter setting on a
specific problem. In other words, choose the algorithm you like, but be
aware that there exists a lot of other ones.
I eventually find some time to try a graphical representation of how
metaheuristics could be classified.
Here is a slidified version, that shows each classes
independently:
And a static image version:
Note that some metaheuristics are not completely contained in certains
classes, this indicate that the method could be considered as part of the class
or not, depending on your point of view.
population metaheuristics vs trajectory-based ones,
evolutionary computation or not,
nature-inspired methods or not,
dynamic objective function vs static ones,
memory-based algorithms vs memory-less,
implicit, explicit or direct metaheuristics.
I proposed the last class, so that it may not be well-known. You will find
more informations about it in the following paper: Adaptive Learning
Search, a new tool to help comprehending metaheuristics, J. Dreo, J.-P.
Aumasson, W. Tfaili, P. Siarry, International Journal on Artificial
Intelligence Tools, Vol. 16, No. 3.. - 1 June 2007
I didn't placed a stochastic category, as it seems a bit difficult
to represent graphically. Indeed, a lot of methods could be "stochasticized" or
"derandomized" in several ways.
There is surely several lacks or errors, feel free to give your point of
view with a trackback, an email or by modifying the SVG source version
(comments are disabled due to spam that I didn't have time to fight
accurately).
An interesting idea is to use meta-model (a priori representation of the
problem) as a filter to bias the sample produced by metaheuristics. This
approach seems especially promising for engineering problem, where computing
the objective function is very expensive.
One simple form of meta-model is a probability density function,
approximating the shape of the objective function. This PDF could thus be used
to filter out bad points before evaluation.
Why, then, do not directly use EDA to generate the sample ? Because one can
imagine that the problem shape is not well known, and that using a complex PDF
is impossible (too expensive to compute, for example). Then, using a classical
indirect metaheuristic (let say an evolutionary algorithm) should be preferable
(computationnaly inexpensive) for the sample generation. If one know a good
approximation to use for the distribution of the EDA (not too computationnaly
expensive), one can imagine using the best part of the two worlds.
An example could be a problem with real variable : using an EDA with a
multi-variate normal distribution is computationnaly expensive (due to the
estimation of the co-variance, mainly), and using a mixture of gaussian kernels
makes difficult to have an a priori on the problem. Thus, why not
using a indirect metaheuristic to handle the sample generation, and use a
meta-model which parameters are estimated from the previous sample, according
to a chosen distribution ?
Many metrics are used to assess the quality of approximation found by
metaheuristics. Two of them are used really often: distance to the true optimum
according to its position and to its value.
Unfortunately, the objective function's shape can vary a lot in real-world
problem, making these metrics difficult to interpret. For example, if the
optimum is in a very deep valley (in value), a solution close to it in position
may not signifiate that the algorithm have well learn the shape of it.
Inversely, a solution close to an optimum in value may not signifiate that it
is in the same valley.
One metric that can counter thse drawbacks is a distance taking into account
the parameters of the problem as well as the value dimension.
Anyway, the question of the type of distance to use is dependent of the
problem.
When adapting combinatorial metaheuristics to continuous problems, one
sometimes use a sphere as an approximation of the "neighborhood". The idea is
thus to draw the neighbours around a solution, for example in order to apply a
simple mutation in a genetic algorithm.
Sometimes, one choose to use an uniform law, but how to draw random vectors
uniformly in an hyper-sphere ?
The first idea that comes to mind is to use a polar coordinate system and
draw the radius r and the angles
a1...a2...ai...aN with a
uniform law. Then, one convert the coordinates in the rectangular system,
x1...x2...xi...xN.
The result is interesting for a metaheuristic design, but is not a uniform
distribution:
<img src="/public/randHS_false.png" />
The correct method is to draw each xi according to:
xi=(ri1/Nai)/√(∑(ai))
(in LATEX : $x_{i}=\frac{r^{\frac{1}{N}}_i\cdot
a_{i}}{\sqrt{{\displaystyle \sum _{j=1}^{N}a_{i}}}}$)
With ri uniformly drawn in U0,1 and
a following a normal law NO,1
The result is then a true uniform distribution:
<img src="/public/randHS_ok.png
/>
Credits goes to Maurice Clerc for
detecting and solving the problem.
The EO framework has just reached the
1.0 version. This is one of the most interesting library for
metaheuristics.
It is a templatized C++ framework, with a component based architecture. EO
is focused on "evolutionary computing" (a synonym of metaheuristics, imho) and
can be used for any population-base metaheuristics. There exists versions for
local searches, multi-objective optimization or parallel architectures... a
real good piece of software :-)
The post and its comments are talking about machine-learning, but can
largely be applied to metaheuristics. The page is listing several reason for
using randomization, from which some are of special intersts for
metaheuristics:
symmetry breaking as a way to make decision, which is of great importance
for metaheuristics, which must learn and choose where are the "promising
regions";
overfit avoidance, which is related to the intensification/diversification
balance problem;
adversary defeating and bias suppression, which can be interpreted as
trying to design a true meta-heuristic (i.e. that can be applied on
several problems without major changes).
Of course, it should be possible to design a completely deterministic
algorithm that takes decisions, achieve a correct i/d balance and can tackle
all problems... Even if this force to integrate the problems themselves in the
algorithm, it should be possible. The drawback is that it is
computationally intractable.
In fact, metaheuristics (and, as far as I understand, machine-learning
algorithms) are located somewhere between random search algorithms and
deterministic ones. The compromise between these two tendencies is dependent of
the problem and of the offered computational effort.
Metaheuristics and machine-learning algorithms shares a large number of
characteristics, like stochastic processes, manipulaton of probability density
functions, etc.
One of the interesting evolution of the research on metaheuristics these
years is the increasing bridge-building with machine-learning. I see at least
two interesting pathways: the use of metaheuristics in machine-learning and the
use of machine-learning in metaheuristics.
The first point is not really new, machine-learning heavily use
optimization, and it was natural to try stochastic algorithms where local
search or exact algorithms failed. Nevertheless, there is now a sufficient
litterature to organize some special sessions in some symposium. For 2007,
there will be a special session on Genetics-Based Machine Learning at
CEC, and a track
on Genetics-Based Machine Learning and Learning Classifier Systems at
GECCO. These events are centered around "genetic" algortihm (see the posts on
the IlliGAL blog : 1, 2), despite the fact that there are several papers using
other metaheuritics, like simulated annealing, but this is a common drawback,
and does not affect the interest of the subject.
The second point is less exploited, but I find it of great interest. A
simple example of what can be done with machine-learning inside metaheuristic
can be shown with estimation of distribution algorithms. In these
metaheuristics, a probability density function is used to explicitely build a
new sample of the objective function (a "population", in the evolutionary
computation terminology) at each iteration. It is then crucial to build a
probability density function that is related to the structure of the objective
function (the "fitness landscape"). There, it should be really interesting to
build the model of the pdf itself from a selected sample, using a
machine-learning algorithm. There is some interesting papers talking about
that.
If you mix these approaches with the problem of estimating a Boltzmann
distribution (the basis of simulated annealing), you should have an awesome
research field...
Thomas Bartz-Beielstein is working on the statistical analysis of the
behaviour of metaheuristics (see its tutorials at GECCO and CEC), and the publication of its book is a
really great thing. I haven't read it yet, but the table of content seems
really promising. There is a true need for such work in the metaheuristics
community, and in stochastic optimization in general.
A friend said to me that the lack of experimental culture in the computer
science community was a form of consensus, perhaps because theoretical aspects
of mathematics was the "only way to make true science". This is a true problem
when you deal with stochastic algorithm, applied to real world problem. Despite
the fact that several papers early call for more rigourous experimental studies
of metaheuristcs (E.D. Taillard has written papers on this problem several years ago,
for example), the community does not seems to quickly react.
Yet, things are changing, after the series of CEC special sessions on
benchmark for metaheuristics, there is more and more papers on how to test
stochastic optimization algorithms and outline the results. I think this book
is coming timely... the next step will be to promote the dissemination of the
results data (and code!), in an open format, along with the papers.
Note that descriptions are picked up from the web sites of the projects.
As one can see, most of these softwares are designed for evolutionnary
algorithms, but I recommend you to try out some of the generic frameworks,
because "genetic" algorithms are not always the best choice for solving an
optimization problem, despite their wide spread.
Here are the frameworks I would recommend. These frameworks are free
softwares, you can use, look at the code, modify it and redistribute it
(precious qualities for frameworks).
I would also recommend C or C++, which permits to implement
fast programs, while using object oriented programming. C++ compilers are also
available for a large choice of plateforms (with a special distinction for GCC,
which is free software). A fast progam is crucial for testing algorithms on
real problems and using a well-known langage is a good idea.
The main idea beside the design of the framework is specified as one of the
following keywords:
template: design a new algorithm concist in extending a base
class, perhaps the simple object model to understand, but it can be difficult
to re-use existing code.
component: design a new algorithm concist in select its component
from availables operators, make it easy to implement algorithms, but it can be
quite difficult to understand the underlying model. It can be hard to add new
algorithm paradigm (generaly used for evolutionnary algorithms).
function: design a new algorithm concist in use the framework's
primitives. Simple to understand, but one must learn the primitives to use and
code a lot of stuff.
Favorites
Here is my list :
Open Metaheuristics: a library aimed
at the conception of metaheuristics (i.e. genetic/evolutionnary algorithms,
tabu search, simulated annealing, ant colony algorithms, etc.). One of the main
goal of oMetah is to permit rigourous empirical tests of metaheuristics,
through a statistical approach. (C++, LGPL, template)
*EO: a set of
paradigm-free Evolutionary Computation libraries dedicated to the flexible
design of EAs through evolving objects superseding the most common dialects
(Genetic Algorithms, Evolution Strategies, Evolutionary Programming and Genetic
Programming). (C++, GPL, component)
Sferes: a framework that gathers an
evolution framework with a simulation framework. It is dedicated to experiments
involving the design of artificial agents through artificial evolutionary
algorithms, like Genetic Algorithms or Evolution Strategies. (C++,
CeCILL, component)
PISA: Platform and
Programming Language Independent Interface for Search Algorithms. PISA is
mainly dedicated to multi-objective search, where the optimization problem is
characterized by a set of conflicting goals and not just one criterion that
needs to be optimized. (C, BSD, function)
GAlib: defines classes for using
genetic algorithms to do optimization in any C++ program using any
representation and genetic operators. The distribution includes extensive
documentation and many examples.. (C++, MIT, component)
MOMHLib++: a
library of C++ classes that implements a number of multiple
objective metaheuristics. It has been developed on the basis of former
implementations of Pareto simulated annealing (PSA) and multiple objective
genetic local search (MOGLS). (C++, GPL, template).
ECJ Java Evolutionary
Computation Research System. (Java, AFL, component)
Other
These frameworks are not those which I would recommend, but they have some
properties that could be intersting :
Many of these projects are dead, be carefull if you need a maintained
framework. Take a close look to the project activity, generally, the number of
developers and file releases give you a good idea of the framework vitality.
Also check if it is a student's project for a training course or if it is made
by professional researchers.