# Mentales habitudes

lundi 19 octobre 2015

## Paris discret, province continue

La France a une particularité culturelle intéressante liée à la position de sa capitale : Paris est au centre de tout. Il parait que cela fait 4 siècles que ça dure et que ça n'est pas prêt de s'arrêter. Je ne vais pas ici faire un énième édito sur les avantages ou les inconvénients de la métropolisation économico-politico-culturelle française, mais plutôt vous raconter un effet que je trouve intéressant sur la psychologie des français.

Cet effet se résume ainsi : les provinciaux raisonnent de manière continue alors que les parisiens ont un esprit discret (au sens de discontinu). Par « parisien », j'entends quelqu'un habitant depuis plusieurs années dans Paris, temps nécessaire suffisant pour qu'il se soit adapté.

Cette constatation a pris jour dans mon esprit suite au changement de signalétique dans le « pôle RER » de Châtelet-les-Halles (une des plus grosses station de métro de Paris) : la simple mention des points cardinaux à coté de la direction des lignes.

Mes amis parisiens ont été unanimes à propos de ce changement : « ça ne sert à rien » alors que nombre de provinciaux on fait part de leur satisfaction. En effet, si vous demandez à un parisien où se trouve la gare de l'Est dans paris, il y a de forte chance qu'il vous réponde « dans le 10e », là où un provincial vous gratifiera d'un « ben, à l'est » narquois. De la même manière, un parisien n'habite pas « au sud-est de Paris », il habite « entre Bastille et Nation », alors qu'un provincial habite « au nord-est d'Angers » et pas « entre Pellouailles-les-Vignes et Écouflant ».

Autre différence notable : la perception de l'espace-temps. Un parisien demandera si votre rendez-vous est loin et s'attendra à ce que vous lui répondiez « 4-5 stations, mais il y a un changement », alors qu'un provincial ne saura de quoi il retourne que si vous lui dites « environ 15 minutes ». S'il doit sortir du réseau de métro, le parisien raisonnera en kilomètres, ce qui laissera le provincial dubitatif, surtout s'il connaît les bouchons parisiens. Ainsi, il sera difficile pour un parisien habitant le quartiers de Montparnasse d'envisager de prendre le train pour aller à Épernon (80 km), alors qu'il ne verra pas trop d'inconvénient à aller à la Villette (9 km), ce qui pourra attrister le provincial, pour qui ces deux trajets font 40 minutes en transports en communs.

Je tiens à préciser que ces deux types de raisonnements sont le fruit d'une adaptation à l'environnement : les attendus provinciaux ne seraient pas adaptés à Paris, tout comme le comportement parisien ne serait pas efficace en province. Par exemple, le provincial comme le parisien aime avoir la liberté de pouvoir rentrer de soirée quand il le souhaite, mais ce simple souhait produit des a priori diamétralement opposés. Un parisien sera réticent à aller passer une soirée en province, car il sait que cela va être pénible de ne pas rater le dernier train et que l'avant-dernier passe une heure plus tôt. De la même façon, le provincial trouve pénible de sortir sur Paris, car y trouver une place de parking est très difficile ou coûte cher. Autre exemple : si un parisien ne vous invite pas chez lui mais toujours dans un bar, c'est parce qu'il n'a simplement pas la place de vous héberger, vous et ses amis. Et si un provincial ne vient jamais boire un coup avec vous, c'est parce qu'il ne peut pas organiser un voyage de deux heures (aller-retour) au dernier moment.

La victoire (toute relative) de l'esprit provincial à Châtelet-les-Halles illustre cette petite différence et met en avant le fait que le cerveau humain est extrêmement capable d'apprendre des modèles généraux et à les appliquer partout où il peut. L'esprit parisien raisonne en terme de lieux relativement séparés alors que l'esprit provincial manipule des directions et des temps plutôt continus.

Garder cette différence à l'esprit peut vous aider à éviter de tomber trop facilement dans les stéréotypes faciles, alors ne vous en privez pas.

## Changement de sujet

Ce blog était à l'abandon depuis 6 ans, pour tout un tas de raisons que je ne vais pas exposer ici. Il va néanmoins essayer de reprendre un semblant d'activité.

Je change donc son sujet, sa langue et son nom pour y héberger des réflexions plus générales. Les anciens articles sont conservés, parce que je n'aime pas jeter.

jeudi 16 décembre 2010

## Evolving Objects C++ framework -- release 1.1.0

After several years without a release, the EO team is proud to announce that a new version of the "Evolving Objects" framework is available.

EO is a template-based, C++ evolutionary computation library which helps you to write your own stochastic optimization algorithms insanely fast.

You will find the release 1.1.0 at the following address: https://sourceforge.net/projects/eodev/files/eo/1.1.0/

Here is a summary of the change log:

• massive documentation cleaning, now far more browsable!
• provide cmake build system, remove the old autotools one
• GCC 4.3 compatibility
• new versatile log system with several nested verbose levels
• an evaluator that throw an exception if a maximum evaluation numbers has been reached, independently of the number of generations
• a new monitor that can write on any ostream
• a new continuator that can catch POSIX system user signals
• dual fitness class to handle feasibility of individual with guarantee that feasible fitness will always be better than unfeasible one
• more statistics: feasible fitness ratio, fitness interquartile range, average size of individuals
• new website, switch from CVS to GIT, and a cool logo!

See a more detailed change log at: https://sourceforge.net/news/?group_id=9775

Do not hesitate to submit the bugs you will face: https://sourceforge.net/apps/trac/eodev/wiki/WikiStart

I've chosen EO since a few years to code metaheuristics, after having test − and even code − several other frameworks, and I did not regret this choice. Since a few month, I'm now the maintainer of EO, and the main developper, along with my intern, Caner Candan, and we have great plan for improving EO!

Happy evolutionary hacking :)

lundi 17 novembre 2008

## More on "stochastic local search" definition

Petr Pošík let an interesting comment on a previous post : "Metaheuristic" or "Stochastic Local Search"?. As bloggers always love worthwhile comments from insightful readers, I copy it here, as an article, along with my answer.

I would like to present here an idea that will take the "stochastic" and "local" idea even more further than Thomas and Dirk. In that view, even GAs are local search techniques! Why?

It has a lot to do with the definition of neighborhood of the current state of the algorithm. The state may be the position of an algorithm in the search space. In classical local search, the state is given by mere 1 point. In population-based techniques, the state is the whole population. New solutions are usually generated only by local modifications of the state, i.e. the search takes place only in the local neighborhood of the algorithm state.

Yes, of course, the local neighborhood of the population as a whole is very broad and such a local search is much less prone to premature convergence and exhibits a kind of global behaviour. But, IMHO, it is still a local search...

I completely understand your view: if there is a non-zero probability of generating any solution in the search space, then the search must eventually find the global optimum, and thus performs a global search.

This is the case for many stochastic local search algorithm in continuous domain which use e.g. normal distribution. But if you set the variance too high, you will get very slow convergence to that global optimum which is in practice not desirable. If you set variance too low, you will quickly find local optimum, but you will have to wait for the global one virtually infinitely long - again undesirable. Or, put it in another way, would you call a "global optimization" technique still "global" if it takes zilions of years to find that global solution?

In practice, I think, for continuous spaces we have to resort to algorithms that exhibit local behavior since we always have some hard constraints on the length of the optimization run. In my eyes, this perfectly justifies the name "stochastic local search".

In fact, as far as I comprehend them, both Dick Thierens and Thomas Stützle shares your point of view. For them, GA are a kind of stochastic local search.

I find the classification in "descent" algorithm and "population" ones quite artificial. For example, simulated annealing is implemented as a descent algorithm, but does actually perform a sampling of the objective function, which is commonly seen as a population algorithm characteristic.

More rigourously, I think that every stochastic metaheuristic does try to avoid local optima. Stochastic processes are always here to do so[1].

Maybe the problem is the definition of local search. In my humble opinion, a local search is a method that search for a local optimum. This can be opposed to global search, where one want to find a global optimum.

You are right when you point out that this is related to ergodicity, as only an ergodic[2] algorithm may converge[3], and thus be able to reliably find a global optimum. Thus, yes, I will say that a true and rigorous optimization method is global if, and only if, it is at least quasi-ergodic[4]. A pure random search is the basic global optimization method, that a good metaheuristic should at least outperforms.

Here, it is the stochastic operators that manipulates the solutions that permits the global behaviour. This is not related to the encoding of the solutions/the neighbourhood structure, even if it is a really crucial part of the algorithm.

Thus, a "stochastic local search" may be defined as a local search seeking a global optimum, which is a paradoxical definition. I'd rather prefer that we keep the implementation and the mathematical bases separated, and thus talk of "stochastic search", or "stochastic metaheuristic".

Or else, one may want to use a paradoxical definition, as a matter of fun, which is also a good reason to do so :-)

#### Notes

[1] There may be probabilistic choices that are not related to such tasks, but they are not linked to the iterative aspects of the search algorithms, thus not being stochastics

[2] i.e. that can evaluate any solution

[3] i.e. that have a non-null probability of finding the global optimum in a finite time, or to say it differently, that can find the global optimum after a time that may tends towards infinity (for continuous problems, for discrete ones it is bounded by the size of the instance)

[4] i.e. that can reach any solution in a finite number of iterations

lundi 10 novembre 2008

## Multi-criteria meta-parameter tuning for mono-objective stochastic metaheuristics

Here are some explanations about the work I have presented at the META'08 conference. This post is based on the notes I used for my presentation.

In all the automatic parameter setting methods, the problem of finding the better parameter set is considered as an optimization problem, with only one objective, generally finding the best optimum, or reducing the uncertainty of the results. Sometimes, one try to improve the speed. More rarely, speed, precision or robustness are aggregated in one criterion, with an adhoc formula. In fact, one can set parameters according to several objectives (improve speed, improve robustness, etc.). One cannot find a set of parameters fitting all the potential uses of a single algorithm on a single problem instance. Thus, parameter setting is a multi-objective problem.

The key point here is that it is easier to set the parameters of a solver than to solve the problem directly. The simpler example of this idea is when you want to solve a continuous optimization problem with hundreds of variables, with a metaheuristic that have 3 parameters. Moreover, you only have to tune your parameters once, even if you will solve many problems instances later.

In this work, I only consider speed and precision, although the method may handle any performance metrics.

What is crucial in our method is that we do not want to aggregates the criterions, instead, we want the Pareto front corresponding to all the non-dominated parameters set. I use plots representing the Pareto front, that I will sometimes call the « performance » front, or performance profile.

The idea is that one can then compare more rigorously several algorithms, by comparing their respective performance fronts. We can also benefits from having a cursor, scaling from a behaviour oriented towards speed, at one extreme, or precision, at the other side. Even more interesting is the performance profile projected on the parameters space. One can see that every algorithm has its very own profile, that tells a lot on how it behaves.

The figure above shows performance profiles of 4 metaheuristics: a Simulated Annealing, a Genetic Algorithm, two Estimation of Distribution Algorithms (produced by NSGA-II, with median estimation, on the Rosenbrock-2D problem, using the parameter corresponding to the sampling density for each method with an absolute time stopping criterion).

Our results suggest that the choice of the stopping criterion has a drastic influence on the interest of the performance profile, it must be chosen carefully. Similarly, the method can not naturally find a unique profile for a set of problem instances, but is strictly valid only for an instance of a given problem. Finally, we note that the performance profiles are often convex in the objectives space, which could indicate that aggregation may be usefull.

The proposed method allows to aggregate all parameters into one, determining the position within the profile of performance, since behavior strongly oriented towards production (fast, unaccurate) to conception (slow, accurate). The projection of the profile in the space of parameters can also reflect the impact of parameters on performance, or dependence between parameters. Such information may be very relevant to better understand some complex metaheuristics. It also becomes possible to compare several metaheuristics, delaying the performance profiles on the same scale. The statistic validation also receives additional dimensions of discrimination.

In perspective, it remains to reduce the demand for calculations of the meta-optimizer, using dedicated methods (SPO, racing, etc.).. It is also possible to extend the method taking into account robustness as supplementary objectives and checking the possibility of rebuilding correlations on a set of instances.

Finally, here are the slides. I use light slides without a lot o text, so I suggest that you read the notes while looking at the presentation. You will find the abstract, the extended abstract and the slides on my professional website, at the corresponding publication page.

vendredi 7 novembre 2008

## "Metaheuristic" or "Stochastic Local Search"?

During their excellent tutorials at META 2008, both Thomas and Dick talked about "stochastic local search" and seems to be rather uncomfortable with the "metaheuristic" term. They seems to reserve it for very high level well known algorithms.

I'm not sure that using the term "stochastic" along with "local" is a good idea. In all the algorithms, the use of probabilistic processes aims at avoiding local optima. Thus, stochastic algorithms are not "local search" anymore, but "global search". While I agree that it is a very good approach to start with local search techniques, I would say that when you introduce stochastic processes, then you enter the field of metaheuristics. On the other hand, the paradoxal use of "stochastic" along with "local" may be interesting from a marketing point of vue... but I like paradoxes.

Anyway, despite the fact that there would be a lot more to say about the problem of nomenclature in our field (who says "everything is evolutionary"?), this is not very important, I tink I will continue using "metaheuristics", until a common term establish itself in the litterature.

jeudi 6 novembre 2008

## 2nd International Conference on Metaheuristics and Nature Inspired Computing

I've just attend the META 2008 international conference on metaheuristics and nature inspired computing.

The weather was nice in Tunisia, we had a place to sleep, a restaurant and a swimming pool, the organization was just fine. The acceptance rate was of 60%, with 116 accepted papers, for 130 attendants and one short review by paper (at least for mine).

OK, now let's talk about what is really exciting: science.

I was more than pleased to attend to two tutorials, given by Dick Thierens and Thomas Stützle, that both were talking about the use of stochastic local search.

What was definitely interesting is that these two talented researchers were insisting a lot on the need of a rigorous experimental approach for the design and the validation of metaheuristics. That's good news for our research domain: the idea that metaheuristics should be employed in a scientific manner rather than in an artistic one gains more and more importance.

First, they both says that a good way to tackle a hard optimization problem is to employ a bottom-up approach: start first with a simple local search, then use metaheuristics operators to improve the results.

Thomas Stützle, particularly, insist on the crucial need of rigorous parameter setting and experimental validation with statistical tests. That's definitely a very important point.

Another good point made by Thomas is the use of the term "algorithm engineering" to describe a rigorous design and evaluation approach of optimization algorithms. I was searching a nice term to name it, I think this one is a good candidate. The bad news at this conference is that, despite these two incredible tutorials, there was very few peoples speaking about algorithm engineering. I was presenting a new method for parameter setting and behaviour understanding, but I was in a unrelated "metaheuristics for real-world problem" session. I haven't seen other works specifically dedicated to such subjects.

More badly, I have attend to several presentations with very bad experimental work. Some peoples keeps telling their stochastic algorithm is better only by showing the best result found. More often, their is a mean and a standard deviation, but without a statistical test. But there is hope as, since 2001 (when some works made by Éric D. Taillard definitely introduced experimental validation for metaheuristics, at least for me), I find that the proportion of better experimental plans is increasing in the literature.

Anyway, my wish is that there will be more and more special sessions on algorithm engineering in future conferences on metaheuristics. In the meantime, there is the 2nd "Engineering Stochastic local search algorithms" conference, in september 2009, in Brussels, that seems really interesting...

mardi 30 septembre 2008

## Tools for conferences about metaheuristics

Here is some tools interesting when trying to attend conferences about metaheuristics :

jeudi 11 septembre 2008

## The ultimate metaheuristic?

There exists a lot of different algorithms families that can be called "metaheuristics", stricly speaking, there are a very, very, very large number of metaheuristics instances.

Defining what is a metaheuristic "family" is a difficult problem: when may I called this or this algorithm an evolutionary one? Is estimation of distribution a sub-family of genetic algorithms? What is the difference between ant colony optimization and stochastic gradient ascent? Etc.

Despite the difficulty of classifying metaheuristics, there is some interesting characteristics shared by stochastic metaheuristics. Indeed, they are all iteratively manipulating a sample of the objective function[1]

For example, simulated annealing is often depicted as a probabilistic descent algorithm, but it is more than that. Indeed, simulated annealing is based on the Metropolis-Hastings algorithm, which is a way of sampling any probability distributionn, as long as you can calculate its density at any point. Thus, simulated annealing use an approximation of the objective function as a probability density function to generate a sampling. It is even more obvious if you consider a step by step decrease of the temperature. Estimation of distribution are another obvious example: they are explicitly manipulating samplings, but one can also have the same thoughts about evolutionary algorithms, even if they are manipulating the sampling rather implicitely.

The diagram tries to illustrate this idea: (a) a descent algorithm can have the same sampling behaviour than an iteration of a (b) "population" method.

Given these common processes, is it possible to design a kind of "universal" metaheuristic ? Theoretically, the answer is yes. For example, in the continuous domain, consider an estimation of distribution algorithm, using a mixture of gaussian kernel: it can learn any probability density function (possibly needing an infinite number of kernels). Thus, carefully choosing the function to use at each iteration and the selection operator, one can reproduce the behaviour of any stochastic metaheuristic.

Of course, choosing the correct mixture (and the other parameters) is a very difficult problem in practice. But I find interesting the idea that the problem of designing a metaheuristic can be reduced to a configuration problem.

#### Notes

[1] Johann Dréo, Patrick Siarry, "Stochastic metaheuristics as sampling techniques using swarm intelligence. ", in "Swarm Intelligence: Focus on Ant and Particle Swarm Optimization", Felix T. S. Chan, Manoj Kumar Tiwari (Eds.), Advanced Robotic Systems International, I-Tech Education and Publishing, Vienna, Austria , ISBN 978-3-902613-09-7 - December 2008

lundi 23 juin 2008

## An Estimation of Distribution Algorithm in a few lines of python

About an interesting post I have already commented, Julian Togelius said: "I know roughly what an EDA does, but I couldn't sit down an implement one on the spot [...]"

I personally think that estimation of distribution algorithms (EDA) are some of the more elegant an easy to use metaheuristics. Obviously, this highly depends of the mind configuration :-) Anyway, as a piece of code is more comprehensive than a long speech, I've made a very simple and small EDA in python, to illustrate my thought.

This is a simple continuous EDA using a (also simple) normal probability density function to optimize a (once more simple) function with two variables. As you can see, the code is (guess what ?) simple, only a few lines with some scipy functions, and that's it.

from scipy import * # The problem to optimize def x2y2( x ):         return x[0]**2 + x[1]**2 class eda:         def __init__(self, of):                 # Algorithm parameters                 self.iterations = 100                 self.sample_size = 100                 self.select_ratio = 0.5                 self.epsilon = 10e-6                 # class members                 self.objective_function = of                 self.dimensions = 2                 self.sample = []                 self.means = []                 self.stdevs = []                         self.debug = False         def run(self):                 # uniform initialization                 self.sample = random.rand( self.sample_size, self.dimensions+1 )                 # cosmetic                 self.sample = self.sample * 200 - 100                 self.evaluate()                                  # main loop                 i = 0                 while i < self.iterations:                         i += 1                         self.dispersion_reduction()                         self.estimate_parameters()                         self.draw_sample()                         self.evaluate()                 # sort the final sample                 self.sample_sort()                 # output the optimum                 print "#[ x y f(x,y) ]"                 print self.sample[0]         def sample_sort(self):                 # sort rows on the last column                 self.sample = self.sample[ argsort( self.sample[:,-1], 0 ) ]         def dispersion_reduction(self):                 self.sample_sort()                 # number of points to select                 nb = int( floor( self.sample_size * self.select_ratio ) )                 # selection                 self.sample = self.sample[:nb]         def estimate_parameters( self ):                 # points sub array (without values)                 mat = self.sample[:,:self.dimensions]                                  # row means (axis 0 in scipy)                 self.means = mean( mat, 0 )                                  # row standard deviation                 self.stdevs = std( mat, 0 )         def draw_sample(self):                 # for each variable to optimize                 for i in xrange(self.dimensions):                         # if the dispersion is null                         if self.stdevs[i] == 0.0:                                 # set it to a minimal value                                 self.stdevs[i] = self.epsilon                                  # empty sample                 self.sample = zeros( (self.sample_size, self.dimensions+1) )                                  # for each point                 for i in xrange( self.sample_size ):                         # draw in random normal                         p = random.normal( self.means, self.stdevs )                         # put it into the sample                         self.sample[i][:self.dimensions] = p         def evaluate(self):                 # for each point                 for i in xrange( self.sample_size ):                         d = self.dimensions                         # call the objective function                         #   the third element is the result of the objective function call                         #   taking the first two elements as variables                         self.sample[i][-1]  = self.objective_function( self.sample[i][:d] ) if __name__=="__main__":         a = eda( x2y2 )         a.run()  See also the file alone with a debug mode, to see how it works in details.

mercredi 18 juin 2008

## Metaheuristic validation in a nutshell

People using metaheuristics often forget that the price to pay for their ease of adaptation to a new problem is the hard validation work. There is several things to keep in mind when using a metaheuristic, especially when one want to prove that they work in practice.

This (kind of) mind map try to list what you should do, and a short set of main tools to do it. It is not always mandatory to use all the tools, sometimes it is just a matter of choice (like for the parameter setting), sometimes the more you do, the better it is (like for performance assessment).

The graphic has been drawn in SVG, and I have put some references in a very small font at the bottom of some boxes. Thus, it would be more confortable to view it in Firefox or in Inkscape, and to zoom where needed. Try the SVG version.

lundi 3 mars 2008

## A chronology of metaheuristics

At last, I found some time to translate my chronology of metaheuristics I have made for the french Wikipedia. I put it here for the records, but you will find a more up-to-date version on the english Wikipedia article about Metaheuristics.

# References

1. V. Angel, La rugosité des paysages : une théorie pour la difficulté des problèmes d’optimisation combinatoire relativement aux métaheuristiques, thèse de doctorat de l’université de Paris-Sud, Orsay, 1998.
2. J. Dreo, J.-P. Aumasson, W. Tfaili, P. Siarry, Adaptive Learning Search, a new tool to help comprehending metaheuristics, to appear in the International Journal on Artificial Intelligence Tools.
3. El-Ghazali Talbi, A taxonomy of hybrid metaheuristics, Journal of Heuristics, volume 8, no 2, pages 541-564, septembre 2002
4. (en) exemples de fonctions de tests pour métaheuristiques d’optimisation de problèmes continus.
5. W. Dullaert, M. Sevaux, K. Sörensen et J. Springael, Applications of metaheuristics, numéro spécial du European Journal of Operational Research, no 179, 2007.
6. Robbins, H. and Monro, S., A Stochastic Approximation Method, Annals of Mathematical Statistics, vol. 22, pp. 400-407, 1951
7. Barricelli, Nils Aall, Esempi numerici di processi di evoluzione, Methodos, pp. 45-68, 1954
8. Rechenberg, I., Cybernetic Solution Path of an Experimental Problem, Royal Aircraft Establishment Library Translation, 1965
9. Fogel, L., Owens, A.J., Walsh, M.J., Artificial Intelligence through Simulated Evolution, Wiley, 1966
10. W.K. Hastings. Monte Carlo Sampling Methods Using Markov Chains and Their Applications, Biometrika, volume 57, no 1, pages 97-109, 1970
11. Holland, John H., Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, 1975
12. Smith, S.F., A Learning System Based on Genetic Adaptive Algorithms, PhD dissertation (University of Pittsburgh), 1980
13. S. Kirkpatrick, C. D. Gelatt et M. P. Vecchi, Optimization by Simulated Annealing, Science, volume 220, no  4598, pages 671-680, 1983
14. V. Cerny, A thermodynamical approach to the travelling salesman problem : an efficient simulation algorithm Journal of Optimization Theory and Applications, volume45, pages 41-51, 1985
15. Fred GLover, Future Paths for Integer Programming and Links to Artificial Intelligence, Comput. & Ops. Res.Vol. 13, No.5, pp. 533-549, 1986
16. J.D. Farmer, N. Packard and A. Perelson, The immune system, adaptation and machine learning, Physica D, vol. 22, pp. 187--204, 1986
17. F. Moyson, B. Manderick, The collective behaviour of Ants : an Example of Self-Organization in Massive Parallelism, Actes de AAAI Spring Symposium on Parallel Models of Intelligence, Stanford, Californie, 1988
18. Koza, John R. Non-Linear Genetic Algorithms for Solving Problems. United States Patent 4,935,877. Filed May 20, 1988. Issued June 19, 1990
19. Goldberg, David E., Genetic Algorithms in Search, Optimization and Machine Learning, Kluwer Academic Publishers, Boston, MA., 1989
20. P. Moscato, On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts : Towards Memetic Algorithms, Caltech Concurrent Computation Program, C3P Report 826, 1989.
21. M. Dorigo, Optimization, Learning and Natural Algorithms, Thèse de doctorat, Politecnico di Milano, Italie, 1992.
22. Feo, T., Resende, M., Greedy randomized adaptive search procedure, Journal of Global Optimization, tome 42, page 32--37, 1992
23. Eberhart, R. C. et Kennedy, J., A new optimizer using particle swarm theory, Proceedings of the Sixth International Symposium on Micromachine and Human Science, Nagoya, Japan. pp. 39-43, 1995
24. Kennedy, J. et Eberhart, R. C., Particle swarm optimization, Proceedings of IEEE International Conference on Neural Networks, Piscataway, NJ. pp. 1942-1948, 1995
25. Mülhenbein, H., Paaß, G., From recombination of genes to the estimation of distribution I. Binary parameters, Lectures Notes in Computer Science 1411: Parallel Problem Solving from Nature, tome PPSN IV, pages 178--187, 1996
26. Rainer Storn, Kenneth Price, Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces, Journal of Global Optimization, volume 11, no 4, pages 341-359, 1997
27. Rubinstein, R.Y., Optimization of Computer simulation Models with Rare Events, European Journal of Operations Research, 99, 89-112, 1997
28. Stefan Boettcher, Allon G. Percus, "Extremal Optimization : Methods derived from Co-Evolution", Proceedings of the Genetic and Evolutionary Computation Conference (1999)
29. Takagi, H., Active user intervention in an EC Search, Proceesings of the JCIS 2000

## The problem with spreading new metaheuristics

Marcelo De Brito had interesting thoughts about what he call New Wave Of Genetic Algorithms. He is surprised that when "evolutionary computation" is applied to a new problem, the first algorithm used is the good old canonic genetic algorithm, despite that there exist active researchs on Estimation of Distribution Algorithms. Julian Togelius write that it may be because people does not understand other algorithms, or even know that anything else exists.

I think that is definitely true. This subject is a kind of hobby for me. Indeed, as I have came from ecology to applied mathematics, I feel like a kind of generalist researcher, not being able to be the best somewhere, but trying to be as good as possible on several fields. Concerning the field of what Marcelo called NWOGA, I would like to emphasize some other things.

As David E. Goldberg say in its courses, genetic algorithm is the term everybody use. For specialist, a GA is just a kind of "evolutionary algorithm", with specific rules, that are more defined by history than by anything else.

The litterature on evolutionary computation is quite big, the first algorithm being designed in 1965 (evolutionary strategies, followed by evolutionary programming in 1966), making it difficult to spread deep changes on basic concepts.

There exist a lot more stochastic algorithms for global optimization than just evolutionary ones. I prefer to call the stochastic metaheuristics, or simply metaheuristics, because this lead to far less bias than a metaphoric naming (cf. the previous post on classification of metaheuristics).

For example, during my PhD thesis, I was convinced that some Ant Colony Optimization algorithms were just equivalent to Estimation of Distribution Algorithms, when talking about continuous problems. Moreover, I'm now convinced that a lot of metaheuristics just shares some common stochastic sampling processes, that are not specifiquely related to evolution. For example, mathematically, Simulated Annealing is just a kind of EDA using an approximation of the objective function as a model (or inversely, of course).

As Julian says: I know roughly what an EDA does, but I couldn't sit down an implement one on the spot. This is, in my humble opinion, one of the more important thing to keep in mind. Indeed, there exist more and more papers claming that a correct parameter setting of a metaheuristic can lead to the performances of any competing metaheuristic.

Thus, the true discriminatory criterion is not the fantasised intrinsic capability, but the ease of implementation and parameter setting on a specific problem. In other words, choose the algorithm you like, but be aware that there exists a lot of other ones.

vendredi 12 octobre 2007

## Classification of metaheuristics

I eventually find some time to try a graphical representation of how metaheuristics could be classified.

Here is a slidified version, that shows each classes independently:

And a static image version:

Note that some metaheuristics are not completely contained in certains classes, this indicate that the method could be considered as part of the class or not, depending on your point of view.

I have reported the following metaheuristics:

• genetic programming, genetic algorithms, differential evolution,
• evolution strategies,
• estimation of distribution algorithms,
• particle swarm optimization,
• ant colony optimization,
• simulated annealing,
• tabu search,
• GRASP,
• variable neighborhood search,
• iterated, stochastic and guided local search.

And the following classes :

• metaheurisitcs vs local search,
• population metaheuristics vs trajectory-based ones,
• evolutionary computation or not,
• nature-inspired methods or not,
• dynamic objective function vs static ones,
• memory-based algorithms vs memory-less,
• implicit, explicit or direct metaheuristics.

I proposed the last class, so that it may not be well-known. You will find more informations about it in the following paper: Adaptive Learning Search, a new tool to help comprehending metaheuristics, J. Dreo, J.-P. Aumasson, W. Tfaili, P. Siarry, International Journal on Artificial Intelligence Tools, Vol. 16, No. 3.. - 1 June 2007

I didn't placed a stochastic category, as it seems a bit difficult to represent graphically. Indeed, a lot of methods could be "stochasticized" or "derandomized" in several ways.

There is surely several lacks or errors, feel free to give your point of view with a trackback, an email or by modifying the SVG source version (comments are disabled due to spam that I didn't have time to fight accurately).

vendredi 27 juillet 2007

## Hybridization : estimation of distribution as a meta-model filter generator for metaheuristics ?

An interesting idea is to use meta-model (a priori representation of the problem) as a filter to bias the sample produced by metaheuristics. This approach seems especially promising for engineering problem, where computing the objective function is very expensive.

One simple form of meta-model is a probability density function, approximating the shape of the objective function. This PDF could thus be used to filter out bad points before evaluation.

Why, then, do not directly use EDA to generate the sample ? Because one can imagine that the problem shape is not well known, and that using a complex PDF is impossible (too expensive to compute, for example). Then, using a classical indirect metaheuristic (let say an evolutionary algorithm) should be preferable (computationnaly inexpensive) for the sample generation. If one know a good approximation to use for the distribution of the EDA (not too computationnaly expensive), one can imagine using the best part of the two worlds.

An example could be a problem with real variable : using an EDA with a multi-variate normal distribution is computationnaly expensive (due to the estimation of the co-variance, mainly), and using a mixture of gaussian kernels makes difficult to have an a priori on the problem. Thus, why not using a indirect metaheuristic to handle the sample generation, and use a meta-model which parameters are estimated from the previous sample, according to a chosen distribution ?

One more hybridization to try...

jeudi 5 juillet 2007

## Error metrics

Many metrics are used to assess the quality of approximation found by metaheuristics. Two of them are used really often: distance to the true optimum according to its position and to its value.

Unfortunately, the objective function's shape can vary a lot in real-world problem, making these metrics difficult to interpret. For example, if the optimum is in a very deep valley (in value), a solution close to it in position may not signifiate that the algorithm have well learn the shape of it. Inversely, a solution close to an optimum in value may not signifiate that it is in the same valley.

One metric that can counter thse drawbacks is a distance taking into account the parameters of the problem as well as the value dimension.

Anyway, the question of the type of distance to use is dependent of the problem.

mercredi 28 mars 2007

## Random draw in a sphere

When adapting combinatorial metaheuristics to continuous problems, one sometimes use a sphere as an approximation of the "neighborhood". The idea is thus to draw the neighbours around a solution, for example in order to apply a simple mutation in a genetic algorithm.

Sometimes, one choose to use an uniform law, but how to draw random vectors uniformly in an hyper-sphere ?

The first idea that comes to mind is to use a polar coordinate system and draw the radius r and the angles a1...a2...ai...aN with a uniform law. Then, one convert the coordinates in the rectangular system, x1...x2...xi...xN.

The result is interesting for a metaheuristic design, but is not a uniform distribution:

<img src="/public/randHS_false.png" />

The correct method is to draw each xi according to: xi=(ri1/Nai)/√(∑(ai))
(in LATEX : $x_{i}=\frac{r^{\frac{1}{N}}_i\cdot a_{i}}{\sqrt{{\displaystyle \sum _{j=1}^{N}a_{i}}}}$)

With ri uniformly drawn in U0,1 and a following a normal law NO,1

The result is then a true uniform distribution:

<img src="/public/randHS_ok.png />

Credits goes to Maurice Clerc for detecting and solving the problem.

dimanche 7 janvier 2007

## Evolving Objects 1.0 is out

The EO framework has just reached the 1.0 version. This is one of the most interesting library for metaheuristics.

It is a templatized C++ framework, with a component based architecture. EO is focused on "evolutionary computing" (a synonym of metaheuristics, imho) and can be used for any population-base metaheuristics. There exists versions for local searches, multi-objective optimization or parallel architectures... a real good piece of software :-)

jeudi 21 décembre 2006

## Random and learning

The Machine Learning (Theory) blog, by John Langford, has a very intersting post on "Explicit Randomization in Learning algorithms".

The post and its comments are talking about machine-learning, but can largely be applied to metaheuristics. The page is listing several reason for using randomization, from which some are of special intersts for metaheuristics:

1. symmetry breaking as a way to make decision, which is of great importance for metaheuristics, which must learn and choose where are the "promising regions";
2. overfit avoidance, which is related to the intensification/diversification balance problem;
3. adversary defeating and bias suppression, which can be interpreted as trying to design a true meta-heuristic (i.e. that can be applied on several problems without major changes).

Of course, it should be possible to design a completely deterministic algorithm that takes decisions, achieve a correct i/d balance and can tackle all problems... Even if this force to integrate the problems themselves in the algorithm, it should be possible. The drawback is that it is computationally intractable.

In fact, metaheuristics (and, as far as I understand, machine-learning algorithms) are located somewhere between random search algorithms and deterministic ones. The compromise between these two tendencies is dependent of the problem and of the offered computational effort.

mardi 19 décembre 2006

## Metaheuristics and machine-learning

Metaheuristics and machine-learning algorithms shares a large number of characteristics, like stochastic processes, manipulaton of probability density functions, etc.

One of the interesting evolution of the research on metaheuristics these years is the increasing bridge-building with machine-learning. I see at least two interesting pathways: the use of metaheuristics in machine-learning and the use of machine-learning in metaheuristics.

The first point is not really new, machine-learning heavily use optimization, and it was natural to try stochastic algorithms where local search or exact algorithms failed. Nevertheless, there is now a sufficient litterature to organize some special sessions in some symposium. For 2007, there will be a special session on Genetics-Based Machine Learning at CEC, and a track on Genetics-Based Machine Learning and Learning Classifier Systems at GECCO. These events are centered around "genetic" algortihm (see the posts on the IlliGAL blog : 1, 2), despite the fact that there are several papers using other metaheuritics, like simulated annealing, but this is a common drawback, and does not affect the interest of the subject.

The second point is less exploited, but I find it of great interest. A simple example of what can be done with machine-learning inside metaheuristic can be shown with estimation of distribution algorithms. In these metaheuristics, a probability density function is used to explicitely build a new sample of the objective function (a "population", in the evolutionary computation terminology) at each iteration. It is then crucial to build a probability density function that is related to the structure of the objective function (the "fitness landscape"). There, it should be really interesting to build the model of the pdf itself from a selected sample, using a machine-learning algorithm. There is some interesting papers talking about that.

If you mix these approaches with the problem of estimating a Boltzmann distribution (the basis of simulated annealing), you should have an awesome research field...

- page 1 de 2