# Mentales habitudes

## Tag - estimation of distribution

jeudi 11 septembre 2008

## The ultimate metaheuristic?

There exists a lot of different algorithms families that can be called "metaheuristics", stricly speaking, there are a very, very, very large number of metaheuristics instances.

Defining what is a metaheuristic "family" is a difficult problem: when may I called this or this algorithm an evolutionary one? Is estimation of distribution a sub-family of genetic algorithms? What is the difference between ant colony optimization and stochastic gradient ascent? Etc.

Despite the difficulty of classifying metaheuristics, there is some interesting characteristics shared by stochastic metaheuristics. Indeed, they are all iteratively manipulating a sample of the objective function

For example, simulated annealing is often depicted as a probabilistic descent algorithm, but it is more than that. Indeed, simulated annealing is based on the Metropolis-Hastings algorithm, which is a way of sampling any probability distributionn, as long as you can calculate its density at any point. Thus, simulated annealing use an approximation of the objective function as a probability density function to generate a sampling. It is even more obvious if you consider a step by step decrease of the temperature. Estimation of distribution are another obvious example: they are explicitly manipulating samplings, but one can also have the same thoughts about evolutionary algorithms, even if they are manipulating the sampling rather implicitely. The diagram tries to illustrate this idea: (a) a descent algorithm can have the same sampling behaviour than an iteration of a (b) "population" method.

Given these common processes, is it possible to design a kind of "universal" metaheuristic ? Theoretically, the answer is yes. For example, in the continuous domain, consider an estimation of distribution algorithm, using a mixture of gaussian kernel: it can learn any probability density function (possibly needing an infinite number of kernels). Thus, carefully choosing the function to use at each iteration and the selection operator, one can reproduce the behaviour of any stochastic metaheuristic.

Of course, choosing the correct mixture (and the other parameters) is a very difficult problem in practice. But I find interesting the idea that the problem of designing a metaheuristic can be reduced to a configuration problem.

#### Notes

 Johann Dréo, Patrick Siarry, "Stochastic metaheuristics as sampling techniques using swarm intelligence. ", in "Swarm Intelligence: Focus on Ant and Particle Swarm Optimization", Felix T. S. Chan, Manoj Kumar Tiwari (Eds.), Advanced Robotic Systems International, I-Tech Education and Publishing, Vienna, Austria , ISBN 978-3-902613-09-7 - December 2008

lundi 23 juin 2008

## An Estimation of Distribution Algorithm in a few lines of python

About an interesting post I have already commented, Julian Togelius said: "I know roughly what an EDA does, but I couldn't sit down an implement one on the spot [...]"

I personally think that estimation of distribution algorithms (EDA) are some of the more elegant an easy to use metaheuristics. Obviously, this highly depends of the mind configuration :-) Anyway, as a piece of code is more comprehensive than a long speech, I've made a very simple and small EDA in python, to illustrate my thought.

This is a simple continuous EDA using a (also simple) normal probability density function to optimize a (once more simple) function with two variables. As you can see, the code is (guess what ?) simple, only a few lines with some scipy functions, and that's it.

```from scipy import * # The problem to optimize def x2y2( x ):         return x**2 + x**2 class eda:         def __init__(self, of):                 # Algorithm parameters                 self.iterations = 100                 self.sample_size = 100                 self.select_ratio = 0.5                 self.epsilon = 10e-6                 # class members                 self.objective_function = of                 self.dimensions = 2                 self.sample = []                 self.means = []                 self.stdevs = []                         self.debug = False         def run(self):                 # uniform initialization                 self.sample = random.rand( self.sample_size, self.dimensions+1 )                 # cosmetic                 self.sample = self.sample * 200 - 100                 self.evaluate()                                  # main loop                 i = 0                 while i < self.iterations:                         i += 1                         self.dispersion_reduction()                         self.estimate_parameters()                         self.draw_sample()                         self.evaluate()                 # sort the final sample                 self.sample_sort()                 # output the optimum                 print "#[ x y f(x,y) ]"                 print self.sample         def sample_sort(self):                 # sort rows on the last column                 self.sample = self.sample[ argsort( self.sample[:,-1], 0 ) ]         def dispersion_reduction(self):                 self.sample_sort()                 # number of points to select                 nb = int( floor( self.sample_size * self.select_ratio ) )                 # selection                 self.sample = self.sample[:nb]         def estimate_parameters( self ):                 # points sub array (without values)                 mat = self.sample[:,:self.dimensions]                                  # row means (axis 0 in scipy)                 self.means = mean( mat, 0 )                                  # row standard deviation                 self.stdevs = std( mat, 0 )         def draw_sample(self):                 # for each variable to optimize                 for i in xrange(self.dimensions):                         # if the dispersion is null                         if self.stdevs[i] == 0.0:                                 # set it to a minimal value                                 self.stdevs[i] = self.epsilon                                  # empty sample                 self.sample = zeros( (self.sample_size, self.dimensions+1) )                                  # for each point                 for i in xrange( self.sample_size ):                         # draw in random normal                         p = random.normal( self.means, self.stdevs )                         # put it into the sample                         self.sample[i][:self.dimensions] = p         def evaluate(self):                 # for each point                 for i in xrange( self.sample_size ):                         d = self.dimensions                         # call the objective function                         #   the third element is the result of the objective function call                         #   taking the first two elements as variables                         self.sample[i][-1]  = self.objective_function( self.sample[i][:d] ) if __name__=="__main__":         a = eda( x2y2 )         a.run() ``` See also the file alone with a debug mode, to see how it works in details.

vendredi 27 juillet 2007

## Hybridization : estimation of distribution as a meta-model filter generator for metaheuristics ?

An interesting idea is to use meta-model (a priori representation of the problem) as a filter to bias the sample produced by metaheuristics. This approach seems especially promising for engineering problem, where computing the objective function is very expensive.

One simple form of meta-model is a probability density function, approximating the shape of the objective function. This PDF could thus be used to filter out bad points before evaluation.

Why, then, do not directly use EDA to generate the sample ? Because one can imagine that the problem shape is not well known, and that using a complex PDF is impossible (too expensive to compute, for example). Then, using a classical indirect metaheuristic (let say an evolutionary algorithm) should be preferable (computationnaly inexpensive) for the sample generation. If one know a good approximation to use for the distribution of the EDA (not too computationnaly expensive), one can imagine using the best part of the two worlds.

An example could be a problem with real variable : using an EDA with a multi-variate normal distribution is computationnaly expensive (due to the estimation of the co-variance, mainly), and using a mixture of gaussian kernels makes difficult to have an a priori on the problem. Thus, why not using a indirect metaheuristic to handle the sample generation, and use a meta-model which parameters are estimated from the previous sample, according to a chosen distribution ?

One more hybridization to try...

mardi 1 août 2006