Mentales habitudes

Aller au contenu | Aller au menu | Aller à la recherche

Tag - estimation of distribution

Fil des billets

jeudi 11 septembre 2008

The ultimate metaheuristic?

There exists a lot of different algorithms families that can be called "metaheuristics", stricly speaking, there are a very, very, very large number of metaheuristics instances.

Defining what is a metaheuristic "family" is a difficult problem: when may I called this or this algorithm an evolutionary one? Is estimation of distribution a sub-family of genetic algorithms? What is the difference between ant colony optimization and stochastic gradient ascent? Etc.

Despite the difficulty of classifying metaheuristics, there is some interesting characteristics shared by stochastic metaheuristics. Indeed, they are all iteratively manipulating a sample of the objective function[1]

For example, simulated annealing is often depicted as a probabilistic descent algorithm, but it is more than that. Indeed, simulated annealing is based on the Metropolis-Hastings algorithm, which is a way of sampling any probability distributionn, as long as you can calculate its density at any point. Thus, simulated annealing use an approximation of the objective function as a probability density function to generate a sampling. It is even more obvious if you consider a step by step decrease of the temperature. Estimation of distribution are another obvious example: they are explicitly manipulating samplings, but one can also have the same thoughts about evolutionary algorithms, even if they are manipulating the sampling rather implicitely.

The diagram tries to illustrate this idea: (a) a descent algorithm can have the same sampling behaviour than an iteration of a (b) "population" method.

Given these common processes, is it possible to design a kind of "universal" metaheuristic ? Theoretically, the answer is yes. For example, in the continuous domain, consider an estimation of distribution algorithm, using a mixture of gaussian kernel: it can learn any probability density function (possibly needing an infinite number of kernels). Thus, carefully choosing the function to use at each iteration and the selection operator, one can reproduce the behaviour of any stochastic metaheuristic.

Of course, choosing the correct mixture (and the other parameters) is a very difficult problem in practice. But I find interesting the idea that the problem of designing a metaheuristic can be reduced to a configuration problem.

Notes

[1] Johann Dréo, Patrick Siarry, "Stochastic metaheuristics as sampling techniques using swarm intelligence. ", in "Swarm Intelligence: Focus on Ant and Particle Swarm Optimization", Felix T. S. Chan, Manoj Kumar Tiwari (Eds.), Advanced Robotic Systems International, I-Tech Education and Publishing, Vienna, Austria , ISBN 978-3-902613-09-7 - December 2008

lundi 23 juin 2008

An Estimation of Distribution Algorithm in a few lines of python

About an interesting post I have already commented, Julian Togelius said: "I know roughly what an EDA does, but I couldn't sit down an implement one on the spot [...]"

I personally think that estimation of distribution algorithms (EDA) are some of the more elegant an easy to use metaheuristics. Obviously, this highly depends of the mind configuration :-) Anyway, as a piece of code is more comprehensive than a long speech, I've made a very simple and small EDA in python, to illustrate my thought.

This is a simple continuous EDA using a (also simple) normal probability density function to optimize a (once more simple) function with two variables. As you can see, the code is (guess what ?) simple, only a few lines with some scipy functions, and that's it.

from scipy import *

# The problem to optimize
def x2y2( x ):
        return x[0]**2 + x[1]**2

class eda:
        def __init__(self, of):
                # Algorithm parameters
                self.iterations = 100
                self.sample_size = 100
                self.select_ratio = 0.5
                self.epsilon = 10e-6

                # class members
                self.objective_function = of
                self.dimensions = 2
                self.sample = []
                self.means = []
                self.stdevs = []        
                self.debug = False

        def run(self):
                # uniform initialization
                self.sample = random.rand( self.sample_size, self.dimensions+1 )
                # cosmetic
                self.sample = self.sample * 200 - 100

                self.evaluate()
                
                # main loop
                i = 0
                while i < self.iterations:
                        i += 1
                        self.dispersion_reduction()
                        self.estimate_parameters()
                        self.draw_sample()
                        self.evaluate()

                # sort the final sample
                self.sample_sort()
                # output the optimum
                print "#[ x y f(x,y) ]"
                print self.sample[0]

        def sample_sort(self):
                # sort rows on the last column
                self.sample = self.sample[ argsort( self.sample[:,-1], 0 ) ]

        def dispersion_reduction(self):
                self.sample_sort()

                # number of points to select
                nb = int( floor( self.sample_size * self.select_ratio ) )

                # selection
                self.sample = self.sample[:nb]

        def estimate_parameters( self ):
                # points sub array (without values)
                mat = self.sample[:,:self.dimensions]
                
                # row means (axis 0 in scipy)
                self.means = mean( mat, 0 )
                
                # row standard deviation
                self.stdevs = std( mat, 0 )

        def draw_sample(self):
                # for each variable to optimize
                for i in xrange(self.dimensions):
                        # if the dispersion is null
                        if self.stdevs[i] == 0.0:
                                # set it to a minimal value
                                self.stdevs[i] = self.epsilon
                
                # empty sample
                self.sample = zeros( (self.sample_size, self.dimensions+1) )
                
                # for each point
                for i in xrange( self.sample_size ):
                        # draw in random normal
                        p = random.normal( self.means, self.stdevs )
                        # put it into the sample
                        self.sample[i][:self.dimensions] = p

        def evaluate(self):
                # for each point
                for i in xrange( self.sample_size ):
                        d = self.dimensions
                        # call the objective function
                        #   the third element is the result of the objective function call
                        #   taking the first two elements as variables
                        self.sample[i][-1]  = self.objective_function( self.sample[i][:d] )

if __name__=="__main__":
        a = eda( x2y2 )
        a.run()

See also the file alone with a debug mode, to see how it works in details.

vendredi 27 juillet 2007

Hybridization : estimation of distribution as a meta-model filter generator for metaheuristics ?

An interesting idea is to use meta-model (a priori representation of the problem) as a filter to bias the sample produced by metaheuristics. This approach seems especially promising for engineering problem, where computing the objective function is very expensive.

One simple form of meta-model is a probability density function, approximating the shape of the objective function. This PDF could thus be used to filter out bad points before evaluation.

Why, then, do not directly use EDA to generate the sample ? Because one can imagine that the problem shape is not well known, and that using a complex PDF is impossible (too expensive to compute, for example). Then, using a classical indirect metaheuristic (let say an evolutionary algorithm) should be preferable (computationnaly inexpensive) for the sample generation. If one know a good approximation to use for the distribution of the EDA (not too computationnaly expensive), one can imagine using the best part of the two worlds.

An example could be a problem with real variable : using an EDA with a multi-variate normal distribution is computationnaly expensive (due to the estimation of the co-variance, mainly), and using a mixture of gaussian kernels makes difficult to have an a priori on the problem. Thus, why not using a indirect metaheuristic to handle the sample generation, and use a meta-model which parameters are estimated from the previous sample, according to a chosen distribution ?

One more hybridization to try...

mardi 1 août 2006

About this blog

This blog is an attempt to publish thoughts about metaheuristics and to share them with others. Indeed, blogs are fun, blogs are popular, ok... but most of all, blogs can be very usefull for researchers, that constently need to communicate, share ideas and informations.

Metaheuristics are (well, that's one definition among others, but in my opinion the better one) iterative (stochastic) algorithms for "hard" optimization. Well known metaheuristics are the so-called "genetic algorithms" (lets call them evolutionary ones), but these are not the only class: dont forget simulated annealing, tabu search, ant colony algorithms, estimation of distribution, etc.

This blog will try to focuse on the theory, the design, the understanding, the application, the implementation and the use of metaheuristics. I hope this blog will be profitable to other peoples (researchers as well as users), and will be a place to share thoughts.

Welcome aboard, and lets sleep with metaheuristics.