Mentales habitudes - Tag - estimation of distribution2015-10-19T19:25:25+02:00nojhanurn:md5:12147DotclearThe ultimate metaheuristic?urn:md5:6b05629a17a75cbaed8aa9566830da022008-09-11T14:56:00+02:00nojhanDiversdescent algorithmestimation of distributionevolutionary computationmetaheuristicMetropolis-Hastings algorithmmixture of gaussian kernelsamplingsimulated annealing <p>There exists a lot of different algorithms families that can be called
"metaheuristics", stricly speaking, there are a very, very, very large number
of <a href="http://metah.nojhan.net/tag/metaheuristic">metaheuristics</a> <em>instances</em>.</p>
<p>Defining what is a metaheuristic "family" is a difficult problem: when may I
called this or this algorithm an evolutionary one? Is estimation of
distribution a sub-family of genetic algorithms? What is the difference between
ant colony optimization and stochastic gradient ascent? Etc.</p>
<p>Despite the <a href="http://metah.nojhan.net/post/2007/10/12/Classification-of-metaheuristics">difficulty of classifying
metaheuristics</a>, there is some interesting characteristics shared by
stochastic metaheuristics. Indeed, they are all iteratively manipulating a
sample of the objective function<sup>[<a href="http://metah.nojhan.net/post/2008/09/11/#pnote-276667-1" id="rev-pnote-276667-1" name="rev-pnote-276667-1">1</a>]</sup></p>
<p>For example, <a href="http://metah.nojhan.net/tag/simulated%20annealing">simulated annealing</a> is
often depicted as a probabilistic <a href="http://metah.nojhan.net/tag/descent%20algorithm">descent
algorithm</a>, but it is more than that. Indeed, simulated annealing is based
on the <a href="http://metah.nojhan.net/tag/Metropolis-Hastings%20algorithm">Metropolis-Hastings
algorithm</a>, which is a way of sampling any probability distributionn, as
long as you can calculate its density at any point. Thus, <strong>simulated
annealing use an approximation of the objective function as a probability
density function to generate a <a href="http://metah.nojhan.net/tag/sampling">sampling</a></strong>.
It is even more obvious if you consider a step by step decrease of the
temperature. <a href="http://metah.nojhan.net/tag/estimation%20of%20distribution">Estimation of
distribution</a> are another obvious example: they are explicitly manipulating
samplings, but one can also have the same thoughts about <a href="http://metah.nojhan.net/tag/evolutionary%20computation">evolutionary algorithms</a>, even if they are
manipulating the sampling rather implicitely.</p>
<p><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/f/f9/Metaheuristic_parcours-population.png/250px-Metaheuristic_parcours-population.png" alt="" /></p>
<p>The diagram tries to illustrate this idea: (a) a descent algorithm can have
the same sampling behaviour than an iteration of a (b) "population" method.</p>
<p>Given these common processes, is it possible to design a kind of "universal"
metaheuristic ? Theoretically, the answer is yes. For example, in the
continuous domain, consider an estimation of distribution algorithm, using a
<a href="http://metah.nojhan.net/tag/mixture%20of%20gaussian%20kernel">mixture of gaussian kernel</a>:
it can learn any probability density function (possibly needing an infinite
number of kernels). Thus, carefully choosing the function to use at each
iteration and the selection operator, <strong>one can reproduce the behaviour
of any stochastic metaheuristic</strong>.</p>
<p>Of course, choosing the correct mixture (and the other parameters) is a very
difficult problem in practice. But I find interesting the idea that <strong>the
problem of designing a metaheuristic can be reduced to a configuration
problem</strong>.</p>
<div class="footnotes">
<h4>Notes</h4>
<p>[<a href="http://metah.nojhan.net/post/2008/09/11/#rev-pnote-276667-1" id="pnote-276667-1" name="pnote-276667-1">1</a>] Johann Dréo, Patrick Siarry, "<a href="http://www.nojhan.net/pro/spip.php?article26" hreflang="en">Stochastic
metaheuristics as sampling techniques using swarm intelligence</a>. ", in
"Swarm Intelligence: Focus on Ant and Particle Swarm Optimization", Felix T. S.
Chan, Manoj Kumar Tiwari (Eds.), Advanced Robotic Systems International, I-Tech
Education and Publishing, Vienna, Austria , ISBN 978-3-902613-09-7 - December
2008</p>
</div>An Estimation of Distribution Algorithm in a few lines of pythonurn:md5:59ae75c362289c7724dd378fabebb2e22008-06-23T21:55:00+02:00nojhanProgrammationestimation of distributionpython <p>About an <a href="http://geneticargonaut.blogspot.com/2008/02/evolving-grid-computing-optimization.html">
interesting post</a> I have <a href="http://metah.nojhan.net/post/2008/03/03/The-problem-with-spreading-new-metaheuristics">already
commented</a>, <a href="http://www.blogger.com/profile/09333191187316058782">Julian Togelius</a> said:
<cite>"I know roughly what an EDA does, but I couldn't sit down an implement
one on the spot [...]"</cite></p>
<p>I personally think that estimation of distribution algorithms (EDA) are some
of the more elegant an easy to use metaheuristics. Obviously, this highly
depends of the mind configuration :-) Anyway, as a piece of code is more
comprehensive than a long speech, I've made a very simple and small EDA in
python, to illustrate my thought.</p>
<p>This is a simple continuous EDA using a (also simple) normal probability
density function to optimize a (once more simple) function with two variables.
As you can see, the code is (guess what ?) simple, only a few lines with some
<a href="http://www.scipy.org">scipy</a> functions, and that's it.</p>
<code>from scipy import *<br />
<br />
# The problem to optimize<br />
<b>def</b> x2y2( x ):<br />
<b>return</b> x[0]**2 +
x[1]**2<br />
<br />
<b>class</b> eda:<br />
<b>def</b> __init__(self,
of):<br />
#
Algorithm parameters<br />
self.iterations
= 100<br />
self.sample_size
= 100<br />
self.select_ratio
= 0.5<br />
self.epsilon
= 10e-6<br />
<br />
#
class members<br />
self.objective_function
= of<br />
self.dimensions
= 2<br />
self.sample
= []<br />
self.means
= []<br />
self.stdevs
= [] <br />
self.debug
= False<br />
<br />
<b>def</b> run(self):<br />
#
uniform initialization<br />
self.sample
= random.rand( self.sample_size, self.dimensions+1 )<br />
#
cosmetic<br />
self.sample
= self.sample * 200 - 100<br />
<br />
self.evaluate()<br />
<br />
#
main loop<br />
i
= 0<br />
<b>while</b> i
< self.iterations:<br />
i
+= 1<br />
self.dispersion_reduction()<br />
self.estimate_parameters()<br />
self.draw_sample()<br />
self.evaluate()<br />
<br />
#
sort the final sample<br />
self.sample_sort()<br />
#
output the optimum<br />
<b>print</b> "#[
x y f(x,y) ]"<br />
<b>print</b> self.sample[0]<br />
<br />
<b>def</b> sample_sort(self):<br />
#
sort rows on the last column<br />
self.sample
= self.sample[ argsort( self.sample[:,-1], 0 ) ]<br />
<br />
<b>def</b> dispersion_reduction(self):<br />
self.sample_sort()<br />
<br />
#
number of points to select<br />
nb
= int( floor( self.sample_size * self.select_ratio ) )<br />
<br />
#
selection<br />
self.sample
= self.sample[:nb]<br />
<br />
<b>def</b> estimate_parameters(
self ):<br />
#
points sub array (without values)<br />
mat
= self.sample[:,:self.dimensions]<br />
<br />
#
row means (axis 0 in scipy)<br />
self.means
= mean( mat, 0 )<br />
<br />
#
row standard deviation<br />
self.stdevs
= std( mat, 0 )<br />
<br />
<b>def</b> draw_sample(self):<br />
#
for each variable to optimize<br />
<b>for</b> i
<b>in</b> xrange(self.dimensions):<br />
#
if the dispersion is null<br />
<b>if</b> self.stdevs[i]
== 0.0:<br />
#
set it to a minimal value<br />
self.stdevs[i]
= self.epsilon<br />
<br />
#
empty sample<br />
self.sample
= zeros( (self.sample_size, self.dimensions+1) )<br />
<br />
#
for each point<br />
<b>for</b> i
<b>in</b> xrange( self.sample_size ):<br />
#
draw in random normal<br />
p
= random.normal( self.means, self.stdevs )<br />
#
put it into the sample<br />
self.sample[i][:self.dimensions]
= p<br />
<br />
<b>def</b> evaluate(self):<br />
#
for each point<br />
<b>for</b> i
<b>in</b> xrange( self.sample_size ):<br />
d
= self.dimensions<br />
#
call the objective function<br />
#
the third element is the result of the objective function call<br />
#
taking the first two elements as variables<br />
self.sample[i][-1] =
self.objective_function( self.sample[i][:d] )<br />
<br />
<b>if</b> __name__=="__main__":<br />
a = eda( x2y2 )<br />
a.run()<br />
<br /></code> See also <a href="http://metah.nojhan.net/public/eda_demo.py">the file alone with a
debug mode</a>, to see how it works in details.Hybridization : estimation of distribution as a meta-model filter generator for metaheuristics ?urn:md5:4137438ac93f0a4807a99223b387a73f2007-07-27T00:00:00+02:00nojhanDiversestimation of distributionmachine learning <p>An interesting idea is to use meta-model (a priori representation of the
problem) as a filter to bias the sample produced by metaheuristics. This
approach seems especially promising for engineering problem, where computing
the objective function is very expensive.</p>
<p>One simple form of meta-model is a probability density function,
approximating the shape of the objective function. This PDF could thus be used
to filter out bad points <em>before</em> evaluation.</p>
<p>Why, then, do not directly use EDA to generate the sample ? Because one can
imagine that the problem shape is not well known, and that using a complex PDF
is impossible (too expensive to compute, for example). Then, using a classical
indirect metaheuristic (let say an evolutionary algorithm) should be preferable
(computationnaly inexpensive) for the sample generation. If one know a good
approximation to use for the distribution of the EDA (not too computationnaly
expensive), one can imagine using the best part of the two worlds.</p>
<p>An example could be a problem with real variable : using an EDA with a
multi-variate normal distribution is computationnaly expensive (due to the
estimation of the co-variance, mainly), and using a mixture of gaussian kernels
makes difficult to have an <em>a priori</em> on the problem. Thus, why not
using a indirect metaheuristic to handle the sample generation, and use a
meta-model which parameters are estimated from the previous sample, according
to a chosen distribution ?</p>
<p>One more hybridization to try...</p>About this blogurn:md5:6371356d53a94f98925fb66da14eeb9f2006-08-01T00:00:00+02:00nojhanDiversant colony algorithmestimation of distributionevolutionary computationsimulated annealing <p>This blog is an attempt to publish thoughts about metaheuristics and to
share them with others. Indeed, blogs are fun, blogs are popular, ok... but
most of all, blogs can be very usefull for researchers, that constently need to
communicate, share ideas and informations.</p>
<p>Metaheuristics are (well, that's one definition among others, but in my
opinion the better one) iterative (stochastic) algorithms for "hard"
optimization. Well known metaheuristics are the so-called "genetic algorithms"
(lets call them <em>evolutionary</em> ones), but these are not the only class:
dont forget simulated annealing, tabu search, ant colony algorithms, estimation
of distribution, etc.</p>
<p>This blog will try to focuse on the <em>theory</em>, the <em>design</em>,
the <em>understanding</em>, the <em>application</em>, the
<em>implementation</em> and the <em>use</em> of metaheuristics. I hope this
blog will be profitable to other peoples (researchers as well as users), and
will be a place to share thoughts.</p>
<p>Welcome aboard, and lets sleep with metaheuristics.</p>