Optimisation of a rough function

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Optimisation of a rough function

David Baddeley
Hi all,

there are enough optimisation gurus here that hopefully someone might have some
ideas. I'm trying to optimise a goal function that has a well defined minimum,
but also a certain amount of roughness (see attached figure for an example).
Most of the time (80%) optimize.fmin seems to work, but it sometimes gets stuck
in the roughness. optimise.anneal works, but needs many more function
evaluations, making it a bit impractical (the goal is to be semi-interactive &
the fmin implementation already struggles). The goal function is quite complex
and already written in c. I'm really after some form of solver which will skip
over all the little bumps, but still benefits from using the gradient
information. Should probably also mention that it will usually have ~ 10
parameters, rather than the 2 parameter case shown, but that the same features
should be present. Would welcome any ideas.

thanks,
David



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user

optimisation_surface.png (145K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Optimisation of a rough function

Gael Varoquaux
On Fri, Apr 01, 2011 at 11:04:15PM -0700, David Baddeley wrote:

> there are enough optimisation gurus here that hopefully someone might
> have some ideas. I'm trying to optimise a goal function that has a well
> defined minimum, but also a certain amount of roughness (see attached
> figure for an example). Most of the time (80%) optimize.fmin seems to
> work, but it sometimes gets stuck in the roughness. optimise.anneal
> works, but needs many more function evaluations, making it a bit
> impractical (the goal is to be semi-interactive & the fmin
> implementation already struggles). The goal function is quite complex
> and already written in c. I'm really after some form of solver which
> will skip over all the little bumps, but still benefits from using the
> gradient information. Should probably also mention that it will usually
> have ~ 10 parameters, rather than the 2 parameter case shown, but that
> the same features should be present.

So your function is rough, but bell shaped at a large scale?

I don't have a silver bullet here (I'd love one), but I'd suggest looking
at a simplex algorithm, such as scipy.optimize.fmin, which uses the
Nelder-Mead. Simplex optimization does not rely on gradients, and is thus
more robust to such problems.

However, this will lead to other problems. Namely that errors from the
bell shape will induce problems in the Nelder Mead wich will converge to
the wrong position. I think that Matthieu Brucher suggested that a clever
scheme alternating Nelder-Mead and some kind of grid-search in the
simplex defined by the algorithm would be a good option. If someone has a
reference on such an approach, I'd be very much interested.

Gael
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Optimisation of a rough function

Sebastian Haase-3
On Sat, Apr 2, 2011 at 11:41 AM, Gael Varoquaux
<[hidden email]> wrote:

> On Fri, Apr 01, 2011 at 11:04:15PM -0700, David Baddeley wrote:
>> there are enough optimisation gurus here that hopefully someone might
>> have some ideas. I'm trying to optimise a goal function that has a well
>> defined minimum, but also a certain amount of roughness (see attached
>> figure for an example). Most of the time (80%) optimize.fmin seems to
>> work, but it sometimes gets stuck in the roughness. optimise.anneal
>> works, but needs many more function evaluations, making it a bit
>> impractical (the goal is to be semi-interactive & the fmin
>> implementation already struggles). The goal function is quite complex
>> and already written in c. I'm really after some form of solver which
>> will skip over all the little bumps, but still benefits from using the
>> gradient information. Should probably also mention that it will usually
>> have ~ 10 parameters, rather than the 2 parameter case shown, but that
>> the same features should be present.
>
> So your function is rough, but bell shaped at a large scale?
>
> I don't have a silver bullet here (I'd love one), but I'd suggest looking
> at a simplex algorithm, such as scipy.optimize.fmin, which uses the
> Nelder-Mead. Simplex optimization does not rely on gradients, and is thus
> more robust to such problems.
>
> However, this will lead to other problems. Namely that errors from the
> bell shape will induce problems in the Nelder Mead wich will converge to
> the wrong position. I think that Matthieu Brucher suggested that a clever
> scheme alternating Nelder-Mead and some kind of grid-search in the
> simplex defined by the algorithm would be a good option. If someone has a
> reference on such an approach, I'd be very much interested.
>
> Gael

How do you get your initial parameters ?  How about doing
perpendicular projections and a Gauss fit on those ?  Those fits could
just use the global extremum as initial values.
For the example you show that should work really well.
(Not sure if sum projections or max projections (maybe min
projections) would be best.)

- Sebastian
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Optimisation of a rough function

Paul Anton Letnes
In reply to this post by David Baddeley

On 2. apr. 2011, at 08.04, David Baddeley wrote:

> Hi all,
>
> there are enough optimisation gurus here that hopefully someone might have some
> ideas. I'm trying to optimise a goal function that has a well defined minimum,
> but also a certain amount of roughness (see attached figure for an example).
> Most of the time (80%) optimize.fmin seems to work, but it sometimes gets stuck
> in the roughness. optimise.anneal works, but needs many more function
> evaluations, making it a bit impractical (the goal is to be semi-interactive &
> the fmin implementation already struggles). The goal function is quite complex
> and already written in c. I'm really after some form of solver which will skip
> over all the little bumps, but still benefits from using the gradient
> information. Should probably also mention that it will usually have ~ 10
> parameters, rather than the 2 parameter case shown, but that the same features
> should be present. Would welcome any ideas.
>
> thanks,
> David

I am no guru, but I have done some optimization work.

1) Genetic algorithms are good at not getting stuck in local minima.
2) Same can be said for the simulated annealing group of algorithms (simulated tempering et al.), at least if you get your parameters (temperature etc) right.
3) A simpler way can be what some people call "bad derivatives". Simply put, smoothen your objective function, or take long steps in the beginning. As you get closer to convergence, adjust your smoothing or step length. I do not remember any references off-hand, but perhaps a sketch of the algorithm is found in Numerical Recipes? (Disclaimer: I have never used this particular method myself.)
4) Simplest yet: since computation is quite cheap for such a simple (correct me if I am wrong here) objective function, simply perform N optimization runs with randomized starting points. Pick the one which converges to the smallest/biggest value.

Good luck,
Paul.

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Optimisation of a rough function

Matthieu Brucher-2
This seems to be indeed a global optimization issue. A local optimization (like Nelden-Mead or conjugated-gradient) is only successful if the function is convex, or if you start near the global optimium and if in this area, the function is convex.

First, if you have access to the gradient, you should use it if its computation is not too slow.

Then, if your function is not convex (which seems to be the case), you may start with simulated annealing or genetic algorithms to get inside a locally convex region.
Next step would be retaining several sets of parameters and try a local optimization method on each set. I have several gradient-based optimizers available in scikits.optimization (I have some tutorials on my blog: http://matt.eifelle.com/category/python/generic-optimizers/) that you may try, but you can also use fmin by specifying the gradient.

Matthieu

2011/4/2 Paul Anton Letnes <[hidden email]>

On 2. apr. 2011, at 08.04, David Baddeley wrote:

> Hi all,
>
> there are enough optimisation gurus here that hopefully someone might have some
> ideas. I'm trying to optimise a goal function that has a well defined minimum,
> but also a certain amount of roughness (see attached figure for an example).
> Most of the time (80%) optimize.fmin seems to work, but it sometimes gets stuck
> in the roughness. optimise.anneal works, but needs many more function
> evaluations, making it a bit impractical (the goal is to be semi-interactive &
> the fmin implementation already struggles). The goal function is quite complex
> and already written in c. I'm really after some form of solver which will skip
> over all the little bumps, but still benefits from using the gradient
> information. Should probably also mention that it will usually have ~ 10
> parameters, rather than the 2 parameter case shown, but that the same features
> should be present. Would welcome any ideas.
>
> thanks,
> David

I am no guru, but I have done some optimization work.

1) Genetic algorithms are good at not getting stuck in local minima.
2) Same can be said for the simulated annealing group of algorithms (simulated tempering et al.), at least if you get your parameters (temperature etc) right.
3) A simpler way can be what some people call "bad derivatives". Simply put, smoothen your objective function, or take long steps in the beginning. As you get closer to convergence, adjust your smoothing or step length. I do not remember any references off-hand, but perhaps a sketch of the algorithm is found in Numerical Recipes? (Disclaimer: I have never used this particular method myself.)
4) Simplest yet: since computation is quite cheap for such a simple (correct me if I am wrong here) objective function, simply perform N optimization runs with randomized starting points. Pick the one which converges to the smallest/biggest value.

Good luck,
Paul.

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



--
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
dpo
Reply | Threaded
Open this post in threaded view
|

Re: Optimisation of a rough function

dpo
In reply to this post by David Baddeley
> Date: Sat, 2 Apr 2011 11:41:17 +0200
> Subject: Re: [SciPy-User] Optimisation of a rough function
> On Fri, Apr 01, 2011 at 11:04:15PM -0700, David Baddeley wrote:
>> there are enough optimisation gurus here that hopefully someone might
>> have some ideas. I'm trying to optimise a goal function that has a well
>> defined minimum, but also a certain amount of roughness (see attached
>> figure for an example). Most of the time (80%) optimize.fmin seems to
>> work, but it sometimes gets stuck in the roughness. optimise.anneal
>> works, but needs many more function evaluations, making it a bit
>> impractical (the goal is to be semi-interactive & the fmin
>> implementation already struggles). The goal function is quite complex
>> and already written in c. I'm really after some form of solver which
>> will skip over all the little bumps, but still benefits from using the
>> gradient information. Should probably also mention that it will usually
>> have ~ 10 parameters, rather than the 2 parameter case shown, but that
>> the same features should be present.

You have a noisy optimization problem. Not sure if it will solve your
problem but you want to look into implicit filtering:
http://www4.ncsu.edu/~ctk/imfil.html

(don't know if there's a Python version of this).

The method is described in Tim Kelley's book:

http://www.siam.org/books/kelley/fr18/index.php

--
Dominique
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Optimisation of a rough function

David Baddeley
Thanks everyone! Implicit filtering definitely sounds like it might be a step in
the right direction.

To address some of the previous comments, the goal function is normally quite
costly, making the brute force type of method somewhat impractical. To give you
a bit more insight, I'm transforming a set of (x,y) points, calculating a
triangularisation of the transformed point set, and calculating the product of
all the edge lengths in that triangularisation. I'm trying to find parameters
for the transformation which minimise this product. The goal surface I plotted
was for a simplified case - 100 points and a simple linear transformation (ie 2
parameters). In normal applications I'll be looking at ~100k points, and a
non-linear transformation (~10 parameters).

cheers,
David


----- Original Message ----
From: Dominique Orban <[hidden email]>
To: [hidden email]
Sent: Sun, 3 April, 2011 4:25:43 AM
Subject: Re: [SciPy-User] Optimisation of a rough function

> Date: Sat, 2 Apr 2011 11:41:17 +0200
> Subject: Re: [SciPy-User] Optimisation of a rough function
> On Fri, Apr 01, 2011 at 11:04:15PM -0700, David Baddeley wrote:
>> there are enough optimisation gurus here that hopefully someone might
>> have some ideas. I'm trying to optimise a goal function that has a well
>> defined minimum, but also a certain amount of roughness (see attached
>> figure for an example). Most of the time (80%) optimize.fmin seems to
>> work, but it sometimes gets stuck in the roughness. optimise.anneal
>> works, but needs many more function evaluations, making it a bit
>> impractical (the goal is to be semi-interactive & the fmin
>> implementation already struggles). The goal function is quite complex
>> and already written in c. I'm really after some form of solver which
>> will skip over all the little bumps, but still benefits from using the
>> gradient information. Should probably also mention that it will usually
>> have ~ 10 parameters, rather than the 2 parameter case shown, but that
>> the same features should be present.

You have a noisy optimization problem. Not sure if it will solve your
problem but you want to look into implicit filtering:
http://www4.ncsu.edu/~ctk/imfil.html

(don't know if there's a Python version of this).

The method is described in Tim Kelley's book:

http://www.siam.org/books/kelley/fr18/index.php

--
Dominique
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



     
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Optimisation of a rough function

Charles R Harris


On Sat, Apr 2, 2011 at 3:25 PM, David Baddeley <[hidden email]> wrote:
Thanks everyone! Implicit filtering definitely sounds like it might be a step in
the right direction.

To address some of the previous comments, the goal function is normally quite
costly, making the brute force type of method somewhat impractical. To give you
a bit more insight, I'm transforming a set of (x,y) points, calculating a
triangularisation of the transformed point set, and calculating the product of
all the edge lengths in that triangularisation. I'm trying to find parameters
for the transformation which minimise this product. The goal surface I plotted
was for a simplified case - 100 points and a simple linear transformation (ie 2
parameters). In normal applications I'll be looking at ~100k points, and a
non-linear transformation (~10 parameters).


Map all the points to zero. Problem solved ;)

Chuck


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user