[SciPy-User] Normalization for optimization in python

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] Normalization for optimization in python

Yuxiang Wang
Dear all,

During optimization, it is often helpful to normalize the input
parameters to make them on the same order of magnitude, so the
convergence can be much better. For example, if we want to minimize
f(x), while a reasonable approximation is x0=[1e3, 1e-4], it might be
helpful to normalize x0[0] and x0[1] to about the same order of
magnitude (often O(1)).

My question is, I have been using scipy.optimize and specifically, the
L-BFGS-B algorithm. I was wondering that, do I need to normalize that
manually by writing a function, or the algorithm already did it for
me?

Thank you!

-Shawn
--
Yuxiang "Shawn" Wang
Gerling Research Lab
University of Virginia
[hidden email]
+1 (434) 284-0836
https://sites.google.com/a/virginia.edu/yw5aj/
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Normalization for optimization in python

Eric Hermes
Shawn,

I extensively use scipy.optimize.fmin_l_bfgs_b, and I always explicitly
normalize the input before passing it to the optimizer.  I usually do it
by writing my function as g(x,norm) ==
f([x[0]*norm[0],x[1]*norm[1],...]), and pass it to the optimizer as
fmin_l_bfgs_b(func=g,x0=[1.,1.],args=(norm)).  Note that you can achieve
rigorous convergence by multiplying norm by the result of optimization
and iterating, but convergence behavior far from a minimum may highly
depend both on what you choose as your initial guess and what your
initial normalization factor is.

Eric

On 1/26/2014 3:01 PM, Yuxiang Wang wrote:

> Dear all,
>
> During optimization, it is often helpful to normalize the input
> parameters to make them on the same order of magnitude, so the
> convergence can be much better. For example, if we want to minimize
> f(x), while a reasonable approximation is x0=[1e3, 1e-4], it might be
> helpful to normalize x0[0] and x0[1] to about the same order of
> magnitude (often O(1)).
>
> My question is, I have been using scipy.optimize and specifically, the
> L-BFGS-B algorithm. I was wondering that, do I need to normalize that
> manually by writing a function, or the algorithm already did it for
> me?
>
> Thank you!
>
> -Shawn
>
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Normalization for optimization in python

Yuxiang Wang
Hi Eric,

Great! Thanks a lot for the confirmation. That is really helpful.

-Shawn

On Sun, Jan 26, 2014 at 7:01 PM, Eric Hermes <[hidden email]> wrote:

> Shawn,
>
> I extensively use scipy.optimize.fmin_l_bfgs_b, and I always explicitly
> normalize the input before passing it to the optimizer.  I usually do it
> by writing my function as g(x,norm) ==
> f([x[0]*norm[0],x[1]*norm[1],...]), and pass it to the optimizer as
> fmin_l_bfgs_b(func=g,x0=[1.,1.],args=(norm)).  Note that you can achieve
> rigorous convergence by multiplying norm by the result of optimization
> and iterating, but convergence behavior far from a minimum may highly
> depend both on what you choose as your initial guess and what your
> initial normalization factor is.
>
> Eric
>
> On 1/26/2014 3:01 PM, Yuxiang Wang wrote:
>> Dear all,
>>
>> During optimization, it is often helpful to normalize the input
>> parameters to make them on the same order of magnitude, so the
>> convergence can be much better. For example, if we want to minimize
>> f(x), while a reasonable approximation is x0=[1e3, 1e-4], it might be
>> helpful to normalize x0[0] and x0[1] to about the same order of
>> magnitude (often O(1)).
>>
>> My question is, I have been using scipy.optimize and specifically, the
>> L-BFGS-B algorithm. I was wondering that, do I need to normalize that
>> manually by writing a function, or the algorithm already did it for
>> me?
>>
>> Thank you!
>>
>> -Shawn
>>
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user



--
Yuxiang "Shawn" Wang
Gerling Research Lab
University of Virginia
[hidden email]
+1 (434) 284-0836
https://sites.google.com/a/virginia.edu/yw5aj/
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Normalization for optimization in python

cjaramillo
In reply to this post by Eric Hermes
Hi, Eric.

I have a problem with my Jacobian matrices being non-homogeneous due to being derived from variables of different units, such as when combining rotation and translation parameters. Besides scaling the initial value by a factor, do you have an example of how to scale down the pertaining Jacobian matrices?

Thanks I advance!

Carlos
Reply | Threaded
Open this post in threaded view
|

Re: Normalization for optimization in python

Yuxiang Wang
Well, I guess I am not Eric but I did started this whole conversation :)

Question: if your Jacobian is numerically computed (as done
automatically by the minimize() function), shouldn't it be already
normalized, if we already normalized the independent variable being
passed into?

Shawn

On Sat, Apr 25, 2015 at 2:02 AM, cjaramillo
<[hidden email]> wrote:

> Hi, Eric.
>
> I have a problem with my Jacobian matrices being non-homogeneous due to
> being derived from variables of different units, such as when combining
> rotation and translation parameters. Besides scaling the initial value by a
> factor, do you have an example of how to scale down the pertaining Jacobian
> matrices?
>
> Thanks I advance!
>
> Carlos
>
>
>
> --
> View this message in context: http://scipy-user.10969.n7.nabble.com/SciPy-User-Normalization-for-optimization-in-python-tp19084p20189.html
> Sent from the Scipy-User mailing list archive at Nabble.com.
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user



--
Yuxiang "Shawn" Wang
Gerling Research Lab
University of Virginia
[hidden email]
+1 (434) 284-0836
https://sites.google.com/a/virginia.edu/yw5aj/
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Normalization for optimization in python

cjaramillo
Yuxiang Wang wrote
Well, I guess I am not Eric but I did started this whole conversation :)

Question: if your Jacobian is numerically computed (as done
automatically by the minimize() function), shouldn't it be already
normalized, if we already normalized the independent variable being
passed into?

Shawn

On Sat, Apr 25, 2015 at 2:02 AM, cjaramillo
<[hidden email]> wrote:
> Hi, Eric.
>
> I have a problem with my Jacobian matrices being non-homogeneous due to
> being derived from variables of different units, such as when combining
> rotation and translation parameters. Besides scaling the initial value by a
> factor, do you have an example of how to scale down the pertaining Jacobian
> matrices?
>
> Thanks I advance!
>
> Carlos
>
>
>
> --
> View this message in context: http://scipy-user.10969.n7.nabble.com/SciPy-User-Normalization-for-optimization-in-python-tp19084p20189.html
> Sent from the Scipy-User mailing list archive at Nabble.com.
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user



--
Yuxiang "Shawn" Wang
Gerling Research Lab
University of Virginia
[hidden email]
+1 (434) 284-0836
https://sites.google.com/a/virginia.edu/yw5aj/
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
I'm providing the Jacobian functions to be used in the optimization. It's faster this way. I found a couple of articles doing some normalization of Jacobian matrices of non-homogeneous parameters. Any examples with scipy woukd be appreciated. I'm trying to grasp the normalization methos for the Jacobian and they seem to be a simple diagonal matrix multiplication.
Reply | Threaded
Open this post in threaded view
|

Re: Normalization for optimization in python

Daπid
In reply to this post by cjaramillo
On 25 April 2015 at 08:02, cjaramillo <[hidden email]> wrote:
I have a problem with my Jacobian matrices being non-homogeneous due to
being derived from variables of different units, such as when combining
rotation and translation parameters. Besides scaling the initial value by a
factor, do you have an example of how to scale down the pertaining Jacobian
matrices?

Scaling factors can be error prone when you have to propagate them. Usually, the safest way is to blame it on the units. So, in your case, you define the rotation angles in radians and the typical displacement to be 1 in your units (assuming the rotations are big, so around 1 rad).

This technique also works with any other mathematical problems, like differential equations or linear systems.


/David.

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Normalization for optimization in python

cjaramillo
Daπid wrote
On 25 April 2015 at 08:02, cjaramillo <[hidden email]>
wrote:

Scaling factors can be error prone when you have to propagate them.
Usually, the safest way is to blame it on the units. So, in your case, you
define the rotation angles in radians and the typical displacement to be 1
in your units (assuming the rotations are big, so around 1 rad).
Thanks, David. However, you can still run into biased convergences even when all variables would be under the same units. For example, when you have a parameter that translates in the range from 0 to 10 [mm] and others that translate in the hundreds of [mm]. I'm searching on the web more about this, but I would appreciate you can refer me to a source where they employ unit-based scaling (normalization) of Jacobian matrices for their use within optimization.
Reply | Threaded
Open this post in threaded view
|

Re: Normalization for optimization in python

Daπid

On 27 April 2015 at 14:40, cjaramillo <[hidden email]> wrote:
Thanks, David. However, you can still run into biased convergences even when
all variables would be under the same units. For example, when you have a
parameter that translates in the range from 0 to 10 [mm] and others that
translate in the hundreds of [mm].

That is still not a problem. 10 mm and 500 mm are the same amount (more or less). The whole purpose of scaling is to minimise numerical inaccuracies when adding or subtracting numbers. But since the resolution of a double is 1e-15, you need nine orders of magnitude difference before you get accuracies comparable with perfectly scaled float (1e-6) (waiving my hands violently here).

Note that, due to the nature of the optimisation problems, the numerical noise in most "simple" functions will get swamped by the procedure, unless you are interested in an extremely accurate result on a finely crafted function. You can check on your case and compare with and without the scaling, without providing the gradient for simplicity.

I'm searching on the web more about this,
but I would appreciate you can refer me to a source where they employ
unit-based scaling (normalization) of Jacobian matrices for their use within
optimization.

This is usually called "natural units" or "nondimensionalisation". You will perhaps find more relevant hits in the context of differential equations. I don't know from the top of my head of any reference, but this should be covered in most numerical analysis texts.


/David.


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user