[SciPy-User] estimating cov_x matrix with leastsq, without doing a fit.

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] estimating cov_x matrix with leastsq, without doing a fit.

Andrew Nelson
Dear list,
I have a least squares data fitting system.  I use two different ways
of fitting the data; the first is with a differential evolution
algorithm, the second is with the scipy.optimize.leastsq function.

When I use the leastsq function I obtain the cov_x matrix, giving me
estimated parameter uncertainties.

However, with my home rolled DE algorithm I don't get a covariance
matrix and wish to estimate one.  However, I don't want to change the
fitted parameters, I just want the covariance matrix estimated.  Is
there any way of getting leastsq to give me the covariance matrix
solely based on the initial parameters?

If there isn't, can anyone suggest the best way for me to accomplish
this?  At the moment I am using numdifftools to calculate the Hessian
and inverting this, but this is not giving me the same numbers as I
get out of the leastsq function.

regards,
Andrew.

--
_____________________________________
Dr. Andrew Nelson


_____________________________________
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: estimating cov_x matrix with leastsq, without doing a fit.

josef.pktd
On Thu, Feb 27, 2014 at 2:03 AM, Andrew Nelson <[hidden email]> wrote:

> Dear list,
> I have a least squares data fitting system.  I use two different ways
> of fitting the data; the first is with a differential evolution
> algorithm, the second is with the scipy.optimize.leastsq function.
>
> When I use the leastsq function I obtain the cov_x matrix, giving me
> estimated parameter uncertainties.
>
> However, with my home rolled DE algorithm I don't get a covariance
> matrix and wish to estimate one.  However, I don't want to change the
> fitted parameters, I just want the covariance matrix estimated.  Is
> there any way of getting leastsq to give me the covariance matrix
> solely based on the initial parameters?

If your own fitted parameters also solve the leastsq problem, then
calling leastsq with your values as starting values should not change
the parameters.

>
> If there isn't, can anyone suggest the best way for me to accomplish
> this?  At the moment I am using numdifftools to calculate the Hessian
> and inverting this, but this is not giving me the same numbers as I
> get out of the leastsq function.

Depending on how "nice" your function is, the values can differ quite
a bit depending on the choice of the stepsize in the numerical
derivatives.

In general I would trust numdifftools more than the derivatives
returned by leastsq. You could also compare with the numerical
derivatives in statsmodels.
http://statsmodels.sourceforge.net/devel/tools.html#numerical-differentiation

Josef


>
> regards,
> Andrew.
>
> --
> _____________________________________
> Dr. Andrew Nelson
>
>
> _____________________________________
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: estimating cov_x matrix with leastsq, without doing a fit.

Matt Newville
In reply to this post by Andrew Nelson
Andrew,

On Thu, Feb 27, 2014 at 1:03 AM, Andrew Nelson <[hidden email]> wrote:

> Dear list,
> I have a least squares data fitting system.  I use two different ways
> of fitting the data; the first is with a differential evolution
> algorithm, the second is with the scipy.optimize.leastsq function.
>
> When I use the leastsq function I obtain the cov_x matrix, giving me
> estimated parameter uncertainties.
>
> However, with my home rolled DE algorithm I don't get a covariance
> matrix and wish to estimate one.  However, I don't want to change the
> fitted parameters, I just want the covariance matrix estimated.  Is
> there any way of getting leastsq to give me the covariance matrix
> solely based on the initial parameters?

Does it work to run leastsq() starting with the solution from your DE
algorithm?   That is, if you've found a good solution, and start
leastsq() with that, wouldn't you be concerned if it finds a
significantly different solution?  Maybe leastsq() can be thought of
as refining the values found from the DE approach?  If you think (or
find) that leastsq() is running off to obviously wrong solutions,
tools like lmfit allow you to put bounds on parameter values with
leastsq().  Sometimes that complicates finding a good (non-singular)
covariance matrix, but it often works just fine.

I know a few people using lmfit to first do a fit with Powell's method
(often thought to be more stable than LM when far from solution, and
so safer in some automated/batch analysis), and then use that solution
with leastsq() to get a final solution with error bars are
correlations.  Lmfit tries to simplify that by allowing you to switch
fitting methods without changing your objective function.  We'd love
to add a DE algorithm.  If you know of a good one, or are willing to
share yours, I'd love to hear it.

> If there isn't, can anyone suggest the best way for me to accomplish
> this?  At the moment I am using numdifftools to calculate the Hessian
> and inverting this, but this is not giving me the same numbers as I
> get out of the leastsq function.

I don't think it would be too hard to lift the code that extracts and
uses the jacobian to calculate the covariance matrix from leastsq()
(but be mindful of the pivoting).  It might be easier to start with
the MPFIT code, as it's pure python.

Hope that helps,

--Matt Newville
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: estimating cov_x matrix with leastsq, without doing a fit.

Matt Newville
In reply to this post by josef.pktd
Hi Josef, Andrew,

On Thu, Feb 27, 2014 at 6:49 AM,  <[hidden email]> wrote:

> On Thu, Feb 27, 2014 at 2:03 AM, Andrew Nelson <[hidden email]> wrote:
>> Dear list,
>> I have a least squares data fitting system.  I use two different ways
>> of fitting the data; the first is with a differential evolution
>> algorithm, the second is with the scipy.optimize.leastsq function.
>>
>> When I use the leastsq function I obtain the cov_x matrix, giving me
>> estimated parameter uncertainties.
>>
>> However, with my home rolled DE algorithm I don't get a covariance
>> matrix and wish to estimate one.  However, I don't want to change the
>> fitted parameters, I just want the covariance matrix estimated.  Is
>> there any way of getting leastsq to give me the covariance matrix
>> solely based on the initial parameters?
>
> If your own fitted parameters also solve the leastsq problem, then
> calling leastsq with your values as starting values should not change
> the parameters.
>
>>
>> If there isn't, can anyone suggest the best way for me to accomplish
>> this?  At the moment I am using numdifftools to calculate the Hessian
>> and inverting this, but this is not giving me the same numbers as I
>> get out of the leastsq function.
>
> Depending on how "nice" your function is, the values can differ quite
> a bit depending on the choice of the stepsize in the numerical
> derivatives.
>
> In general I would trust numdifftools more than the derivatives
> returned by leastsq. You could also compare with the numerical
> derivatives in statsmodels.
> http://statsmodels.sourceforge.net/devel/tools.html#numerical-differentiation

Do you think it would be worth adding an option to leastsq() to use
these methods to calculate the covariance after the best solution is
found?

I forgot the mention that lmfit also includes methods to do a
brute-force estimate of confidence intervals, which can be especially
useful when sections of the covariance matrix are far from elliptical.
  That might be a useful approach for a problem that needs a DE
solver.

--Matt Newville
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: estimating cov_x matrix with leastsq, without doing a fit.

josef.pktd
On Thu, Feb 27, 2014 at 11:08 AM, Matt Newville
<[hidden email]> wrote:

> Hi Josef, Andrew,
>
> On Thu, Feb 27, 2014 at 6:49 AM,  <[hidden email]> wrote:
>> On Thu, Feb 27, 2014 at 2:03 AM, Andrew Nelson <[hidden email]> wrote:
>>> Dear list,
>>> I have a least squares data fitting system.  I use two different ways
>>> of fitting the data; the first is with a differential evolution
>>> algorithm, the second is with the scipy.optimize.leastsq function.
>>>
>>> When I use the leastsq function I obtain the cov_x matrix, giving me
>>> estimated parameter uncertainties.
>>>
>>> However, with my home rolled DE algorithm I don't get a covariance
>>> matrix and wish to estimate one.  However, I don't want to change the
>>> fitted parameters, I just want the covariance matrix estimated.  Is
>>> there any way of getting leastsq to give me the covariance matrix
>>> solely based on the initial parameters?
>>
>> If your own fitted parameters also solve the leastsq problem, then
>> calling leastsq with your values as starting values should not change
>> the parameters.
>>
>>>
>>> If there isn't, can anyone suggest the best way for me to accomplish
>>> this?  At the moment I am using numdifftools to calculate the Hessian
>>> and inverting this, but this is not giving me the same numbers as I
>>> get out of the leastsq function.
>>
>> Depending on how "nice" your function is, the values can differ quite
>> a bit depending on the choice of the stepsize in the numerical
>> derivatives.
>>
>> In general I would trust numdifftools more than the derivatives
>> returned by leastsq. You could also compare with the numerical
>> derivatives in statsmodels.
>> http://statsmodels.sourceforge.net/devel/tools.html#numerical-differentiation
>
> Do you think it would be worth adding an option to leastsq() to use
> these methods to calculate the covariance after the best solution is
> found?

scipy has an open issue for adding more numerical derivative facilities.

I only remember some spot checking where the cov_x of leastsq wasn't very good.
(I'm not using leastsq these days, and my last comment on scaling in
leastsq was wrong.)

IIRC, In statsmodels we never rely on the Hessian approximation that
the optimizer uses. And I have seen also cases that are "not nice",
where our own numerical derivatives can be very sensitive.

Serious problems like not being positive definite show up less often
at the minimum than during the optimization (with the non-leastsq
optimizers).

Josef

>
> I forgot the mention that lmfit also includes methods to do a
> brute-force estimate of confidence intervals, which can be especially
> useful when sections of the covariance matrix are far from elliptical.
>   That might be a useful approach for a problem that needs a DE
> solver.
>
> --Matt Newville
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: estimating cov_x matrix with leastsq, without doing a fit.

federico vaggi-2
In reply to this post by josef.pktd

To expand on what Joseph was saying:

You can take the parameter value that you've obtained at your solution, calculate the Jacobian using finite elements (I didn't know statsmodels had those utility functions, I'll definitely use those in my own code) then use the Jacobian to approximate the Hessian (if you are close to the minima, it's very close to J*J.T) then use the Hessian to calculate the covariance matrix.

The only issue with this is that this approach is only really valid if your solution is near a minima already - if it isn't, the approximation for the Hessian might be way off.

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user