I would like to calculate the Jacobian for a least squares problem, followed by a Hessian estimation, then the covariance matrix from that Hessian.
With my current approach I sometimes experience issues with the covariance matrix in that it's sometimes not positive semi-definite. I am using the covariance matrix to seed a MCMC sampling process by supplying it to `np.random.multivariate_normal` to get initial positions for the MC chain. I am using the following code: ``` from scipy.optimize._numdiff import approx_derivative jac = approx_derivative(residuals_func, x0) hess = np.matmul(jac.T, jac) covar = np.linalg.inv(hess) ``` Note that x0 may not be at a minimum. - would this be the usual way of estimating the Hessian, is there anything incorrect with the approach? - what is the recommended way (i.e. numerically stable) of inverting the Hessian in such a situation? - does `optimize.leastsq` do anything different? - if `x0` is not at a minimum should the covariance matrix be expected to be positive semi-definite anyway? _______________________________________________ SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
> Am 27.03.2018 um 01:57 schrieb Andrew Nelson <[hidden email]>: > > I would like to calculate the Jacobian for a least squares problem, followed by a Hessian estimation, then the covariance matrix from that Hessian. > > With my current approach I sometimes experience issues with the covariance matrix in that it's sometimes not positive semi-definite. I am using the covariance matrix to seed a MCMC sampling process by supplying it to `np.random.multivariate_normal` to get initial positions for the MC chain. I am using the following code: > > ``` > from scipy.optimize._numdiff import approx_derivative > jac = approx_derivative(residuals_func, x0) > hess = np.matmul(jac.T, jac) > covar = np.linalg.inv(hess) > ``` > > Note that x0 may not be at a minimum. > > - would this be the usual way of estimating the Hessian, is there anything incorrect with the approach? > - what is the recommended way (i.e. numerically stable) of inverting the Hessian in such a situation? If your hess matrix is close to being singular, you could gain some precision by using the QR decomposition of the jacobian. In general to solve a linear system it is recommended to avoid calculating the the inverse matrix. > - does `optimize.leastsq` do anything different? leastsq wraps the MINPACK library, which brings it own carefully tuned numeric differentiation routines, and it uses QR decomposition. > - if `x0` is not at a minimum should the covariance matrix be expected to be positive semi-definite anyway? If x0 is not a minimum, then there is no guarantee. Even if x0 is a minimum this might by violated due to numerical errors. best Gregor > _______________________________________________ > SciPy-User mailing list > [hidden email] > https://mail.python.org/mailman/listinfo/scipy-user _______________________________________________ SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
Hi, Additionally to what Gregor said: - finite-differences estimation of the derivatives should really be a last resort; best is paper-and-pencil, or algorithmic differentiation (algopy et al). If that is not possible, I'd try some higher-order finite differences. E.g. approx_derivatives with method '3-point' or 'cs' (if that works). - approx_derivative is more sophisticated than fitpack actually. IIUC minpack only does the simplest two-point forward scheme, https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack/fdjac2.f - linalg.inv(matrix) is generally better spelled as solve(matrix, identity_matrix) - in this case, it's indeed best to use QR or SVD. curve_fit does a pseudoinverse: (IIRC this was written by Nikolay, and he cited Ceres or some other industry-class optimization software). Cheers, Evgeni On Tue, Mar 27, 2018, 12:19 PM Gregor Thalhammer <[hidden email]> wrote:
_______________________________________________ SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
Hi Andrew, The covariance can be calculated from the Jacobian as Cij = Jij * Sij * Jij^T where S is the covariance of the inputs, C is the covariance of the outputs, and J^T is the transpose of the Jacobian Jij = dxi/dyj. Sorry if this is redundant or unhelpful. Best, Mark On Tue, Mar 27, 2018, 11:38 PM Evgeni Burovski <[hidden email]> wrote:
_______________________________________________ SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
In reply to this post by Evgeni Burovski
On Wed, Mar 28, 2018 at 2:37 AM, Evgeni Burovski
<[hidden email]> wrote: > Hi, > > Additionally to what Gregor said: > > - finite-differences estimation of the derivatives should really be a last > resort; best is paper-and-pencil, or algorithmic differentiation (algopy et > al). If that is not possible, I'd try some higher-order finite differences. > E.g. approx_derivatives with method '3-point' or 'cs' (if that works). > > - approx_derivative is more sophisticated than fitpack actually. IIUC > minpack only does the simplest two-point forward scheme, > https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack/fdjac2.f > > - linalg.inv(matrix) is generally better spelled as solve(matrix, > identity_matrix) > > - in this case, it's indeed best to use QR or SVD. curve_fit does a > pseudoinverse: > > https://github.com/scipy/scipy/blob/v1.0.0/scipy/optimize/minpack.py#L502-L790 > > (IIRC this was written by Nikolay, and he cited Ceres or some other > industry-class optimization software). Depending on the use of the hessian, I would regularize the hessian if it is not positive definite or positive semi-definite. Simplest is to clip singular values to a threshold at or above zero or to add a Ridge factor. (I added a Ridge factor to help home made Newton method in statsmodels to avoid at least some invertibility or singularity problems.) statsmodels also has some function to find the nearest positive (semi-) definite matrix. (We don't regularize the final hessian used for the covariance of the parameter estimates in nonlinear maximum likelihood models, because if that is not positive definite, then there are more serious problems with the model or the data.) Josef > > Cheers, > > Evgeni > > > On Tue, Mar 27, 2018, 12:19 PM Gregor Thalhammer > <[hidden email]> wrote: >> >> >> >> > Am 27.03.2018 um 01:57 schrieb Andrew Nelson <[hidden email]>: >> > >> > I would like to calculate the Jacobian for a least squares problem, >> > followed by a Hessian estimation, then the covariance matrix from that >> > Hessian. >> > >> > With my current approach I sometimes experience issues with the >> > covariance matrix in that it's sometimes not positive semi-definite. I am >> > using the covariance matrix to seed a MCMC sampling process by supplying it >> > to `np.random.multivariate_normal` to get initial positions for the MC >> > chain. I am using the following code: >> > >> > ``` >> > from scipy.optimize._numdiff import approx_derivative >> > jac = approx_derivative(residuals_func, x0) >> > hess = np.matmul(jac.T, jac) >> > covar = np.linalg.inv(hess) >> > ``` >> > >> > Note that x0 may not be at a minimum. >> > >> > - would this be the usual way of estimating the Hessian, is there >> > anything incorrect with the approach? >> your straightforward approach is ok, especially since you don’t require >> the highest precision. An alternative would be to use automatic >> differentiation to calculate the derivatives accurately, e.g. using algopy, >> theano or tensor flow >> >> > - what is the recommended way (i.e. numerically stable) of inverting the >> > Hessian in such a situation? >> >> If your hess matrix is close to being singular, you could gain some >> precision by using the QR decomposition of the jacobian. In general to solve >> a linear system it is recommended to avoid calculating the the inverse >> matrix. >> >> > - does `optimize.leastsq` do anything different? >> >> leastsq wraps the MINPACK library, which brings it own carefully tuned >> numeric differentiation routines, and it uses QR decomposition. >> >> > - if `x0` is not at a minimum should the covariance matrix be expected >> > to be positive semi-definite anyway? >> If x0 is not a minimum, then there is no guarantee. Even if x0 is a >> minimum this might by violated due to numerical errors. >> >> best >> Gregor >> >> > _______________________________________________ >> > SciPy-User mailing list >> > [hidden email] >> > https://mail.python.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> [hidden email] >> https://mail.python.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > [hidden email] > https://mail.python.org/mailman/listinfo/scipy-user > SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
In reply to this post by Andrew Nelson
On Mon, Mar 26, 2018 at 7:57 PM, Andrew Nelson <[hidden email]> wrote:
> I would like to calculate the Jacobian for a least squares problem, followed > by a Hessian estimation, then the covariance matrix from that Hessian. > > With my current approach I sometimes experience issues with the covariance > matrix in that it's sometimes not positive semi-definite. I am using the > covariance matrix to seed a MCMC sampling process by supplying it to > `np.random.multivariate_normal` to get initial positions for the MC chain. I never looked much at the details of MCMC. But if your data or starting point doesn't provide good information about the Hessian, then, I think, you could shrink the hessian to or combine it with the prior covariance matrix, e.g. use a weighted average. Josef I > am using the following code: > > ``` > from scipy.optimize._numdiff import approx_derivative > jac = approx_derivative(residuals_func, x0) > hess = np.matmul(jac.T, jac) > covar = np.linalg.inv(hess) > ``` > > Note that x0 may not be at a minimum. > > - would this be the usual way of estimating the Hessian, is there anything > incorrect with the approach? > - what is the recommended way (i.e. numerically stable) of inverting the > Hessian in such a situation? > - does `optimize.leastsq` do anything different? > - if `x0` is not at a minimum should the covariance matrix be expected to be > positive semi-definite anyway? > > _______________________________________________ > SciPy-User mailing list > [hidden email] > https://mail.python.org/mailman/listinfo/scipy-user > SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
I'm using the Hessian to calculate the covariance matrix for parameter estimates in least squares, i.e. the equivalent of `pcov` in `curve_fit` (I don't want to do a fit, I just want the covariance around the current location). On 29 March 2018 at 03:05, <[hidden email]> wrote: On Mon, Mar 26, 2018 at 7:57 PM, Andrew Nelson <[hidden email]> wrote: _____________________________________
Dr. Andrew Nelson _____________________________________ _______________________________________________ SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
On Wed, Mar 28, 2018 at 7:33 PM, Andrew Nelson <[hidden email]> wrote:
> I'm using the Hessian to calculate the covariance matrix for parameter > estimates in least squares, i.e. the equivalent of `pcov` in `curve_fit` (I > don't want to do a fit, I just want the covariance around the current > location). But if you are just using it to initialize a Bayesian estimator, then you don't need the covariance from the data by itself. What I guess: You are using the cross-product of the jacobian (outer product of gradient). This should in general be positive definite. If it is not positive definite, then locally at least one of the parameters is not identified, i.e. your X in analogy to linear regression is singular. You can fiddle with numerical precision to maybe get it noisily PSD, and then you have just an almost singular jacobian with a very noise inverse. pinv(jac) Josef > > On 29 March 2018 at 03:05, <[hidden email]> wrote: >> >> On Mon, Mar 26, 2018 at 7:57 PM, Andrew Nelson <[hidden email]> wrote: >> > I would like to calculate the Jacobian for a least squares problem, >> > followed >> > by a Hessian estimation, then the covariance matrix from that Hessian. >> > >> > With my current approach I sometimes experience issues with the >> > covariance >> > matrix in that it's sometimes not positive semi-definite. I am using the >> > covariance matrix to seed a MCMC sampling process by supplying it to >> > `np.random.multivariate_normal` to get initial positions for the MC >> > chain. >> >> I never looked much at the details of MCMC. >> But if your data or starting point doesn't provide good information about >> the >> Hessian, then, I think, you could shrink the hessian to or combine it with >> the >> prior covariance matrix, e.g. use a weighted average. >> >> Josef >> >> >> I >> > am using the following code: >> > >> > ``` >> > from scipy.optimize._numdiff import approx_derivative >> > jac = approx_derivative(residuals_func, x0) >> > hess = np.matmul(jac.T, jac) >> > covar = np.linalg.inv(hess) >> > ``` >> > >> > Note that x0 may not be at a minimum. >> > >> > - would this be the usual way of estimating the Hessian, is there >> > anything >> > incorrect with the approach? >> > - what is the recommended way (i.e. numerically stable) of inverting the >> > Hessian in such a situation? >> > - does `optimize.leastsq` do anything different? >> > - if `x0` is not at a minimum should the covariance matrix be expected >> > to be >> > positive semi-definite anyway? >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > [hidden email] >> > https://mail.python.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> [hidden email] >> https://mail.python.org/mailman/listinfo/scipy-user > > > > > -- > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > > _______________________________________________ > SciPy-User mailing list > [hidden email] > https://mail.python.org/mailman/listinfo/scipy-user > SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
> You are using the cross-product of the jacobian (outer product of gradient). This is correct. _______________________________________________ SciPy-User mailing list [hidden email] https://mail.python.org/mailman/listinfo/scipy-user |
Free forum by Nabble | Edit this page |