Quantcast

[SciPy-User] How to optimize 2D data with leastsq in Python?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[SciPy-User] How to optimize 2D data with leastsq in Python?

nxkryptor nxkr

I have data of the form given below:

X   Y1  Y2  Y3  Y4  Y5  Y6
1.42857 4.83    4.58    4.43    4.31    4.22    4.14
1.40845 3.87    3.63    3.49    3.38    3.3 3.23
1.38889 3.17    2.93    2.79    2.69    2.62    2.56
1.36986 2.65    2.41    2.27    2.17    2.11    2.05
1.35135 2.27    2.02    1.88    1.78    1.72    1.67
    :     :       :       :       :        :      :
1.01010 2.26    1.6 1.21    0.97    0.79    0.66
1.00000 2.01    1.44    1.1 0.88    0.73    0.62

I am trying to optimize a function of the form:

enter image description here enter image description here

Now my y variable has 6 columns, I can optimize one column (in MWE #3) which I have done in the MWE. How can I optimize the function to get the parameters A, n and B for all the 6 column values? In other words how can I get the value one value of A, n and B for all 6 values of y.

MWE

import numpy as np
from scipy.signal import argrelextrema
import scipy.interpolate
from scipy.optimize import leastsq


with open('./exp_fit.dat', "r") as data:
    while True:
        line = data.readline()
        if not line.startswith('#'):
            break
    data_header = [i for i in line.strip().split('\t') if i]
    _data_ = np.genfromtxt(data, names = data_header, dtype = None, delimiter = '\t')
_data_.dtype.names = [j.replace('_', ' ') for j in _data_.dtype.names]
data = np.array(_data_.tolist())

x = _data_['X']
x_interp = np.linspace(x[-1], x[0], 100)

y = np.zeros(shape = (len(x), 6))
y_interp = np.zeros(shape = (len(x_interp), 6))
for i in range(6):
    y[:, i] = data[:, i + 1]
    tck = scipy.interpolate.splrep(x[::-1], y[::-1, i], s=0)
    y_interp[::-1, i] = scipy.interpolate.splev(x_interp, tck, der = 0)


def residuals(p, y, x):
    A_lt, n_lt, B_lt, A_it, n_it, B_it, A_ht, n_ht, B_ht = p
    err = y - (1 / (1 / ((A_it * (15 ** (-n_it)) * np.exp(B_it * x / 1000)) + \
        (A_lt * (15 ** (-n_lt)) * np.exp(B_lt * x / 1000))) + \
        1 / (A_ht * (15 ** (-n_ht)) * np.exp(B_ht * x / 1000))))
    return err
p1 = [2.789E-05, 1.47, 9.368E+03, 2.789E-05, 1.47, 9.368E+03, 2.789E-05, 1.47, 9.368E+03]
plsq = leastsq(residuals, p1, args=(y_interp[:, 2], x_interp))

The question is also available on SO.

nxkr

_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to optimize 2D data with leastsq in Python?

Matt Newville
Nxkr,

On Sat, Jul 2, 2016 at 4:02 AM, nxkryptor nxkr <[hidden email]> wrote:

I have data of the form given below:

X   Y1  Y2  Y3  Y4  Y5  Y6
1.42857 4.83    4.58    4.43    4.31    4.22    4.14
1.40845 3.87    3.63    3.49    3.38    3.3 3.23
1.38889 3.17    2.93    2.79    2.69    2.62    2.56
1.36986 2.65    2.41    2.27    2.17    2.11    2.05
1.35135 2.27    2.02    1.88    1.78    1.72    1.67
    :     :       :       :       :        :      :
1.01010 2.26    1.6 1.21    0.97    0.79    0.66
1.00000 2.01    1.44    1.1 0.88    0.73    0.62

I am trying to optimize a function of the form:

enter image description here enter image description here

Now my y variable has 6 columns, I can optimize one column (in MWE #3) which I have done in the MWE. How can I optimize the function to get the parameters A, n and B for all the 6 column values? In other words how can I get the value one value of A, n and B for all 6 values of y.

MWE

import numpy as np
from scipy.signal import argrelextrema
import scipy.interpolate
from scipy.optimize import leastsq


with open('./exp_fit.dat', "r") as data:
    while True:
        line = data.readline()
        if not line.startswith('#'):
            break
    data_header = [i for i in line.strip().split('\t') if i]
    _data_ = np.genfromtxt(data, names = data_header, dtype = None, delimiter = '\t')
_data_.dtype.names = [j.replace('_', ' ') for j in _data_.dtype.names]
data = np.array(_data_.tolist())

x = _data_['X']
x_interp = np.linspace(x[-1], x[0], 100)

y = np.zeros(shape = (len(x), 6))
y_interp = np.zeros(shape = (len(x_interp), 6))
for i in range(6):
    y[:, i] = data[:, i + 1]
    tck = scipy.interpolate.splrep(x[::-1], y[::-1, i], s=0)
    y_interp[::-1, i] = scipy.interpolate.splev(x_interp, tck, der = 0)


def residuals(p, y, x):
    A_lt, n_lt, B_lt, A_it, n_it, B_it, A_ht, n_ht, B_ht = p
    err = y - (1 / (1 / ((A_it * (15 ** (-n_it)) * np.exp(B_it * x / 1000)) + \
        (A_lt * (15 ** (-n_lt)) * np.exp(B_lt * x / 1000))) + \
        1 / (A_ht * (15 ** (-n_ht)) * np.exp(B_ht * x / 1000))))
    return err
p1 = [2.789E-05, 1.47, 9.368E+03, 2.789E-05, 1.47, 9.368E+03, 2.789E-05, 1.47, 9.368E+03]
plsq = leastsq(residuals, p1, args=(y_interp[:, 2], x_interp))

The question is also available on SO.

nxkr

 
I strongly recommend writing a function for your power-law/exponential function, and calling that 3 times.   I assume your code does what you want, but it's far too messy to actually read.    Also, it seems somewhat odd that you are interpolating your data onto a finer x-grid (but backwards??) rather than simply fitting the data you actually have.   Maybe there's a good reason for that.  Finally, I believe you can read in your data much more simply with

     data = np.loadtxt('./exp_fit.dat', skiprows=3)

I'm afraid I do not actually understand what you are trying to do.  You ask   "How can I optimize the function to get the parameters A, n and B for all the 6 column values? In other words how can I get the value one value of A, n and B for all 6 values of y".  

Do you mean that you want to do 6 separate fits, one for each column? 
Or perhaps you mean that you want to fit 9 parameter values (3 each A, n, B for 3 of you exponential functions) to all of the data columns simultaneously?   Or perhaps you mean something else entirely (for example that you mean one value for A, but a different value for n and B for each column).    Sorry, but I can't tell.

Finally, just so you're aware, fitting multiple damped exponential functions is often very difficult. 

--Matt Newville


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to optimize 2D data with leastsq in Python?

josef.pktd
It looks to me that if P is a constant, then A_i and n_i are not separately identified.

Josef

On Sun, Jul 3, 2016 at 7:10 PM, Matt Newville <[hidden email]> wrote:
Nxkr,

On Sat, Jul 2, 2016 at 4:02 AM, nxkryptor nxkr <[hidden email]> wrote:

I have data of the form given below:

X   Y1  Y2  Y3  Y4  Y5  Y6
1.42857 4.83    4.58    4.43    4.31    4.22    4.14
1.40845 3.87    3.63    3.49    3.38    3.3 3.23
1.38889 3.17    2.93    2.79    2.69    2.62    2.56
1.36986 2.65    2.41    2.27    2.17    2.11    2.05
1.35135 2.27    2.02    1.88    1.78    1.72    1.67
    :     :       :       :       :        :      :
1.01010 2.26    1.6 1.21    0.97    0.79    0.66
1.00000 2.01    1.44    1.1 0.88    0.73    0.62

I am trying to optimize a function of the form:

enter image description here enter image description here

Now my y variable has 6 columns, I can optimize one column (in MWE #3) which I have done in the MWE. How can I optimize the function to get the parameters A, n and B for all the 6 column values? In other words how can I get the value one value of A, n and B for all 6 values of y.

MWE

import numpy as np
from scipy.signal import argrelextrema
import scipy.interpolate
from scipy.optimize import leastsq


with open('./exp_fit.dat', "r") as data:
    while True:
        line = data.readline()
        if not line.startswith('#'):
            break
    data_header = [i for i in line.strip().split('\t') if i]
    _data_ = np.genfromtxt(data, names = data_header, dtype = None, delimiter = '\t')
_data_.dtype.names = [j.replace('_', ' ') for j in _data_.dtype.names]
data = np.array(_data_.tolist())

x = _data_['X']
x_interp = np.linspace(x[-1], x[0], 100)

y = np.zeros(shape = (len(x), 6))
y_interp = np.zeros(shape = (len(x_interp), 6))
for i in range(6):
    y[:, i] = data[:, i + 1]
    tck = scipy.interpolate.splrep(x[::-1], y[::-1, i], s=0)
    y_interp[::-1, i] = scipy.interpolate.splev(x_interp, tck, der = 0)


def residuals(p, y, x):
    A_lt, n_lt, B_lt, A_it, n_it, B_it, A_ht, n_ht, B_ht = p
    err = y - (1 / (1 / ((A_it * (15 ** (-n_it)) * np.exp(B_it * x / 1000)) + \
        (A_lt * (15 ** (-n_lt)) * np.exp(B_lt * x / 1000))) + \
        1 / (A_ht * (15 ** (-n_ht)) * np.exp(B_ht * x / 1000))))
    return err
p1 = [2.789E-05, 1.47, 9.368E+03, 2.789E-05, 1.47, 9.368E+03, 2.789E-05, 1.47, 9.368E+03]
plsq = leastsq(residuals, p1, args=(y_interp[:, 2], x_interp))

The question is also available on SO.

nxkr

 
I strongly recommend writing a function for your power-law/exponential function, and calling that 3 times.   I assume your code does what you want, but it's far too messy to actually read.    Also, it seems somewhat odd that you are interpolating your data onto a finer x-grid (but backwards??) rather than simply fitting the data you actually have.   Maybe there's a good reason for that.  Finally, I believe you can read in your data much more simply with

     data = np.loadtxt('./exp_fit.dat', skiprows=3)

I'm afraid I do not actually understand what you are trying to do.  You ask   "How can I optimize the function to get the parameters A, n and B for all the 6 column values? In other words how can I get the value one value of A, n and B for all 6 values of y".  

Do you mean that you want to do 6 separate fits, one for each column? 
Or perhaps you mean that you want to fit 9 parameter values (3 each A, n, B for 3 of you exponential functions) to all of the data columns simultaneously?   Or perhaps you mean something else entirely (for example that you mean one value for A, but a different value for n and B for each column).    Sorry, but I can't tell.

Finally, just so you're aware, fitting multiple damped exponential functions is often very difficult. 

--Matt Newville


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user
Loading...