[SciPy-User] Molecular Index Calculation

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[SciPy-User] Molecular Index Calculation

Stephen P. Molnar
I am a newcomer to Python and am attempting to translate a FORTRAN
program that I wrote about 20 years ago into Python, specifically v-3.6.

Let me say that I am not asking someone to write the program for me, but
only to point me is the right direction.

So far, I have bumbled my way to the point that I can get all of the
input data resulting from a quantum mechanical calculation of a very
simple organic molecule in to a Python program, but am encountering
problems with processing the data.

The equation that I a want to evaluate (the one that I programmed in
FORTRAN) is equation (7) in the attached scan of one of the pages of the
following document.

Stephen P. Molnar and James W. King, Theory and Applications of the
Integrated Molecular Transform and the Normalized Molecular Moment
Structure Descriptors: QSAR and QSPR Paradigms, Int. J Quantum Chem.,
85, 662 (2001).

the equation is

I(s) = sum from i=2toN sum from j=1toi-1 Z_sub_i*Z_sub_j *

    sin  s*r_sub_ij
   -----------------  Equation (7)
        s*r_sub_ij

What I have managed to do so far is for a simple organic molecule
containing 5 atoms (CHFClBr):

import numpy as np
import numpy, pandas, scipy.spatial.distance as dist

r=[]        #distances between pairs of atoms
z = []      #atomic numbers of atoms
Z_dist = [] #products of successive atomic numbers
I = []      #vector calculated in equation  (7)

start=1
finish=31
points=300
s = np.linspace(start,finish,points)


name = input("Enter Molecule ID: ")
name = str(name)
name_in = name+'.dat'

df = pandas.read_table(name_in, skiprows=2, sep=" ", skipinitialspace=True)
z = numpy.array(df["ZA"])
N =  numpy.ma.size(z)  ## of atoms in molecule
a = numpy.array(df[["X", "Y", "Z"]])
dist.squareform(dist.pdist(a, "euclidean"))
anrows, ancols = np.shape(a)
a_new = a.reshape(anrows, 1, ancols)
diff = a_new - a

r = (diff ** 2).sum(2)
r = np.sqrt(r)

for j in range(0,N-1):
      for k in range(j+1,N):
          Z_diff = (z[j]*z[k])


For the test molecule I get the following:

MASS =  [ 12.011   1.008  79.9    35.453  18.998]  (Atomic weights of atoms)

r:  [ 0.          2.059801    3.60937686  3.32591826  2.81569212]
      [ 2.059801    0.          4.71452879  4.45776445  4.00467382]
      [ 3.60937686  4.71452879  0.          5.66500917  5.26602175]
      [ 3.32591826  4.45776445  5.66500917  0.          5.02324896]
      [ 2.81569212  4.00467382  5.26602175  5.02324896  0.        ]

z:

6.0
210.0
102.0
54.0
35.0
17.0
9.0
595.0
315.0
153.0

I have checked these calculations with a spreadsheet calculation,
consequently I'm correct as far as I have gotten.

However, my problem is calculating the value of I for the 300
x-coordinate values required for the molecular index calculation.

The problem that I am currently having is calculating the product of the
pre-sine function and the sine term for the 300 values of s.

Unfortunately, the wealth of  function in Python makes it difficult for
me to ascertain just how to proceed.  It seems that I have the problem
of seeing the trees because of the size of the forest.  At this point
Goggling only makes my confusion deeper.  Any pointers in the correct
direction will be greatly appreciated.

Thanks in advance.

--
Stephen P. Molnar, Ph.D. Life is a fuzzy set
www.molecular-modeling.net Stochastic and multivariate
(614)312-7528 (c)
Skype: smolnar1


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user

I_Formula.pdf (125K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Molecular Index Calculation

Hjalmar Turesson
Hi Stephen,


I see that N = 10 (ie. the length of Z), and that r is indexed by i and j, which run up to N. However, r is only a 5x5 array. Is this correct?

Otherwise, something like this might work:

import numpy as np

start=1
finish=31
points=300
s = np.linspace(start, finish, points)

MASS =  np.array([12.011, 1.008, 79.9, 35.453, 18.998])

r = np.array([[0., 2.059801, 3.60937686, 3.32591826, 2.81569212],
              [2.059801, 0., 4.71452879, 4.45776445, 4.00467382],
              [3.60937686, 4.71452879, 0., 5.66500917, 5.26602175],
              [3.32591826, 4.45776445, 5.66500917, 0., 5.02324896],
              [2.81569212, 4.00467382, 5.26602175, 5.02324896, 0.]])

r[r == 0.] = 0.0001

Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0, 153.0])

N = Z.size

I = np.zeros(s.shape)

for i in range(1, N):
    for j in range(j):
        I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])


Do you have an example of the correct output?

Best,
Hjalmar

On Sat, Apr 15, 2017 at 6:29 AM, Stephen P. Molnar <[hidden email]> wrote:
I am a newcomer to Python and am attempting to translate a FORTRAN program that I wrote about 20 years ago into Python, specifically v-3.6.

Let me say that I am not asking someone to write the program for me, but only to point me is the right direction.

So far, I have bumbled my way to the point that I can get all of the input data resulting from a quantum mechanical calculation of a very simple organic molecule in to a Python program, but am encountering problems with processing the data.

The equation that I a want to evaluate (the one that I programmed in FORTRAN) is equation (7) in the attached scan of one of the pages of the following document.

Stephen P. Molnar and James W. King, Theory and Applications of the Integrated Molecular Transform and the Normalized Molecular Moment Structure Descriptors: QSAR and QSPR Paradigms, Int. J Quantum Chem., 85, 662 (2001).

the equation is

I(s) = sum from i=2toN sum from j=1toi-1 Z_sub_i*Z_sub_j *

   sin  s*r_sub_ij
  -----------------  Equation (7)
       s*r_sub_ij

What I have managed to do so far is for a simple organic molecule containing 5 atoms (CHFClBr):

import numpy as np
import numpy, pandas, scipy.spatial.distance as dist

r=[]        #distances between pairs of atoms
z = []      #atomic numbers of atoms
Z_dist = [] #products of successive atomic numbers
I = []      #vector calculated in equation  (7)

start=1
finish=31
points=300
s = np.linspace(start,finish,points)


name = input("Enter Molecule ID: ")
name = str(name)
name_in = name+'.dat'

df = pandas.read_table(name_in, skiprows=2, sep=" ", skipinitialspace=True)
z = numpy.array(df["ZA"])
N =  numpy.ma.size(z)  ## of atoms in molecule
a = numpy.array(df[["X", "Y", "Z"]])
dist.squareform(dist.pdist(a, "euclidean"))
anrows, ancols = np.shape(a)
a_new = a.reshape(anrows, 1, ancols)
diff = a_new - a

r = (diff ** 2).sum(2)
r = np.sqrt(r)

for j in range(0,N-1):
     for k in range(j+1,N):
         Z_diff = (z[j]*z[k])


For the test molecule I get the following:

MASS =  [ 12.011   1.008  79.9    35.453  18.998]  (Atomic weights of atoms)

r:  [ 0.          2.059801    3.60937686  3.32591826  2.81569212]
     [ 2.059801    0.          4.71452879  4.45776445  4.00467382]
     [ 3.60937686  4.71452879  0.          5.66500917  5.26602175]
     [ 3.32591826  4.45776445  5.66500917  0.          5.02324896]
     [ 2.81569212  4.00467382  5.26602175  5.<a href="tel:02324896%20%200" value="+4623248960" target="_blank">02324896 0.        ]

z:

6.0
210.0
102.0
54.0
35.0
17.0
9.0
595.0
315.0
153.0

I have checked these calculations with a spreadsheet calculation, consequently I'm correct as far as I have gotten.

However, my problem is calculating the value of I for the 300 x-coordinate values required for the molecular index calculation.

The problem that I am currently having is calculating the product of the pre-sine function and the sine term for the 300 values of s.

Unfortunately, the wealth of  function in Python makes it difficult for me to ascertain just how to proceed.  It seems that I have the problem of seeing the trees because of the size of the forest.  At this point Goggling only makes my confusion deeper.  Any pointers in the correct direction will be greatly appreciated.

Thanks in advance.

--
Stephen P. Molnar, Ph.D.                Life is a fuzzy set
www.molecular-modeling.net              Stochastic and multivariate
(614)312-7528 (c)
Skype: smolnar1


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Molecular Index Calculation

Stephen P. Molnar
On 04/15/2017 04:07 PM, Hjalmar Turesson wrote:

> Hi Stephen,
>
>
> I see that N = 10 (ie. the length of Z), and that r is indexed by i and
> j, which run up to N. However, r is only a 5x5 array. Is this correct?
>
> Otherwise, something like this might work:
>
> import numpy as np
>
> start=1
> finish=31
> points=300
> s = np.linspace(start, finish, points)
>
> MASS =  np.array([12.011, 1.008, 79.9, 35.453, 18.998])
>
> r = np.array([[0., 2.059801, 3.60937686, 3.32591826, 2.81569212],
>                [2.059801, 0., 4.71452879, 4.45776445, 4.00467382],
>                [3.60937686, 4.71452879, 0., 5.66500917, 5.26602175],
>                [3.32591826, 4.45776445, 5.66500917, 0., 5.02324896],
>                [2.81569212, 4.00467382, 5.26602175, 5.02324896, 0.]])
>
> r[r == 0.] = 0.0001
>
> Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0,
> 153.0])
>
> N = Z.size
>
> I = np.zeros(s.shape)
>
> for i in range(1, N):
>      for j in range(j):
>          I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
>
>
> Do you have an example of the correct output?
>
> Best,
> Hjalmar
>
> On Sat, Apr 15, 2017 at 6:29 AM, Stephen P. Molnar
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     I am a newcomer to Python and am attempting to translate a FORTRAN
>     program that I wrote about 20 years ago into Python, specifically v-3.6.
>
>     Let me say that I am not asking someone to write the program for me,
>     but only to point me is the right direction.
>
>     So far, I have bumbled my way to the point that I can get all of the
>     input data resulting from a quantum mechanical calculation of a very
>     simple organic molecule in to a Python program, but am encountering
>     problems with processing the data.
>
>     The equation that I a want to evaluate (the one that I programmed in
>     FORTRAN) is equation (7) in the attached scan of one of the pages of
>     the following document.
>
>     Stephen P. Molnar and James W. King, Theory and Applications of the
>     Integrated Molecular Transform and the Normalized Molecular Moment
>     Structure Descriptors: QSAR and QSPR Paradigms, Int. J Quantum
>     Chem., 85, 662 (2001).
>
>     the equation is
>
>     I(s) = sum from i=2toN sum from j=1toi-1 Z_sub_i*Z_sub_j *
>
>         sin  s*r_sub_ij
>        -----------------  Equation (7)
>             s*r_sub_ij
>
>     What I have managed to do so far is for a simple organic molecule
>     containing 5 atoms (CHFClBr):
>
>     import numpy as np
>     import numpy, pandas, scipy.spatial.distance as dist
>
>     r=[]        #distances between pairs of atoms
>     z = []      #atomic numbers of atoms
>     Z_dist = [] #products of successive atomic numbers
>     I = []      #vector calculated in equation  (7)
>
>     start=1
>     finish=31
>     points=300
>     s = np.linspace(start,finish,points)
>
>
>     name = input("Enter Molecule ID: ")
>     name = str(name)
>     name_in = name+'.dat'
>
>     df = pandas.read_table(name_in, skiprows=2, sep=" ",
>     skipinitialspace=True)
>     z = numpy.array(df["ZA"])
>     N =  numpy.ma.size(z)  ## of atoms in molecule
>     a = numpy.array(df[["X", "Y", "Z"]])
>     dist.squareform(dist.pdist(a, "euclidean"))
>     anrows, ancols = np.shape(a)
>     a_new = a.reshape(anrows, 1, ancols)
>     diff = a_new - a
>
>     r = (diff ** 2).sum(2)
>     r = np.sqrt(r)
>
>     for j in range(0,N-1):
>           for k in range(j+1,N):
>               Z_diff = (z[j]*z[k])
>
>
>     For the test molecule I get the following:
>
>     MASS =  [ 12.011   1.008  79.9    35.453  18.998]  (Atomic weights
>     of atoms)
>
>     r:  [ 0.          2.059801    3.60937686  3.32591826  2.81569212]
>           [ 2.059801    0.          4.71452879  4.45776445  4.00467382]
>           [ 3.60937686  4.71452879  0.          5.66500917  5.26602175]
>           [ 3.32591826  4.45776445  5.66500917  0.          5.02324896]
>           [ 2.81569212  4.00467382  5.26602175  5.02324896 0
>     <tel:02324896%20%200>.        ]
>
>     z:
>
>     6.0
>     210.0
>     102.0
>     54.0
>     35.0
>     17.0
>     9.0
>     595.0
>     315.0
>     153.0
>
>     I have checked these calculations with a spreadsheet calculation,
>     consequently I'm correct as far as I have gotten.
>
>     However, my problem is calculating the value of I for the 300
>     x-coordinate values required for the molecular index calculation.
>
>     The problem that I am currently having is calculating the product of
>     the pre-sine function and the sine term for the 300 values of s.
>
>     Unfortunately, the wealth of  function in Python makes it difficult
>     for me to ascertain just how to proceed.  It seems that I have the
>     problem of seeing the trees because of the size of the forest.  At
>     this point Goggling only makes my confusion deeper.  Any pointers in
>     the correct direction will be greatly appreciated.
>
>     Thanks in advance.
>
>     --
>     Stephen P. Molnar, Ph.D.                Life is a fuzzy set
>     www.molecular-modeling.net <http://www.molecular-modeling.net>
>              Stochastic and multivariate
>     (614)312-7528 (c)
>     Skype: smolnar1
>
>
>     _______________________________________________
>     SciPy-User mailing list
>     [hidden email] <mailto:[hidden email]>
>     https://mail.python.org/mailman/listinfo/scipy-user
>     <https://mail.python.org/mailman/listinfo/scipy-user>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/scipy-user
>
Hjalmar

Thanks for your response.

I'm sorry that my original request wasn't a bit clearer.  I presume that
the paper I attached to the request help was deleted before the email
was posted to the list.

Here is how I incorporated your suggestion in the code.

import numpy as np
import numpy, pandas, scipy.spatial.distance as dist

r = []
s=[]
z = []
Z = []
I = []

start=1
finish=31
points=300
s = np.linspace(start,finish,points)
np.savetxt('Number of Points',s,delimiter=' ')

name = input("Enter Molecule ID: ")
name = str(name)
name_in = name+'.dat'

df = pandas.read_table(name_in, skiprows=2, sep=" ",
skipinitialspace=True)
Z = numpy.array(df["ZA"])
print('z = ',z)
N =  numpy.ma.size(z)
a = numpy.array(df[["X", "Y", "Z"]])
dist.squareform(dist.pdist(a, "euclidean"))
anrows, ancols = np.shape(a)
a_new = a.reshape(anrows, 1, ancols)
diff = a_new - a

r = (diff ** 2).sum(2)
r = np.sqrt(z)

r[r == 0.] = 0.0001

N = Z.size

I = np.zeros(s.shape)

for i in range(1, N):
     for j in range(j):
         I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
         print('I: ',I)

np.savetxt('I',I,delimiter=' ')

The for j in range(j): has a marginal note (I'm using the Spyder IDE)
"undefined name 'y' " and I is a vector with three hundred 0.0 terms.

also, please note that I removed 'Z = np.array([6.0, 210.0, 102.0, 54.0,
35.0, 17.0, 9.0, 595.0, 315.0, 153.0])' from your suggestion as that is
the result of the per-sine term 'Z[i] * Z[j]'

Regards,

        Steve

--
Stephen P. Molnar, Ph.D. Life is a fuzzy set
www.molecular-modeling.net Stochastic and multivariate
(614)312-7528 (c)
Skype: smolnar1
_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Molecular Index Calculation

David Mikolas
I escaped a cluttered Fortran/C/Pascal/Basic/Perl jumble by "living" in stackoverflow. Mostly by reading questions and answers, and learning how not to get my questions closed in milliseconds. Break your program into small pieces, and look for how each piece might be done differently. 

If you try to find ways to do things without ever manually checking the sizes of arrays, you'll naturally be guided towards "pythonic" methods and techniques. There is almost always a better way.

Computers are so fast these days, don't worry about ultimate speed until you really hit a brick wall. Letting python, numpy, or scipy methods handle housekeeping for you means you are usually defaulting to very well written compiled routines.

So for example, let python do the thinking...

r = np.sqrt((diff ** 2).sum(2))
r[r < 0.0001] = 0.0001  
# or just
r = np.maximum(np.sqrt((diff ** 2).sum(2)), 0.0001)

# and

Zij = Z * Z[:, None]
triangle = np.triu(np.ones_like(Zij), k=1)
I = Zij*trangle.sum()

# or just

I = np.triu(Z * Z[:, None], k=1).sum()


On Sun, Apr 16, 2017 at 8:25 PM, Stephen P. Molnar <[hidden email]> wrote:
On 04/15/2017 04:07 PM, Hjalmar Turesson wrote:
Hi Stephen,


I see that N = 10 (ie. the length of Z), and that r is indexed by i and
j, which run up to N. However, r is only a 5x5 array. Is this correct?

Otherwise, something like this might work:

import numpy as np

start=1
finish=31
points=300
s = np.linspace(start, finish, points)

MASS =  np.array([12.011, 1.008, 79.9, 35.453, 18.998])

r = np.array([[0., 2.059801, 3.60937686, 3.32591826, 2.81569212],
               [2.059801, 0., 4.71452879, 4.45776445, 4.00467382],
               [3.60937686, 4.71452879, 0., 5.66500917, 5.26602175],
               [3.32591826, 4.45776445, 5.66500917, 0., 5.02324896],
               [2.81569212, 4.00467382, 5.26602175, 5.02324896, 0.]])

r[r == 0.] = 0.0001

Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0,
153.0])

N = Z.size

I = np.zeros(s.shape)

for i in range(1, N):
     for j in range(j):
         I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])


Do you have an example of the correct output?

Best,
Hjalmar

On Sat, Apr 15, 2017 at 6:29 AM, Stephen P. Molnar
<[hidden email] <mailto:[hidden email]>> wrote:

    I am a newcomer to Python and am attempting to translate a FORTRAN
    program that I wrote about 20 years ago into Python, specifically v-3.6.

    Let me say that I am not asking someone to write the program for me,
    but only to point me is the right direction.

    So far, I have bumbled my way to the point that I can get all of the
    input data resulting from a quantum mechanical calculation of a very
    simple organic molecule in to a Python program, but am encountering
    problems with processing the data.

    The equation that I a want to evaluate (the one that I programmed in
    FORTRAN) is equation (7) in the attached scan of one of the pages of
    the following document.

    Stephen P. Molnar and James W. King, Theory and Applications of the
    Integrated Molecular Transform and the Normalized Molecular Moment
    Structure Descriptors: QSAR and QSPR Paradigms, Int. J Quantum
    Chem., 85, 662 (2001).

    the equation is

    I(s) = sum from i=2toN sum from j=1toi-1 Z_sub_i*Z_sub_j *

        sin  s*r_sub_ij
       -----------------  Equation (7)
            s*r_sub_ij

    What I have managed to do so far is for a simple organic molecule
    containing 5 atoms (CHFClBr):

    import numpy as np
    import numpy, pandas, scipy.spatial.distance as dist

    r=[]        #distances between pairs of atoms
    z = []      #atomic numbers of atoms
    Z_dist = [] #products of successive atomic numbers
    I = []      #vector calculated in equation  (7)

    start=1
    finish=31
    points=300
    s = np.linspace(start,finish,points)


    name = input("Enter Molecule ID: ")
    name = str(name)
    name_in = name+'.dat'

    df = pandas.read_table(name_in, skiprows=2, sep=" ",
    skipinitialspace=True)
    z = numpy.array(df["ZA"])
    N =  numpy.ma.size(z)  ## of atoms in molecule
    a = numpy.array(df[["X", "Y", "Z"]])
    dist.squareform(dist.pdist(a, "euclidean"))
    anrows, ancols = np.shape(a)
    a_new = a.reshape(anrows, 1, ancols)
    diff = a_new - a

    r = (diff ** 2).sum(2)
    r = np.sqrt(r)

    for j in range(0,N-1):
          for k in range(j+1,N):
              Z_diff = (z[j]*z[k])


    For the test molecule I get the following:

    MASS =  [ 12.011   1.008  79.9    35.453  18.998]  (Atomic weights
    of atoms)

    r:  [ 0.          2.059801    3.60937686  3.32591826  2.81569212]
          [ 2.059801    0.          4.71452879  4.45776445  4.00467382]
          [ 3.60937686  4.71452879  0.          5.66500917  5.26602175]
          [ 3.32591826  4.45776445  5.66500917  0.          5.02324896]
          [ 2.81569212  4.00467382  5.26602175  5.02324896 0
    <tel:02324896%20%200>.        ]


    z:

    6.0
    210.0
    102.0
    54.0
    35.0
    17.0
    9.0
    595.0
    315.0
    153.0

    I have checked these calculations with a spreadsheet calculation,
    consequently I'm correct as far as I have gotten.

    However, my problem is calculating the value of I for the 300
    x-coordinate values required for the molecular index calculation.

    The problem that I am currently having is calculating the product of
    the pre-sine function and the sine term for the 300 values of s.

    Unfortunately, the wealth of  function in Python makes it difficult
    for me to ascertain just how to proceed.  It seems that I have the
    problem of seeing the trees because of the size of the forest.  At
    this point Goggling only makes my confusion deeper.  Any pointers in
    the correct direction will be greatly appreciated.

    Thanks in advance.

    --
    Stephen P. Molnar, Ph.D.                Life is a fuzzy set
    www.molecular-modeling.net <http://www.molecular-modeling.net>
             Stochastic and multivariate
    <a href="tel:%28614%29312-7528" value="+16143127528" target="_blank">(614)312-7528 (c)
    Skype: smolnar1


    _______________________________________________
    SciPy-User mailing list
    [hidden email] <mailto:[hidden email]>
    https://mail.python.org/mailman/listinfo/scipy-user
    <https://mail.python.org/mailman/listinfo/scipy-user>




_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user

Hjalmar

Thanks for your response.

I'm sorry that my original request wasn't a bit clearer.  I presume that the paper I attached to the request help was deleted before the email was posted to the list.

Here is how I incorporated your suggestion in the code.

import numpy as np
import numpy, pandas, scipy.spatial.distance as dist

r = []
s=[]
z = []
Z = []
I = []

start=1
finish=31
points=300
s = np.linspace(start,finish,points)
np.savetxt('Number of Points',s,delimiter=' ')

name = input("Enter Molecule ID: ")
name = str(name)
name_in = name+'.dat'

df = pandas.read_table(name_in, skiprows=2, sep=" ", skipinitialspace=True)
Z = numpy.array(df["ZA"])
print('z = ',z)
N =  numpy.ma.size(z)
a = numpy.array(df[["X", "Y", "Z"]])
dist.squareform(dist.pdist(a, "euclidean"))
anrows, ancols = np.shape(a)
a_new = a.reshape(anrows, 1, ancols)
diff = a_new - a

r = (diff ** 2).sum(2)
r = np.sqrt(z)

r[r == 0.] = 0.0001

N = Z.size

I = np.zeros(s.shape)

for i in range(1, N):
    for j in range(j):
        I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
        print('I: ',I)

np.savetxt('I',I,delimiter=' ')

The for j in range(j): has a marginal note (I'm using the Spyder IDE) "undefined name 'y' " and I is a vector with three hundred 0.0 terms.

also, please note that I removed 'Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0, 153.0])' from your suggestion as that is the result of the per-sine term 'Z[i] * Z[j]'

Regards,

        Steve


--
Stephen P. Molnar, Ph.D.                Life is a fuzzy set
www.molecular-modeling.net              Stochastic and multivariate
<a href="tel:%28614%29312-7528" value="+16143127528" target="_blank">(614)312-7528 (c)
Skype: smolnar1
_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Molecular Index Calculation

David Mikolas
wrong: I = Zij*trangle.sum()

better: I = (Zij*trangle).sum()


On Sun, Apr 16, 2017 at 11:56 PM, David Mikolas <[hidden email]> wrote:
I escaped a cluttered Fortran/C/Pascal/Basic/Perl jumble by "living" in stackoverflow. Mostly by reading questions and answers, and learning how not to get my questions closed in milliseconds. Break your program into small pieces, and look for how each piece might be done differently. 

If you try to find ways to do things without ever manually checking the sizes of arrays, you'll naturally be guided towards "pythonic" methods and techniques. There is almost always a better way.

Computers are so fast these days, don't worry about ultimate speed until you really hit a brick wall. Letting python, numpy, or scipy methods handle housekeeping for you means you are usually defaulting to very well written compiled routines.

So for example, let python do the thinking...

r = np.sqrt((diff ** 2).sum(2))
r[r < 0.0001] = 0.0001  
# or just
r = np.maximum(np.sqrt((diff ** 2).sum(2)), 0.0001)

# and

Zij = Z * Z[:, None]
triangle = np.triu(np.ones_like(Zij), k=1)
I = Zij*trangle.sum()

# or just

I = np.triu(Z * Z[:, None], k=1).sum()


On Sun, Apr 16, 2017 at 8:25 PM, Stephen P. Molnar <[hidden email]> wrote:
On 04/15/2017 04:07 PM, Hjalmar Turesson wrote:
Hi Stephen,


I see that N = 10 (ie. the length of Z), and that r is indexed by i and
j, which run up to N. However, r is only a 5x5 array. Is this correct?

Otherwise, something like this might work:

import numpy as np

start=1
finish=31
points=300
s = np.linspace(start, finish, points)

MASS =  np.array([12.011, 1.008, 79.9, 35.453, 18.998])

r = np.array([[0., 2.059801, 3.60937686, 3.32591826, 2.81569212],
               [2.059801, 0., 4.71452879, 4.45776445, 4.00467382],
               [3.60937686, 4.71452879, 0., 5.66500917, 5.26602175],
               [3.32591826, 4.45776445, 5.66500917, 0., 5.02324896],
               [2.81569212, 4.00467382, 5.26602175, 5.02324896, 0.]])

r[r == 0.] = 0.0001

Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0,
153.0])

N = Z.size

I = np.zeros(s.shape)

for i in range(1, N):
     for j in range(j):
         I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])


Do you have an example of the correct output?

Best,
Hjalmar

On Sat, Apr 15, 2017 at 6:29 AM, Stephen P. Molnar
<[hidden email] <mailto:[hidden email]>> wrote:

    I am a newcomer to Python and am attempting to translate a FORTRAN
    program that I wrote about 20 years ago into Python, specifically v-3.6.

    Let me say that I am not asking someone to write the program for me,
    but only to point me is the right direction.

    So far, I have bumbled my way to the point that I can get all of the
    input data resulting from a quantum mechanical calculation of a very
    simple organic molecule in to a Python program, but am encountering
    problems with processing the data.

    The equation that I a want to evaluate (the one that I programmed in
    FORTRAN) is equation (7) in the attached scan of one of the pages of
    the following document.

    Stephen P. Molnar and James W. King, Theory and Applications of the
    Integrated Molecular Transform and the Normalized Molecular Moment
    Structure Descriptors: QSAR and QSPR Paradigms, Int. J Quantum
    Chem., 85, 662 (2001).

    the equation is

    I(s) = sum from i=2toN sum from j=1toi-1 Z_sub_i*Z_sub_j *

        sin  s*r_sub_ij
       -----------------  Equation (7)
            s*r_sub_ij

    What I have managed to do so far is for a simple organic molecule
    containing 5 atoms (CHFClBr):

    import numpy as np
    import numpy, pandas, scipy.spatial.distance as dist

    r=[]        #distances between pairs of atoms
    z = []      #atomic numbers of atoms
    Z_dist = [] #products of successive atomic numbers
    I = []      #vector calculated in equation  (7)

    start=1
    finish=31
    points=300
    s = np.linspace(start,finish,points)


    name = input("Enter Molecule ID: ")
    name = str(name)
    name_in = name+'.dat'

    df = pandas.read_table(name_in, skiprows=2, sep=" ",
    skipinitialspace=True)
    z = numpy.array(df["ZA"])
    N =  numpy.ma.size(z)  ## of atoms in molecule
    a = numpy.array(df[["X", "Y", "Z"]])
    dist.squareform(dist.pdist(a, "euclidean"))
    anrows, ancols = np.shape(a)
    a_new = a.reshape(anrows, 1, ancols)
    diff = a_new - a

    r = (diff ** 2).sum(2)
    r = np.sqrt(r)

    for j in range(0,N-1):
          for k in range(j+1,N):
              Z_diff = (z[j]*z[k])


    For the test molecule I get the following:

    MASS =  [ 12.011   1.008  79.9    35.453  18.998]  (Atomic weights
    of atoms)

    r:  [ 0.          2.059801    3.60937686  3.32591826  2.81569212]
          [ 2.059801    0.          4.71452879  4.45776445  4.00467382]
          [ 3.60937686  4.71452879  0.          5.66500917  5.26602175]
          [ 3.32591826  4.45776445  5.66500917  0.          5.02324896]
          [ 2.81569212  4.00467382  5.26602175  5.02324896 0
    <tel:02324896%20%200>.        ]


    z:

    6.0
    210.0
    102.0
    54.0
    35.0
    17.0
    9.0
    595.0
    315.0
    153.0

    I have checked these calculations with a spreadsheet calculation,
    consequently I'm correct as far as I have gotten.

    However, my problem is calculating the value of I for the 300
    x-coordinate values required for the molecular index calculation.

    The problem that I am currently having is calculating the product of
    the pre-sine function and the sine term for the 300 values of s.

    Unfortunately, the wealth of  function in Python makes it difficult
    for me to ascertain just how to proceed.  It seems that I have the
    problem of seeing the trees because of the size of the forest.  At
    this point Goggling only makes my confusion deeper.  Any pointers in
    the correct direction will be greatly appreciated.

    Thanks in advance.

    --
    Stephen P. Molnar, Ph.D.                Life is a fuzzy set
    www.molecular-modeling.net <http://www.molecular-modeling.net>
             Stochastic and multivariate
    <a href="tel:%28614%29312-7528" value="+16143127528" target="_blank">(614)312-7528 (c)
    Skype: smolnar1


    _______________________________________________
    SciPy-User mailing list
    [hidden email] <mailto:[hidden email]>
    https://mail.python.org/mailman/listinfo/scipy-user
    <https://mail.python.org/mailman/listinfo/scipy-user>




_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user

Hjalmar

Thanks for your response.

I'm sorry that my original request wasn't a bit clearer.  I presume that the paper I attached to the request help was deleted before the email was posted to the list.

Here is how I incorporated your suggestion in the code.

import numpy as np
import numpy, pandas, scipy.spatial.distance as dist

r = []
s=[]
z = []
Z = []
I = []

start=1
finish=31
points=300
s = np.linspace(start,finish,points)
np.savetxt('Number of Points',s,delimiter=' ')

name = input("Enter Molecule ID: ")
name = str(name)
name_in = name+'.dat'

df = pandas.read_table(name_in, skiprows=2, sep=" ", skipinitialspace=True)
Z = numpy.array(df["ZA"])
print('z = ',z)
N =  numpy.ma.size(z)
a = numpy.array(df[["X", "Y", "Z"]])
dist.squareform(dist.pdist(a, "euclidean"))
anrows, ancols = np.shape(a)
a_new = a.reshape(anrows, 1, ancols)
diff = a_new - a

r = (diff ** 2).sum(2)
r = np.sqrt(z)

r[r == 0.] = 0.0001

N = Z.size

I = np.zeros(s.shape)

for i in range(1, N):
    for j in range(j):
        I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
        print('I: ',I)

np.savetxt('I',I,delimiter=' ')

The for j in range(j): has a marginal note (I'm using the Spyder IDE) "undefined name 'y' " and I is a vector with three hundred 0.0 terms.

also, please note that I removed 'Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0, 153.0])' from your suggestion as that is the result of the per-sine term 'Z[i] * Z[j]'

Regards,

        Steve


--
Stephen P. Molnar, Ph.D.                Life is a fuzzy set
www.molecular-modeling.net              Stochastic and multivariate
<a href="tel:%28614%29312-7528" value="+16143127528" target="_blank">(614)312-7528 (c)
Skype: smolnar1
_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Molecular Index Calculation

Stephen P. Molnar
On 04/16/2017 11:58 AM, David Mikolas wrote:

> wrong: I = Zij*trangle.sum()
>
> better: I = (Zij*trangle).sum()
>
>
> On Sun, Apr 16, 2017 at 11:56 PM, David Mikolas
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     I escaped a cluttered Fortran/C/Pascal/Basic/Perl jumble by "living"
>     in stackoverflow. Mostly by reading questions and answers, and
>     learning how not to get my questions closed in milliseconds. Break
>     your program into small pieces, and look for how each piece might be
>     done differently.
>
>     If you try to find ways to do things without ever manually checking
>     the sizes of arrays, you'll naturally be guided towards "pythonic"
>     methods and techniques. There is almost always a better way.
>
>     Computers are so fast these days, don't worry about ultimate speed
>     until you really hit a brick wall. Letting python, numpy, or scipy
>     methods handle housekeeping for you means you are usually defaulting
>     to very well written compiled routines.
>
>     So for example, let python do the thinking...
>
>     r = np.sqrt((diff ** 2).sum(2))
>     r[r < 0.0001] = 0.0001
>     # or just
>     r = np.maximum(np.sqrt((diff ** 2).sum(2)), 0.0001)
>
>     # and
>
>     Zij = Z * Z[:, None]
>     triangle = np.triu(np.ones_like(Zij), k=1)
>     I = Zij*trangle.sum()
>
>     # or just
>
>     I = np.triu(Z * Z[:, None], k=1).sum()
>
>
>     On Sun, Apr 16, 2017 at 8:25 PM, Stephen P. Molnar
>     <[hidden email] <mailto:[hidden email]>> wrote:
>
>         On 04/15/2017 04:07 PM, Hjalmar Turesson wrote:
>
>             Hi Stephen,
>
>
>             I see that N = 10 (ie. the length of Z), and that r is
>             indexed by i and
>             j, which run up to N. However, r is only a 5x5 array. Is
>             this correct?
>
>             Otherwise, something like this might work:
>
>             import numpy as np
>
>             start=1
>             finish=31
>             points=300
>             s = np.linspace(start, finish, points)
>
>             MASS =  np.array([12.011, 1.008, 79.9, 35.453, 18.998])
>
>             r = np.array([[0., 2.059801, 3.60937686, 3.32591826,
>             2.81569212],
>                             [2.059801, 0., 4.71452879, 4.45776445,
>             4.00467382],
>                             [3.60937686, 4.71452879, 0., 5.66500917,
>             5.26602175],
>                             [3.32591826, 4.45776445, 5.66500917, 0.,
>             5.02324896],
>                             [2.81569212, 4.00467382, 5.26602175,
>             5.02324896, 0.]])
>
>             r[r == 0.] = 0.0001
>
>             Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0,
>             595.0, 315.0,
>             153.0])
>
>             N = Z.size
>
>             I = np.zeros(s.shape)
>
>             for i in range(1, N):
>                   for j in range(j):
>                       I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
>
>
>             Do you have an example of the correct output?
>
>             Best,
>             Hjalmar
>
>             On Sat, Apr 15, 2017 at 6:29 AM, Stephen P. Molnar
>             <[hidden email] <mailto:[hidden email]>
>             <mailto:[hidden email]
>             <mailto:[hidden email]>>> wrote:
>
>                  I am a newcomer to Python and am attempting to
>             translate a FORTRAN
>                  program that I wrote about 20 years ago into Python,
>             specifically v-3.6.
>
>                  Let me say that I am not asking someone to write the
>             program for me,
>                  but only to point me is the right direction.
>
>                  So far, I have bumbled my way to the point that I can
>             get all of the
>                  input data resulting from a quantum mechanical
>             calculation of a very
>                  simple organic molecule in to a Python program, but am
>             encountering
>                  problems with processing the data.
>
>                  The equation that I a want to evaluate (the one that I
>             programmed in
>                  FORTRAN) is equation (7) in the attached scan of one of
>             the pages of
>                  the following document.
>
>                  Stephen P. Molnar and James W. King, Theory and
>             Applications of the
>                  Integrated Molecular Transform and the Normalized
>             Molecular Moment
>                  Structure Descriptors: QSAR and QSPR Paradigms, Int. J
>             Quantum
>                  Chem., 85, 662 (2001).
>
>                  the equation is
>
>                  I(s) = sum from i=2toN sum from j=1toi-1 Z_sub_i*Z_sub_j *
>
>                      sin  s*r_sub_ij
>                     -----------------  Equation (7)
>                          s*r_sub_ij
>
>                  What I have managed to do so far is for a simple
>             organic molecule
>                  containing 5 atoms (CHFClBr):
>
>                  import numpy as np
>                  import numpy, pandas, scipy.spatial.distance as dist
>
>                  r=[]        #distances between pairs of atoms
>                  z = []      #atomic numbers of atoms
>                  Z_dist = [] #products of successive atomic numbers
>                  I = []      #vector calculated in equation  (7)
>
>                  start=1
>                  finish=31
>                  points=300
>                  s = np.linspace(start,finish,points)
>
>
>                  name = input("Enter Molecule ID: ")
>                  name = str(name)
>                  name_in = name+'.dat'
>
>                  df = pandas.read_table(name_in, skiprows=2, sep=" ",
>                  skipinitialspace=True)
>                  z = numpy.array(df["ZA"])
>                  N =  numpy.ma.size(z)  ## of atoms in molecule
>                  a = numpy.array(df[["X", "Y", "Z"]])
>                  dist.squareform(dist.pdist(a, "euclidean"))
>                  anrows, ancols = np.shape(a)
>                  a_new = a.reshape(anrows, 1, ancols)
>                  diff = a_new - a
>
>                  r = (diff ** 2).sum(2)
>                  r = np.sqrt(r)
>
>                  for j in range(0,N-1):
>                        for k in range(j+1,N):
>                            Z_diff = (z[j]*z[k])
>
>
>                  For the test molecule I get the following:
>
>                  MASS =  [ 12.011   1.008  79.9    35.453  18.998]
>             (Atomic weights
>                  of atoms)
>
>                  r:  [ 0.          2.059801    3.60937686  3.32591826
>             2.81569212]
>                        [ 2.059801    0.          4.71452879  4.45776445
>             4.00467382]
>                        [ 3.60937686  4.71452879  0.          5.66500917
>             5.26602175]
>                        [ 3.32591826  4.45776445  5.66500917  0.
>             5.02324896]
>                        [ 2.81569212  4.00467382  5.26602175  5.02324896 0
>                  <tel:02324896%20%200>.        ]
>
>
>                  z:
>
>                  6.0
>                  210.0
>                  102.0
>                  54.0
>                  35.0
>                  17.0
>                  9.0
>                  595.0
>                  315.0
>                  153.0
>
>                  I have checked these calculations with a spreadsheet
>             calculation,
>                  consequently I'm correct as far as I have gotten.
>
>                  However, my problem is calculating the value of I for
>             the 300
>                  x-coordinate values required for the molecular index
>             calculation.
>
>                  The problem that I am currently having is calculating
>             the product of
>                  the pre-sine function and the sine term for the 300
>             values of s.
>
>                  Unfortunately, the wealth of  function in Python makes
>             it difficult
>                  for me to ascertain just how to proceed.  It seems that
>             I have the
>                  problem of seeing the trees because of the size of the
>             forest.  At
>                  this point Goggling only makes my confusion deeper.
>             Any pointers in
>                  the correct direction will be greatly appreciated.
>
>                  Thanks in advance.
>
>                  --
>                  Stephen P. Molnar, Ph.D.                Life is a fuzzy set
>             www.molecular-modeling.net
>             <http://www.molecular-modeling.net>
>             <http://www.molecular-modeling.net
>             <http://www.molecular-modeling.net>>
>                           Stochastic and multivariate
>             (614)312-7528 <tel:%28614%29312-7528> (c)
>                  Skype: smolnar1
>
>
>                  _______________________________________________
>                  SciPy-User mailing list
>             [hidden email] <mailto:[hidden email]>
>             <mailto:[hidden email] <mailto:[hidden email]>>
>             https://mail.python.org/mailman/listinfo/scipy-user
>             <https://mail.python.org/mailman/listinfo/scipy-user>
>                  <https://mail.python.org/mailman/listinfo/scipy-user
>             <https://mail.python.org/mailman/listinfo/scipy-user>>
>
>
>
>
>             _______________________________________________
>             SciPy-User mailing list
>             [hidden email] <mailto:[hidden email]>
>             https://mail.python.org/mailman/listinfo/scipy-user
>             <https://mail.python.org/mailman/listinfo/scipy-user>
>
>         Hjalmar
>
>         Thanks for your response.
>
>         I'm sorry that my original request wasn't a bit clearer.  I
>         presume that the paper I attached to the request help was
>         deleted before the email was posted to the list.
>
>         Here is how I incorporated your suggestion in the code.
>
>         import numpy as np
>         import numpy, pandas, scipy.spatial.distance as dist
>
>         r = []
>         s=[]
>         z = []
>         Z = []
>         I = []
>
>         start=1
>         finish=31
>         points=300
>         s = np.linspace(start,finish,points)
>         np.savetxt('Number of Points',s,delimiter=' ')
>
>         name = input("Enter Molecule ID: ")
>         name = str(name)
>         name_in = name+'.dat'
>
>         df = pandas.read_table(name_in, skiprows=2, sep=" ",
>         skipinitialspace=True)
>         Z = numpy.array(df["ZA"])
>         print('z = ',z)
>         N =  numpy.ma.size(z)
>         a = numpy.array(df[["X", "Y", "Z"]])
>         dist.squareform(dist.pdist(a, "euclidean"))
>         anrows, ancols = np.shape(a)
>         a_new = a.reshape(anrows, 1, ancols)
>         diff = a_new - a
>
>         r = (diff ** 2).sum(2)
>         r = np.sqrt(z)
>
>         r[r == 0.] = 0.0001
>
>         N = Z.size
>
>         I = np.zeros(s.shape)
>
>         for i in range(1, N):
>              for j in range(j):
>                  I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
>                  print('I: ',I)
>
>         np.savetxt('I',I,delimiter=' ')
>
>         The for j in range(j): has a marginal note (I'm using the Spyder
>         IDE) "undefined name 'y' " and I is a vector with three hundred
>         0.0 terms.
>
>         also, please note that I removed 'Z = np.array([6.0, 210.0,
>         102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0, 153.0])' from your
>         suggestion as that is the result of the per-sine term 'Z[i] * Z[j]'
>
>         Regards,
>
>                  Steve
>
>
>         --
>         Stephen P. Molnar, Ph.D.                Life is a fuzzy set
>         www.molecular-modeling.net <http://www.molecular-modeling.net>
>                      Stochastic and multivariate
>         (614)312-7528 <tel:%28614%29312-7528> (c)
>         Skype: smolnar1
>         _______________________________________________
>         SciPy-User mailing list
>         [hidden email] <mailto:[hidden email]>
>         https://mail.python.org/mailman/listinfo/scipy-user
>         <https://mail.python.org/mailman/listinfo/scipy-user>
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/scipy-user
>
Great suggestions!

Many thanks.

--
Stephen P. Molnar, Ph.D. Life is a fuzzy set
www.molecular-modeling.net Stochastic and multivariate
(614)312-7528 (c)
Skype: smolnar1
_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user
Loading...