I need help speeding up some code I wrote to perform a Runge-Kuta
integration. I need to do the integration as part of a real-time control algorithm, so it needs to be fairly fast. scipy.integrate.odeint does too much error checking to be fast enough. My pure Python version was just a little too slow, so I tried coding it up in Cython. I have only used Cython once before, so I don't know if I did it correctly (the .pyx file is attached). The code runs just fine, but there is almost no speed up. I think the core issue is that my dxdt_runge_kuta function gets called about 4000 times per second, so most of my overhead is in the function calls (I think). I am running my real-time control algorithm at 500 Hz and I need at least 2 Runge-Kuta integration steps per real-time steps for numeric stability. And the Runge-Kuta algorithm needs to evaluate the derivative 4 times per times step. So, 500 Hz * 2 * 4 = 4000 calls per second. I also tried coding this up in fortran and using f2py, but I am getting a type mismatch error I don't understand. I have a function that declares its return values as double precision: double precision function dzdt(x,voltage) and I declare the variable I want to store the returned value in to also be double precision: double precision F,z,vel,accel,zdot1,zdot2,zdot3,zdot4 zdot1 = dzdt(x_prev,volts) but some how it is not happy. My C skills are pretty weak (the longer I use Python, the more C I forget, and I didn't know that much to start with). I started looking into Boost as well as using f2py on C code, but I got stuck. Can anyone either make my Cython or Fortran approaches work or point me in a different direction? Thanks, Ryan _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
On 8/3/2012 11:02 AM, Ryan Krauss wrote:
I need help speeding up some code I wrote to perform a Runge-Kuta integration. I need to do the integration as part of a real-time control algorithm, so it needs to be fairly fast. scipy.integrate.odeint does too much error checking to be fast enough. My pure Python version was just a little too slow, so I tried coding it up in Cython. I have only used Cython once before, so I don't know if I did it correctly (the .pyx file is attached). The code runs just fine, but there is almost no speed up. I think the core issue is that my dxdt_runge_kuta function gets called about 4000 times per second, so most of my overhead is in the function calls (I think). I am running my real-time control algorithm at 500 Hz and I need at least 2 Runge-Kuta integration steps per real-time steps for numeric stability. And the Runge-Kuta algorithm needs to evaluate the derivative 4 times per times step. So, 500 Hz * 2 * 4 = 4000 calls per second. I also tried coding this up in fortran and using f2py, but I am getting a type mismatch error I don't understand. I have a function that declares its return values as double precision: double precision function dzdt(x,voltage) and I declare the variable I want to store the returned value in to also be double precision: double precision F,z,vel,accel,zdot1,zdot2,zdot3,zdot4 zdot1 = dzdt(x_prev,volts) but some how it is not happy. I'm not much of a Fortran programmer and I may misunderstand the above, but have you tried adding dzdt to your double precision declaration? My C skills are pretty weak (the longer I use Python, the more C I forget, and I didn't know that much to start with). I started looking into Boost as well as using f2py on C code, but I got stuck. Can anyone either make my Cython or Fortran approaches work or point me in a different direction? Thanks, Ryan _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Ryan Krauss-2
03.08.2012 19:02, Ryan Krauss kirjoitti:
[clip] > Can anyone either make my Cython or Fortran approaches work or point > me in a different direction? Regarding Cython: run cython -a runge_kuta.pyx and check the created HTML file. Slow points are highlighted with yellow. Regarding this case: - `cdef`, not `def` for the dxdt_* function - from libc.math import exp - Do not use small numpy arrays inside loops. Use C constructs instead. - Use @cython.cdivision(True), @cython.boundscheck(False) PS. Runge-Kutta -- Pauli Virtanen _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Ryan Krauss-2
03.08.2012 19:02, Ryan Krauss kirjoitti:
[clip] > zdot1 = dzdt(x_prev,volts) > > but some how it is not happy. It's Fortran 77. You need to declare double precision dzdt I'd suggest writing Fortran 90 --- no need to bring more F77 code into existence ;) -- Pauli Virtanen _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Pauli Virtanen-3
Thanks for the suggestions.
> - Do not use small numpy arrays inside loops. > Use C constructs instead. This is where I ran into trouble with my knowledge of C. I have several 3x1 arrays that I need to pass into the dxdt function, multiply by scalars, and add together. I don't know how to do that cleanly in C. For example: x_out = x_prev + 1.0/6*(g1 + 2*g2 + 2*g3 + g4) where x_prev, g1, g2, g3, and g4 are all 3x1. A little googling lead me to valarray's, but I don't know if that is the best approach or how to use them within Cython. How would you do basic math on small arrays in pure C? On Fri, Aug 3, 2012 at 1:56 PM, Pauli Virtanen <[hidden email]> wrote: > 03.08.2012 19:02, Ryan Krauss kirjoitti: > [clip] >> Can anyone either make my Cython or Fortran approaches work or point >> me in a different direction? > > Regarding Cython: run > > cython -a runge_kuta.pyx > > and check the created HTML file. Slow points are highlighted with yellow. > > Regarding this case: > > - `cdef`, not `def` for the dxdt_* function > > - from libc.math import exp > > - Do not use small numpy arrays inside loops. > Use C constructs instead. > > - Use @cython.cdivision(True), @cython.boundscheck(False) > > > > PS. Runge-Kutta > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
Fortran, so fast, yet so painful. Once I got it working, it was 94
times faster than my pure Python version. Thanks to Jim and Pauli for helping me find my error. Ironically, I was thinking like a C programmer. Just because a Fortran function declares its return value data type doesn't mean all calling functions or subroutines will know the data of the function when they call it. I am still open to Cython suggestions. I don't want to bring more F77 code into the world..... On Fri, Aug 3, 2012 at 2:16 PM, Ryan Krauss <[hidden email]> wrote: > Thanks for the suggestions. > >> - Do not use small numpy arrays inside loops. >> Use C constructs instead. > > This is where I ran into trouble with my knowledge of C. I have > several 3x1 arrays that I need to pass into the dxdt function, > multiply by scalars, and add together. I don't know how to do that > cleanly in C. For example: > x_out = x_prev + 1.0/6*(g1 + 2*g2 + 2*g3 + g4) > where x_prev, g1, g2, g3, and g4 are all 3x1. > > A little googling lead me to valarray's, but I don't know if that is > the best approach or how to use them within Cython. > > How would you do basic math on small arrays in pure C? > > > > On Fri, Aug 3, 2012 at 1:56 PM, Pauli Virtanen <[hidden email]> wrote: >> 03.08.2012 19:02, Ryan Krauss kirjoitti: >> [clip] >>> Can anyone either make my Cython or Fortran approaches work or point >>> me in a different direction? >> >> Regarding Cython: run >> >> cython -a runge_kuta.pyx >> >> and check the created HTML file. Slow points are highlighted with yellow. >> >> Regarding this case: >> >> - `cdef`, not `def` for the dxdt_* function >> >> - from libc.math import exp >> >> - Do not use small numpy arrays inside loops. >> Use C constructs instead. >> >> - Use @cython.cdivision(True), @cython.boundscheck(False) >> >> >> >> PS. Runge-Kutta >> >> -- >> Pauli Virtanen >> >> _______________________________________________ >> SciPy-User mailing list >> [hidden email] >> http://mail.scipy.org/mailman/listinfo/scipy-user SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Pauli Virtanen-3
Den 03.08.2012 21:05, skrev Pauli Virtanen: > It's Fortran 77. You need to declare > > double precision dzdt > > I'd suggest writing Fortran 90 --- no need to bring more F77 code into > existence ;) > With the new typed memoryviews in Cython, there is no need to bring more Fortran of any sort into existance. ;-) Sturla _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Ryan Krauss-2
Hey,
Just to add what was said previously, isn't float in Cython single precision? I doubt this was intended here and should be replaced with DTYPE_t everywhere. Other then that it was already said, np.zeros/np.exp is bad there... Regards, Sebastian On Fr, 2012-08-03 at 14:41 -0500, Ryan Krauss wrote: > Fortran, so fast, yet so painful. Once I got it working, it was 94 > times faster than my pure Python version. > > Thanks to Jim and Pauli for helping me find my error. Ironically, I > was thinking like a C programmer. Just because a Fortran function > declares its return value data type doesn't mean all calling functions > or subroutines will know the data of the function when they call it. > > I am still open to Cython suggestions. I don't want to bring more F77 > code into the world..... > > On Fri, Aug 3, 2012 at 2:16 PM, Ryan Krauss <[hidden email]> wrote: > > Thanks for the suggestions. > > > >> - Do not use small numpy arrays inside loops. > >> Use C constructs instead. > > > > This is where I ran into trouble with my knowledge of C. I have > > several 3x1 arrays that I need to pass into the dxdt function, > > multiply by scalars, and add together. I don't know how to do that > > cleanly in C. For example: > > x_out = x_prev + 1.0/6*(g1 + 2*g2 + 2*g3 + g4) > > where x_prev, g1, g2, g3, and g4 are all 3x1. > > > > A little googling lead me to valarray's, but I don't know if that is > > the best approach or how to use them within Cython. > > > > How would you do basic math on small arrays in pure C? > > > > > > > > On Fri, Aug 3, 2012 at 1:56 PM, Pauli Virtanen <[hidden email]> wrote: > >> 03.08.2012 19:02, Ryan Krauss kirjoitti: > >> [clip] > >>> Can anyone either make my Cython or Fortran approaches work or point > >>> me in a different direction? > >> > >> Regarding Cython: run > >> > >> cython -a runge_kuta.pyx > >> > >> and check the created HTML file. Slow points are highlighted with yellow. > >> > >> Regarding this case: > >> > >> - `cdef`, not `def` for the dxdt_* function > >> > >> - from libc.math import exp > >> > >> - Do not use small numpy arrays inside loops. > >> Use C constructs instead. > >> > >> - Use @cython.cdivision(True), @cython.boundscheck(False) > >> > >> > >> > >> PS. Runge-Kutta > >> > >> -- > >> Pauli Virtanen > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> [hidden email] > >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Ryan Krauss-2
I am sure properly coded Cython is great, but I really struggled when I
tried to use it. I found that it allows you to write really slow code without errors or warnings. I found the profiling tools to be only marginally helpful. So many different ways to do the same thing... which is the best? All the documentation is nice, but very long and dense. I am having much more success with f2py (using F90 syntax). Either your code runs fast, or it simply will not compile (excepting segfault bugs that can sometimes be difficult to track down). Improved and updated documentation would be helpful, but otherwise f2py is now what I turn to when speed is crucial. My 2 cents. YMMV. Jonathan On 08/04/2012 02:45 AM, [hidden email] wrote: > Date: Sat, 04 Aug 2012 03:03:38 +0200 > From: Sturla Molden > Subject: Re: [SciPy-User] help speeding up a Runge-Kuta algorithm > (cython, f2py, ...) > > Den 03.08.2012 21:05, skrev Pauli Virtanen: >> >It's Fortran 77. You need to declare >> > >> > double precision dzdt >> > >> >I'd suggest writing Fortran 90 --- no need to bring more F77 code into >> >existence;) >> > > With the new typed memoryviews in Cython, there is no need to bring more > Fortran of any sort into existance.;-) SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
Den 04.08.2012 21:35, skrev Jonathan Stickel:
> I am sure properly coded Cython is great, but I really struggled when I > tried to use it. I uses np.ndarray declarations and array expressions. Those will be slow. To write fast and numpyonic Cython, use typed memoryviews instead, and write out all loops. I.e. there is no support for array expressions with these yet. And unfortunately there is a huge lack of documentation on how to use typed memoryviews efficiently. Here is an example code of how to use Cython as a Fortran killer: https://github.com/sturlamolden/memview_benchmarks/blob/master/memview.pyx In this case, the performance with -O2 was just 2.2% slower than "plain C" with pointer arithmetics. It is possible to write very fast array code with Cython, but you must do it right. For comparison, this is very slow: https://github.com/sturlamolden/memview_benchmarks/blob/master/cythonized_numpy_2b.pyx What this mean is this: For anything but trivial code, the NumPy syntax is just too slow and should be avoided! I breiefly looked at the Cython code posted in this thread, and it suffers form all these issues. Sturla _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Ryan Krauss-2
Not tested and debugged, but to me it
looks like something like this might be what you want.
Sturla Den 03.08.2012 19:02, skrev Ryan Krauss: I need help speeding up some code I wrote to perform a Runge-Kuta integration. I need to do the integration as part of a real-time control algorithm, so it needs to be fairly fast. scipy.integrate.odeint does too much error checking to be fast enough. My pure Python version was just a little too slow, so I tried coding it up in Cython. I have only used Cython once before, so I don't know if I did it correctly (the .pyx file is attached). The code runs just fine, but there is almost no speed up. I think the core issue is that my dxdt_runge_kuta function gets called about 4000 times per second, so most of my overhead is in the function calls (I think). I am running my real-time control algorithm at 500 Hz and I need at least 2 Runge-Kuta integration steps per real-time steps for numeric stability. And the Runge-Kuta algorithm needs to evaluate the derivative 4 times per times step. So, 500 Hz * 2 * 4 = 4000 calls per second. I also tried coding this up in fortran and using f2py, but I am getting a type mismatch error I don't understand. I have a function that declares its return values as double precision: double precision function dzdt(x,voltage) and I declare the variable I want to store the returned value in to also be double precision: double precision F,z,vel,accel,zdot1,zdot2,zdot3,zdot4 zdot1 = dzdt(x_prev,volts) but some how it is not happy. My C skills are pretty weak (the longer I use Python, the more C I forget, and I didn't know that much to start with). I started looking into Boost as well as using f2py on C code, but I got stuck. Can anyone either make my Cython or Fortran approaches work or point me in a different direction? Thanks, Ryan _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user runge_kuta.pyx (2K) Download Attachment |
In reply to this post by Jonathan Stickel-5
On Sat, Aug 04, 2012 at 01:35:17PM -0600, Jonathan Stickel wrote:
> I am sure properly coded Cython is great, but I really struggled when I > tried to use it. I found that it allows you to write really slow code > without errors or warnings. I found the profiling tools to be only > marginally helpful. To write fast cython code, compile it with 'cython -a', open the resulting html file in a web browser. The yellow lines are where the problems are: click on them and you'll find that they correspond to lines of Cython code that lead to long and complex C code. Improve your code (by making sure that it relies on typed variables and fast array access) until it has not yellow lines. HTH, Gael _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Sturla Molden-2
Thanks to Sturla for helping me get this working in Cython.
I am trying to compile the code to compare it against fortran for speed. I have run into two bugs so far (I mentioned that my C skills are weak). The first has to do with the "const trick": Error compiling Cython file: ------------------------------------------------------------ ... cdef inline void dxdt_runge_kuta(double *x "const double *", double voltage "const double", double *dxdt): #cdef double J = 0.0011767297528720126 "const double" cdef double J = 0.0011767297528720126 cdef double alpha0 = 4.1396263800000002 "const double" ^ ------------------------------------------------------------ runge_kuta_v2.pyx:12:44: Syntax error in C variable declaration I don't know what the problem is here, so for now I just got rid of all the "const double" statements. (In case the formatting doesn't come through, the little error carrot ^ points to the space between the last number and the quote. After getting rid of all the "const double" expressions (just to see if everything else would compile), I got this: Error compiling Cython file: ------------------------------------------------------------ ... dxdt[0] = vel dxdt[1] = accel dxdt[2] = dzdt def runge_kuta_one_step(double _x[::1], Py_ssize_t factor, double volts, ^ ------------------------------------------------------------ runge_kuta_v2.pyx:31:34: Expected an identifier or literal The carrot points to the first square bracket. Thanks, Ryan On Sat, Aug 4, 2012 at 6:28 PM, Sturla Molden <[hidden email]> wrote: > Not tested and debugged, but to me it looks like something like this might > be what you want. > > Sturla > > > Den 03.08.2012 19:02, skrev Ryan Krauss: > > I need help speeding up some code I wrote to perform a Runge-Kuta > integration. I need to do the integration as part of a real-time > control algorithm, so it needs to be fairly fast. > scipy.integrate.odeint does too much error checking to be fast enough. > My pure Python version was just a little too slow, so I tried coding > it up in Cython. I have only used Cython once before, so I don't know > if I did it correctly (the .pyx file is attached). > > The code runs just fine, but there is almost no speed up. I think the > core issue is that my dxdt_runge_kuta function gets called about 4000 > times per second, so most of my overhead is in the function calls (I > think). I am running my real-time control algorithm at 500 Hz and I > need at least 2 Runge-Kuta integration steps per real-time steps for > numeric stability. And the Runge-Kuta algorithm needs to evaluate the > derivative 4 times per times step. So, 500 Hz * 2 * 4 = 4000 calls > per second. > > I also tried coding this up in fortran and using f2py, but I am > getting a type mismatch error I don't understand. I have a function > that declares its return values as double precision: > > double precision function dzdt(x,voltage) > > and I declare the variable I want to store the returned value in to > also be double precision: > > double precision F,z,vel,accel,zdot1,zdot2,zdot3,zdot4 > > zdot1 = dzdt(x_prev,volts) > > but some how it is not happy. > > > My C skills are pretty weak (the longer I use Python, the more C I > forget, and I didn't know that much to start with). I started looking > into Boost as well as using f2py on C code, but I got stuck. > > > Can anyone either make my Cython or Fortran approaches work or point > me in a different direction? > > Thanks, > > Ryan > > > > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user > SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
So, I get the same error when I try to compile Stula's memview.pyx
example. I think I have too old of a version of cython: Cython version 0.15.1 Let me look into that... On Mon, Aug 6, 2012 at 8:51 AM, Ryan Krauss <[hidden email]> wrote: > Thanks to Sturla for helping me get this working in Cython. > > I am trying to compile the code to compare it against fortran for > speed. I have run into two bugs so far (I mentioned that my C skills > are weak). > > The first has to do with the "const trick": > Error compiling Cython file: > ------------------------------------------------------------ > ... > cdef inline void dxdt_runge_kuta(double *x "const double *", > double voltage "const double", > double *dxdt): > #cdef double J = 0.0011767297528720126 "const double" > cdef double J = 0.0011767297528720126 > cdef double alpha0 = 4.1396263800000002 "const double" > ^ > ------------------------------------------------------------ > > runge_kuta_v2.pyx:12:44: Syntax error in C variable declaration > > I don't know what the problem is here, so for now I just got rid of > all the "const double" statements. (In case the formatting doesn't > come through, the little error carrot ^ points to the space between > the last number and the quote. > > After getting rid of all the "const double" expressions (just to see > if everything else would compile), I got this: > Error compiling Cython file: > ------------------------------------------------------------ > ... > dxdt[0] = vel > dxdt[1] = accel > dxdt[2] = dzdt > > > def runge_kuta_one_step(double _x[::1], Py_ssize_t factor, double volts, > ^ > ------------------------------------------------------------ > > runge_kuta_v2.pyx:31:34: Expected an identifier or literal > > The carrot points to the first square bracket. > > Thanks, > > Ryan > > > On Sat, Aug 4, 2012 at 6:28 PM, Sturla Molden <[hidden email]> wrote: >> Not tested and debugged, but to me it looks like something like this might >> be what you want. >> >> Sturla >> >> >> Den 03.08.2012 19:02, skrev Ryan Krauss: >> >> I need help speeding up some code I wrote to perform a Runge-Kuta >> integration. I need to do the integration as part of a real-time >> control algorithm, so it needs to be fairly fast. >> scipy.integrate.odeint does too much error checking to be fast enough. >> My pure Python version was just a little too slow, so I tried coding >> it up in Cython. I have only used Cython once before, so I don't know >> if I did it correctly (the .pyx file is attached). >> >> The code runs just fine, but there is almost no speed up. I think the >> core issue is that my dxdt_runge_kuta function gets called about 4000 >> times per second, so most of my overhead is in the function calls (I >> think). I am running my real-time control algorithm at 500 Hz and I >> need at least 2 Runge-Kuta integration steps per real-time steps for >> numeric stability. And the Runge-Kuta algorithm needs to evaluate the >> derivative 4 times per times step. So, 500 Hz * 2 * 4 = 4000 calls >> per second. >> >> I also tried coding this up in fortran and using f2py, but I am >> getting a type mismatch error I don't understand. I have a function >> that declares its return values as double precision: >> >> double precision function dzdt(x,voltage) >> >> and I declare the variable I want to store the returned value in to >> also be double precision: >> >> double precision F,z,vel,accel,zdot1,zdot2,zdot3,zdot4 >> >> zdot1 = dzdt(x_prev,volts) >> >> but some how it is not happy. >> >> >> My C skills are pretty weak (the longer I use Python, the more C I >> forget, and I didn't know that much to start with). I started looking >> into Boost as well as using f2py on C code, but I got stuck. >> >> >> Can anyone either make my Cython or Fortran approaches work or point >> me in a different direction? >> >> Thanks, >> >> Ryan >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> [hidden email] >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> [hidden email] >> http://mail.scipy.org/mailman/listinfo/scipy-user >> SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
I upgraded to cython 0.16 and made a bit more progress.
I don't know if this is headed in the right direction or not, but based on the memview.pyx example I changed double _x[::1] to np.float64_t[::1] _x and did the same thing with cdef double out[::1] = np.zeros(3) I seem to be closer to compiling successfully, but now have this error: Error compiling Cython file: ------------------------------------------------------------ ... import numpy as np cimport numpy as np from libc.math cimport exp, fabs cdef inline void dxdt_runge_kuta(double *x "const double *", ^ ------------------------------------------------------------ runge_kuta_v2.pyx:8:32: Function argument cannot have C name specification (carrot points to the last a in runge_kuta Thanks again, Ryan On Mon, Aug 6, 2012 at 9:02 AM, Ryan Krauss <[hidden email]> wrote: > So, I get the same error when I try to compile Stula's memview.pyx > example. I think I have too old of a version of cython: > > Cython version 0.15.1 > > Let me look into that... > > On Mon, Aug 6, 2012 at 8:51 AM, Ryan Krauss <[hidden email]> wrote: >> Thanks to Sturla for helping me get this working in Cython. >> >> I am trying to compile the code to compare it against fortran for >> speed. I have run into two bugs so far (I mentioned that my C skills >> are weak). >> >> The first has to do with the "const trick": >> Error compiling Cython file: >> ------------------------------------------------------------ >> ... >> cdef inline void dxdt_runge_kuta(double *x "const double *", >> double voltage "const double", >> double *dxdt): >> #cdef double J = 0.0011767297528720126 "const double" >> cdef double J = 0.0011767297528720126 >> cdef double alpha0 = 4.1396263800000002 "const double" >> ^ >> ------------------------------------------------------------ >> >> runge_kuta_v2.pyx:12:44: Syntax error in C variable declaration >> >> I don't know what the problem is here, so for now I just got rid of >> all the "const double" statements. (In case the formatting doesn't >> come through, the little error carrot ^ points to the space between >> the last number and the quote. >> >> After getting rid of all the "const double" expressions (just to see >> if everything else would compile), I got this: >> Error compiling Cython file: >> ------------------------------------------------------------ >> ... >> dxdt[0] = vel >> dxdt[1] = accel >> dxdt[2] = dzdt >> >> >> def runge_kuta_one_step(double _x[::1], Py_ssize_t factor, double volts, >> ^ >> ------------------------------------------------------------ >> >> runge_kuta_v2.pyx:31:34: Expected an identifier or literal >> >> The carrot points to the first square bracket. >> >> Thanks, >> >> Ryan >> >> >> On Sat, Aug 4, 2012 at 6:28 PM, Sturla Molden <[hidden email]> wrote: >>> Not tested and debugged, but to me it looks like something like this might >>> be what you want. >>> >>> Sturla >>> >>> >>> Den 03.08.2012 19:02, skrev Ryan Krauss: >>> >>> I need help speeding up some code I wrote to perform a Runge-Kuta >>> integration. I need to do the integration as part of a real-time >>> control algorithm, so it needs to be fairly fast. >>> scipy.integrate.odeint does too much error checking to be fast enough. >>> My pure Python version was just a little too slow, so I tried coding >>> it up in Cython. I have only used Cython once before, so I don't know >>> if I did it correctly (the .pyx file is attached). >>> >>> The code runs just fine, but there is almost no speed up. I think the >>> core issue is that my dxdt_runge_kuta function gets called about 4000 >>> times per second, so most of my overhead is in the function calls (I >>> think). I am running my real-time control algorithm at 500 Hz and I >>> need at least 2 Runge-Kuta integration steps per real-time steps for >>> numeric stability. And the Runge-Kuta algorithm needs to evaluate the >>> derivative 4 times per times step. So, 500 Hz * 2 * 4 = 4000 calls >>> per second. >>> >>> I also tried coding this up in fortran and using f2py, but I am >>> getting a type mismatch error I don't understand. I have a function >>> that declares its return values as double precision: >>> >>> double precision function dzdt(x,voltage) >>> >>> and I declare the variable I want to store the returned value in to >>> also be double precision: >>> >>> double precision F,z,vel,accel,zdot1,zdot2,zdot3,zdot4 >>> >>> zdot1 = dzdt(x_prev,volts) >>> >>> but some how it is not happy. >>> >>> >>> My C skills are pretty weak (the longer I use Python, the more C I >>> forget, and I didn't know that much to start with). I started looking >>> into Boost as well as using f2py on C code, but I got stuck. >>> >>> >>> Can anyone either make my Cython or Fortran approaches work or point >>> me in a different direction? >>> >>> Thanks, >>> >>> Ryan >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> [hidden email] >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> [hidden email] >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Ryan Krauss-2
Den 06.08.2012 15:51, skrev Ryan Krauss:
> Thanks to Sturla for helping me get this working in Cython. > > I am trying to compile the code to compare it against fortran for > speed. I have run into two bugs so far (I mentioned that my C skills > are weak). > Sorry, I should have debugged :( This one compiles with $ python setup.py build_ext Is this what you wanted? Sturla _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user runge_kutta.zip (1K) Download Attachment |
Thanks Stula. That code compiles just fine and will go a long way
toward helping me understand how to use Cython to write fast code for these kinds of applications. For many Runge-Kutta steps, your Cython code is 200 times faster than my pure Python version. Fortran is still 1.6 times faster than the Cython version, but the Fortran version is much more work to code up. Thanks again, Ryan On Mon, Aug 6, 2012 at 6:18 PM, Sturla Molden <[hidden email]> wrote: > Den 06.08.2012 15:51, skrev Ryan Krauss: > >> Thanks to Sturla for helping me get this working in Cython. >> >> I am trying to compile the code to compare it against fortran for >> speed. I have run into two bugs so far (I mentioned that my C skills >> are weak). >> > > Sorry, I should have debugged :( > > This one compiles with > > $ python setup.py build_ext > > Is this what you wanted? > > > > Sturla > > > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user > SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
On 07.08.2012 18:37, Ryan Krauss wrote:
> For many Runge-Kutta steps, your Cython code is 200 times faster than > my pure Python version. Fortran is still 1.6 times faster than the > Cython version, but the Fortran version is much more work to code up. Don't expect anything to be "faster than Fortran" for certain kind of numerical work. Cython has a certain overhead (larger than C and Fortran), and since it compiles to ANSI C (not ISO C) we cannot restrict pointers. But still, ~75% of Fortran performance is often acceptable! Another thing is you need to look at "scalability". How much of that extra runtime is constant due to differences between Cython and f2py? How much is variable due to the numerical kernel being faster in Fortran? Will differently sized problems give you the same overhead from using Cython? It often helps to plot a graph of the performance (mean and error bars) for various problem sizes, rather than benchmarking at one single point. Correctness is always more important than speed. That is one thing to consider too. With Cython we can begin with a tested Python prototype and optimize along the way, using the Python profiler to pinpoint where it matters the most. Python, NumPy and Cython will not win the world championship of being "fastest on the CPU" for simple numerical kernels, but that is not the idea either. Implementing complex algorithms in Fortran can be a PITA compared to Python. But Cython helps us in a stright forward way to speed up Python code and/or interface with C or C++. Fortran is only nice for helping us scientists to avoid the pointer arithmetics of C, but Cython's memoryviews do that too. Sturla _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
I agree. Thanks again.
On Tue, Aug 7, 2012 at 1:10 PM, Sturla Molden <[hidden email]> wrote: > On 07.08.2012 18:37, Ryan Krauss wrote: > >> For many Runge-Kutta steps, your Cython code is 200 times faster than >> my pure Python version. Fortran is still 1.6 times faster than the >> Cython version, but the Fortran version is much more work to code up. > > Don't expect anything to be "faster than Fortran" for certain kind of > numerical work. Cython has a certain overhead (larger than C and > Fortran), and since it compiles to ANSI C (not ISO C) we cannot restrict > pointers. But still, ~75% of Fortran performance is often acceptable! > Another thing is you need to look at "scalability". How much of that > extra runtime is constant due to differences between Cython and f2py? > How much is variable due to the numerical kernel being faster in > Fortran? Will differently sized problems give you the same overhead from > using Cython? It often helps to plot a graph of the performance (mean > and error bars) for various problem sizes, rather than benchmarking at > one single point. > > Correctness is always more important than speed. That is one thing to > consider too. With Cython we can begin with a tested Python prototype > and optimize along the way, using the Python profiler to pinpoint where > it matters the most. Python, NumPy and Cython will not win the world > championship of being "fastest on the CPU" for simple numerical kernels, > but that is not the idea either. Implementing complex algorithms in > Fortran can be a PITA compared to Python. But Cython helps us in a > stright forward way to speed up Python code and/or interface with C or > C++. Fortran is only nice for helping us scientists to avoid the pointer > arithmetics of C, but Cython's memoryviews do that too. > > > Sturla > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
Free forum by Nabble | Edit this page |