Hello!
I use scipy.optimize.leastsq to adopt paramters of a model to measured data. Each evaluation of that model costs 1.5 h of computation time. Unfortunately I can not specify a gradient function. While observing the approximation process I found that the first 3 runs were always with the same parameters. First I thought, the parameter variation for gradient approximation is too tiny for a simple print command. Later I found out, that these three runs were independent of the number of fit parameters. A closer look to the code reveals the reason (svn dir trunk/scipy/optimize): 1st call is to check with python code wether the function is valid line 265 of minpack.py m = check_func(func,x0,args,n)[0] 2nd call is to get the right amount of memory for paramters. line 449 of __minpack.h ap_fvec = (PyArrayObject *)call_python_function(fcn, n, x, extra_args, 1, minpack_error); 3rd call is from inside the fortran algorithm (the essential one!) Unfortunately that behaviour is not described and I would eagerly demand to avoid the superficial calls to the function. Yours, Achim _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
You could try using leastsq from scikits.openopt, it has mechanism for
preventing double-call objective function with same x value as before. Regards, D. Achim Gaedke wrote: > Hello! > > I use scipy.optimize.leastsq to adopt paramters of a model to measured > data. Each evaluation of that model costs 1.5 h of computation time. > Unfortunately I can not specify a gradient function. > > While observing the approximation process I found that the first 3 runs > were always with the same parameters. First I thought, the parameter > variation for gradient approximation is too tiny for a simple print > command. Later I found out, that these three runs were independent of > the number of fit parameters. > > A closer look to the code reveals the reason (svn dir trunk/scipy/optimize): > > 1st call is to check with python code wether the function is valid > > line 265 of minpack.py > m = check_func(func,x0,args,n)[0] > > 2nd call is to get the right amount of memory for paramters. > > line 449 of __minpack.h > ap_fvec = (PyArrayObject *)call_python_function(fcn, n, x, extra_args, > 1, minpack_error); > > 3rd call is from inside the fortran algorithm (the essential one!) > > Unfortunately that behaviour is not described and I would eagerly demand > to avoid the superficial calls to the function. > > Yours, Achim > > _______________________________________________ > SciPy-user mailing list > [hidden email] > http://projects.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Achim Gaedke
If you have multiple CPUs available and your modeling can be adapted to parallelization, you might look at Parallel Python:
http://www.parallelpython.com/ James Phillips http://zunzun.com On Tue, Apr 15, 2008 at 1:15 PM, Achim Gaedke <[hidden email]> wrote:
_______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Achim Gaedke
Hi Achim
On 15/04/2008, Achim Gaedke <[hidden email]> wrote: > Hello! > > I use scipy.optimize.leastsq to adopt paramters of a model to measured > data. Each evaluation of that model costs 1.5 h of computation time. > Unfortunately I can not specify a gradient function. > > While observing the approximation process I found that the first 3 runs > were always with the same parameters. Woops, that should be fixed, if possible. In the meantime, you can use the memoize decorator as a workaround: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466320 Your function calls take so long that you really won't notice the (tiny) overhead. Regards Stéfan _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by dmitrey.kroshko
On 15 Apr 2008, at 11:47 AM, dmitrey wrote:
> You could try using leastsq from scikits.openopt, it has mechanism for > preventing double-call objective function with same x value as before. > > Regards, D. > > Achim Gaedke wrote: >> Hello! >> >> I use scipy.optimize.leastsq to adopt paramters of a model to >> measured >> data. Each evaluation of that model costs 1.5 h of computation time. >> Unfortunately that behaviour is not described and I would eagerly >> demand >> to avoid the superficial calls to the function. >> >> Yours, Achim In principle, you could also just roll your own memoization to cache the results (assuming that you can afford to store them, but since it takes 1.5 hours per call, you can't have to many calls!): def memoize(f): def memoized_f(x,_cache={}): key = tuple(numpy.ravel(x)) # Numpy arrays are not valid keys... if not _cache.has_key(key): _cache[key] = f(x) return _cache[key] return memoized_f Now you can decorate your expensive function, and it will not compute values for the same input. @memoize def expensive_function(x): print "Computing %s**2..."%str(x) return x*x >>> expensive_function(1.01) Computing 1.010000**2... 1.0201 >>> expensive_function(1.01) 1.0201 >>> expensive_function(array([1,2,3])) >>> expensive_function(array([1,2,3])) Computing [1 2 3]**2... array([1, 4, 9]) >>> expensive_function(array([1,2,3])) array([1, 4, 9]) Michael. _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Achim Gaedke
On 15/04/2008, Achim Gaedke <[hidden email]> wrote:
> Hello! > > I use scipy.optimize.leastsq to adopt paramters of a model to measured > data. Each evaluation of that model costs 1.5 h of computation time. > Unfortunately I can not specify a gradient function. Yikes. I'm afraid this is going to be a rather painful process. Unfortunately, while minimzing the number of function evaluations is a goal of optimization procedures, they are not necessarily tuned carefully enough for such an expensive function. You may want to look into taking advantage of any structure your problem has (for example when I had a similar problem I found I could modify most of my parameters rather rapidly, while changing one of them was expensive, so I used nested optimizers) or, if you have to do this often, coming up with an interpolating function and optimizing that. You may also want to write a function that is fast but behaves in a similar fashion and hold a shootout of all the optimizers available to see which ones require the fewest evaluations for the accuracy you need. If you have access to a computing cluster, you may also want to look into some kind of parallel optimization procedure that can run a number of function evaluations concurrently. > While observing the approximation process I found that the first 3 runs > were always with the same parameters. First I thought, the parameter > variation for gradient approximation is too tiny for a simple print > command. Later I found out, that these three runs were independent of > the number of fit parameters. > > A closer look to the code reveals the reason (svn dir trunk/scipy/optimize): > > 1st call is to check with python code wether the function is valid > > line 265 of minpack.py > m = check_func(func,x0,args,n)[0] > > 2nd call is to get the right amount of memory for paramters. > > line 449 of __minpack.h > ap_fvec = (PyArrayObject *)call_python_function(fcn, n, x, extra_args, > 1, minpack_error); > > 3rd call is from inside the fortran algorithm (the essential one!) > > Unfortunately that behaviour is not described and I would eagerly demand > to avoid the superficial calls to the function. This is annoying, and it should be fixed inside scipy if possible; the FORTRAN code will make this more difficult, but file a bug on the scipy Trac and we'll look at it. In the meantime, you can use "memoizing" to avoid recomputing your function. See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/498110 for a bells-and-whistles implementation, but the basic idea is just that you wrap your function in a wrapper that stores a dictionary mapping inputs to outputs. Then every time you call the function it checks whether the function has been called before with these values, and if so, returns the value computed before. Good luck, Anne _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Achim Gaedke
Tue, 15 Apr 2008 20:15:33 +0200, Achim Gaedke wrote:
> Hello! > > I use scipy.optimize.leastsq to adopt paramters of a model to measured > data. Each evaluation of that model costs 1.5 h of computation time. > Unfortunately I can not specify a gradient function. [clip] > Unfortunately that behaviour is not described and I would eagerly demand > to avoid the superficial calls to the function. As a workaround, you can memoize the function calls, something like this: ---- clip ---- import scipy as sp def memoize_single(func): """Optimize out repeated function calls with the same arguments""" last_z = [] last_f = None def wrapper(z, *a, **kw): if sp.all(z == last_z): return last_f.copy() last_z = sp.array(z, copy=True) last_f = sp.array(func(z, *a, **kw), copy=True) return last_f.copy() return wrapper @memoize_single def target_function(z): print "Evaluating..." z = sp.asarray(z) return sum(z**2) for k in xrange(10): target_function([1,2,3]) ---- clip ---- Is should output ---- clip ---- Evaluating... 14 14 14 14 14 14 14 14 14 14 ---- clip ---- -- Pauli Virtanen _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
Pauli Virtanen wrote:
> > @memoize_single > def target_function(z): > print "Evaluating..." > z = sp.asarray(z) > return sum(z**2) Sorry for this off-topic questions, but can somebody explain me what that @.... syntax means? I searched the python manuals several times without luck. Thanks, Christian _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
It's a decorator (Python 2.5). It only does :
target_function = memoize_single(target_function) after the function definition. Matthieu 2008/4/15, Christian K. <[hidden email]>: Pauli Virtanen wrote: -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Christian K.
On 15 Apr 2008, at 2:12 PM, Christian K. wrote: > Sorry for this off-topic questions, but can somebody explain me what > that @.... syntax means? I searched the python manuals several times > without luck. Function decorator syntax. http://www.python.org/dev/peps/pep-0318/ http://docs.python.org/ref/function.html#l2h-629 @memoize def f(x)... is equivalent to def f(x)... f = memoize(f) Michael. _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Anne Archibald
Anne Archibald wrote:
> On 15/04/2008, Achim Gaedke <[hidden email]> wrote: > >> While observing the approximation process I found that the first 3 runs >> were always with the same parameters. First I thought, the parameter >> variation for gradient approximation is too tiny for a simple print >> command. Later I found out, that these three runs were independent of >> the number of fit parameters. >> >> A closer look to the code reveals the reason (svn dir trunk/scipy/optimize): >> >> 1st call is to check with python code wether the function is valid >> >> line 265 of minpack.py >> m = check_func(func,x0,args,n)[0] >> >> 2nd call is to get the right amount of memory for paramters. >> >> line 449 of __minpack.h >> ap_fvec = (PyArrayObject *)call_python_function(fcn, n, x, extra_args, >> 1, minpack_error); >> >> 3rd call is from inside the fortran algorithm (the essential one!) >> >> Unfortunately that behaviour is not described and I would eagerly demand >> to avoid the superficial calls to the function. >> > > This is annoying, and it should be fixed inside scipy if possible; the > FORTRAN code will make this more difficult, but file a bug on the > scipy Trac and we'll look at it. In the meantime, you can use > "memoizing" to avoid recomputing your function. See > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/498110 > for a bells-and-whistles implementation, but the basic idea is just > that you wrap your function in a wrapper that stores a dictionary > mapping inputs to outputs. Then every time you call the function it > checks whether the function has been called before with these values, > and if so, returns the value computed before. > measurement. So I can not optimize each model before testing in detail, I could not even simplify the model, because I want to investigate the contribution of different effects. I am not really terrified by 1.5h for each data point. But I have the opportunity to compare the logs of each run. I've already written a stub to avoid unneceassary evaluation. So my intention was to notify people that the behaviour is not as expected. There are parameters for function evaluation limits. They might be wrong, because they count only fortran calls. Thanks for all the answers, Achim _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
Le mercredi 16 avril 2008 à 15:32 +0200, Achim Gaedke a écrit :
> I am testing different physics models whether they comply with the > measurement. So I can not optimize each model before testing in detail, > I could not even simplify the model, because I want to investigate the > contribution of different effects. > I am not really terrified by 1.5h for each data point. But I have the > opportunity to compare the logs of each run. Which language do you use to compute each 1h30 evaluation ? python, C, fortran ? If python, it may be worth writing it in fortran and use f2py for example... -- Fabrice Silva _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
Fabrice Silva wrote:
> Le mercredi 16 avril 2008 à 15:32 +0200, Achim Gaedke a écrit : > >> I am testing different physics models whether they comply with the >> measurement. So I can not optimize each model before testing in detail, >> I could not even simplify the model, because I want to investigate the >> contribution of different effects. >> I am not really terrified by 1.5h for each data point. But I have the >> opportunity to compare the logs of each run. >> > > Which language do you use to compute each 1h30 evaluation ? python, C, > fortran ? If python, it may be worth writing it in fortran and use f2py > for example... > I would also add it would be clearer to know what you want to achieve. In part, one reason is that there may be a better way to achieve your goal as the model is extremely complex or the data used is inappropriate (very badly conditioned). If you are just changing parameters and assuming the model is reasonably well behaved, then you should use starting values close to the base model. If it is the data then you should do something to the data like normalize. Bruce _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
Free forum by Nabble | Edit this page |