Administrator

Hello,
I am unsing the scikit.timeseries to evaluate a longterm measurement data set. How can I extract those years, which have complete measurements? In the below, years 2004 & 2008 are not complete. Is there a generic possibility that all incomplete years get masked? Thanks & regards, Timmie ###code import numpy as np import numpy.ma as ma import scikits.timeseries as ts data = np.arange(0, 40800) start_dt = ts.Date(freq='H', year=2004, month=3, day=1, hour=0) s_all = ts.time_series(data, freq='H', start_date=start_dt) _______________________________________________ SciPyuser mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipyuser 
Timmie,
There's no generic function to perform what you want as it'll depend on the frequency. What you can do is: 1. get a list of years >>> singleyears = set(s_all.years) 2. for each year, check what are the first and last days of the year: >>> firstandlast = [tuple([year] +s_all[s_all.years==year].yeardays[[0,1]].tolist()) for year in singleyears] That gives you a list of tuples (year, first day, last day) 3. find the years for which the first day is strictly larger than 1 and the last strictly lower than 365. >>> maskyears = [y for (y,f,l) in firstandlast if f>1 or l<365] 4. Mask the corresponding years >>> for y in maskyears: >>> s_all[s_all.years==y] = ma.masked That's far from efficient and rather ugly, but that should give you a generic idea. Let me know how it goes. P. On Nov 17, 2008, at 3:34 PM, Timmie wrote: > Hello, > I am unsing the scikit.timeseries to evaluate a longterm > measurement data set. > > How can I extract those years, which have complete measurements? > > In the below, years 2004 & 2008 are not complete. > Is there a generic possibility that all incomplete years get masked? > > Thanks & regards, > Timmie > > ###code > > import numpy as np > import numpy.ma as ma > import scikits.timeseries as ts > > data = np.arange(0, 40800) > start_dt = ts.Date(freq='H', year=2004, month=3, day=1, hour=0) > s_all = ts.time_series(data, freq='H', start_date=start_dt) > > > > _______________________________________________ > SciPyuser mailing list > [hidden email] > http://projects.scipy.org/mailman/listinfo/scipyuser _______________________________________________ SciPyuser mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipyuser 
In reply to this post by Timmie
Timmie,
There's smarter than the previous answer, if you're not afraid of temporary arrays. Here's a copypasted version, commented. Let me know how it goes. Cheers P. #### BELOW A SAMPLE SCRIPT THAT MAY ILLUSTRATE #### #!/usr/bin/env python # * coding: utf8 * import datetime import scikits.timeseries as ts import numpy as np #import numpy as np import numpy.ma as ma import scikits.timeseries as ts data = np.arange(0, 40800) start_dt = ts.Date(freq='H', year=2004, month=3, day=1, hour=0) s_all = ts.time_series(data, freq='H', start_date=start_dt) # Convert to a (5,24*366) annual series: each row is a year, each column an hour # Because of lapse years, we have 24*366 cols, not 24*365 a_s_all = s_all.convert('A') # If the first column (the first date) is masked, mask the row. a_s_all[a_s_all[:,0].mask] = ma.masked # If the column 25 (last hour of 12/31 or 12/30) is masked, masked the column a_s_all[a_s_all[:,25].mask] = ma.masked # Make a new series from the annual series. # We can't us convert because the annual series is 2D. # Instead, we create a new series starting at the first date of the annual series, # converted to the correct frequency (s_all.freq). # As the method asfreq defaults to END, we need to force 'START' for relation # (check the docstring of asfreq). starting_date = a_s_all.dates[0].asfreq(s_all.freq, relation='START') # For the data, we can't use a_s_all.ravel() directly because a_s_all is 2D, # but we only need the data actually, not the dates. s_new = ts.time_series(a_s_all._series.ravel(), start_date=starting_date) # And if you want, you can force the starting and ending dates of this new series # to the initial ones s_mod = ts.align_with(s_all, s_new) On Nov 17, 2008, at 3:34 PM, Timmie wrote: > Hello, > I am unsing the scikit.timeseries to evaluate a longterm > measurement data set. > > How can I extract those years, which have complete measurements? > > In the below, years 2004 & 2008 are not complete. > Is there a generic possibility that all incomplete years get masked? > > Thanks & regards, > Timmie > > ###code > > import numpy as np > import numpy.ma as ma > import scikits.timeseries as ts > > data = np.arange(0, 40800) > start_dt = ts.Date(freq='H', year=2004, month=3, day=1, hour=0) > s_all = ts.time_series(data, freq='H', start_date=start_dt) > > > > _______________________________________________ > SciPyuser mailing list > [hidden email] > http://projects.scipy.org/mailman/listinfo/scipyuser _______________________________________________ SciPyuser mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipyuser 
Free forum by Nabble  Edit this page 