Hello Pierre, Matt and others!
The thing you suggested worked and gave the result that I wanted to achieve. The crutial thing was -- as Pierre write -- the filling of the missing dates: timeseries.fill_missing_dates(series) But now I have kind of 'two different' masks: (1) One mask that I created when importing the data or creating the masked array. This is used to mask all data values are physically inplausible or invalid. (2) Another mask that I just created with fill_missing_dates to get the missing dates filled. You'd say that this is fine. I now want continue to mask invalid data with filters (e.g. discard x lower 5 AND higher 100). And many more filters in between. In the end I would like to count the all masked data points to get a feeling of the performance of my logging device or the measurement process as a whole. When I now count all masked values the result would include those data points masked in stage (2). This would signifcantly reduce the accuracy of my data recovery ratio: number of valid data points / number of expected data points. Any suggestion who I can get around this? BTW, Is there a more efficient way to get properties of the masked array like number of masked and not masked values? I tried this: # return the number of masked values number_of_valid_values = filled.mask.size-sum(filled.mask) #return number of False values in a masked array number_of_valid_values = filled.mask.size-filled.mask.size-sum(filled.mask) Greetings, Marco _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
Marco,
> (1) One mask that I created when importing the data or creating the masked > array. This is used to mask all data values are physically inplausible or > invalid. > (2) Another mask that I just created with fill_missing_dates to get the > missing dates filled. Just count the number of unmasked data with series.count(), and store it into a count_ini variable. Then, keep on applying your filters, counting the number of unmasked data each time. You can then compare this new counts to count_ini (the original one). > BTW, Is there a more efficient way to get properties of the masked array > like number of masked and not masked values? If you look at the source code for the count method (in numpy.ma), you'll see that the result of count is only the difference between the size along the given axis and the sum of the mask along the same axis: ma.count(s, axis) = numpy.size(s._data, axis) - numpy.sum(s._mask, axis) So, the nb of "valid" values is given by series.count(axis), the nb of "invalid" values by series._mask.sum(axis), the total nb of data by numpy.size(s,axis) or simply series.shape[axis]. If you only have 1D data, that's even faster: nb of valid: series.count() nb of invalid: series._mask.sum() nb of data: series.size _______________________________________________ SciPy-user mailing list [hidden email] http://projects.scipy.org/mailman/listinfo/scipy-user |
Free forum by Nabble | Edit this page |